KERNELMETHODSFORBIOSENSINGAPPLICATIONS By HassanAqeelKhan ADISSERTATION Submitted toMichiganStateUniversity inpartialful˝llmentoftherequirements forthedegreeof ElectricalEngineeringDoctorofPhilosophy 2015 ABSTRACT KERNELMETHODSFORBIOSENSINGAPPLICATIONS By HassanAqeelKhan Thisthesisexaminesthedesignnoiserobustinformationretrievaltechniquesbasedonkernel methods.Algorithmsarepresentedfortwobiosensingapplications:(1)Highthroughput proteinarraysand(2)Non-invasiverespiratorysignalestimation.Ourprimaryobjectivein proteinarraydesignistomaximizethethroughputbyenablingdetectionofanextremely largenumberofproteintargetswhileusingaminimalnumberofreceptorspots.Thisis accomplishedbyviewingtheproteinarrayasacommunicationchannelandevaluatingits informationtransmissioncapacityasafunctionofitsreceptorprobes.Inthisframework, thechannelcapacitycanbeusedasatooltooptimizeprobedesign;theoptimalprobes beingtheonesthatmaximizecapacity.Theinformationcapacityis˝rstevaluatedfor asmallscaleproteinarray,withonlyafewproteintargets.Webelievethisisthe˝rst e˙orttoevaluatethecapacityofaproteinarraychannel.Forthispurposemodelsofthe proteomicchannel'snoisecharacteristicsandreceptornon-idealities,basedonexperimental prototypes,areconstructed.Kernelmethodsareemployedtoextendthecapacityevaluation tolargersizedproteinarraysthatcanpotentiallyhavethousandsofdistinctproteintargets. Aspeciallydesignedkernelwhichwecallthe ProteomicKernel isalsoproposed.Thiskernel incorporatesknowledgeaboutthebiophysicsoftargetandreceptorinteractionsintothecost functionemployedforevaluationofchannelcapacity. Forrespiratoryestimationthisthesisinvestigatesestimationofbreathing-rateandlung- volumeusingmultiplenon-invasivesensorsundermotionartifactandhighnoiseconditions. Aspirometersignalisusedasthegoldstandardforevaluationoferrors.Anovelalgorithm calledthesegregatedenvelopeandcarrier(SEC)estimationisproposed.Thisalgorithm approximatesthespirometersignalbyanamplitudemodulatedsignalandsegregatesthe estimationofthefrequencyandamplitudeinformation.Resultsdemonstratesthatthisap- proachenablese˙ectiveestimationofbothbreathingrateandlungvolume.Anadaptive algorithmbasedonacombinationof Gini kernelmachinesandwavelet˝lteringisalsopro- posed.Thisalgorithmistitledthewavelet-adaptive Gini (orWA Gini )algorithm,itemploys anovelwavelettransformbasedfeatureextractionfrontendtoclassifythesubject'sunder- lyingrespiratorystate.Thisinformationisthenemployedtoselecttheparametersofthe adaptivekernelmachinebasedonthesubject'srespiratorystate.Resultsdemonstratesig- ni˝cantimprovementinbreathingrateestimationwhencomparedtotraditionalrespiratory estimationtechniques. Copyrightby HASSANAQEELKHAN 2015 Thisthesisisdedicatedtomyparents. v ACKNOWLEDGEMENTS Iconsiderthe5yearsingrad-schoolatMichiganStateUniversitytobethemostchallenging yetenlighteningandexcitingtimeofmylifesofar.Iamtrulyblessedtohaveawonderful familyandacademicadviserswhohavealwaysprovidedtheirfullsupportwheneverIneeded it.Thisthesiswouldnothavebeenpossiblewithoutthesupportandguidanceofmyadviser Professor ShantanuChakrabartty andIamverygratefultohimforexposingmetosomevery innovativeandnovelresearchproblems.Dr.ChakrabarttyisoneofthesmartestpeopleI knowandIamamazedbyhisabilitytocomeupwithanendlesssetofnovelresearch ideas.IhavethoroughlyenjoyedmytimeworkingunderhisguidanceattheAIMlab.I wouldalsoliketothankmycommitteemembersDr. HayderRadha ,Dr. JonathanHall andDr. EvangelynAlocilja fortheirguidance.TheknowledgethatIgainedincourses theytaughtwasinvaluableinhelpingmetackletheinter-disciplinaryproblemsrequiredfor completionofthisthesis.TheywereallveryhelpfulwheneverIneededtheirinputonmy researchproblems.IalsothanktheNationalScienceFoundationandtheNationalInstitutes ofHealthfortheirgenerousresearchgrantswhichenabledmetocompletemyPhD. Iamhighlyindebtedtomyparentsfortheirunwaveringloveandsupport;theywould onlycallmeonweekendsbecausetheydidnotwanttoinfringeuponmystudytime.Thank you Ammee and Abbu ,IwishIcanbecomeasamazingaparenttomydaughterasyouboth havebeentome.Mywife Tooba hasbeenverysupportivethroughoutthistime.Thank youhoneyfornotcomplainingaboutthelonghoursandweekendsspentinthelab.Thank youalsofortakingcareofourdaughter,Abeer,allonyourownbackhomeinPakistan overthelastoneyear.ThemostwonderfulgiftthatIhavereceivedduringmyPhDis mydaughter;thankyou AbeerZahra yourwonderfulsmileandunconditionallovemakeme forgetallmyworriesandtroubles.Thanksarealsoduetomybrother HassanJamil and mysister Beenish fortakingcareofourparentsforalltheyearsIhavebeenawayfrom vi home.IwouldalsoliketothankallmyfriendsandcolleaguesatMSU,NUSTandUETfor theirhelp,supportandideaswheneverIneededthem.Thankyou Jawad,Faraz,Shahzad, Afshan,Samina,Zubair,Momina,Ahmad and Abhinav forbeingpartofsomeofmymost wonderfulmemoriesatMSU.Thankyou UsmanIlyas and MohammadAghagolzadeh atthe WAVES-Laband LiangZhou attheAIM-Labforalwaysprovidingattentiveearstomyideas andgivingvaluablefeedback.Thanksarealsodueto SyedAliKhayam andDr. ArshadAli forbeingwonderfulmentorsduringmytimeatNUST.Abigshoutouttomyundergrad friendsfromUETTaxila,the J-Gang : Rao,I˚,Saqib,Roomi,Qazi,Moodaman,Zim-X, BeelaBhae,Chaudhry and Bukhari ;youguysrockandIcan'twaittobebackinyour company. vii TABLEOFCONTENTS LISTOFTABLES ................................... x LISTOFFIGURES ................................... xi LISTOFSYMBOLS .................................. xv CHAPTER1INTRODUCTION ........................... 1 1.1ImpedancePlythesmorgaphyForRespiratorSignalEstimation........4 1.2HighThroughputProteinArrays........................7 1.3Contributions...................................11 CHAPTER2RESPIRATORYSIGNALESTIMATION ............... 14 2.1Multi-leadimpedanceplethysmography.....................15 2.1.1DataCollectionAndPre-processing...................17 2.2RespiratorySignalCharacteristics........................18 2.3SpirometerSignalRegression...........................22 2.3.0.1AverageBreathing-RateError( BR err )............22 2.3.0.2EnvelopeCorrelationCoe˚cient( E ˆ )............23 2.3.1SupportVectorRegression(SVR)....................23 2.3.2GaussianMixtureRegression(GMR)..................26 2.3.3DCTBasedEstimation..........................30 2.4SECEstimationUsingtheAMAproximation.................32 2.4.1EnvelopeEstimation...........................33 2.4.2CarrierEstimation............................34 2.5Results.......................................37 CHAPTER3BREATHINGRATEESTIMATIONUSINGKERNELMETHODS . 41 3.1 Gini KernelMachinesforBreathingRateEstimation.............43 3.1.1SupervisedLearningUsing Gini -KernelMachines...........44 3.1.2ProbabilisticLabelingofRespiratoryData...............49 3.1.3ResultsGiniKernelMachine.......................50 3.2AWaveletAdaptiveGiniKernelMachines..............54 3.2.1RespiratoryStateDetectionusingWaveletsFilters..........54 3.2.1.1RegionScoreComputation..................58 3.2.2RespiratoryStateDetectionusingDCTFilters.............64 3.2.3RateEstimationUsingAdaptiveGiniKernelMachines........65 3.2.4Results...................................66 3.2.5WaveletBasedArtifactDetection....................69 CHAPTER4PROTEOMICCHANNELCAPACITY ................ 75 4.1ProteomicChannelModels............................75 viii 4.1.1ProteinDi˙usionModel.........................76 4.1.2ReceptorResponseModel........................80 4.2ConditionalDistributionofProteinArrayChannel..............86 4.3ProteomicChannelCapacity...........................92 CHAPTER5KERNELMACHINESFORCAPACITYESTIMATION ....... 96 5.1Di˙usionModel..................................97 5.2ReceptorResponseModel............................98 5.3ProteomicChannelCapacityEstimation....................99 5.4ProteomicKernel.................................103 5.5OptimizationAlgorithm.............................106 CHAPTER6CONCLUSIONSANDFUTUREWORK ............... 109 6.1Summary.....................................109 6.2FutureDirections.................................110 BIBLIOGRAPHY .................................... 111 ix LISTOFTABLES Table1.1:Listofcytokines/proteinsemployedforcancerdetection...........7 Table2.1:AverageRespirationRateError( RR err )for11di˙erenthumansubjects..37 Table2.2:EnvelopecorrelationCoe˚cient( E ˆ )for11di˙erenthumansubjects...38 Table3.1:AverageRespirationRateError( RR err )inBPMfordi˙erenthuman subjects.Errorsarecomputedover10secondwindowswith5second overlaps.....................................52 Table3.2:AverageRespirationRateError( RR err )inBPMfordi˙erenthumansub- jectsin artifactsessions .Errorsarecomputedover10secondwindows with5secondoverlaps.............................67 Table3.3:AverageRespirationRateError( RR err )inBPMfordi˙erenthuman subjectsin acceleratedbreathing & apneasessions .Errorsarecomputed over10secondwindowswith5secondoverlaps................68 Table4.1:Behavioralmodelparametersforthreedi˙erenttypesofreceptorswith MouseandRabbitIgGastargetanalytes[35][36].Note:theletters ` m 'and` r ',inthesubscript,havebeenemployedhere(insteadofthe numerals`1'and`2'inequation(4.19))torepresentMouseandRabbit IgGrespectively.................................84 Table4.2:ValuesofReceptorParameters........................87 Table4.3:Di˙usionandReactionParameters......................89 x LISTOFFIGURES Figure1.1:Blockdiagramsof(a)aproteinsensingchanneland;(b)arespiratory signalestimator................................2 Figure1.2:ReferenceSpirometersignalandElectrodeoutputs(rawand˝ltered) underdi˙erentmotionandbreathingconditions(a)reachingforob- ject;(b)shallowfastbreathing;(c)deepfastbreathing;(d)deepslow breathing;(e)holdingbreath.........................5 Figure1.3:Recenttrendsinproteinarrayconstruction. N isthenumberoftarget proteins; V isthesamplevolume(in )requiredforasingletest; E=N/S isthee˚ciencyoftheproteinarray,where, S representsthe totalspotsonthemicroarray.........................8 Figure1.4:(a)Traditionalarraywithmicrospotsspeci˝ctoonlyasingleprotein. Maximume˚ciencyequalto0.5(b)Combinatorialarray,containsboth speci˝candcombinatorialmicrospots.Canachievee˚ciencygreater than0.5(c)Scanningelectronmicroscope(SEM)imageofpreviously reportedcombinatorialspot.Di˙erentlogicelementsareplottedon topoftheSEM.Experimentallymeasuredconductanceacross:(d)a soft-ORreceptorformouseandrabbitIgG;(e)asoft-ANDreceptorfor mouseandrabbitIgG;(f)aconventional(non-combinatorial)receptor speci˝conlytomouseIgG(Figure(c)to(f)adaptedfrom[35,36])....9 Figure2.1:Con˝gurationemployedformeasurementofrespiratorysignalfromhu- mansubjects.The spirometer employsadi˙erentialpressuresensor (placedinsideatubeoverthemouth)tomeasure˛owversustime.Mul- tiple impedanceplethysmographicsensors placedoverthetorsomeasure changesinlungvolumeversustime.....................15 Figure2.2:Percentageofdataduringwhichdi˙erentimpedance-electrodesgivethe lowesterrorrate................................16 Figure2.3:Signal-to-DistortionRatioofaSpirometerSignalReconstructedfora ˝xednumberofDCTcoe˚cients.......................20 Figure2.4:(a)OriginalSpirometersignal.(b)Reconstructedsignalusingonly1 DCTcoe˚cient;SDR=-0.104dB.(c)Reconstructedsignalusingthe AMapproximation;SDR=5.483dB.....................21 xi Figure2.5: TestSignal-1 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR( RR e rr =4.58BPM, E ˆ =0.338) (c) GMR( RR e rr =5.92BPM, E ˆ =0.771) (d) DCTbasedestimation,( RR e rr =6.58BPM, E ˆ = 0.327)and (e) SEC( RR e rr =2.79BPM, E ˆ =0.989).Subjectper- formingphysicalactivitybetween0to100sec................26 Figure2.6: TestSignal-2 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR( RR e rr =17.61BPM, E ˆ =0.572) (c) GMR( RR e rr =9.38 BPM, E ˆ =0.712) (d) DCTbasedestimation,( RR e rr =13.83BPM, E ˆ =0.384)and (e) SEC( RR e rr =2.80BPM, E ˆ =0.868).Subject performingphysicalactivitybetween0to100sec..............27 Figure2.7: TestSignal-3 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR( RR e rr =0.396BPM, E ˆ =0.417) (c) GMR( RR e rr =16.21 BPM, E ˆ =0.846) (d) DCTbasedestimation,( RR e rr =16.21BPM, E ˆ =0.343)and (e) SEC( RR e rr =3.03BPM, E ˆ =0.947).No physicalactivityatanytime.........................28 Figure2.8: TestSignal-4 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR( RR e rr =4.80BPM, E ˆ =-0.041) (c) GMR( RR e rr =4.91 BPM, E ˆ =0.646) (d) DCTbasedestimation,( RR e rr =7.81BPM, E ˆ =-0.081)and (e) SEC( RR e rr =3.24BPM, E ˆ =0.291).Subject performingphysicalactivitybetween100to180sec.............29 Figure2.9:BlockdiagramofSECestimationusingtheAMapproximation......32 Figure2.10:(a)OriginalSpirometersignal.(b)Signalestimateusingthelargest magnitudeDCTcoe˚cient(c)Signalestimatevianoisyframesuppression.35 Figure3.1:Breathingrateestimationduring hyperventilation underhighnoisycon- ditions.(a)ReferenceSpirometer.(b)SECoutput.(c)Electrode-2output.42 Figure3.2:Breathingrateestimationduring apnea underhighnoisyconditions. (a)ReferenceSpirometer.(b)SECoutput.(c)Electrode-2output....43 Figure3.3:Maximumentropyregressionforsupervisedlearning;thesquareregion representstheconstraintspace.(a) !1 :Solutionistheprojection of U ontotheconstraintspace.(b) =0 :Solution ~ P isequalto Y .(c) Non-extremevaluesof :Solution ~ P liesatalocationwithinthecon- straintspacethatminimizesthetotaldistancetothepriordistribution Y andtheagnosticdistribution U ......................46 xii Figure3.4:Probabilistictransformationofrespiratorysignal(a)ReferenceSpirom- eteroutput,positivevaluesof˛owindicateexpiration,negativevalues indicateinspiration(b)Plotof y i 1 = P ( C 1 j x i ) (orexpirationproba- bility)versustime(expiration)(c)Probabilityof y i 2 = P ( C 2 j x i ) (or inspirationprobability)versustime.....................51 Figure3.5:WA-Giniblockdiagram.Thetopplotillustratesthemainstepsofthe respiratorystatedetector(seeequations(3.17)to(3.19))..........53 Figure3.6:WaveletdecompositionusingMulti-resolutionanalysis...........56 Figure3.7:Referencespirometersignal y ( t ) anditscorrespondingDaubechies-Wavelet [62]detailsatdi˙erentlevels.........................57 Figure3.8:Electrode-1output x 1 ( t )anditscorrespondingDaubechies-Wavelet[62] detailsatdi˙erentlevels............................59 Figure3.9:Respiratorystatedetection(a)Spirometeroutput(b)Meanofall10 electrodes(c)Probabilitycurves:solidlinerepresentsprobabilityof acceleratedbreathing, p 0 ( t ) ;dottedlinerepresentprobabilityofnormal (orLow)breathing, p ( t ) ...........................62 Figure3.10:Impactofmotion-artifactonelectrodesandprobabilitycurves;subject isreachingforobjectbetweenthe0.5to2minmark.Plotsindicate:(a) Spirometeroutput;(b)Outputsofthreeelectrodesand;(c)Probability curves.....................................71 Figure3.11:Probabilitycurvesoffourdi˙erentsubjectswhenreachingforobject..72 Figure3.12:Probabilitycurvesoffourdi˙erentsubjectswalkingatanormalpace..73 Figure3.13:Probabilitycurvesoffourdi˙erentsubjectswhenrollingleftandright onbed.....................................74 Figure4.1:(a)Cross-sectionalviewofdi˙usioninamulti-proteinarray.Di˙erent statesofthechannel:(b) t =0: X n particlesinjectedatorigin I = (0 ; 0 ; 0) ;(c) t> 0: concentration, n R ;t ) ,ofparticlesinthereceptor sub-volumeisgivenby(5);(d) t !1 : (Steady-State)concentration,, ofparticlesinthereceptorsub-volumeisgivenby(6)...........77 Figure4.2:Concentrationasafunctiontimeinsidereceptorsub-volumefordi˙erent valuesof D n .(Totalinputconcentration n I ; 0)=4 g/cm 3 ;x R =1 cm ;r n =0 : 02 s 1 ;R s =2 )...........................79 Figure4.3:Illustrationofreceptorsaturationduetounavailabilityoffreeprobes...81 Figure4.4:Outputsignalsaturationinatypicala˚nitybasedarray..........82 xiii Figure4.5:Speci˝candCombinatorialProbes......................85 Figure4.6:Conditionaldistributionofproteinarraychannel;(a)3Dview(b)Top view.ReceptorParametersare˝xedto k 1 =1 ;k 2 =0 : 9 ;k 12 =0 : 9 ; Di˙usionparametersareaslistedinTable4.3; x 2 =1 : 765 10 3 ......86 Figure4.7:Conditionaldistributionofproteinarraychannelfor k 1 =0 : 2 and x 2 = 1 : 765 10 3 .Thevaluesof k 2 and k 12 varyrowandcolumnwiserespectively.87 Figure4.8:Conditionaldistributionofproteinarraychannelfor k 1 =1 : 0 and x 2 = 1 : 765 10 3 .Thevaluesof k 2 and k 12 varyrowandcolumnwiserespectively.88 Figure4.9:Crosssectionalviewoftheconditionaldistribution P Y j X ( y j x ) fora˝xed x 2 andvarying x 1 .Conditionalvariance ˙ Y j X ( y j x ) 2 isapproximated byit'saveragevalue ˙ 2 n ............................89 Figure4.10:KL-Divergencebetweentrueand˝xedvariancedistributions........90 Figure4.11:Capacityofproteinarraychannelfordi˙erentvaluesofreceptorparam- eters.Variance P oftheinputdistributionisthesameforallsettings andissetequalto10.............................94 Figure5.1:Cross-sectionalviewofdi˙usioninamulti-proteinarray.........97 Figure5.2:Blockdiagramillustratingthecomputationofcapacityoftheproteomic channel....................................100 Figure5.3:Illustrationofinteractionsbetweenproteinoftype` i 'andtwodi˙erent typesofcapturingprobes...........................104 xiv LISTOFSYMBOLS Wheneverpossiblethefollowingnotationwillbeemployedthroughoutthisthesis: n Lowercasesymbolsrepresentscalarvalues.Generallyemployedtorepresent indicesofvectorsandmatrices. N Uppercasesymbolsalsorepresentscalarvalues.Generallyemployedtorepresent sizesofvectorsandmatrices. x Lowercaseboldfacesymbolsrepresentvectors. X Uppercaseboldfacesymbolsrepresentmatrices. x i Representsthe i -thvector. x i Representsthe i -thelementofvector x . x ij Representstheelement ( i;j ) ofmatrix X . R D Representsthe D -dimensionalspaceofrealnumbers. R Representsthe 1 -dimensionalspaceofrealnumbers. R D 0 Representsthe D -dimensionalspaceofpositiverealnumbers,including0. Z Representsthe 1 -dimensionalspaceofintegers. Z D 0 Representsthe D -dimensionalspaceofpositiveintegers,including0. xv CHAPTER1 INTRODUCTION Modernstatisticalandpatternrecognitiontechniqueshavefoundapplicationsinadiverse rangeofscienti˝cdisciplines.Modernbiologyforexamplehasbene˝tedtremendouslyfrom theseinter-disciplinaryinteractionsresultingeventuallyintheemergenceofnewdisciplines suchasbioinformaticsandcomputationalbiology.Thisthesisexplorestheapplicationof statisticallearningapproachestoenhancetheperformanceofbiologicalandmedicalsensing systems.Inparticular,thisworkfocusesontheuseofsignalprocessingandkernelmethods forreliableinformationextractionfromhigh-dimensionalandnoisydatafoundinprotein andrespiratorysensingdata. TheblockdiagramofatypicalproteinsensingframeworkisshowninFigure1.1(a).The goalinthisapplicationistosimultaneouslydetecttheconcentrationlevelsofmultipletarget proteinswithinatestsample.Fromacommunicationtheoreticperspectivethissetupcan beviewedasmultiplexedcommunicationchannel.Thethroughputofthischanneldepends onanumberoffactorswhichmayormaynotbeunderourcontrol.Theseinclude:di˙usion noise,receptorresponseandsaturationcharacteristicsandtheHooke˙ectetc.Thisthesis isprimarilyconcernedwiththeimpactofdi˙usionnoiseandthereceiverresponsecharac- teristics.Ideally,wewouldliketoobtainexpressionsthatdemonstratethee˙ectofreceptor parametersonthethroughputofmultiplexedproteinarrayplatformsinthepresenceofchan- nelirregularitiessuchasdi˙usionnoise.Informationandcommunicationtheoryprovidea numberofusefultoolsforevaluatingtheperformancelimitsofanycommunicationchannel inthepresenceofnoise;theforemostbeingthechannelcapacity.Enhancingthethroughput ofanycommunicationchannelentailsthemaximizationofitsinformation-theoreticcapacity, C ,whichdependsontheconditionalprobabilitydistributionofthechannel, p ( y j x ) .Itisgen- erallyverydi˚cultto˝ndclosed-formexpressionsforthisdistribution;thiscanattributed 1 Figure1.1:Blockdiagramsof(a)aproteinsensingchanneland;(b)arespiratorysignal estimator. tothecomplexityofthedi˙usionchannelandthenon-linearnatureoftheproteinreceptors. Therefore,theconditionaldistribution, p ( y j x ) ,mustbecomputedvianumericaltechniques. ThenumericaltechniquesemployedinthisthesisprimarilyemployMonte-Carlosimulations andkernelmethodsforevaluationofcapacityandsimilarmeasureofinformation. Figure1.1(b)showstheblockdiagramofamulti-electroderespiratorysensingsystem. Theobjectiveofsuchasystemistomeasureahumansubject'srespiratoryparameterssuch as: BreathingRate (BR),indicatingthefrequencyatwhichthesubjectinhalesandexhales, and LungVolume (LV)whichcorrespondstothevolumeofaircontainedwithinthelungsat 2 anygiveninstanceoftime.Suchaframeworkshouldpreferablyemploynon-invasivesensors inordertoavoidcausingdiscomforttothesubjectbeingmonitored.Impedanceplethys- mographyisapopularnon-invasiverespiratorysignalestimationtechniquewhichoperates byplacingplethysmographicsensorsoverthesubject'schestandabdomenareas.Under normalbreathingconditionsthecross-sectionofthechestandabdomenareasincreasesdur- inginhalationandreturnstoabaselineduringexhalation[1],thiscausesachangeinthe impedanceoftheattachedelectrodesresultinginoutputsignalsfromwhichrespiratory parametersofinterestcanbeextracted.Unfortunately,thesesensorssu˙erfrommotion artifactsandnoisemakingitdi˚culttotheestimatebreathingrateandlungvolumeespe- ciallywhenthesubjectperformssomephysicalactivity.Apotentialsolutiontothisproblem istoemploymultiplesensorssothatinformationfrommultiplesourcesmaybecombined toobtainanartifactfreeestimateofthedesiredrespiratoryinformation.Thisthesisuses anumberofsignalprocessingandpatternrecognitiontechniquestodemonstratesthatitis indeedpossibletominimize(ordiminish)theimpactofnoiseandchannelartifactsbyusing multipleplethysmographicsensors.Alsoproposedarealgorithmsbasedonkernelmethods forrobustrecoveryofrespiratoryparametersinthepresenceofmotion-artifacts.Therefer- encerespiratorysignalforcomparison,andtraining,isobtainedfroma Spirometer whichis immunetomotion-artifactsbutisinvasiveandtherefore,notfeasibleforlong-termsubject monitoring. Thisthesisisorganizedasfollows:Motivationforbothapplicationsispresentedin section1.1andsection1.2.Contributionsarelistedinsection1.3.Chapter2presents anapproachwhichemploysDCTbased˝lteringandpatternrecognitionforestimationof breathingrateand(tidal)lungvolume.Chapter3examinesaccuratebreathingrateestima- tionusingkernelmethods.Alsopresentedisaninnovativewavelet˝lteringbasedfront-end whichenablesdetectionofdi˙erentrespiratoryandphysicalstatesofhumansubjectswith highaccuracy.Coupledwithkernelmethodsthistechniquedemonstratessigni˝cantreduc- tionintheerrorobtainedwhenestimatingbreathingratefromimpedance-plethysmographic 3 electrodechannels.Aninformationtheoreticanalysisoftheproteinarraychanneliscon- ductedinchapter4.Optimalprobecon˝gurationsthatmaximizeinformationexchange acrosstheproteomicchannelarealsoinvestigated.Chapter5proposeaframeworkbased onkernelmethodsforevaluatingthequadraticcapacityoftheproteomicchannel.Chap- ter5alsopresentsanovel proteomic kernelwhichisbasedonthebio-physicalinteractionof thereceptorprobesandthetargetparticles.Conclusionsandfutureworkarepresentedin chapter6. 1.1ImpedancePlythesmorgaphyForRespiratorSignalEstimation Chronicobstructivepulmonarydisease(COPD)isthe3rdleadingcauseofdeathworldwide [2]andisamajorcauseofdisabilitya˙ectingmorethan12millionpeopleintheUnited States[3].Over5millionpeopleintheUnitedStates(US)area˙ectedbyHeartFailure (HF)whichaccountsfor300,000deathsperyearintheUS[4].Di˚cultyinbreathingand shortnessofbreathareearlyindicatorsofdeterioratingpatientconditionsinboththese diseases.TherearecurrentlynocuresforHFandCOPDtherefore,continuousmonitoring oftherespiratoryconditioninthesepatientscanenablecaregiverstointerveneatanearly stageandmanagediseasesymptoms,forestallingcatastrophiceventsandavoidinglossof preciouslives.Thetwomostimportantparametersextractedfromtherespiratorysignalare the respiration-rate and lung-volume .Lung-volumeisindicativeofthesizeoflungsandthe volumeofairapatientcanbreatheinorout,itisthemostimportantfactorfordetection ofCOPD[5].Tachypnoea,oranincreaseinrespiration-rate,canberepresentativeofan attemptbythebodytocompensateforpoorpulmonarygasexchangeand/orpoorcardiac circulation.Ithasbeendemonstratedtobeasigni˝cantfactorinthepredictionofcardiac arrestintheICU[6].Depressionoftherespiratorycenterduetoseveredeteriorationof thepatientornarcoticovermedicationoftencorrespondstoadecreasedrespiratory-rate[7]. However,despitethesigni˝canceofmonitoringpatientbreathingpatternsandrespiratory rates,thesemeasurementsarefrequentlyignoredinclinicalpractice[1].StudiesofICU 4 Figure1.2:ReferenceSpirometersignalandElectrodeoutputs(rawand˝ltered)under di˙erentmotionandbreathingconditions(a)reachingforobject;(b)shallowfastbreathing; (c)deepfastbreathing;(d)deepslowbreathing;(e)holdingbreath. practiceshaverevealedthatinspiteinstallationofvitalsignsbasedearlywarningscoring systemsrespiratorymeasurementsareneglectedover40%ofthetime[8].Thiscanbe attributedtothefactthatthemostaccuraterespiratoryratemeasurementmethods,such asCO 2 sensorsand˛owsensors,aredi˚culttoadministerandoftenintolerablefornon- intubatedambulatorypatients.Analternativeistomeasurerespiratoryconditionsusing impedancebasedelectrodes.Thismethodisnotinvasiveandthus,ismorelikelytobe acceptedbybothcliniciansandpatientsduetothealreadyroutineuseofelectrodebased monitoringsystemsathospitals. Unfortunately,electrode-basedrespiratorymeasurementsarenoisyandtherefore,require 5 post-processingtominimizetheimpactofnoiseirregularities.Thesecondmajorobjective ofthisthesiswillbetoemploymultipleelectrodesplacedatdi˙erentspatiallocationsover thebodiesofhumansubjectstoestimatetherespiratorysignal.A spirometer (invasive˛ow sensor)isusedasthestreferencesignal.ThetopplotinFigure1.2displays theoutputofaspirometerindi˙erentrespiratorystates.Thecorrespondingoutputsofthree distinctelectrodesareshownthemiddleplot.Notice,thattheelectrodesintroduceaslow varyingDCbaselineintherespiratorysignal.Furthermore,thereissigni˝cantdistortion inregion-(a)duetomotion-artifactsthatoccurwhensubjectreachesforanobject.The electrodeoutputafterbandpass˝lteringisshowninthebottomplot.Thisthesisproposesa numberofdi˙erenttechniquesforobtaininganaccurateestimateofthespirometer/respi- ratorysignalfromelectrodeoutputsbyformulatingthetaskathandasamachinelearning problem.Anoveltechniquecalled SegregatedEnvelopeCarrier (SEC)estimationispro- posed.Thisapproachisbasedonthehypothesisthattherespirationinformationliesontwo distinctmanifolds:(1)ahighfrequencymanifoldand;(2)alowfrequencymanifold.The detailsofthisapproacharepresentedinchapter2.TheSECenablestheestimationofboth breathingrateandlungvolume.Itishighlightedthattraditionalnon-invasiveapproaches torespiratorysignalestimationconcentratesolelyonrespiration-rateestimationandgener- allydonotcaterforlung-volumeestimation.ForexampleaKalman˝lterframeworkwas proposedin[9].Thisapproachestimatestherespiration-ratebycombininginformationfrom multiplephysiologicalsources.Respirationratecanalsobederivedfromtheelectrocardio- gram(ECG);suchapproachesgenerallyemployalgorithmsbasedontheR-peakamplitude (RPA)modulation[10]ortherespiratorysinusarrhythmia(RSA)[11],[12].Morerecently,an approachcombiningbothRSAandRPAhasbeenproposedin[13].Although,alltheafore- mentionedtechniquesachievehighaccuracy(inestimatingrespiration-rateonly),theyare testedondatacollectedfromnon-ambulatorysubjectsgenerallyrestinginasupineposition. Incontrast,thedatabaseemployedforthisthesishasbeencreatedundermorechallenging conditionsandrecordsrespiratorysignalinbothambulatoryandnon-ambulatorycondi- 6 BreastCancer SialylLewis x ,C3,C4,C5,IL-8,TM-peptide, IL-5,IL-7,MCP-3CXCL8,IL-8,CXCL1,GRO OvarianCancer IL-6,IL-8,VEGF,EGF,MCP-1,CA-125 Leptin,Prolactin,Osteopontin,IGF-1,MIF ProstateCancer MCP-1,IL-6,IL-8,GRO- ,ENA-78,CXL-16 Table1.1:Listofcytokines/proteinsemployedforcancerdetection. tions.Thereareonlyasmallnumberofveryrecentstudiesthatmeasurebothlung-volume andrespiration-ratewhichcanbefoundin[14]and[15]. InadditiontotheSECaninnovativeapproachbasedonthecombinationofwavelet ˝lteringandkernelmethodsisproposedinchapter3.Thistechniqueachievesasigni˝cant reductioninthebreathingrateerrorundernoiseandartifactconditions. 1.2HighThroughputProteinArrays Thehumanbodyisthoughttocontainmorethan2millionproteins,eachassociatedwith adi˙erentbiologicalfunction[16].Decodingofthesecomplexbiologicalfunctionsrequires detectingandmeasuringthestateofnumerousproteinssimultaneously.Inthisregard,high- throughputproteinmicroarrayshavebecomeanessentialtoolwhichenablesrapid,direct, quantitativeandmultiplexeddetectionofamultitudeofproteins.Applicationsoftheprotein microarraytechnologyrangefromdrugdevelopmenttodiseasedetectionanddiagnosis. Consider,forexamplethecasefordetectingwhicharesignalingproteinsthat arecollectivelyresponsibleforanumberofphysiologicfunctionsandplayanimportantrole inmanydetectingonsetofdiseases[17].Usingproteinmicroarrays,researchershavebeen abletouncovernewandimproved,cytokinebased,biomarkersforanumberofdiseasessuch as:Alzheimers[18]Parkinsonsdiseases[19]andmanyothertypesofcancers[20].Table1.1, listssomeexamplesofcytokineandproteintargetsthatcouldbeusedasbiomarkersfor di˙erenttypesofcancers(Refs:[17,Oneofthetrendsandcorrespondingchallenges inthedesignofproteinassaysistobeabletosimultaneouslydetectasmanybiomarkersas possiblewhileminimizingthevolumeofthesamplerequiredforanalysis. 7 Figure1.3:Recenttrendsinproteinarrayconstruction. N isthenumberoftargetproteins; V isthesamplevolume(in )requiredforasingletest; E=N/S isthee˚ciencyofthe proteinarray,where, S representsthetotalspotsonthemicroarray. Figure1.3showsthespeci˝cationsofsomeoftheproteinmicroarraysthathavebeen reportedduringthelast15yearsTheplotcomparesthetotalnumberoftargets thatcanbesimultaneouslydetectedversusthee˚ciencyofthearraygivenbythedi˙erent rangesoftestsamplevolumes.Figure1.3clearlyshowsthattheoveralltrendhasbeento enhancethethroughputandmultiplexingcapabilityofproteinarraysbydetectingalarge numberoftargetproteinswhileconsumingaslittleofthetestsamplevolumeaspossible. Forinstance,oneof˝rsttheproteinarrayswasproposedin[25],itemployed504spots formultiplexeddetectionof7targetsensuringveryhighredundancyatcostofachieving verylowe˚ciency(plottedonbottomleftofFigure1.3).Recentlydevelopedmicroarrays however,generallyemployapproximately2spotspertargetandtherefore,haveane˚ciency valuearound1/2.Figure1.3alsoindicatesthatthebestarrays(intermsof E and N )also consumethesmallestamountofsamplevolumepertargetpertest. Thisthesisinvestigatesthelimitsofmulti-analytedetectioncapabilityofagenericpro- teomicmicroarrayplatformbasedoninformationtheoreticandcomputationalmodeling techniques.Fromaninformationtheoreticpointofview,aproteomicplatformcanbeviewed 8 Figure1.4:(a)Traditionalarraywithmicrospotsspeci˝ctoonlyasingleprotein.Maximum e˚ciencyequalto0.5(b)Combinatorialarray,containsbothspeci˝candcombinatorial microspots.Canachievee˚ciencygreaterthan0.5(c)Scanningelectronmicroscope(SEM) imageofpreviouslyreportedcombinatorialspot.Di˙erentlogicelementsareplottedontop oftheSEM.Experimentallymeasuredconductanceacross:(d)asoft-ORreceptorformouse andrabbitIgG;(e)asoft-ANDreceptorformouseandrabbitIgG;(f)aconventional(non- combinatorial)receptorspeci˝conlytomouseIgG(Figure(c)to(f)adaptedfrom[35,36]). 9 asabiosensingchannelwheretargetproteins(withdi˙erentconcentrationlevels)constitute thesignalbeingtransmittedandthechannelnoiseariseduetodi˙erentbiosensingarti- factslikenon-speci˝cbinding,saturation,hooke˙ect,spotcorruptionandmeasurement noise[37,38].ThisconceptisillustratedinFigure1.4(a)forasmallscaleassaywhereeach ofthespots(labeled1-3)areimmobilizedbytargetspeci˝cantibodyprobes.Forthesake ofsimplicitytheinputtotheassayisabinaryvectorwithaindicatingabsenceanda indicatingpresenceofthetargetprotein.Eachoftheprobescomprisesofepitopeswhich arerecognitionsitesthatbindwiththetargetproteinwithsomedegreeofa˚nity.Thus, aprotein-probehybridizationcanbeviewedasanequivalentnner-product"betweenthe assaymatrixandtheinputvector,withtheresultingoutputbeingameasurable electricaloranopticalsignalvector. Conventionalmicroassaysandmicroarraysusemultiplespotsofantibodyprobestoim- provethereliabilityofdetectingasingletarget.Thus,fromachannelcodingpointofview thisprocedurecanbeviewedasusinga repetition block-codeandexistingmicroarrayplat- formusea repetition codetocombatchannelerrors.However,itiswellknownthatthe channelcapacityofarepetitioncodeisnote˚cient,especiallyifthesizeoftheblock-code becomeslarge.Inthisregard,usingabinatorial"probethatcanbindwithdi˙erent targetproteinswithdi˙erenta˚nities(asshowninFigure1.4(b))couldbeusedtoenhance thecapacityoftheassay.Investigatingthisprincipleusingacomputationallye˚cientap- proachisoneofthemainobjectiveofthisthesis.Thisrequiresdevelopmentofsuitable channelmodelsfollowedbytheevaluationofthechannelprobabilitydistributionswhichare requiredforcomputingthechannelcapacity.Modelsoftheproteomicdi˙usionchanneland di˙erenttypesofreceptorsarederivedinchapter4.Numericalresultsindicatethatcapacity oftheproteomicchannelcanindeedbeenhancedusingcombinatorialprobes.Furthermore, e˚ciencyofnumericalcomputationofthechanneldistributionscanbeimprovedemploying kernelmethods. 10 1.3Contributions Theprimarymotivationforthisthesisistoemploytheprinciplesofpatternrecognitionand informationandsignalprocessingfornoiserobustretrievalofinformationfromemerging biosensingapplications.Inthisrespectthisthesisfocusesontwoapplicationsareasnamely: (1)Non-invasiverespiratorysignalestimationand(2)Highthroughputproteindetection arrays.Inboththeseapplicationsigni˝cantemphasisisplacedonkernelmethods.Thekey contributionsarelistedbelow: 1. ThisthesisemploysanoveldatasetrecordedbyGeneralElectricGlobalresearchin NiskayunaNewYorkforestimationofrespiratorysignalparameters.Thisdataset containsrespiratorysignalsrecordedfrommultiplenon-invasivesensorsfrom19human subjects.Incomparisontodatasetsemployedinexistingliteratureourdatasetis uniqueinthesensethatitcontainsmultipleinstancesofsubjectsperformingvarious physicalactivities.Ourdatasetcontainsmultipleinstancesofthesubjectsindi˙erent respiratorystatessuchas:apnea,acceleratedbreathingandhyper-ventilation,slow breathingetc.Existingdatasetincontrastgenerallyfocusononlyoneortwotypesof respiratorystates.Therefore, therespiratorydatasetemployedinthisthesisis uniqueinthesensethatitcontainsadiversesetofrespiratorystatesand physicalactivities. 2. Existingliteratureinrespiratoryestimationfocusesprimarilyonbreathingrateesti- mationalone.Lungvolumeestimationisgenerallyignoredandthereareonlyafew worksthathaveinvestigatedtheestimationoflungvolumefromnon-invasivesen- sors[14]and[15].However,thedatasetemployedintheseworksdonotcontainany motionartifacts. ThisthesisproposesanovelapproachcalledtheSegregated EnvelopeCarrier(SEC)estimationwhichexaminestheestimationofboth breathingrateandlungvolumefromnon-invasivesensorsunderbotharti- factandartifact-freeconditions. 11 3. Asetofnovelfeaturesbasedonthediscretewavelettransformisproposed.These featuresprovideasimplemethodforclassi˝cationofthesubject'srespiratoryand physicalstates.Thisthesisemploysthesefeaturesfordetectionofartifacts,apnea, acceleratedandnormalbreathingregions. Tothebestoftheauthor'sknowledge thisisthe˝rsttimethatthesetypeoffeatureshavebeenemployedfor classi˝cationofrespiratoryandphysicalstates. 4. Anadaptiveframeworkbasedon Gini kernelmachinesisproposedfordetectionof breathingrateestimation.ThisistitledtheWavelet-Adaptive- Gini (orWA Gini )al- gorithmforbreathingrateestimation.Thisalgorithmemployswaveletbasedfeatures forrespiratorystateclassi˝cationandusestheclassi˝er'sdecisionforselectingaker- nelmachinethathasbeentrainedspeci˝callyfortheunderlyingrespiratorystate. EvaluationoftheoutputindicatesthattheWA Gini algorithmenablessig- ni˝cantreductioninthebreathingrateestimationerror.Theperformance improvementobtainedissigni˝cantincomparisontostandardrateestima- tiontechnqiues. 5. Forproteinarraysensingthisthesisevaluatestheimpactofvariouschannelirregulari- tiesoninformationtransferbetweentheinputandoutputofthea˚nitybasedprotein arraysensors.Forthispurposeaproteinarrayisviewedasacommunicationchannel anditschannelcapacityisevaluated. Tothebestoftheauthor'sknowledge thisisthe˝rste˙ortundertakentoevaluatetheinformationtransmission capacityofaproteinarraychannel. 6. Capacityevaluationoftheproteinarraychannelentailsmodelingofthevariousir- regularitiesthatcanhaveanadverseimpactontheinformationofinterest. Forthis purposemodelsofdi˙usionprocessesandreceptorartifacts,basedonex- perimentalprototypesconstructedinlab,arepresented. Existingliterature investigatingthecapacityofbiologicalcommunicationchannelsgenerallyemployideal 12 modelsofreceptors;seeforexample[39]. 7. Evaluationofshannon'schannelcapacitybecomeschallengingwhendealingwithnon- linearchannelswithcontinuousandhigh-dimensionalinputalphabets.Analternative canbetoemploymetricssuchasaquadraticformofmutualinformationwhichin practicearemoreamenabletooptimization.Inthiscontextanoptimizationframe- workbasedonaquadraticinformationmeasureisproposedinchapter5.Ofparticular importanceinthisframeworkistheuseofanovelkernelwhichwecallthe Proteomic Kernel . Theproteomickernelisdesignedspeci˝callytocapturethebio- physicalinteractionsofthereceptorprobesandthetargetproteinparticles. Thisenablestheeasyextensionofproteinarraydesignstoalargenumber oftargetproteins. Itisenvisionedthatthetheoreticaldiscussionprovidedinthis thesiswilleventuallyleadtosoftwaretoolsthatwillenablepromptandcost-e˙ective designofhigh-throughputproteinarrays. 13 CHAPTER2 RESPIRATORYSIGNALESTIMATION Accuraterespiratorysignalestimationusingimpedanceplethysmographycanbechallenging undercertainconditionssuchas;duringpatientmotionorundernoisyconditionsathigh breathingrates.Thischapterdiscussesanumberofdi˙erentsignalprocessingandlearning techniquesforestimationofbreathingrateandlungvolumefrommultiplenon-invasive impedanceplethysmographicelectrodechannels.Theorganizationofthischapterisas follows:section2.1describesthehardwareemployedforobtainingrespiratorydatafrom di˙erenthumansubjects;italsodetailsthedi˙erentconditionsunderwhichthedatawas collected.Section2.2discussesthesalientcharacteristicsoftherespiratorysignal,thisisdone toprovidetheoreticalbackgroundandgaininsightsaboutwhatstrategytoemploytoobtain agoodestimateoftherespiratorysignalfromtheelectrodeoutputs.Section2.3contains detailsofthedi˙erentregressiontechniquesemployedtopredicttherespiration/spriometer signalfromtheelectrodeoutputs.Atotaloffourdi˙erentregressiontechniqueshavebeen employed;theyarelistedinsections2.3.1to2.4.The˝rsttwoapproachesarebasedon conventionaltechniquessuchasSupportVectorregression(SVR)andGaussianmixture regression(GMR).AsimpleschemebasedonDCT˝lteringisdiscussedinsection2.3.3. AnovelapproachEnvelope(SEC)isproposedinsection2.4. Itoperatesontheassumptionthatrespiratoryinformationiscontainedintwodistinct manifolds:(1)EnvelopeManifoldcontainingslowvaryingtemporalinformationand(2)The CarrierManifoldcontainingtherelativelyfastervaryingtemporalinformation.Consolidated resultsforallhumansubjectsaresummarizedinsection2.5. 14 Figure2.1:Con˝gurationemployedformeasurementofrespiratorysignalfromhumansub- jects.The spirometer employsadi˙erentialpressuresensor(placedinsideatubeoverthe mouth)tomeasure˛owversustime.Multiple impedanceplethysmographicsensors placed overthetorsomeasurechangesinlungvolumeversustime. 2.1Multi-leadimpedanceplethysmography Themostreliableandaccuratemethodsofmeasuringthebreathingrateemployinstruments suchasspirometersthatmeasurethechangesintheair˛owdirectlyfromthepatient's airway.Thespirometer(or˛owmeter)setupemployedduringdatarecordingisillustrated inFigure2.1;itusesadi˙erentialpressuresensorplacedinsideatubelocatedoverthe subject'smouth.Thesubject'snoseisblockedusinganoseclipsothatonlytheair˛ow toandfromthemouthiscaptured.Thedi˙erenceinair˛owtoandfromthemouthis measuredbythedi˙erentialpressuresensorwhichtheproducesthetime-seriessignal y ( t ) correspondingtothevariationofair˛owintoandoutofthelungsasfunctionoftime. Analternatesensingmechanismthatcanbeemployedtomeasuretherespiratorysignal usesimpedance-plethysmographicsensorsplacedoverthesubject'storso.Thesesensors operatebycapturingasubject'schestmotionasitin˛atesandde˛atesduringinspiration andexpiration.Animpedanceplethysmographicelectrodesensormeasuresvariationsofthe changesintheairvolumeinsidethesubject'slungsasafunctionoftimeanditsoutput x ( t ) 15 Figure2.2:Percentageofdataduringwhichdi˙erentimpedance-electrodesgivethelowest errorrate. istherefore,theintegralofthe˛owsignaloutputfromthespirometer: x ( t )= Z y ( t ) dt (2.1) Duetoitsnon-invasivenaturetheplethysmographicsensingmechanismisveryappealing forlong-termandremotemonitoringofpatients.Unfortunately,thismechanismisprone tomotionartifacts[40]sinceessentiallyanyactivitybythesubjectsuchasarmmovement etccanalsobemeasuredbyaplethysmographicsensorandcanthereforeinterferewiththe respiratoryinformation.Apotentialsolutiontothisproblemistoemploymultiplesensing electrodesplacedatdi˙erentspatiallocationsoverthepatient'sbody.Sincetherespiratory signaliscorrelatedamongalltheimpedanceelectrodesandthepatientmovementsare sporadicandgenerallynotcorrelatedweshouldbeabletoseparatetherespiratorysignal frommotionartifactsusingamulti-sensorsetup.Asimpleproceduretodemonstratethe potentialadvantageofusingmultipleimpedance-electodesistodividetherespiratorydata into,non-overlapping,segmentsofequaltimedurationandthencomputethepercentageof 16 segmentsduringwhichthebreathingrateobtainedfromanyparticularchest-sensorisclosest tothe reference breathingrateobtainedfromthespirometer.Inthiswaywecancompute thatpercentageofsegmentsduringwhichacertainelectrodegivesthebestperformance (orthesmallestbreathingrateerror).Thebar-graphinFigure2.2plotsthepercentageof segmentsduringwhicheachofthe10chest-sensorsgivesthebestperformance.Itcanbe observedthatthereisnoclearwinnerandthemostaccuratechest-sensorgivesthebest performanceinonlyabout20%ofthetotalsegments.Thisisnotunexpectedsincethedata isnotartifactfreeandthedegreeofimpactofamotionartifactonanimpedance-electrode dependsonthenatureoftheunderlyingphysicalactivityandtheelectrode'slocation.For example,asuddenrightarmmovementismorelikelytodistortoutputsofelectrodeson therightsideofthetorsothanitistodistorttheleftsidesensors.Asaresultthere seemstobenosingleelectrode-sensorthatgivesthebestperformanceacrossallthedi˙erent activitiescontainedinthevarioussegmentsofdata.Therefore,itseemslikelythatmultiple impedance-electrodesmayenableustominimize/eliminatetheimpactofmotion-artifacts onrespiratorysignalestimation. 2.1.1DataCollectionAndPre-processing Theexperimentsinthischapterarebasedon11respiratorydatasetseachofwhichwas recordedfromadistinctadulthumansubject.Anadditional8subjectsareaddedforthere- sultsinthenextchapter. DatawascollectedbyGeneralElectric(GE)globalresearchattheir NiskayunaNYlocation.ThedatacollectionprotocolwasapprovedbyGE'sinstitutionalre- viewboard. Atotalof10impedanceelectrodeswereplacedatdi˙erentspatiallocationsona humansubject'storsoasshowninFigure2.1.Asmentionedpreviously,a spirometer (Model RX137FBiopacInc.Goleta,CA)isusedasthereferencerespiratorysignal.Thespirometer andelectrodeswereswitchedonando˙bytwodi˙erentoperatorsandtheoutputsignals werealignedmanually.Excesspre-andpost-samplesweretruncated.Thespirometerand electrodehardwarehavedi˙erentsamplingratestherefore,interpolationwasemployedto 17 waveformswithidenticalsamplingrates.Eachindividualdatasetisapproximately50min- utesinduration.Duringrecordingthehumansubjectswereinstructedtomaintaindi˙erent positions/posturessuchas:sittinginchair,layingface-uponbedandstandingetc.Fur- thermore,subjectwastoldtoachievedistinct respiratory-states suchas:normalbreathing, deepbreathing,shallowfastbreathing,deepfastbreathing,coughing,yawningandholding breathetc;whilesimultaneouslymaintainingdi˙erentpostures/positions.Eachdatasetalso incorporatedmotionartifactsbyrecordingtherespiratorysignalwhilethepatientperformed di˙erent physical-activities suchas:reading,eating,walking,reachingtograbobjectetc. Considerforexample,interval-(a)inFigure1.2(onpage6)wherethesubjectisreachingfor anobjectwhilebreathingnormally(asindicatedbythespirometersignal)inaseatedpo- sition.Motionartifactsinthisintervalcausesigni˝cantdistortionintheelectrodesignals. Duringintervals-(b)through(e)thesubjectislaying,face-up,onabedandmaintaining di˙erentrespiratory-stateseachforadurationofapproximately30seconds. InadditiontomotionartifactsimpedanceelectrodeoutputsincludeachangingDC- baseline(seeFigure1.2middleplot).Thiscanbeattributedtoslightshiftsinelectrode positionovertime.Therefore,the˝rstpre-processingstepistoapplyaDCblocking˝lterto theelectrodeoutputs.AfterthisalowpassFIR˝lterisappliedtoeliminatehighfrequency interferenceandnoise.ThebottomplotinFigure1.2displaysthe˝lteredelectrodesignals anddemonstratesthatsimple˝lteringeliminatestheDC-baselineandhighfrequencynoise. 2.2RespiratorySignalCharacteristics Thissectionprovidesabriefbackgroundabouttheimportantinformationcontainedwithin therespiratorysignaloutputfromthespirometerandexamineitstime-frequencycharac- teristics.Atypicalspirometeremploysamouthpiecetodirectlymeasuretheair˛owinthe lungsduringinspirationandexpiration[41].Thebreathing-rateiscontainedinthespirome- terfrequencywhereas,thelung-volumecanbeobtainedbyintegratingthespirometerouput. Therefore,itiscriticaltopreserveboththefrequencyandamplitudeofspirometerinorder 18 toproduceanaccurateestimateoftherespiration-rateandlungvolume.Asaresult,we approximatethespirometeroutputbyan Amplitude-Modulated (AM)signal.Theharmonic natureoftheSpirometersignalimpliesthatitcanberepresentede˚cientlyusingahar- monicbasissuchastheDiscreteFourierTransform(DFT)ortheDiscreteCosineTransform (DCT).Iftruethismayprovideuswithamethodtoconstructsparsefeaturesforsignal regression.Toexaminetheharmonicnatureoftherespiratorysignalwetakeasample Spirometeroutputandsubdivideitinto,non-overlapping,frames(orwindows)oflength M =200 sampleseach.Givenatotalof F non-overlappingframesinthespirometeroutput, itistransformedusingaDCTbasis z i = Tu i for i =[1 ;:::;F ] (2.2) where, T representsthe ( M M ) DCTbasis.Thevector u i 2 R M representsthe i -thframe ofthespirometerwaveformand z i 2 R M representsthecorrespondingvectorofcoe˚cients obtainedafterapplicationoftheDCTtransform.Aquantized/distortedestimateofthe spirometersignalisthenobtainedusingtheinverseDCTtransformasbelow e u i = T 1 e z i for i =[1 ;:::;F ] (2.3) where, e z i isthevectorthatisobtainedbyretainingonlythe N M coe˚cientsin z i that havethelargestabsolutevalues;theremainingcoe˚cientsaresettozero.Foragivenvalue of N qualityofthereconstructedsignalisevaluatedbycomputingitsaverageSignal-to- DistortionRatio(SDR)asbelow: SDR = 1 F F X i =1 20 log jj e u i jj jj u i e u i jj (2.4) where, jjjj representsthe l 2 -norm.Figure2.3displaysaplotoftheSignal-to-Distortion Ratio(SDR)fordi˙erentvaluesof N . Thelength M ofeachframe u i isequalto200samplesandtherefore,themaximum valueof N canbeequalto200.However,weareinterestedintheSDRvaluesatsmall valuesof N ,henceFigure2.3displaysSDRonlyuptoamaximumof N =30 coe˚cients. 19 Figure2.3:Signal-to-DistortionRatioofaSpirometerSignalReconstructedfora˝xed numberofDCTcoe˚cients. Figure2.3indicatesthatareasonableSDR( > 10dB)canbeachievedevenusingasmall number(10to15)ofDCTcoe˚cients.Itishighlightedherethatweareprimarilyinterested inextractingthefrequencyandtheamplitude;otherfactorssuchastheexactshapeofthe waveformarenotcriticalandtherefore,canbecompromisedatthecostofpreservingthese twoparameters.Therefore,theSDRmaynotbethebestmetrictomeasurethequality ofthespirometersignalsinceevenalowqualitySDRsignalmaybeacceptableaslong asitpreservesthecriticalparameters.Hence,SDRisonlyemployedinthissectionfor demonstrationpurposes,the˝nalresultsareevaluatedusingdi˙erentmetrics(describedin section-2.3). AdemonstrationoftheAMapproximationispresentedinFigure2.4wherethetop˝gure containsthereferencespirometersignal.Figure2.4(b)containsaplotofthequantized spirometersignal, e u =[ e u 1 ;:::; e u F ] ,obtainedbyretainingonlythesinglelargestDCT coe˚cientineachframei.e.; N =1 .TheSDRobtainedforthissignalisequalto-0.104 dB.TheplotinFigure2.4(c)showsthesignalobtainedviatheAMapproximation.In thiscasethecarrier(breathingrate)componentisobtainedbysettingthespirometerDCT coe˚cientwiththelargestabsolutevalue(ineachframe)to1.Theremaining M 1 20 Figure2.4:(a)OriginalSpirometersignal.(b)Reconstructedsignalusingonly1DCT coe˚cient;SDR=-0.104dB.(c)ReconstructedsignalusingtheAMapproximation;SDR =5.483dB. coe˚cientsaresetto0.Thiscarriercomponentisthenmultipliedwiththe,ideal,envelope obtainedfromthespirometersignalinFigure2.4(a)toobtaintheAMapproximation displayedinFigure2.4(c).TheSDRinthiscaseisequalto5.483dBtherefore,moving fromthesingleDCTcoe˚cienttotheAM-approximationresultsinagainofabout5.5dB. TheAM-approximationalsousesonlyasingleDCTcoe˚cienthowever,italsomultiplies thecoe˚cientwiththeenvelope.Therefore,itseemsthattheAMapproximationisafair assumption.Notethatuseoftheenvelopehereisonlyfordemonstrationpurposes; fortheSECalgorithmproposedinthischaptertheenvelopecomponentislearntfromthe spirometersignalduringthetrainingphaseandpredictedfromelectrodeoutputsduringthe testphase. 21 2.3SpirometerSignalRegression Themulti-leadplethysmographysystememployedfordatacollectionconsistsofatotalof 10impedanceelectrodesstrategicallyplacedatdi˙erentspatiallocationsoverthepatient's body.Asmentionedpreviously,aspirometerwasemployedtocapturethepatient'strue respiratorystate.Onaverageeachsubject'sdatasetconsistedof95,000samples.Each datasetwassplitinto9non-overlappingsetsoutofwhich8wereemployedfortrainingand 1wasusedfortestingatonetime.Anumberofdi˙erentregressiontechniquesweretested toobtainthebestestimateofthespirometersignal.Thefollowingsubsectionsdescribein detailsomeoftheregressionapproachesthatwereemployed.Duetospaceconstraintsitis notpossibletoplotallofthereconstructedtestsignals.Therefore,only4testsignalsfor eachregressiontechniqueareplottedhere.Forconsistencyandjudiciouscomparisonthe samesetoftestsignalsisplottedforalltheregressionmethodspresentedinthefollowing subsections.ResultsforalltheDatasetsaresummarizedattheendinTables2.2and2.1. Thecriticalparametersherearethesignal's breathingrate and envelope .Therefore,to evaluatethequalityoftheestimatedrespiratorysignalthefollowingperformancemetrics areemployed: 2.3.0.1AverageBreathing-RateError( BR err ) Theaccuracyoftheestimatedrespirationrateisevaluatedbycomparingthepredictedsignal withthereferencespirometersignal.Morespeci˝callyboth,theestimatedandreference, signalsaredividedinto60secframesandtherespirationrateiscalculatedbyidentifying thehighestenergyfrequencycomponentintheirrespectivespectrograms.Thespectrogram wasevaluatedusing60seclongGaussianwindowswithanoverlapof25secbetweensucces- sivewindows.Theaveragebreathingrateerror( BR err ),inbreaths-per-minute(BPM),is thencomputedby˝rsttakingtheabsolutedi˙erencebetweenthereferenceandestimated breathing-ratecurvesandthenaveragingoverthetotalnumberofframes. 22 2.3.0.2EnvelopeCorrelationCoe˚cient( E ˆ ) Thecorrelation-coe˚cientisemployedtoquantifytherelationshipbetweenthe temporal variations oftheenvelopes,ofthereferenceandtheestimatedrespiratorysignals.Inpar- ticular,thismetriciscriticalforevaluationoftestsignalsthatcontainamixtureofdi˙erent respiratorystates(suchasthesignalinFigure1.2)wheretheenvelopeexhibitssigni˝cant temporalvariations. 2.3.1SupportVectorRegression(SVR) ThedescriptionofsupportvectormachinesinthissectionisbasedonSmolaet.al'stu- torial[42].Supportvectormachines(SVM)[43],[44]areamongstthemostpopularand widelyappliedtoolsinregressionproblems.Givenasetofinputtrainingvectors, [ x 1 ;:::; x l ] belongingtoinputspace X = R d inthecurrentcontext ,andcorrespondingtrainingla- bels,givenby [ y 1 ;:::;y l ] 2 R .Support-VectorRegressionattemptsto˝ndsafunction f ( x ) thathasatmost deviationfromalltrainingvalues, y i ,andisas˛ataspossible[42].Ina linearformulationthefunction f ( x ) isassumedtohavethefollowingform: f ( x )= w t x + b (2.5) where, w 2X and b 2 R .Forfunctionsofthetypein(2.5) Flatness correspondstoseeking asmall w .Thismaybeachievedbyminimizingthenormof w .Thusonecansolvefor w withintheframeworkofconvexoptimization.However,itispossiblethatafunction, f ( x ) ,thatsatis˝es -deviationconstraintforallpairs ( x i ;y i ) maynotexist.Therefore,slack variables ˘ i ;˘ i areintroducedtotoleratesomeerrorstomaketheoptimizationfeasible. Minimizationof j x j subjecttotheconstraintsdiscussedabovecannowbeformulatedasthe followingoptimizationproblem[43]: 23 min 1 2 j w j 2 + C l X i =1 ( ˘ i + ˘ i ) (2.6) subjectto 8 > > > > > > < > > > > > > : y i w t x i b + ˘ i w t x i + b y i + ˘ i ˘ i ˘ i 0 Thetrade-o˙betweentheerror-toleranceandthe˛atnessof f ( x ) iscontrolledbythe constant C> 0 [42]. Non-Linear supportvectorregressionoperatesbymappingthetraining instance x i intoa(generallyhigherdimensional)featurespace S usingthemap : X!S andthenapplyingthestandardsupportvectorregressionalgorithm. The primal objectivefunctionof(2.6)canbesolvedmoreeasilyinitsdualformby makinguseofaLagrangianfunction.The dual oftheprimalin(2.6)generalizedtothe non-linearcaseisgivenby: max 8 > > > > > > < > > > > > > : 1 = 2 P l i;j =1 i i K x i ; x j P l i =1 i + i + P l i =1 y i i i (2.7) subjectto l X i =1 ( i i ) and i ; i 2 [0 ;C ] where, i 0 and i 0 representtheLagrangemultipliers; K x i ; x j := ( x i ) t x j is calledthe Kernel -function.Thedualin(2.7)dependsonlyonthedotproductinthefeature spaceandtherefore,canbesolvedwithoutexplicitlycomputingof ( x i ) .Inthenon-linear case, x and f ( x ) takethefollowingform: w = l X i =1 ( i i ) (2.8) f ( x )= l X i =1 ( i i ) K ( x i ; x )+ b (2.9) 24 Therefore;aftertraining,predictionsusingfuture,ortest,datavectorscanbemade usingequation(2.9).ForSVregressiontheopen-sourceLIBSVMtoolbox[45]isemployed. Thekernelusedistheradialbasisfunction(RBF)kernel, K ( x i ; x ):= exp ( j x i x j ) . Thedimension, d ,oftheinputfeaturevectorsisequalto20.Eachfeaturevectorcontains 10electrodesamples(onecorrespondingtoeachofthe10electrodes)andanadditional10 valuescontainingthedeltacoe˚cientsofeachelectrode.Thedeltacoe˚cientscorrespondto the1 st derivativeoftheelectrodetime-seriesoutputsandareemployedheretocapturethe temporaldependenceontheprecedingsamples.Inordertocapturetemporaldependence wealso,experimentedwithfeaturesbasedontheauto-regressive(AR)modelhowever,the resultswerenotencouragingandtherefore,arenotpresentedhere.Theparameter C and parameter,oftheRBFkernel,wereselectedbyperformingagrid-searchonalargecollections oftestsignalsfromdi˙erentpatients;thevaluesthatgavethebestperformance(interms oftheperformancemetricsdescribedabove)wereselectedfortherestofthesimulations. TheestimatedtimeseriesobtainedviaSVRforfourtestsignalsareshowninFig- ures2.5(b)-2.8(b).Thereferencespirometeroutputisalsoshownforcomparison.Res- piratorysignalestimationisgenerallymorechallengingathigherbreathingratestherefore, weselectedtestsignalsthatcontaininstancesofapnea,normalandacceleratedbreathing. Furthermore,todemonstratetheimpactofmotionartifacts,threeoutofthefourtestsig- nalsalsocontainregionswherethesubjectsareperformingaphysicalactivity.ForTest Signal-1(Figure2.5)SVRresultsinanoverall RR err =4 : 58 BPM however,theenvelope correlationcoe˚cientisonly0.338.Additionally,thereseemstobesigni˝cantdegradation intheenvelopeandrateestimationinduetomotionartifactswhenthesubjectisphysically active.ForTestSignal-2the RR err =17 : 61 BPM whichisquitehighandtherealsoseems tobenoticeabledegradationinthemotionartifactregion.ForTestSignal-3the RR err is almostzerohowever,thissignaldoesnotcontainanyregionsofphysicalactivityandenve- lopecorrelationcorrelationcoe˚cientisstillquitelow.TestSignal-4followsasimilartrend andthereappearstobesigni˝cantdegradationintheregioncontainingphysicalactivity. 25 Figure2.5: TestSignal-1 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR ( RR e rr =4.58BPM, E ˆ =0.338) (c) GMR( RR e rr =5.92BPM, E ˆ =0.771) (d) DCT basedestimation,( RR e rr =6.58BPM, E ˆ =0.327)and (e) SEC( RR e rr =2.79BPM, E ˆ =0.989).Subjectperformingphysicalactivitybetween0to100sec. Therefore,thereseemstobeasigni˝cantmarginforimprovementandotheralternatives mustbeinvestigated. 2.3.2GaussianMixtureRegression(GMR) Giventheharmonicnatureoftherespiratorysignal,aGaussianMixturebasedapproach seemstobeaveryappealingoption.Forexample,Gaussianmixturemodels(GMMs)are oneofthemostwidelyemployedmethodsinspeechbasedapplications[46],[47]where theunderlyingsignalcontainscomplexharmonicinformation.GMMsarebasedonthe assumptionthattheunderlyingdistributionofthedatacanbeapproximatedbyamulti- modalGaussiandistribution.Asingleinstanceofthefeaturevector, x ,attheinputof theGaussianmixturemodelis d (=20) dimensionalandconsistsoftheelectrodeoutputs 26 Figure2.6: TestSignal-2 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR ( RR e rr =17.61BPM, E ˆ =0.572) (c) GMR( RR e rr =9.38BPM, E ˆ =0.712) (d) DCT basedestimation,( RR e rr =13.83BPM, E ˆ =0.384)and (e) SEC( RR e rr =2.80BPM, E ˆ =0.868).Subjectperformingphysicalactivitybetween0to100sec. plustheircorrespondingdeltacoe˚cients(atonetimesample).Thusthefeatureextraction blockisidenticaltotheoneemployedforSVMregressioninsection2.3.1.TheGaussian mixturedensityofan ( d +1) dimensionalmultivariaterandomvariable =[ x ;y ] ,obtained byconcatenating x andthe(1-dimensional)spirometeroutput y ,isgivenby[48]: p ( )= p ( x ;y )= K X k =1 ˇ k N ( ; k ; k ) (2.10) where, K isthetotalnumberofGaussiancomponents, ˇ k arenon-negativemixingcom- ponentswith P K k =1 ˇ k =1 and N ( ; k ; k ) representsamulti-variateGaussiandensity. Furthermore, k denotesthemeanvectorandisgivenby: k = 0 B @ k x ky 1 C A (2.11) 27 Figure2.7: TestSignal-3 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR ( RR e rr =0.396BPM, E ˆ =0.417) (c) GMR( RR e rr =16.21BPM, E ˆ =0.846) (d) DCT basedestimation,( RR e rr =16.21BPM, E ˆ =0.343)and (e) SEC( RR e rr =3.03BPM, E ˆ =0.947).Nophysicalactivityatanytime. k representsthecovarianceandisgivenby: k = 0 B @ k xx k x y ky x kyy 1 C A (2.12) AfterinitializationusingK-meansclustering[49]theEMalgorithm[50]isemployed to˝ndtheparametersoftheGaussianmixturedistribution(ofequation(2.10))thatbest ˝tsthetrainingdata.Thenumberofmixturecomponents( K )isdeterminedusingthe Bayesian-InformationCriterion(BIC). Theconditionaldistribution, p k ( y j x ) ,ofacomponent k ,ofthespirometeroutput y given theinputfeaturevector x isdeterminedbydividingthejointdistribution, p k ( x ;y ) ,bythe 28 Figure2.8: TestSignal-4 ;timeseriesobtainedfrom: (a) Referencespirometer (b) SVR ( RR e rr =4.80BPM, E ˆ =-0.041) (c) GMR( RR e rr =4.91BPM, E ˆ =0.646) (d) DCT basedestimation,( RR e rr =7.81BPM, E ˆ =-0.081)and (e) SEC( RR e rr =3.24BPM, E ˆ =0.291).Subjectperformingphysicalactivitybetween100to180sec. marginaldistribution, p k ( x ) [48]: p k ( y j x )= p k ( x ;y ) p k ( x ) = N x j ky j x ; 1 kyy (2.13) where, kyy isasubmatrixofthematrix k = 1 k givenby: k = 0 B @ k xx k x y ky x kyy 1 C A (2.14) Theconditionalmean, ky j x ,isgivenby: ky j x = ky 1 kyy ky x ( x k x ) (2.15) Duringthetestphasethespirometeroutputispredictedfromthefeaturevectorsusingthe Gaussianmixturedistributionlearntduringthetrainingphase.Morespeci˝cally;givena 29 testvector x theestimate ^ y ofthespirometerisequaltotheexpectationoftheconditional distribution p ( y j x ) : ^ y = E [ p ( y j x )] (2.16) ThespirometertestsignalsestimatedusingGaussianMixtureRegression(GMR)are plottedinFigure2.5(c)-2.8(c).ItseemsthatGMR,ascomparedtoSVR,givesabetter estimateoftheenvelopeasindicatedbyboththeshapeoftheGMRestimatesandthe highervaluesof E ˆ .However,intermsofbreathingrateestimationitseemsthatGMRalso degradesinhighbreathingrateandmotionartifactregions.Therefore,overallitseemsthat GMRdoesresultinperformanceimprovementoverSVR,especiallyatnormalrespiration rates.However,itsperformanceathigherrespirationratesisnotsatisfactoryandthereis stillmarginforimprovement. 2.3.3DCTBasedEstimation Ingeneral,itisdi˚culttooptimizemachinelearningclassi˝erstogiveoptimalperformance inregressionapplicationsthathavesigni˝canttemporalcorrelationbetweenneighboring samples.Thisisbecausetheunderlyingtheory,moreoftenthannot,assumesthatthedata pointsareindependentandidenticallydistributed.Apotentialsolutiontothisproblem canbetoconsidertechniquessuchas MarkovModels thatexplicitlycaterforthetempo- raldependenceoremployafeatureextractionfront-endthatoutputsfeaturevectorsthat e˙ectivelycapturetemporaldependence.TheadditionofdeltafeaturesinSVRandGMR didamelioratethesituationtosomeextenthowever,thereisstillroomforimprovement. Thissubsection,presentsaverysimpletime-frequencytechniquethatoperatesontemporal framesof˝nitelengthandestimatesthespirometersignalonaframe-by-framebasis,in contrasttothesample-by-sampleestimationapproachemployedbySVRandGMR.The outputsignalfromeachelectrodeissplitintonon-overlappingframesoflength M temporal samples.Di˙erentframelengthswereexperimentedwith; M =200 sampleswasfoundto givethebestperformance.Assumingthatmaximumnumberofframesinagiventestsignal 30 equals F thenforthe i -thframethe( M N )matrix X i ,whosecolumnscontainthetime samplesfromthe N (=10) electrodes,theDCTcoe˚cientmatrix C i isobtainedby C i = TX i for i =[1 ;:::;F ] (2.17) where, T representsthe( M M )DCTbasis.Inthenextstepthequantizedcoe˚cient matrix e C i isobtainedbysettingall,butthe P ( ˘ (2.22) thelargestmagnitudeelementof c Tst i isassignedamagnitudeof1(signremainsunchanged); theremainingelementsaresettozero.Ifontheotherhandcondition(2.22)isnotsatis˝ed, thenallcoe˚cientsaresettozero.Theparameter =0 : 15 anditsvaluewasdetermined heuristicallyfromthedata.Figure2.10-(c)demonstratesthespirometerestimateobtained viathenoisyframesuppressionprocedureoutlinedabove. ResultsforTestSignals-1to4aredisplayedinFigures2.5(e)-2.8(e).Itcanbeobserved thattheperformanceimprovementissigni˝cant.ForTestSignal-1SECoutperformsall othertechniquesbyalargemarginbothtermsofenvelopeandbreathingrateestimation. Theenvelopecorrelationcoe˚cientforthissignalisalmostequalto1,RR err =2 : 80 BPM . Additionally,SECseemstobemorerobusttothee˙ectofmotionartifactsasisapparent 36 Table2.1:AverageRespirationRateError( RR err )for11di˙erenthumansubjects. fromitsestimatebetween0to100sec.BothSVRandGMRsu˙erfromdegradationin thisregion.SimilartrendsareobservedinTestSignal-2aswell.ForTestSignal-3the RR err =3 : 03 BPM forSECisslightlyhigherthanthatachievedbyusingSVRhowever;SEC isstillsigni˝cantlybetterintermsofenvelopeestimation.ForTestSignal-4theenvelope correlationcoe˚cientvalueisnotthebestamongstthefourtechniquesmentionedinthis work.However,thismetricisnotperfectandvisuallyitseemsthatSECestimationgivesan acceptableperformance.IntermsofRR err stillgivesthebestperformanceandagainseems tobemoretolerantofmotionartifactsthantheotherthreeapproaches. 2.5Results DuetospaceconstraintsitisnotpossibletoplotalltheTestsignalsandtheperformance obtainedbyallmethodsfor11di˙erenthumansubjectsaresummarizedinTables2.1and 2.2.Asmentionedatthestartofsection2.3,eachpatient'sdatawasdividedinto9non- overlappingsubsets.Atonetimeasinglesubsetwasusedasthetestsignalandtherest wereusedfortraining.Inthisfashioneachsubsetwasusedasatestsignalandcompared 37 Table2.2:EnvelopecorrelationCoe˚cient( E ˆ )for11di˙erenthumansubjects. withthespirometerreferencesignal.Therespirationrateandenvelopeperformancemetrics werecomputed;afterthisthesubsetwasreplacedinthetrainingsetandthenextsubset wasselectedandtrainingandtestingphaseswererepeated.Furthermore,wesubdivided eachsubject'sTestSignalsintotwogroupsdependingonwhethertheycontainedLowor Highrespirationrates.ForexampleallsignalssuchasTestSignal-1werelabeledasLow respirationratesignalswhereas,signalssuchasTestSignal-2,3and4werelabeledasHigh respirationratesignals.EachmetriclistedinTable2.1andTable2.2wascomputedby averagingoveralltheLoworHighbreathingratetestsignalsforthatparticularhuman subject.Forexample,theaveragerespirationrateerrorforsubject-1usingtheSECis1.789 BPMforlowbreathingratetestsignalsand10.212BPMforhighbreathingratetestsignals. ItcanbeobservedfromTable2.1thatintermsofrespirationrate,theAMbasedSEC approachgivesthebestperformanceformajorityofsubjects.Mostpracticalapplications requirethattherespirationrateerror( RR err )mustbelessthan10BPMatalltimes.We employastricterthresholdof5BPMhere.Atlowbreathingratesallapproaches,except SVR,givelowerrors.However,overallSECgivesamoreconsistentperformanceandits 38 RR err isnevergreaterthan5BPM.Whereas, RR err forGMRexceeds5BPMfor4out ofthe11subjects.DCTandSVRestimationexceedthe5BPMthresholdfor1and6 subjectsrespectively.AthighrespirationratesSECgivesthebestperformanceonaverage andits RR err exceedsthe5BPMfor3subjects.SVRandGMRexceedthethresholdfor 11and8subjectsrespectively.Whereasthe RR err forDCTbasedestimationexceeds5 BPMforallbutoneofthe11subjects.Overall,thevaluesinTable2.1demonstrateatrend similartothatobservedforTestSignals-1to4i.e.,GMRgivesreasonableperformanceat lowrespirationrateshowever,athighrespirationratesitgivesareasonableperformance inafewcasesbutcompletelymissesthemarkinthemajorityofcases.SEC,incontrast, deliversamuchmoreconsistentperformance. Envelopecorrelationcoe˚cients( E ˆ )arelistedinTable2.2.Athighbreathingrates, SECgivesthebestperformanceforallsubjectsexceptsubjects6and9.Forhumansubject-6 GMRperformsslightlybetterhowever,SECalsoresultsinaveryhighcorrelationcoe˚cient (0.918).Forhumansubject-9,theenvelopeestimationperformanceforall4techniques issubpar;thismaybeduetoextraordinarilyhighlevelsofnoiseorinterferenceduring datacollection.Atlowbreathingratesthereisnotasigni˝cantdi˙erencebetweenthe performanceofallthefourtechniquesintermsofenvelopeestimation.OnaverageDCT givesthebestperformancefollowedbytheSEC.However,itishighlightedthatatlow breathingratesthesignalenvelopedoesnotexhibitsigni˝cantvariationsintheshapeof theenvelope.ThemajorityofsignalsatlowrespirationratesaresimilartoTestSignal-1 andtherefore,haveanalmost˛atenvelopewithanamplitudeclosetothesubject'saverage lungvolume.Inthesescenariosafewfalsepeaksintheestimatedenvelopemayresultin signi˝cantvariationsin E ˆ .ConsiderforexampleTS-1,althoughthecorrelationcoe˚cient oftheSECestimateofthissignalislowerthanthatoftheGMRestimateitcanbeobserved visuallythatthereisnotasigni˝cantdi˙erencebetweenthetwoenvelopes.Athighbreathing rateshowever,thereferencesignalsexhibitssigni˝cantvariationsinlungvolumeshapeand correlationcoe˚cientgainsmuchmoreimportanceintheseregions.Theresultsdemonstrate 39 thatSECdeliverssigni˝cantperformanceimprovementsoverallotherapproaches. 40 CHAPTER3 BREATHINGRATEESTIMATIONUSINGKERNELMETHODS ThepreviouschapterdiscussedtheSECwhichisaframeworkforestimationboththebreath- ingrateandlungvolume.ThetechniqueforestimationofthebreathingrateintheSECis quitesimpletechniqueforbreathingrateestimation.Althoughitgivesreasonableperfor- mancetherestillisroomforimprovement.Inthischaptertheemphasisisshiftedprimarily totheestimationofbeathingrateesimtationalone.Therearetwoprimaryreasonsforthis: (1)Breathingrateisconsideredtobeamuchmoreimportantinclinicalpracticethanlung volume;and(2)Improvementinrateestimationisboundtobene˝tlungvolumeestimation aswellsinceaccurateestimationofbreathingrateiscriticalforlungvolumeestimationusing theSEC.Toelaboratefurther,thebreathingrateestimationapproachproposedinthischap- tercanbeusedtoreplacetherate(carrier)estimationtechniqueemployedinsection2.4.2. Inthischapterkernelmachinesareemployedtoforrobustbreathingrateestimation.The bestperformingtechniqueemploysaninnovativesetoffeaturesconstructedfromthedis- cretewavelettransformtodi˙erentiatebetweenvariousrespiratorystateswhichenablethe learningalgorithmadaptbasedontheunderlyingstate(suchasApnea,fastbreathingor normalbreathingetc). Adetailedanalysisoftheresultsobtainedfromthe,DCT˝lterbased,breathingrate estimationemployedinthepreviouschapterrevealsthat(otherthanartifactregions)the mostchallengingrespiratorystatesforrateestimationusingimpedanceelectrodesarehy- perventilationandapneaespeciallywhenelectrodeoutputsarenoisy.Twoexamplecases arepresentedinFigures3.1and3.2.Detectionofbreathingrateduringhyperventilation ischallengingprimarilybecausethesubjectistakingveryshallowbreathswhilebreathing ataveryrapidrate.Therefore,thechangesinthelungvolumearequitesmallandcan missedeasilyespeciallywhentheelectrodesignalcontainsnoise.Considerforexample,the 41 Figure3.1:Breathingrateestimationduring hyperventilation underhighnoisyconditions. (a)ReferenceSpirometer.(b)SECoutput.(c)Electrode-2output. respiratorysignalshowninFigure3.1(a);thesubjectstartsbreathingatanacceleratedrate beyondthe30secondmark.Examinationofonlythespirometersignalseemstoindicate thatthereshouldn'tbemuchdi˚cultyinestimatingbreathingratesatanyratebecause theamplitudeofthespirometersignalduringacceleratedbreathingiscomparabletoits amplitudeundernormalbreathingconditions.However,thespirometermeasures ˛ow ;our taskistoestimatethebreathingratefromtheelectrodes,whichmeasurethelungvolume directlywhichdoesnotchangesigni˝cantlyduringtheshallowbreathinghyperventilation conditions.Therefore,whennoisepowerishighitbecomesverydi˚culttodi˙erentiate betweennoiseandgenuinebreathsignals.Thiscanbeobservedfromtheplotofelectrode-2 showninFigure3.1(c),itcanbeseenthatamplitudeoftheelectrodeoutputinthehy- perventilationregion(between0.5minto2min)isquitelow;asaresultaspectralenergy basedrateestimationalgorithmmayunderorover-estimatethebreathingrateasshownin 42 Figure3.2:Breathingrateestimationduring apnea underhighnoisyconditions.(a)Refer- enceSpirometer.(b)SECoutput.(c)Electrode-2output. Figure3.1(b).Similarly,ahighlevelofnoisemayalsocausefalselytreatinganapnearegion asahyperventilationregionasshowninFigure3.2(c). AlthoughtheSECemploysnoisy-framesuppression(describedinsection2.4.2)tomit- igatetheimpactofnoiseitisdependentontheaccurateestimationoftheenvelopeand therefore,maynotworkincaseenvelopeestimationisnotaccurate.Therefore,thischapter presentsatechniquewhichemployskernelmachinesforaccurateestimationofbreathing rate;resultsdemonstrateimprovementsinperformance. 3.1 Gini KernelMachinesforBreathingRateEstimation AsdiscussedabovethebreathingrateestimationalgorithmemployedbytheSECdegradein thepresenceofmotionartifactsandhighnoisepower.Trainingkernelmachinesonadataset 43 thatcontainsexamplescenarioscontainingartifactsandnoisemayenablebetterestimation ofthebreathingrate.Thetrainingdatasetcontainsanumberofrepresentativenoiseand artifactscenariosandalsocontainsmultipleelectrodechannelstherefore,itisanticipated thatthekernelmachineshouldbeabletolearnthemappingtoactualbreathingsignaleven whensomeoftheelectrodechannelsarecorrupted.Thesubsectionsthatfollowdescribe indetailtheoptimizationtechniquesemployedfortraining Gini kernelmachinesandhow theycanbeusedforbreathingrateestimation.Thisoptimizationframeworkwaspresented in[51];itisdescribedhereforconvenienceandismodi˝edaccordingtothedemandsofthe applicationwherenecessary. 3.1.1SupervisedLearningUsing Gini -KernelMachines Inageneralsupervisedlearningframeworkthelearneristrainedusingasetoffeaturevectors TˆX : T = x i ;i =1 ;::;N independentlydrawnfroma˝xeddistribution P ( x ) ; with x 2 X .Furthermore,thelearnerisalsoprovidedwithasetofsoft(orhard)labels y ik = P ( C k j x i ) thatrepresenttheconditionalprobabilitymeasuresrepresentingtheprobabilityof observingclass- k givenfeaturevector x i .Thesetofclassesisdiscrete ( k 2 [1 ;:::;M ]) and thelabelstherefore,arenormalizedtosatisfythecondition P M k =1 y ik =1 .Forbreathing rateestimationthenumberofclasses M =2 ;withclass-1representing inspiration and class-2representing expiration .Thesoftlabelsforeachclassarederivedusingthelogistic transformationasdescribedinsection3.1.2.Thetaskofthelearneristosearchforuseful patternsinthetrainingdataandusethemtoselecta˝nitesetofregressionfunctions ~ P = f ~ P k ( x ) g ;k =1 ;::;M thatareaccurateestimatesofthetrueconditionalprobabilities P ( C k j x ) .Thelearneraccomplishesthisbyincorporatingpriorknowledgeofthetopologyof thefeaturespaceusingadistancemetric D Q : R M R M ! R .Inadditionto D Q ,thelearner alsoemploysanagnostic(ornon-informative)distancemetric D I : R M R M ! R which assumesnoknowledgeofthetrainingset.Useoftheagnosticpriorenablesenforcement ofthesmoothnessconstraintsontheregressionfunction ~ P k ( x ) .Smoothingisconsistent 44 withtheprinciplesofmaximumentropy[52]andisrequiredbecausethepriorlabels y ik arebasedonlyonthetrainingdata;useof D Q aloneresultsinanover-˝ttedsolutionthat doesnotgeneralizewelltounseendata.Therefore,thetrainingprocedureforestimation oftheprobabilityfunctions ~ P = f ~ P k ( x ) g isgenerallyperformedbyminimizationofajoint distancemetricasbelow min ~ P G ( ~ P )=min ~ P [ D Q ( Y; ~ P )+ D I ( ~ P;U )] : (3.1) Where Y : R N R M isamatrixofpriorlabels y ik = P ( C k j x i ) ; with i 2 [1 ;::;N ] and k 2 [1 ;::;M ] . U denotesauniformdistributiongivenby U k ( x )=1 =M; 8 k =1 ;::;M . > 0 isahyper-parameterthatcontrolsthetrade-o˙betweentheprior ( D Q ) andagnostic ( D I ) distancemetrics.Thesolutionobtainedbyminimizingthecostfunction3.1iscloseto bothpriordistributionwithrespecttothedistancemetric D Q ( :;: ) andtheagnostic(non- informative)uniformdistribution U .Themaximumentropyframework[52]alsopermitsthe impositionoflinearconstraintsontheoptimizationproblem3.1.Theseconstraintsshould beintermsofcumulativestatisticsde˝nedonthetrainingset.The˝rstlinearconstraint imposestheequalityconditionbetweenthefrequenciesofoccurrenceofaclass k =1 ;::;M underthedistribution ~ P toanequivalentmeasureunderthepriordistribution y ik .This ˝rstconstraintexpressesequivalencebetweenaverageestimatedprobabilitiesandempirical frequenciesforeachclassoverthetrainingset N X i =1 ~ P k ( x i )= N X i =1 y ik ;k =1 ;:::M (3.2) Theunderlyingassumptionhereisthatallfeatures x 2X areequallylikely.Thenormaliza- tionandboundaryconditionsforvalidprobabilitydistributionsareexpressedusingasecond setoflinearconstraints ~ P k ( x ) 0 ;k =1 ;:::M; (3.3) M X k =1 ~ P k ( x i )=1 (3.4) 45 Figure3.3:Maximumentropyregressionforsupervisedlearning;thesquareregionrepresents theconstraintspace.(a) !1 :Solutionistheprojectionof U ontotheconstraintspace. (b) =0 :Solution ~ P isequalto Y .(c)Non-extremevaluesof :Solution ~ P liesatalocation withintheconstraintspacethatminimizesthetotaldistancetothepriordistribution Y and theagnosticdistribution U . wherethenormalizingequalityconstraintsubsumestheadditionalinequalityconstraint ~ P k ( x ) 1 ;k =1 ;:::M . Anillustrationofthesolutiontotheoptimizationprobleminequation(3.1)isshown inFigure3.3.Forthepurposeofillustrationthelinearconstraints(5.8),(3.3)and(3.4) arerepresentedbytheshadedsquareregion.Asaresultanysolutionsto(3.1)mustlie withinorattheboundaryoftheconstraintspace.Theproximityofthe,learned,distri- bution ~ P tothepriorempiricaldistribution Y isdeterminedbythedistance D Q ( Y; ~ P ) . Thedistance D I ( ~ P;U ) thatde˝nesanagnosticmodelwhichassumeszeropriorknowledge. Thisframeworkissimilartothemaximumentropyapproach[52,53].Notethattheprior distribution Y lieswithintheconstraintspacewhereastheagnostic U distributionwilllie outsidetheconstraintspaceundernon-degenerateconditions.Thelocationofthesolution ~ P withrespecttotheprior Y andagnostic U distributionsisin˛uencedbythevalueofthe hyper-parameter > 0 .Thisparameteralsodeterminesthegeneralizationperformanceand sparsityofclassi˝ersde˝nedby ~ P .AsdemonstratedinFigure3.3,for =0 ,thesolution 46 overlapswiththepriordistribution Y resultinginover-˝ttingofthetrainingset.When !1 ,themaximumentropyentropysolutionisachievedwhichistheprojectionofthe agnosticdistribution U ontheconstraintspace. Afterapplicationof˝rstorderKarush-Kuhn-Tucker(KKT)conditions[54]theopti- mizationproblemin(3.1)alongwiththeconstraints(5.8)to(3.3)canberepresentedbya Lagrangianfunction L L ( G;b k ; k ;z )= G ( ~ P ) b k N X i =1 ~ P k y ik k ~ P k ( x ) z 0 @ 1 M X k =1 ~ P k ( x ) 1 A (3.5) Here b k and k representLagrangemultiplierscorrespondingtofrequencyconstraints(5.8) andtheinequalityconstraints(3.3)respectively. z ( x ) correspondstoLagrangemultiplier forthenormalizationconstraint(3.4).Minimizationwithrespecttotheprobabilityfunction ~ P = f ~ P k ( x ) g canbeachievedbytakingthegradientandsettingtheresulttozero. @D I ( ~ P;U ) @ ~ P k ( x ) = @D Q ( Y; ~ P ) @ ~ P k ( x ) + b k z ( x )+ k ( x ) : (3.6) Forsimplicitypurposesitisassumedthat D I ( P;U ) hasaformthatcanbedecomposed intoindependentandidenticallydistributed(i.i.d.)components.Furthermore,aquadratic formulationisemployedasadistancemetric.Thisleadstothefollowingformfor D I ( ~ P;U ) D I ( ~ P;U )= M X k =1 N X i =1 1 2 ~ P k ( x i ) U ik 2 (3.7) Theagnosticdistribution U ik ( 1 =M ) isuniform,anduponsubstitutionyieldsthefollowing form D I ( ~ P;U )= 1 2 M X k =1 N X i =1 ~ P k ( x i ) 2 N 2 M (3.8) TheLagrangefunctionin(3.5)cannowberearrangedtogivetheclass-conditionalproba- bilityforanyvector x ~ P k ( x )= 1 " @D Q ( Y; ~ P ) @ ~ P k ( x ) + b k z ( x )+ k ( x ) # (3.9) Thereareanumberofchoicesfor D Q ( :;: ) ,thepriordistancemetric,amongstthemost popularisthequadraticdistancewhichemployedwidelyinkernelmethods[55]andin 47 Bayesianmethods(ascovariancefunctions)[56].Fortwodistributions ^ P = f ^ P k ( x ) g and ~ P = f ~ P k ( x ) g thequadraticdistanceisgivenby D Q ( ^ P; ~ P )= C 2 M X k =1 X x ; v 2T K ( x ; v ) h ^ P k ( x ) ~ P k ( x ) ih ^ P k ( v ) ~ P k ( v ) i : (3.10) Where K : R M R M ! R representsasymmetric,positivede˝nitekernelthatsatis˝es M ercer'scriterion, 1 .Somepopularkernelsemployedinmachinelearningapplicationsin- cludetheGaussianradialbasisfunctionandpolynomialsplines[55,57].Thekernel K ( x ; v ) quanti˝esthetopologyofthemetricspaceforthepoints x ; v 2X andtherefore,embeds priorknowledgeintothedistance D Q ( :;: ) .Thegradientofthequadraticdistance D Q ( :;: ) of(5.12)withrespectto ~ P k ( x ) isgivenby @D Q ( Y; ~ P ) @ ~ P k ( x ) = C 2 X v 2T K ( x ; v ) h ^ P k ( v ) ~ P k ( v ) i (3.11) Toavoidcomplexityinnotation, v inequation(3.11)isrewrittenasanindexedvector x i togive: @D Q ( Y; ~ P ) @ ~ P k ( x ) = C 2 N X i =1 K ( x ; x i ) h ^ P k ( x i ) ~ P k ( x i ) i (3.12) The˝rstorderconditions(3.9)forthequadraticform D Q ( :;: ) inequation(5.12)cannow berewrittenas ~ P k ( x )= C 2 [ f k ( x ) z ( x )+ k ( x )] (3.13) where f k ( x )= N X i =1 i k K ( x i ; x )+ b k withinferenceparameters i k = C [ y ik ~ P k ( x i )] : TheLagrangeparameterfunction k ( x ) inEquation(3.13)needstoensurethattheproba- bilityscores P k ( x ) 0 8 x 2X accordingto(3.3),andtheLagrangeparameterfunction z ( x ) needstoensurenormalizedprobabilities P M k =1 P k ( x )=1 accordingto(3.4). 1 K ( x ; v )= x ) v ) .Thereisnoneedtoexplicitlycomputethemap ) sinceit onlyappearsininner-productform. 48 Theprocedureforobtainingthesetofinferenceparameters = f i k g ;i =1 ;::;N;k = 1 ;::;M entailssolving(3.1)overthesetoftrainingdata T .Expressingthequadraticdistance D Q ( Y; ~ P ) inequation(5.12)intermsoftheinferenceparameters i k andsubstitutingback inthecostfunction(3.1)alongwiththeagnosticdistance D I ( P;U ) (3.8)leadstoadual formulationforthe Gini SVMcostfunction H g = M X k =1 2 4 1 2 C N X i =1 N X j =1 i k Q ij j k + 2 N X i =1 ( y ik i k =C ) 2 3 5 (3.14) where Q ij = K ( x i ; x j ) denoteelementsofthe k ernelmatrix Q .Theconstantterm (= N= 2 M ) intheagnosticdistance D I ( :;: ) ofequation(3.8)hasbeendiscardedsinceithasno e˙ectontheminimization.Asisthecasefortheprimal(3.1),minimizationofthedual H d shouldalsobeperformedwhileensuringthatthelinearconstraints(5.8)-(3.4)aresatis˝ed whensearchforsolutions.Theconstraintsrewrittenintermsoftheinferenceparameters areasbelow M X k =1 i k =0 ;i =1 ;:::N; N X i =1 i k =0 ;k =1 ;:::M; (3.15) i k Cy ik : Thesolutiontothe Gini dualinequation(3.14)subjecttotheconstraints(3.15)cannow beusingstandardquadraticoptimizationtechniquesthatareavailableinseveralpackages 3.1.2ProbabilisticLabelingofRespiratoryData The Gini kernelmachineframeworkdescribedaboveestimatestheprobabilitiesofadiscrete setofclasses.Asaresulttherespiratorysignalthatistobeestimatedmustbeconverted inprobabilities.Thiscanbeaccomplishedbyviewingtheestimationofthebreathingsignal tobeatwoclassproblemwith class -1representing expiration (orexhalation)and class -2 49 representing inspiration (orinhalation).Furthermore,theexactvalueoftherespiratory (˛ow)curveobtainedfromthespirometercanbemappedtosoftprobabilitylabelsusingthe logistictransform.ConsiderforexamplethespirometeroutputshowninFigure3.4(a).Here positivevaluesrepresentexpiration(or class -1)andnegativevaluesrepresentinspiration(or class -2).Theprobability y i 1 = P ( C 1 j x i ) canbeobtainedby y i 1 ( t )= 1 1+ exp ( c˚ i ) (3.16) where, c isaconstantcontrollingtheshape(orsaturation)ofthelogisticcurveand ˚ i is thevalueofthespirometersignalatthe i -thtimesample.Sincethereareonlytwoclasses theprobability y 2 i =1 y 1 i .Probabilitylabelsobtainedbyapplicationofthelogistic transformofequation(3.16),with c =20,tothespirometersignalinFigure3.4(a)areshown inFigures3.4(b)and(c). 3.1.3ResultsGiniKernelMachine Forevaluationofrespirationrateerrorthetheevaluationcriteriaismademorestringent. Inchapter2atemporalwindowof60secondswithanoverlapof25secondswasemployed. Thisisnowreducedtoawindowlengthto10secondswithanoverlapof5seconds.Ashorter windowsimpliesthatevenverysmalldurationerrorswillbetakenintoconsiderationwhen evaluatingthebreathingrateerror.Theoutputofthe Gini -SVRisdenoisedusingthe daubechieswaveletinordertoeliminatenoise.Thewindowlengthemployedfordenoising is20seconds.Asdoneinthepreviouschapterdatasessionsaregroupedintotwocategories: (1)ArtifactSessions(orsubsets)duringwhichthesubjectismobilebutbreathingata normalrate;and(2)AcceleratedBreathingandApneaSessionsduringwhichthesubject isstationaryandbreathingatanacceleratedrateorholdingbreath;thesesessionsdonot containmotionartifacts.Theaveragebreathingrateerrorforallsubjectsispresentedin Table3.1.AlsopresentedinTable3.1isthebreathingrateerrorobtainedwhenemployingon asingleimpedance-plethysmographicelectrodesensor.Electrode-4isselectedforcomparison 50 (a) (b) (c) Figure3.4:Probabilistictransformationofrespiratorysignal(a)ReferenceSpirometerout- put,positivevaluesof˛owindicateexpiration,negativevaluesindicateinspiration(b)Plot of y i 1 = P ( C 1 j x i ) (orexpirationprobability)versustime(expiration)(c)Probabilityof y i 2 = P ( C 2 j x i ) (orinspirationprobability)versustime. 51 Subject ArtifactSessions AcceleratedBreathing& ApneaSessions Elec-4 SEC Gini Elec-4 SEC Gini 1 4.54 2.29 1.81 11.67 8.67 15.29 2 5.65 2.04 1.93 12.41 5.78 15.74 3 3.07 19.06 1.90 10.52 13.11 12.25 4 7.93 5.76 3.60 16.52 2.34 8.74 5 7.23 3.66 3.13 15.99 4.65 4.13 6 5.06 5.10 3.23 4.93 5.18 12.94 7 8.41 8.33 2.54 28.58 32.85 16.03 8 12.08 4.63 3.75 12.49 3.90 16.49 9 2.73 2.77 2.88 3.70 3.50 5.85 10 4.77 3.77 2.57 12.15 6.59 15.34 11 6.26 2.75 2.27 9.95 9.64 15.56 12 6.19 2.83 2.47 9.52 9.68 14.15 13 8.86 3.38 2.48 8.42 7.77 10.10 14 3.14 2.94 2.72 7.93 11.30 13.61 15 3.39 3.18 2.80 5.10 7.42 6.29 16 3.29 2.69 2.54 10.14 9.17 15.56 17 5.64 4.42 3.49 8.92 3.32 16.74 18 6.11 3.82 2.95 7.06 3.54 7.79 19 4.60 4.79 4.20 6.37 2.77 11.01 M ean 5.73 4.62 2.80 10.65 7.96 12.29 Table3.1:AverageRespirationRateError( RR err )inBPMfordi˙erenthumansubjects. Errorsarecomputedover10secondwindowswith5secondoverlaps. 52 Figure3.5:WA-Giniblockdiagram.Thetopplotillustratesthemainstepsoftherespiratory statedetector(seeequations(3.17)to(3.19)). becauseitgivesthebreathingrateerroramongstallthe10individualsensors(asillustrated inFigure2.2). Resultsdemonstratethatuseofthe Gini kernelmachineresultsinsigni˝cantperfor- manceimprovementsinartifactsessionswheretheaveragereductioninbreathingrateerror isalmostequalto2BPMwhencomparedtotheSEC.The Gini kernelmachineoutperforms bothelectrode-4andSECinartifactsessionsforallsubjectsexceptone(subject-9).Un- fortunately,itseemsthatperformancedegradesinacceleratedbreathingandapneasessions wherethemeanerrorincreasesfrom7.96BPMto12.29BPMwhencomparedtotheSEC. Theproblemsthatcausethisperformancedegradationarediscussedandremediedinthe nextsection. 53 3.2WaveletAdaptiveGiniKernelMachines Performancedegradationof Gini kernelmachineinacceleratedbreathingandapneacanbe attributedtotwofactors:(1)Trainingdataisslightlyunbalancedbecausetherearerelatively largerinstancesofeachsubjectbreathingnormallythanthereareofacceleratedbreathing andapnea.Asaresult,thelearningalgorithmisbiasedtowardsnormalbreathingrates. (2)Denoisingsuppressesoreliminateshigh-frequenciesinregionscontainingaccelerated breathing.Toovercometheseproblemsanadditionalrespiratorystatedetectionblockis addedontopofthe Gini kernelmachinealgorithm.Twoenhancementsaremadetothe Gini kernelmachineregressionframeworkdiscussedintheprecedingsection.First,wavelet ˝lteringisemployedtoaccuratelyidentifyaccelerated-breathingandapnearegions.Second, aseparate Gini kernelmachinetrainedjustonacceleratedbreathingdataisalsoadded. Thismeansthattheframeworknowhastwo Gini kernelmachines;onefornormalbreathing andoneforacceleratedbreathing.Upondetectionofanacceleratedregion,thealgorithm switchestotheacceleratedbreathing Gini .ThisalgorithmistitledtheWavelet-Adaptive- Gini (orWAGini)andisillustratedinFigure3.5.Therespiratorystatedetectoremploys wavelet˝lteringfordetectingthepresenceofacceleratedbreathing.The Gini - selector block switchesbetweentheacceleratedandnormal-breathing Ginis basedontheoutputofthe respiratorystatedetector.Forcomparisonpurposesresultsforamuchsimplerrespiratory statedetector,whichemploystheDiscreteCosineTransform(DCT),arealsopresentedin section3.2.4. 3.2.1RespiratoryStateDetectionusingWaveletsFilters Knowledgeofthesubject'srespiratorystatecanenablefurtherenhancementoftheperfor- manceofthelearningalgorithm.Forexample,ifthelearningalgorithmknowswithhigh con˝dencethatthesubjectisbreathingnormallythenitknowsthatthehigh-frequencies intheelectrodesarecausedbynoiseorinterferenceandshouldbeattenuated.Similarly, 54 knowledgethatsubjectisbreathingatanacceleratedrateshouldtriggertheinverseprocess resultinginampli˝cationofhighfrequenciesandsuppressionoflowerfrequencies.TheSEC rateestimationdoesutilizeraterespiratorystatedetectiontodi˙erentiatebetweenhyper- ventilationandapnearegionshowever,itmakesaharddecision.Therespiratorystateina frameisclassi˝edaseitheracceleratedbreathingorapneadependingontheenergyinlevel inthehighfrequencybands.Thereareanumberofdisadvantagestousingthisapproach; theprimarybeingconstantresolutioninthespectralandtemporaldomains.Useof˝xed lengthtimewindowsimpliesthatawrongdecisionwillimpacttheentireframe.TheSingle Giniapproachoperatesattheotherendofthespectrumithasmuchhigherresolutionsince itpredictsthevalueofeachindividualsampleandalthoughithasmuchbettertemporal resolution,afewnoisysamplesinitsoutputcanincreasetherateestimationerror.Another problemarisesfromthefactthatthephysiologicalbreathingratesarelocatedwellbelowthe samplingrate ( < 0 : 05 f s ) makingitdi˚culttodesigntraditional˝ltersthatcandi˙erentiate betweenthevariousfrequencybandsspanningthephysiologicalfrequencies. Fortunately,toolslikewaveletbased˝lteringprovideane˚cientwaytoachievehigh frequencyresolutionatlowerfrequencybands.Thediscretewavelettransformation(DWT) ofasignaloflength N canbeobtaine˚cientlyviamulti-resolutionanalysis(MRA)pro- posedbyMallat[61].TheMRAbasedDWTforthreedecompositionlevelsisillustrated inFigure3.6.MRAcanbeconsideredtohaveatreelikestructurewhereateachlevel theinputis˝lteredusingeitheralow-pass˝lter(LPF), h [ n ] ,orahigh-pass˝lter(HPF), g [ n ] .Theimpulseresponseofeach˝lterdependsonmotherwaveletbeingemployed.The LPF h [ n ] ateachlevelretainsthelowerhalfoftheinputfrequencybandanddiscardsthe upperhalfwhereas;theHPF g [ n ] doestheoppositeandretainsonlytheupperhalfofthe frequencyband.Forexample,atlevel-1thehighestfrequencyintheinputsignalis ˇ rad/sec (correspondingtohalfofthesamplingfrequencyinHz).Theoutputofthelevel-1LPF h [ n ] thereforecontainsthelowerhalfoftheinputfrequencyband [0 ˇ= 2] whereas,outputof thecorrespondingHPF, g [ n ] ,coverstheupperhalf;the [ ˇ= 2 ˇ ] frequencyband.Since 55 Figure3.6:WaveletdecompositionusingMulti-resolutionanalysis. theoutputof˝lterscontainshalfoftheoriginalfrequencyhalfofthetimesamplesinthe outputbecomeredundantandthereforecanbediscardedbydown-samplingbyafactorof 2.Asmorelevelsareaddedtothetreethefrequencyresolutionstartstoincrease,doubling ateachstage. TheenhancedfrequencyresolutionachievedbytheapplicationofDWTcanbeveryuseful fordi˙erentiatingbetweenthevariousrespiratorystateswhicharelocatedveryclosetoeach otherinthefrequencydomain.AnillustrationofthisispresentedinFigure3.7where thetopplotcontainsthereferencespirometersignalofasubjectindi˙erentrespiratory states.Decompositionlevels d 3 ;d 5 ;d 6 and d 8 areshowninthelowerplots.Alowerwavelet decompositionlevelscorrespondstohigherfrequencybandswheresashighdecomposition levelsrepresentlowerfrequencies.Itcanbeseenthat d 3 coe˚cientshavethehighestvalue inshallowhighbreathing(orhyperventilation)regionsandarealmostzeroinnormaland slowbreathingregions.Thismeansthatdecompositioncoe˚cients d 3 maybeemployed 56 Figure3.7:Referencespirometersignal y ( t ) anditscorrespondingDaubechies-Wavelet[62] detailsatdi˙erentlevels. 57 todi˙erentiatehyperventilationregionsfromnormalandslowbreathingregions.Similarly, decomposition d 5 seemstoaccuratelyindicatethepresenceofDeepHighbreathingregions whereasdecompositions d 6 and d 8 seemtopresenceofnormalshallowbreathingregionsor respiratorystates.Thenextsectiondescribeshowprobabilisticcurvescanbecomputed fromthemultipleelectrodesignalstoidentifydi˙erentrespiratorystates. 3.2.1.1RegionScoreComputation Ideallytheidenti˝cationofanyrespiratorystateoreventshouldbebasedonaprobabilistic measureasitwouldenablemakingsoftdecisionsmakingtheoverallframework˛exible.The advantagesofsuchasetupwillbecomeclearasitsdetailsareexaminedintheforthcoming discussion.Figure3.7demonstratesthatdi˙erentrespiratorystatesareeasytoidentify atdistinctdecompositionlevelsforinstance,hyperventilationismosteasilyidenti˝ableat level-3whereas,slow-deep-breathingliesatlevel-8.Therefore,ameasurefordetectionof acertainrespiratorystatecanbeconstructedfromthevaluesofdetailcoe˚cientsofthe waveletdecompositionlevelatwhichitoccurs.However,thewaveletdecompositionsin Figure3.7arederivedfromthespirometersignal;theactualmeasureswillneedtobebased ontheoutputsofmultipleelectrodechannels.ThetopplotFigure3.8displaystheoutput ofelectrode-1correspondingtothespirometersignalshowninFigure3.7.Notethehigh- levelofnoisein d 3 showninthebottomplot.Thisexplainsthereasonwhyithasbeen sochallengingtoextractthebreathingrateduringhyperventilationsofar.Thereexistsa signi˝cantamountofnoiseinthefrequencybandcontaininghyperventilationrates.The problemisfurtheraggravatedbythefactthatthelungvolumechangesareverysmallin thisstatemakingtheenergyofthedesirablefrequenciesquitelow.Fortunatelyitseems thattheenergyislargeenoughtobedetectablesincethewaveletdetailsarehigherinthe hyperventilationregionthantheyareintheotherstates.Inadditiontothis,useofmultiple electrodechannelscanalsosalvagethesituation. Wenowderivetwoprobabilisticscoresbasedonthewaveletdetailcurvesthatenableusto 58 Figure3.8:Electrode-1output x 1 ( t )anditscorrespondingDaubechies-Wavelet[62]details atdi˙erentlevels. 59 makedecisionsastowhethertheunderlyingrespiratorystateisnormal(orlow)breathing, hyperventilationorapnea.Apleasantlysurprising,bonus,outcomeofusingthesescore curvesisthatinadditiontorespiratorystatedetectiontheyalsoenabledetectionofmotion artifactswithveryhightemporalresolutionandaccuracy.Given d lm ( t ) thewaveletdetail curvecorrespondingtolevel- l ofelectrode- m we˝rstextractitsenvelope: d 0 lm ( t )= Envelope [ d lm ( t )] (3.17) Theenvelopeisemployedherebecausetheintentistocapturetheslowtemporalvariations ofthewaveletcurve.Foragiventestsignalweobtainadistancemeasure r lm ( t ) asbelow: r lm ( t )= d 0 lm ( t ) E trn [ d 0 lm ( t )] (3.18) HereE trn [ d 0 lm ( t )] representstheexpectedvalueof d 0 lm ( t ) inthetrainingdata. r lm ( t ) is asimpledistancemeasurethatindicatesdeviationofthewaveletdetail- l (ofelectrode- m ) aboveorbelowitmeanvalueinthetrainingdata. r lm ( t ) canbeconvertedintoaprobabilistic measureusingthelogisticfunction: p lm ( t )= 1 1+ exp ( cr lm ( t )) (3.19) where, c isaconstantcontrollingthesteepnessofthecurve 2 .Positivevaluesof r lm ( t ) are transformedto p lm ( t ) 2 (0 : 5 ; 1] whereasnegativevaluesof r lm ( t ) resultin p lm ( t ) 2 [0 ; 0 : 5) . p lm ( t )=0 : 5 when r lm ( t )=0 .AsillustratedinFigures3.7and3.8thehighoraccelerated breathingratesaregenerallylocatedinlevels3and4.Therefore,theprobabilitythata sample,attime t ,fromelectrode-mbelongstoanacceleratedrespiratorystateisgivenby: p 0 m ( t )= 1 2 ( p 3 m ( t )+ p 4 m ( t )) (3.20) Simiarly,theprobabilitythatasamplefromelectrode-mbelongstoanormal(orlow)breath- ingstateisobtainedbymergingtheprobabilitiesobtainedfromlevels5to8: p m ( t )= 1 4 8 X l =5 p lm ( t ) (3.21) 2 p lm ( t ) representsthevalueofprobabilityscore p lm attime t . 60 Theoverallprobabilitythatavectorconsistingofsinglesamplefromall10electrodes belongstoacceleratedrespiratorystateisobtainedbycombiningtheprobabilityscoresof theindividualelectrodes: p 0 ( t )= 10 X m =1 ! m p 0 m ( t ) (3.22) Where ! m representstheweightassignedtoelectrode- m .Furthermoretheelectrodeweights mustsumto1inordertoensureavalidprobabilityscore. 10 X m =1 ! m =1 (3.23) Theresultspresentedhereassignequalweights( ! m =1 = 10 ;m =[1 ::: 10] )toallelectrode however,itisalsopossibletoassignelectrodeweightsbasedonsomesignalqualityindicators (SQI).Insuchasetupelectrodesthatareconsideredtobenoisy,basedonthevalueofthe SQI,shouldbeassignedalowerweightwhereaselectrodeswithlownoiseshouldbeassigned higherweights.Thecombinedprobabilityofasamplebelongingtonormalorlowbreathing respiratorystateisgivenby: p ( t )= 10 X m =1 ! m p m ( t ) (3.24) Thenormalizationconstraininequation(3.23)appliesinthiscaseaswell.Anillustrationof howtheprobabilityscores p 0 ( t ) and p ( t ) enableclassi˝cationofdi˙erentrespiratorystates ispresentedinFigure3.9.Itisapparentthattheprobabilityscoresvaryaccordingtothe underlyingrespiratorystate.Forinstancethevalueof p 0 ( t ) iscloseto0.5innormal,low breathingandapnearegions.Itrisessharplyinthehyperventilationregion(betweenthe 35minto35.5minmark)andthenbeginstodrop.Similarly,anormalorlowbreathing stateisindicatedbythedominanceofthelowprobabilitycurve p ( t ) .Anapnearegioncan beischaracterizedby p 0 ( t ) ˇ 0 : 5 andextremelylowvaluesof p ( t ) .Ideally, p 0 ( t ) should alsoapproach0inanapnearegionhowever,practicallythisdoesnothappenbecauseofthe presenceofnoiseintheelectrodesignals.Notethattheprobabilities p 0 ( t ) and p ( t ) ,atatime sample t ,donotsumto1thisisbecausealmostallrespiratorystateshavesomecontribution frombothhighandlowfrequencies.Theabovediscussionindicatesthatcomparisonofthe 61 (a) (b) (c) Figure3.9:Respiratorystatedetection(a)Spirometeroutput(b)Meanofall10electrodes (c)Probabilitycurves:solidlinerepresentsprobabilityofacceleratedbreathing, p 0 ( t ) ;dotted linerepresentprobabilityofnormal(orLow)breathing, p ( t ) . 62 relativevalueofbothprobabilitiesallowsustomakeacorrectdecisionabouttheunderlying respiratorystate.Thehighestpriorityinthisworkisassignedtodi˙erentiationbetween threerespiratorystatesnamely,hyperventilation(oracceleratedbreathingregions),apnea (orregionswithzerobreathingrates)andnormalbreathing.Althoughitispossibleto di˙erentiatebetweenmorerespiratorystatesthisthesisdi˙erentiatesbetweenonlythree. Thedecisionmakingprocessaboutwhetherasampleattime t belongstoahyperventilation, apneaornormalbreathingregionisdescribedbelow. Hyperventilation: Asampleattime t belongstohyperventilationrespiratorystateif itsacceleratedbreathingprobabilityscore p 0 ( t ) isgreaterthan0.5andthenormalbreathing probabilityscore p ( t ) islessthan p 0 ( t ) .Thiscanberepresentedintermsofanindicator function, 1 H ( t ) asbelow: 1 H ( t )= 8 > > < > > : 1 ;p 0 ( t ) > w ; and p ( t )

> < > > : 1 ;p 0 ( t ) 0 : 5 ; and p ( t ) <˘ w 0 ; otherwise (3.26) Thethreshold ˘ w is˝xedat0.3forallsubjects. NormalBreathing: Allsamplesthatdonotbelongtoeitheranapneaorhyperven- tilationstateareconsideredtobegeneratedfromanormalbreathingstate.Theindicator function 1 N ( t ) representingthelocationsofnormalbreathingsamplescanbeobtainedby applyingtheexclusive-ORoperationtotheindicatorfunctions 1 H ( t ) and 1 A ( t ) andnegating 63 theanswer: 1 N ( t )= : [ 1 H ( t ) 1 A ( t )] (3.27) Wherethesymbols and : representtheexclusive-ORandnegationoperationsrespec- tively 3 .Since 1 H ( t ) and 1 A ( t ) aremutuallyexclusivetherefore,theexclusive-ORoperation givesthelocationsofthetimesampleswheretherespiratorystateiseitherhyperventila- tionorapnea.Negationoftheexclusive-ORoutputgivesthelocationofallnon-apneaand non-hyperventilationsamples. 3.2.2RespiratoryStateDetectionusingDCTFilters Intheorywavelet˝lteringshouldenablehigh-accuracyclassi˝cationofdi˙erentrespiratory stateshowever,inordertovalidatethisclaimresultsforrespiratorystatedetectionusing DCT˝lteringarealsopresented.Themeansensorsignal, x ( t ) ,is˝rstdividedintonon- overlappingtimeframesof5secondseach.The i th frame x i consistingof M (=5 f s ) time samples 4 of x ( t ) istransformedtoobtaintheDCTcoe˚cientvector c i : c i = Tx i (3.28) Here, T denotesthe ( M M ) DCTmatrix.Frame- i isconsideredtobelongtoan apnea regionifthenormofitsDCTcoe˚cients j c i j issigni˝cantlybelowathreshold ˘ D .Inorderto beclassi˝edas accelerated-breathing ,frame- i mustsatisfythefollowingtwoconditions: (1) It mustcontainsigni˝cantenergyathigherfrequencycomponents.Inotherwords,theratioof thenormofitshigh-frequencyDCTcoe˚cientstothenormofallitsDCTcoe˚cientsmust exceedathreshold D . (2) Itmustnotbeanapneaframe.Mostapneaframesalsocontain signi˝cantenergyathigherspectralcomponentsduetohigh-frequencynoiseinthesensor output.Therefore,condition-(2)isimposedtoavoidmisclassifyinganapneaframeasan accelerated-breathingframe.Framesthatdonotfallintoapneaoracceleratedbreathingare 3 Alllogicaloperationsaremodulo-2 4 f s representsthesamplingfrequencyinHz 64 classi˝edas normal-breathing frames.Thethresholds D and ˘ D controlthedecisionfunction andaredeterminedfromtheirreceiver-operating-characteristic(ROC)curves.Thatis,the DCTapneathreshold, ˘ D ,isvariedoverawiderangetodeterminetheapneadetectionROC curvewhichisaplotofthefalse-positive-rate(FPR)versusthetrue-positiverate(FPR). Thevalueof ˘ D whichresultsinaFPRof1%isselectedandusedforrespiratorystate detection.Similarly,thevalueof D resultingin1%FPRforacceleratedbreathingframes isselectedfromitscorrespondingacceleratedbreathingROCcurve.Note,insection3.2.1 thewaveletthresholds w and ˘ w were˝xedto0.5and0.3respectively.Wewouldliketo pointoutthatthesevaluesalsocorrespondtoa1%FPRforapneaandacceleratedbreathing regions.LimitingboththewaveletandDCTrespiratorystatedetectorstothesameFPR ensuresthatthecomparisonisfair. 3.2.3RateEstimationUsingAdaptiveGiniKernelMachines InordertoimproveestimationofbreathingrateinacceleratedbreathingregionstheWA- Gini algorithmemploysa Gini kernelmachinetrainedonlyonacceleratedbreathingsamples .Upondetectionofanacceleratedbreathingregionthealgorithmswitchesfromthenor- malbreathing Gini totheacceleratedbreathingkernelmachine.Forapnearegions,one possibilityistosettheoutputtozero.However,thisapproachmayresultinlargerateesti- mationerrorsincaseoffalsealarms.Therefore,asoft-decisionapproachisemployedi.e.;the normal-breathing Gini isusedinapnearegionsaswell.Themainissuewithapnearegionsis high-frequencynoisewhichisremovedbythewaveletdenoisingappliedattheoutputofthe normalbreathing Gini ;thisresultsinzero,orminimal,breathingrateestimationerrorfor thevastmajorityofapnearegions.Resultsindicatethatthisapproachperformsbetterthan explicitlysettingapnearegionstozero.Inordertoevaluatewhichofthetworespiratory statedetectorsperformsbetterresultsforbothcon˝gurations.Theframeworkthatemploys thewaveletbasedrespiratorystatedetectionistitledaveletAdaptive Gini orWAGini. Whereas,theDCTbasedframeworkistitledtheAdaptive Gini orDAGini.Both 65 approachesareidenticalinallotherrespectswiththeonlydi˙erencebeingtherespiratory statedetector. 3.2.4Results Thebreathingrateerrorsobtainedusinganumberofdi˙erenttechniquesfor19di˙erent humansubjectsinthenormalbreathingartifactsessionsarepresentedinTable3.2.The resultsforacceleratedbreathingandapneasessionsarecontainedinTable3.3.Theresultsfor DAGini,inaccelerated-breathingandapneasessions,demonstratenoticeableperformance improvementwhencomparedtothesingle- Gini andthesinglesensorapproaches.However, whencomparedtotheSECtheperformanceimprovementinacceleratedbreathingsessions isverysmall.Inartifactsessions,theDAGiniresultsinmorethan1BPMimprovement whencomparedtotheSEChowever,itperformsworsethanthesingle- Gini approach.This degradationisduetofalsedetectionsofacceleratedbreathingframesinnormalbreathing sessionsofsomesubjects.Considerforexample,thesingle- Gini resultsinartifactsessions ofsubject-1andsubject-2.Forsubject-1theerrorratesforsingle- Gini andDAGiniare identicalbecausetherearenofalsedetectionsofacceleratedframes.Forsubject-2however, thesingle- Gini errorrateislowerthanthatforDAGinibecauseofhighnoiseincertain normalbreathingsessions.Thisresultsinfalsedetectionofacceleratedbreathingregionsby theDCTbasedrespiratorystatedetectorcausinganincreaseintheoverallbreathingrate error. TheresultsforWAGinidemonstratethatwavelet˝lteringcoupledwithmultiplekernel machinesproducethebestestimateofbreathingrateinaccelerated-breathingandapnea sessions.WhencomparedtotheSECandDAGiniweobtainmorethan2BPMimprovement intheaverageerrorestimate.Inartifactsessionsthesingle- Gini stillremainsthebest performingapproachhowever,thedi˙erencewithWAGiniisnotverysigni˝cant.Aswas caseinDAGinitheerrorsinthiscasemayalsobeattributedtofalsedetectionofaccelerated breathingregionssincethewaveletapproachisnotperfect.However,theoverallrateerror 66 Subject ArtifactSessions Elec-4 SEC Gini DAGini WAGini 1 4.54 2.29 1.81 1.81 1.81 2 5.65 2.04 1.93 2.77 3.58 3 3.07 19.06 1.90 10.07 4.68 4 7.93 5.76 3.60 3.60 4.00 5 7.23 3.66 3.13 3.13 3.13 6 5.06 5.10 3.23 5.33 3.23 7 8.41 8.33 2.54 5.49 2.54 8 12.08 4.63 3.75 3.75 3.75 9 2.73 2.77 2.88 2.88 2.88 10 4.77 3.77 2.57 2.57 2.57 11 6.26 2.75 2.27 2.27 2.27 12 6.19 2.83 2.47 2.47 2.47 13 8.86 3.38 2.48 2.48 2.48 14 3.14 2.94 2.72 2.72 3.26 15 3.39 3.18 2.80 2.80 2.80 16 3.29 2.69 2.54 2.54 2.54 17 5.64 4.42 3.49 3.49 3.49 18 6.11 3.82 2.95 2.95 2.95 19 4.60 4.79 4.20 4.20 4.27 M ean 5.73 4.62 2.80 3.54 3.09 Table3.2:AverageRespirationRateError( RR err )inBPMfordi˙erenthumansubjectsin artifactsessions .Errorsarecomputedover10secondwindowswith5secondoverlaps. 67 Subject AcceleratedBreathing& ApneaSessions Sensor-4 SEC Gini DAGini WAGini 1 11.67 8.67 15.29 10.27 11.90 2 12.41 5.78 15.74 7.91 6.21 3 10.52 13.11 12.25 17.62 10.73 4 16.52 2.34 8.74 5.12 3.04 5 15.99 4.65 4.13 1.38 1.33 6 4.93 5.18 12.94 8.99 3.08 7 28.58 32.85 16.03 15.89 12.74 8 12.49 3.90 16.49 8.17 4.78 9 3.70 3.50 5.85 3.09 3.48 10 12.15 6.59 15.34 7.64 9.84 11 9.95 9.64 15.56 10.99 7.84 12 9.52 9.68 14.15 15.38 4.09 13 8.42 7.77 10.10 8.16 7.27 14 7.93 11.30 13.61 5.30 3.92 15 5.10 7.42 6.29 4.43 2.46 16 10.14 9.17 15.56 4.06 3.67 17 8.92 3.32 16.74 3.48 2.66 18 7.06 3.54 7.79 4.20 3.98 19 6.37 2.77 11.01 6.64 7.44 M ean 10.65 7.96 12.29 7.83 5.81 Table3.3:AverageRespirationRateError( RR err )inBPMfordi˙erenthumansubjectsin acceleratedbreathing & apneasessions .Errorsarecomputedover10secondwindowswith 5secondoverlaps. 68 isstillmuchlessthanthatobtainedviatheDAGini.OverallitseemsthattheWAGini enablesustoachieveabalancebetweenrateestimationinartifactandacceleratedbreathing sessions. 3.2.5WaveletBasedArtifactDetection Anaddedadvantageofusingthewaveletbasedprobabilitycurvesisthattheyalsoenable veryaccuratedetectionofregionscontainingmotionartifacts.Eventhoughthesemetrics werenotconstructedprimarilyforthispurposetheyseemtobeverygoodatidentifyingthe presenceofmotionartifacts.Furthermore,itseemsthattheymayalsoenableidenti˝cation oftheunderlyingphysicalstate/postureofthehumansubject.Thissectionbrie˛ydiscusses thepossibilityofusingtheprobabilityscoresofsection3.2.1.1forartifactdetection.The discussioniskeptbriefonpurposebutitpavesthewayforanypossiblefuturework.A numberofdi˙erentmotionartifactintroducelow-frequencyhigh-amplitudedistortionsinthe electrodesignals.ForexampleconsidertheplotofelectrodeoutputsshowninFigure3.10 (b).Inthiscasethesubjectisreachingforanobjectbetweenthe0.5tothe2minutemark. Itcanbeobservedthattheartifactsignalsareseverelydistorted.Abriefglanceatthevalue oftheprobabilitycurvesindicatesthatlowfrequencycurve p 0 ( t ) issigni˝cantlyabovethe averagelevelforthedurationoftheartifact.Thehigh-frequencycurveisalsoabovethe averagevaluemostoftheartifactduration.Therefore,itseemsthatveryhighvaluesofthe p 0 ( t ) curvemayenableveryaccurateidenti˝cationofartifactregions. Inadditiontoartifactdetectionitseemsthatwemayalsobeabletoidentifytheunderly- ingphysicalstateofthesubjectaswell.Forexample,comparisonoftheplotsinFigure3.11 whichcorrespondtoareachingactivity;andFigure3.12whichcorrespondtowalkingindi- catethatduringwalkingthethedi˙erencebetweenthehigh-frequencyandlow-frequency curvesissmallerthandi˙erenceduringreaching.Similarly,theplotsinFigure3.13corre- spondtothecasewhenthesubjectsareonabedrollingfromlefttoright.Therefore,it seemshighlylikelythattheexactshapeofthescorecurveswillallowdi˙erentiationbetween 69 theunderlyingphysicalstate.Thistypeofknowledgemayenablefurtherimprovementin thebreathingrateestimateobtainedviaalgorithmssuchastheWA- Gini however,theyhave beenleftforfutureworkandhavenotbeeninvestigatedfurtherinthisthesis. 70 (a) (b) (c) Figure3.10:Impactofmotion-artifactonelectrodesandprobabilitycurves;subjectisreach- ingforobjectbetweenthe0.5to2minmark.Plotsindicate:(a)Spirometeroutput;(b) Outputsofthreeelectrodesand;(c)Probabilitycurves. 71 (a) (b) (c) (d) Figure3.11:Probabilitycurvesoffourdi˙erentsubjectswhenreachingforobject. 72 (a) (b) (c) (d) Figure3.12:Probabilitycurvesoffourdi˙erentsubjectswalkingatanormalpace. 73 (a) (b) (c) (d) Figure3.13:Probabilitycurvesoffourdi˙erentsubjectswhenrollingleftandrightonbed. 74 CHAPTER4 PROTEOMICCHANNELCAPACITY Thischapterpresentsadetailedstrategyforevaluatingthechannelcapacityofaproteomic channel.Inordertobeofanypracticalusethechannelcapacitycalculationsmustbebased onrealisticchannelconditions.Thisrequiresdevelopmentofmodelsthatincorporatenoise irregularitiesthatmaydegradeproteindetectionperformance.Thischapterisorganizedas follows:Section4.1introducesthebasiccomponentsoftheproteinreceptorchannel.Sec- tion4.1.1presentsamodelofthedi˙usionprocessthatisrelevantforsensingapplications suchasproteinarrays.A two-compartment approachisemployedbysub-dividingthedif- fusionprocessintotwostages:(1)Large-scale,deterministic,di˙usionfromtransmitterto receiverprobe(2)Small-scale,stochastic,di˙usioninasmallvolumearoundthereceptor. Section4.1.2describestheresponseofcombinatorialandspeci˝cprobesandhighlightsin theimpactofdi˙erentparametersondetectionperformance.Section4.2introducesthe conditionaldistributionoftheproteinarraychannelanddiscussestheimpactofreceptor parametersonthenoisevariance.Capacityiscomputedasafunctionofreceptorparameters inSection4.3. 4.1ProteomicChannelModels Atypicala˚nity-basedsensingapplicationbeginswiththeadditionofatestsampleto thearrayreactionchamberwhichcontainsanumberofdi˙erentprobesimmobilizedin micrometerornanometersizedspotsthroughoutitsentirevolume.Intheabsenceofany drifttheproteinparticlespresentinthetestsamplefollowaBrowniantypemotionandover timedistributeevenlyoverthereactionchamber.Theconcentrationofaproteinismeasured byusingreceptorsthatcaptureparticlesintheirvicinity.Asingle Receptor consistsofa largenumberofprobes/recognitionelementsthatareuniformlydistributedoveritssurface. 75 Thephysicalrepresentationofasimpli˝eddualproteinassaywithasinglecombinatorial receptorthatdetectsbothinputproteinsisshowninFigure4.1.Atthestartofthetest, asamplevolumecontainingparticlestobeanalyzedisinjectedintothereactionchamber. Thechangeinparticleconcentrationrateatthespotofthetargetreceptordependson the Di˙usion propertiesofthemedium(determinedbyfactorssuchastheviscosityand thephysicaldimensionsofthetargetproteins).Thereceptorsamplestheconcentration informationinitsencompassingvolumeanddependingonwhethertheprobeisspeci˝cor combinatorialproducesanoutputsignalproportionaltotheconcentrationofoneormore proteins.Inthefollowingsub-sectionswepresentamathematicalformulationthatmodels thedi˙usionprocessandthereceptorbindingprocess. 4.1.1ProteinDi˙usionModel Whileatamacroscopicleveldi˙usioncanbeviewedasadeterministicprocess,atthe microscopicleveltheperpetualBrownianmotionofparticlescausesrandomvariationsin thetransportofparticlesresultinginso-callednoAlthoughmass-transport limitedbiochemicalsystemscanbemodeledusingstochasticdi˙erentialequationssuchasthe Langevinequation[63],thecalculusofmultivariatestochasticdi˙erentialequationsbecomes cumbersomeexceptforsomespecialcases.Inordertokeepthemodelmathematically tractableweuseadeterministictransportmodelbasedontheFick'ssecondlawofdi˙usion. Thevariationduetotheconstantrandommotionofparticlesismodeledonlywithinasmall sub-volumearoundthereceiverprobe.Foraproteinarraywhichperformsmultiplexed detectionof N typesofproteins,wedenotetheinputofthesystembyavector x 2 Z N 0 . Eachcomponentof x isanon-negativeintegerrandomvariabledenotedby x n andrepresents thetotalnumberofparticlesofprotein- n inthetestsample.Particlesareinjectedintothe reactionchamberusingadevicesuchasamicro-pipetteandinjectionlocationwilltakento betheorigin I =(0 ; 0 ; 0) inthethreedimensionalspace.Fick'ssecondlawcanbeused topredicthowtheconcentrationofprotein- n changesovertimeatareceptorlocatedatthe 76 Figure4.1:(a)Cross-sectionalviewofdi˙usioninamulti-proteinarray.Di˙erentstates ofthechannel:(b) t =0: X n particlesinjectedatorigin I =(0 ; 0 ; 0) ;(c) t> 0: concentration, n R ;t ) ,ofparticlesinthereceptorsub-volumeisgivenby(5);(d) t !1 : (Steady-State)concentration,,ofparticlesinthereceptorsub-volumeisgivenby(6). coordinate R =( x R ; 0 ; 0) [64]andcanbeexpressedas: @ n R ;t ) @t = D n r 2 n R ;t ) : (4.1) n R ;t ) istheconcentrationofparticlesatthereceptorlocation R attime t . D n isthe di˙usioncoe˚cientoftheprotein- n molecule.However,themodelinequation(4.1)doesnot accountfortheinteractionbetweentheproteinparticleswiththeepitopes(bindingsites)on thereceptorandhenceneedstobemodi˝edaccordingly.Theconcentrationoftheprotein- n withinasmallsub-volumewherethereceptorsareimmobilized(hereonwardreferredtoas theSub-V V R )isprimarilye˙ectedbytwoprocesses: 1. Sorption whichreferstotheadsorptionanddesorptionofproteinparticlestothe 77 receptorsurfaceorsurfaceepitopes.Weassumethatthesorptionrateishighcompared tothetransportrateofparticlesinthetestsample,therefore,itisfairtoassume thatalocalequilibriumexistsfortheadsorptionanddesorptionprocesses.Under theseconditionsthesorptionratechangesinproportiontotheconcentration[65];this enablesustomodelsorptionasasinklocatedat R : @ n R ;t ) @t = D n r 2 n R ;t ) K d @ n R ;t ) @t (4.2) where, K d istheequilibrium-partitioningcoe˚cientbetweenthe˛uidandthesorption tothereceptorsurface. 2. Binding ofadsorbedparticlestoepitopesonthereceptor.Bindingbetweenthead- sorbedparticlesandthereceptorepitopescanbemodeledusinganadditional factortoequation(4.2)accordingto: @ n R ;t ) @t = D n r 2 n R ;t ) K d @ n R ;t ) @t r n n R ;t ) (4.3) where, r n isthereactionrateatwhichprotein- n targetsbindwiththeepitopesonthe receptor. Equation(4.3)canrearrangedtogive: @ n R ;t ) @t = D n R s r 2 n R ;t ) r n R s n R ;t ) (4.4) where, R s =(1+ K d ) .Equation(4.4)indicatesthatboththedi˙usivetransportandprobe bindingprocessareinhibitedduetotheequilibriumadsorptionanddesorption[65].We assumethatthesizeofthereceptorsub-volumeissigni˝cantlysmallincomparisontothe overallsizeofthereactionchamber.Wewillapproximatetheinputtothedi˙usionchannel tobeavolumetricpointsourcethatinjects x n proteinparticlesduringanin˝nitesimally smalltimeinterval(comparedtothedi˙usiontime-scales).Thiscanbemodeledusingan 78 Figure4.2:Concentrationasafunctiontimeinsidereceptorsub-volumefordi˙erentvalues of D n .(Totalinputconcentration n I ; 0)=4 g/cm 3 ;x R =1 cm ;r n =0 : 02 s 1 ;R s =2 ). impulsewhichoccursatthestartofthetest( t =0 )andhencetheimpulseresponsebased ontheequation(4.4)leadsto: n R ;t )= x n p 4 ˇD 0 n t exp x 2 R 4 D 0 n t r 0 n t ! : (4.5) D 0 n = D n =R s and r 0 n = r n =R s andrepresentthedi˙usionandreactionratesadjusted fortheinhibitioncausedbysorption.Thediligentobserverwillnotethata1-Ddi˙usion modelhasbeenemployedtosolveforequation(5).Theisjusti˝edduetothefollowing reasons:(1)Givena3-dimensionalreferenceaxesshowninFigure4.1(a),wecanassume thattherespectiveconcentrationsofthereceptorsandtheproteinparticlesinthecross- sectionalplane(alongthey-axis)areconstantandcanvaryonlyalongthex-axis.Asimilar assumptionwasusedin[66]wherethee˙ectofdi˙usiononthekineticsofanevanescent wavebionsensorwasinvestigated.Thevariationsalongthez-axiscanalsobeignoredunder theassumptionthatthedepthofthereactionisshallow.(2)Although,higher-dimensional modelscouldbeemployedtheyoftendonotyieldananalyticalsolutionandhenceneedtobe solvedusingnumericaltechniques.Asimple1-Dmodelisthereforetractableandpreferable. Figure4.2isaplotofequation(4.5)fordi˙erentvaluesof D n ;anditplotstheconcentra- tionobservedinthereceptorsub-volume.Forouranalysisweareinterestedinconcentration 79 ofprotein- n insidethereceptorundersteady-stateconditions.Thiscanbecalculatedby integratingequation(4.5)overtimeaccordingto: n R )= Z 1 0 x n p 4 ˇD 0 n t exp x 2 R 4 D 0 n t r 0 n t ! dt = x n p 4 D 0 r 0 n exp 0 @ s r 0 n x 2 R D 0 n 1 A : (4.6) DuetotheBrowniandynamicsofparticledi˙usionthetruestead-stateconcentration n insidethereceptorvolumeisarandomvariablewhoseaveragevalueisgivenbyequation (4.6).Undertheassumptionthattheprobabilityoftwoparticlesoccupyingtheexactsame spatiallocationisnegligibleandthatthemotionofallindividualparticlesinsidethereaction chamberisindependentofeachother;itcanbeshownthattheactualconcentration n R ) inside V R isaPoissonrandomvariable,withanarrivalrateequaltotheaverageconcentration givenby n R ) [67],[68]: e n R ) ˘ Poiss n R )) (4.7) Fortherestoftheanalysiswewillnotinclude R inourexpressionswiththeunderstand- ingthat n representstheaverageconcentrationofprotein- n insidethereceptorsub-volume locatedat R .Wenowhaveamodelwhichwecanusetocharacterizetherandomvariation ofconcentrationinsidethereceptorsub-volume, V R ,atsteady-state. 4.1.2ReceptorResponseModel Deviationsfromidealbehavioratthereceptorarecriticalforconstructingrealisticmodels ofthesysteminquestion.Inthiscontext,amajorityofthepreliminaryinvestigations intomolecularcommunicationsystemshavefocusedprimarilyonnon-idealitiesresulting fromthedi˙usionprocess,whileassumingidealreceptormodels(seeforexample[68],[39]). Onlyalimitednumberofstudieshaveaddressedthisproblem.Forexample,in[69]the impactofsensorcleansetimeontheperformanceofamolecularcommunicationsystemhas 80 Figure4.3:Illustrationofreceptorsaturationduetounavailabilityoffreeprobes. beeninvestigated.Inthissectionarealisticmodelbasedonactualreceptorprototypes, constructedinthelab,ispresented. Thenumberofparticlesinsidethereceptorsub-volumecanbemeasuredusinganelec- tricaloranopticaltransducerthatisalsoimmobilizedtotheprobes[35,70](fore.g.gold nanoparticlesforopticaldetectionorconductivepolymerforelectricaldetection).Since thereareonlyalimitednumberofepitopesonareceptor,athighconcentrationsnotall proteinparticlesinside V R willbeableto˝ndavacantepitopetobindwith;asillustratedin Figure4.3.Therefore,atlow-to-mediumconcentrationsachangeinthetransducer'soutput signal Y variesindirectproportiontoacorrespondingchangeintheaverageconcentration ofprotein- n inside V R .However,athighconcentrationsthereceptorreachessaturationand therateofchangeof Y decreasesduetounavailabilityoffreeprobesasshowninFigure4.4. Furthermore,atultra-largeproteinconcentrationsalargenumberofa˚nitybasedassays su˙erfromaphenomenoncalledthe HookE˙ect [71]whichresultsinadropintheoutput withincreasingconcentration.TheHookE˙ectoccursbeyondtheSaturationregionandis thoughttobetheresultoffactorssuchasshadowinginwhichthehighdensityofcaptured particlespreventsthebindingofdetectorprobesresultinginadropintheoverallsignal. UndertheHookE˙ectanassaywillalmostcertainlygiveafaultyreadingresultinginaca- pacityequaltozero;makingcapacitycalculationsirrelevant.Asaresultweassumethatthe 81 Figure4.4:Outputsignalsaturationinatypicala˚nitybasedarray. inputconcentrationisupper-boundedbyavaluewhichiswell-belowtheHookconcentration anddonota˙ectourreceptormodel.Wecannowwritetherateofchangeintheaverage transduceroutputwithrespecttoacorrespondingchangeintheaverageconcentrationinside V R as[72]: dy=d n = k F n ) (4.8) where, k isproportionalityconstant,itmodelsthesensitivityofthereceptortoprotein- n andisindependentof n .Thefunction F () incorporatesthesaturationresponse;ingeneral itshouldsatisfythefollowingboundaryconditions: F (0)= K< 1 (4.9) F ( 1 )=0 (4.10) Furthermore,inordertoensurethatthechangeinmagnitudeof y reducesasmorerecognition elementsareoccupiedbyincomingparticles,thefollowingconditionsshouldalsobesatis˝ed: F > 0 (4.11) d d F 0 (4.12) 82 Onefunctionsatisfyingconditions(4.9)to(4.12)islistedbelow[72]: F = 1 + (4.13) where, > 0 isaconstantandcontrolsthesaturationfunctionundercontrolconditions (whennoparticlesarepresent).Equation(4.13)cannowbewrittenindi˙erentialformand integratedtogiveamodelforthetransduceroutput: dy = k d + (4.14) Z y y 0 dy = k Z n 0 d + (4.15) y n )= y 0 + k log + n (4.16) Therefore,theoutputsignalofthetransducerisalog-linearfunctionoftheconcen- trationinsidethereceptorsub-volume.Forthedual-proteincombinatorialprobewhichis constructedbyimmobilizingrecognitionelementsspeci˝ctotwodi˙erentproteins(asshown inFigure1.4(b))theoutputsignalwillbeafunctionofparticlesofbothproteins.Inthis casethegradientoftheoutputsignalwithrespecttotheconcentrationofprotein-1and protein-2willbegivenby: @y @ 1 = k 1 1 + 1 + k 12 1 + 1 + 2 (4.17) @y @ 2 = k 2 1 + 2 + k 12 1 + 1 + 2 (4.18) where, ( k 1 ; ) and ( k 2 ; ) arethemodelparameterscorrespondingtoprotein-1andprotein-2 respectively.Thee˙ectof JointHybridization ontheoutputsignaliscapturedbythepa- rameters ( k 12 ; ) .Jointhybridizationcanbeinterpretedasthesensitivityofcombinatorial probestotheconcentrationsofbothinputproteins.Incombinatorialprobesjointhybridiza- tionmaybeintroducedbydesignasshowninFigure4.5.Thesolutiontoequations(4.17) 83 Table4.1:Behavioralmodelparametersforthreedi˙erenttypesofreceptorswithMouse andRabbitIgGastargetanalytes[35][36].Note:theletters` m 'and` r ',inthesubscript, havebeenemployedhere(insteadofthenumerals`1'and`2'inequation(4.19))torepresent MouseandRabbitIgGrespectively. and(4.18)isgivenby: y = g 1 ; 2 )= y 0 + k 1 log + 1 + k 2 log + 2 + k 12 log + 1 + 2 (4.19) Thelog-linearmodelinequation(4.19)isconsistentwiththeexperimentalresultsthathave beenpreviouslyreported[35]forsoft-logicreceptors(correspondingtorabbitandmouseIgG) showninFigure1.4(c).Theresponseofthesoft-logicfunctionsremainconsistentwiththe 84 Figure4.5:Speci˝candCombinatorialProbes. presenceandtheabsenceoftheIgGtargets,whereas,themagnitudeofthemeasuredoutput (conductanceinthecaseof[35])scaleslog-linearlywiththeanalyteconcentration. Equation(4.19)canalsobegeneralizedtomorethantwoproteinsanddi˙erenttypesof combinatorialcircuitssuchas:(1)gatewhichgeneratesasignalwhenbothinput proteinsarepresent.(2)gatewhichgeneratesasignalwheneitherproteinispresent. (3)OR"gatewhichgeneratesasignalwhenoneproteinispresentandtheotherisabsent. Although,traditionalproteinassayssuppressthejointhybridizatione˙ectitwasshownto behelpfulundercertainconditionsin[73]and[74].Ademonstrationoftheadvantagesof exploitingjointhybridizationwaspresentedin[35,37]whereFECcodeswereconstructed usingcombinatorialprobesandincomparisontospeci˝cprobes,anoverallreductionin proteindetectionerrorratewasobserved.However;uptillnowprobeparametershavebeen selectedexperimentally;thisislaboriousandtimeconsumingandconsumesexpensivelab material.Forinstancetable4.1showstheexperimentallydeterminedparametersofthe model(4.19)fortherabbitIgGandmouseIgGcombinatorialprobes(non-combinatorial, soft-ANDandsoft-ORfunctions).However,inthispaperweareprimarilyinterestedin theimpactofthedi˙erentmodelparametersontheoverallcapacity,therefore,wevarythe di˙erentmodelparametersinsteadofusing˝xedvaluesaslistedintable4.1.Inthiscontext 85 Figure4.6:Conditionaldistributionofproteinarraychannel;(a)3Dview(b)Topview. ReceptorParametersare˝xedto k 1 =1 ;k 2 =0 : 9 ;k 12 =0 : 9 ;Di˙usionparametersareas listedinTable4.3; x 2 =1 : 765 10 3 . capacityestimationplaysanimportantroleandcanbeusedasatoolforselectingmodel parameters.Theoptimalprobeparametersshouldinprinciplecorrespondtothemaximum capacityandthereforethecapacitycalculationshouldalsoprovidekeyinsightsondesigning syntheticprobeswiththedesiredhybridizationparameters. 4.2ConditionalDistributionofProteinArrayChannel Thenextsteptowardsdeterminingtheinformationcapacityoftheproteomicchannelisto determinetheconditionaldistribution P Y j X ( y j x ) where y istheoutputand x =[ x 1 ;x 2 ] isthe inputvectortothechannel.Theconditionaldistribution P Y j X ( y j x ) oftheproteomicchannel 86 ParameterValue/Range Y 0 0.5 1 1 1 k 1 [0,1] k 2 [0,1] k 12 [0,1] Table4.2:ValuesofReceptorParameters Figure4.7:Conditionaldistributionofproteinarraychannelfor k 1 =0 : 2 and x 2 =1 : 765 10 3 .Thevaluesof k 2 and k 12 varyrowandcolumnwiserespectively. 87 willcapturethee˙ectofthedi˙usionnoiseasdescribedinsection4.1.1andthee˙ectofthe receptorresponsemodelasdescribedinsection4.1.2.Asisthecaseforanycommunication channelthisconditionaldistributioncanbeempiricallydeterminedbyobservingthereceptor outputs Y correspondingtoalargenumberofinputs.Ourempiricalapproachwillbeto useequation(4.6)todeterminethesteady-stateconcentration n correspondingtoeach protein-variantinsidethereceptorsub-volume,fordi˙erentinstancesoftheinputparticle concentrationvector x =[ x 1 ;x 2 ] .Basedonequation(4.7)thenoisyvaluesofthereceptor concentrationwillbeobtainedbysamplingapoissonrandomvariablewithrateequaltothe steady-stateconcentration n .Thesenoisyconcentrationsamplesarethenusedtoobtain thecorrespondingvaluesofthetransduceroutputusingequation(4.19)whichwillthenbe usedtoevaluatetheconditionaldistribution P Y j X ( y j x ) .Theparametersofthereceptor responsemodelofequation(4.19)arelistedintable-4.2.Since,wedonotknowtheoptimal setofparametersandtherefore, k 1 ;k 2 and k 12 arevariedtodeterminetheire˙ectonthe conditionaldistributionandeventuallythearraycapacity.Althoughapracticalarraycan Figure4.8:Conditionaldistributionofproteinarraychannelfor k 1 =1 : 0 and x 2 =1 : 765 10 3 .Thevaluesof k 2 and k 12 varyrowandcolumnwiserespectively. 88 Figure4.9:Crosssectionalviewoftheconditionaldistribution P Y j X ( y j x ) fora˝xed x 2 and varying x 1 .Conditionalvariance ˙ Y j X ( y j x ) 2 isapproximatedbyit'saveragevalue ˙ 2 n . ParameterValue/Range D 0 10 6 cm 2 s 1 r 0 10s 1 x R 2 10 3 cm Table4.3:Di˙usionandReactionParameters havesensitivityparametersgreaterthan1,welimitparameterrangebetween0and1.The di˙usionparametersforboththeinputproteinsareassumedtobeidenticalandarelisted intable4.3.Figure4.6displaystheconditionaldistribution P Y j X ( y j x ) (fora˝xedvalueof x 2 =1 : 765 10 3 )fromtwodi˙erentviewingangles;thereceptorparametersare˝xedto k 1 =1 ;k 2 =0 : 9 ;k 12 =0 : 9 andthedi˙usion-reactionparametersareaslistedintable4.3. Thee˙ectofreceptorparametersonthechannelconditionaldistributioncanbevisually inspectedforsomeinstancesoftheconditionaldistributionandaredisplayedinFigure4.7 andFigure4.8.Forareceptorwithlowsensitivitytoprotein-1asshowninFigure4.7 weobserveamorepronouncede˙ectontheconditionaldistribution.Incontrast,fora 89 Figure4.10:KL-Divergencebetweentrueand˝xedvariancedistributions. receptorwithhighsensitivitytoprotein-1(Figure4.8)thee˙ectoftheothertwoparameters ( k 2 and k 12 ) islesspronounced.Plotsoftheconditionaldistributionindicatethatthenoise attheoutputoftheproteinarraychannelissignaldependent.Unfortunatelyclosedform expressionsforthecapacityofchannelswithsignaldependentnoisearetoocomplexto computeorcannotbecomputedinmostscenarios.Thisproblemisfurthercomplicated ifthechanneldistributionhasanon-standform,asisthecaseforaproteomicchannel. Wethereforehavetoresorttoanumericalapproachorusesomesimplifyingassumptions suchthatcapacityexpressionscorrespondingtostandardchanneldistributionscanbeused. Becausetheobjectiveofthispaperistodetermineapproximatecapacityexpressions,we haveoptedfortheanalyticalapproximationbasedapproach.Sinceweareinterestedinthe impactofreceptorparametersonthearraycapacity,weassumethatthevarianceofthe outputsignalis˝xedforallinputvaluesandtherefore,isindependentofthevalueofthe inputparticles.Thus,weapproximate P Y j X ( y j x ) byanormaldistributionwithmeangiven bythetruevalueof y ofequation(4.19)andaconstantvariance ˙ 2 n thatisindependent oftheinput.ThisisdemonstratedinFigure4.9wherethedottedlineindicatesthecross- sectionalviewoftheactualconditionaldistribution P Y j X ( y j x ) whosevariance ˙ Y j X ( y j x ) 2 is signaldependentanddecreaseswithincreasingvaluesof x 1 .Thehighlightedareaindicates 90 theapproximateconditionaldistributionwithconstantvariance ˙ 2 n whichisequaltothe meanvalueof ˙ Y j X ( y j x ) .Inordertocapturethedependenceonthereceptorparameters wecomputethenoisevariancebytransmittingalargenumberofinputsandobservingthe output y fora˝xedsetofreceptorparameters.Since,theactualvalueofthevariancedepends onthevalueoftheinput x thereforethevalueofthevariancecorrespondingtoa˝xedsetof receptorparametersisobtainedbyaveragingoverthevarianceobservedfordi˙erentvalues oftheinput.Thisprocessisrepeateduntilweobtainthevariancevaluesoverthecomplete rangeofreceptorparameters.Thevariance ˙ 2 n isafunctionofthereceptorparametersand iscomputedbyregressionontheaveragevarianceofthetrueconditionaldistributionthat isobservedforarangeofdi˙erentreceptorparameters.Multivariatepolynomialregression resultsinthefollowingexpressionfor ˙ 2 n : ˙ 2 n = 0 : 946 k 2 1 +0 : 711 k 2 2 +27 : 518 k 2 12 22 : 041 k 1 k 2 +5 : 806 k 1 k 12 +4 : 684 k 2 k 12 +21 : 804 k 1 +19 : 352 k 2 66 : 008 k 12 +58 : 781) 10 3 (4.20) Thenoisedistributioncannowbeemployedtoevaluatethecapacityoftheproteinarray channel. Thedegreeoferrorincurredbyapplyingtheconstantvarianceassumptioncanbegauged bycomparingtheactual,variablevariance,probabilitydistributionswiththeconstantvari- anceprobabilitydistributionemployedforcapacitycalculationusingametricsuchasthe Kullback-Leiber(KL)Divergence.Figure4.10plots,alongthey-axis,thepercentageofthe observed(variablevariance)outputdistributionswhoseKL-Divergencewiththeapproxi- mate,constantvariance,distributionislessthantheKL-Divergencelistedalongthex-axis. Forexample,givenaKL-Divergenceof0.5wecanseethat55%oftheobserveddistributions haveaKL-Divergenceoflessthan0.5whencomparedwiththeapproximatedistribution. Similarly,about70%oftheobserveddistributionsarewithinaKL-Divergenceoflessthan 91 1fromtheconstantvariancedistribution. 4.3ProteomicChannelCapacity Theapproximateconditionaldistributioncannowbeusedtoestimatetheinformationca- pacityoftheproteomicchannel.Theinformationcapacityofanycommunicationsystemis themaximumamountofinformationthatcanbesuccessfullyconveyedfromatransmitter toareceiverinasingleuse[75].Formallyitisde˝nedasthemaximummutualinformation betweenthetransmittedandthereceivedsignal,withmaximizationperformedoverallprob- abilitydistributionsde˝nedontheinputalphabet.Fortheproteinarraycommunication systemwithinput x =[ x 1 ;x 2 ] ,output Y andgivenaset ( k 1 ;k 2 ;k 12 ) ofprobeparameters wede˝necapacityas: C =max I ( x ; y ) j ( k 1 ;k 2 ;k 12 ) (4.21) =max[ H ( x ) H ( x j y )] (4.22) =max[ H ( y ) H ( y j x )] (4.23) I ( x ; y ) representsthemutualinformation[76].Capacityisobtainedviamaximizationof themutualinformation I ( x ; y ) betweentheinputandoutputsignalsoverallpossibleprob- abilitydistributionsde˝nedontheinputalphabet.Acombinatorialproteinarraychannel canbeviewedasatransform : Z 2 0 ! R thatmapstheinput x totheoutput y .The transform isnoisyandmodelsthee˙ectofdi˙erentnoisesourcesfoundinaproteomic channel.Insection4.2itwasassumedthattheoutputnoisedistributionisidenticalfor allpossiblenoise-free(deterministic)outputs y D therefore,wecanreplacethenoisymap withadeterministicnoise-freetransform D followedbyanAWGNnoisemodel.Equation 4.23cannowberewrittenas: C =max[ H ( y ) H ( y j y D )] (4.24) =max I ( y ; y D ) (4.25) 92 Sincetheoutput y isequaltothedeterministictransduceroutput y D plusgaussiannoise therefore,wecanemploytheexpressionforthecapacityofanadditivewhitenoise(AWGN) channel. ThevalidityofusingaGaussiandistributioncanbeinvestigatedusingGoodness-of-Fit (GOF)measures.ForthispurposethreequantitativeGOFmetricsnamely:(1) Kolmogorov- Smirnov (2) Chi-Squared and(3) Anderson-Darling Testmetricswereemployed.These metricsenableustocheckwhetherobservedsamplesaregeneratedbyaspeci˝cdistribution (Gaussianinthiscase).Toevaluateeachmetricweobservethenoisytransduceroutputsfor di˙erentinstancesoftheinputparticleconcentrationvector x =[ x 1 ;x 2 ] inamannersimilar tothatoutlinedinsection-III.Theobservedoutputsamplevector,foragivenparameter con˝guration [ k 1 ;k 2 ;k 12 ] andinputconcentrationinstance x =[ x 1 ;x 2 ] ,isconsideredtohave aGaussiandistributioniftheGOFtestacceptstheNull-Hypothesiswithasigni˝cancevalue of1%.TheobservedoutputsamplevectorsthatpasstheGOFtestarethendividedbythe totaltransmittedsamplevectorstocomputethepercentageofobserveddatathatiscon- sideredtobeGaussiandistributed.Thisvalueisaveragedoverallparametercon˝gurations toobtainthemeanvalueforeachofthethreemetrics.Itwasobservedthatforall3test metrics,onaverage,morethan80%oftheobservedoutputshaveaGaussiandistribution. Tobespeci˝c,88.5%ofthedatapassedtheGOFtestwhenusingtheChi-Squaredtest; 97.47%ofthedatapassedusingtheKalmogorov-Smirnovtestand81.6%passedusingthe Aderson-Darlintest.Therefore,theassumptionoftheoutputhavingaGaussiandistribution isnotanunfairone. ThecapacityofanAWGNchannelwithnoisevariance ˙ 2 n andanaverageinputpower lessthanequalto P isgivenby[75]: C = 1 2 log 1+ P ˙ 2 n (4.26) Foraproteomicchanneltheequivalentofapowerconstraintisanupperboundonthe varianceoftheinputparticles.However,asdescribedinsection4.1.2thevarianceofthe concentrationoftheinputparticlesisnotunderourcontrol.Butitisreasonabletobound 93 Figure4.11:Capacityofproteinarraychannelfordi˙erentvaluesofreceptorparameters. Variance P oftheinputdistributionisthesameforallsettingsandissetequalto10. (fromabove)theconcentrationleveloftheinputparticles(byplacinganupperboundon thevarianceof y D ).Thecapacityofaproteinarrayasafunctionofreceptorparameters cannowbecomputedbysubstitutionofequation(4.20)intoequation(4.26). Figure4.11plotstheestimatedcapacityofproteinarraychannelfordi˙erentvaluesof receptorparameters.Theconcentrationlevel(equivalenttopower)oftheinputparticleis kept˝xed(atthesamelevel)foralltheexperiments.Becausethreereceptorparameters 94 wereinvolvedinthesweep,each˝gureinFigure4.11correspondstoa˝xedvalueof k 1 , whichisthereceptorsensitivitytoprotein-1.Duetolimitedspaceweonlypresentonlya smallnumberofthecapacitycurvesherehowever,itishighlightedthatthetrendsobserved inFigure4.11wereobservedovertheentirerangeofthereceptorparameters.Itcanbe observedfromtheplotsthatforallparametersettingsahighercapacityisachievedwhen weuseacombinatorialreceptorthatcanbindwithbothproteinssimultaneously.Increasing thejointhybridizationparameter k 12 alwaysimprovesthecapacity.Atlowvaluesofthe protein-1sensitivityparameter, k 1 ,increasingthesensitivity( k 2 )toprotein-2,generally resultsinadecreaseinthecapacityascanbeseeninFigure4.11(a)to(c).Athigher sensitivitiestoprotein-1however,thevalueofthesensitivityparameterofprotein-2does nothaveasigni˝cantimpactontheoverallcapacity.Relativetoaspeci˝creceptorthe highestcapacitygainsareachievedatlowervaluesof k 1 .Thiscanbeattributedtothefact thatthejointhybridizationparameter k 12 hasamoresigni˝cantimpactonthevariance atlowervaluesof k 1 .Forexample,bycomparingthebottomtwoplotsinFigure4.7and Figure4.8;itisapparentthatincreasing k 12 from0to1resultsinamuchmoresigni˝cant reductionintheoverallvariancewhen k 1 =0 : 2 (Figure4.7)incomparisontothecasewhere k 1 =1 (Figure4.8).Asthevalueof k 1 increasesthegainincapacity(relativetoaspeci˝c probe)diminisheshowever,thelossisnotverysigni˝cant. 95 CHAPTER5 KERNELMACHINESFORCAPACITYESTIMATION Thecapacityofadual-proteinproteomicchannelwascomputedbyapproximatingitisan additivewhiteGaussiannoise(AWGN)channelinchapter4.Theproteomicchannelisa non-linearchannelwithhigh-dimensional,continuousinputalphabets.Capacityevaluation ofsuchachannelischallengingandgenerallyrequiresnumericalsolutions.Furthermore, evenwhenusingnumericaltechniquesitoftenbecomeverychallengingtooptimizeShannon's informationmeasuressuchasthemutualinformation.Thischapterpresentsaframework thatemploys Gini kernelmachinestoevaluatethe(quadratic)mutualinformationofthe proteomicchannel.Forthispurposeanovelproteomickernelisproposedwhichincorporates thebio-physicsofthereceptorandtargetproteininteractionsintotheoptimizationproblem. Furthermore,itenablesarraydesignerstoidentifythemostimportantprobesamongsta largenumberofcandidates.Incomparisontothecapacityevaluationapproachinchapter4, theframeworkpresentedinthischapterconsidersalargenumberofinputproteins,the approachemployedinthepreviouschapterwaslimitedtoonly2inputproteins.Althoughit canbeextendedtolargernumberofproteinsitbecomesdi˚cultsincethenumberofcross terms( k ij )inthetransducermodelofequation(4.19)increasessigni˝cantlyasthenumber oftargetproteinsincrease. Theorganizationofthischapterisasfollows.Section5.1describesthedi˙usionmodel employed.Thetransducermodelispresentedinsection5.2.Aframeworkforevaluationof thecapacityusing Gini kernelmachinesispresentedinsection5.3.Anovelkernelemployed forcapacityestimationisproposedinsection5.4. 96 Figure5.1:Cross-sectionalviewofdi˙usioninamulti-proteinarray. 5.1Di˙usionModel Analyticsolutionsfordi˙usionmodelscontainingmultipleproteintargetsin3-dimensional volumesareoftenimpossibletoobtain.Keepingthisinperspectivethischapteremploysa simpledi˙usionmodelwhichviewstheproteinarrayreactionchamberatstead-stateunder well-mixedconditions.Thephysicalrepresentationofasimpli˝edthreeproteinassaywith multiplereceptorsinputproteinsisshowninFigure5.1.Atthestartofthetest,asample volumecontainingparticlestobeanalyzedisinjectedintothereactionchamber.Inorder tokeepthemodelmathematicallytractableweassumethatthereactionvolumecanbe dividedintosmaller,cubic,sub-volumesofequalsizeasshowninFigure5.1andconsider therandomvariationofparticlesonlyinsidethesub-volumescontainingthereceptorslocated atthebottomofthereactionvolume. Foraproteinarraywhichperformsmultiplexeddetectionof P typesofproteins,wedenote theinputofthesystembyavector u 2 Z P + .Eachcomponentof u isanon-negativeinteger randomvariabledenotedby u n andrepresentsthetotalnumberofparticlesofprotein- n in 97 thetestsample.Particlesareinjectedintothereactionchamberusingadevicesuchasa micro-pipette.Itisassumedthatinputparticlesgetdistributeduniformlythroughoutall thereactionvolumeandthereareanequalnumberofproteinparticles,ofeachtype,inside eachsub-volume.Thevector x 2 Z P + representstheaveragenumberofproteinsofeachtype insideeachsub-volume.Therefore,theaveragenumberofparticlesofprotein- n insideeach sub-volumeisgivenby: x n = u n M n =1 ;:::P (5.1) where, M representsthetotalnumberofsub-volumesinsidethereactionvolume.Duetothe Browniandynamicsofparticledi˙usionthetruenumberofparticles x n insidesub-volume isarandomvariablewhoseaveragevalueisgivenbyequation(5.1).Undertheassumption thattheprobabilityoftwoparticlesoccupyingtheexactsamespatiallocationisnegligible andthatthemotionofallindividualparticlesinsidethereactionvolumeisindependentof eachother;itcanbeshownthattheactualnumberofparticles e x n ,ofprotein- n ,insidea sub-volumeisaPoissonrandomvariable,withanarrivalrateequaltotheaveragenumber ofparticlesgivenby x n [77],[78]: e x n ˘ Poiss ( x n ) n =1 ;:::P (5.2) Thismodelcannowbeusedtocharacterizetherandomvariationofconcentrationinside thereceptorsub-volumes. 5.2ReceptorResponseModel Thereceptorresponsemodelemployedhereisanextensionofthejointmodelpresentedin section4.1.2.Theoutputofareceptorinaproteinarraywith P di˙erenttypesofinput proteinsisgivenby: 98 y = g ( x )= y 0 + P X i =0 k i log + x i + X i 6 = j k ij log + x i + x j (5.3) Here,thesensitivityparameters k i and k ij 2 R 0 . k i =0 impliesthatreceptordoesnot containprobesthatinteractwithparticlesofprotein- i .Whereas, k ij =0 impliesthatthe probesoftype- i donotinteractwithprotein- j . 5.3ProteomicChannelCapacityEstimation Theinformationcapacityofanycommunicationsystemisde˝nedasthemaximummutual informationbetweenthetransmittedsignal x andthereceivedsignal y [76].Themaximiza- tionisgenerallyperformedoverallprobabilitydistributionsde˝nedontheinputalphabet. C =max P ( x ) I ( x ; y ) (5.4) Channelcapacityisgenerallyadi˚cultmetrictocompute.Analyticallyitsderivation forcomplexchannelscanbeverychallenging(ifnotimpossible)toevaluate.Numerical solutionsontheotherhandareamoreviableoptionhowever,classicnumericalapproaches forcapacityevaluationsuchastheArimoto-Blahutalgorithm[79],[80]arelimitedto˝nite inputandoutputalphabets.Althoughthesealgorithmshavebeenextendedtocontinuous inputand/oroutputalphabets[81]theevaluationofcapacityforcontinuouschannels(such astheproteomicchannel)withhigh-dimensionalinput-outputalphabetsisstillanopen problem.Theprimarychallengeincapacityevaluationoftheproteomicchannelisaccurate estimationoftheconditionalchanneldistribution P ( y j x ) whichisdi˚cultduetothehigh- dimensionalandcontinuousnatureoftheinputandoutputalphabets.Toelaboratefurther, atraditionalroutecanbeusetheempiricalconditionalchanneldistribution ^ P ( y j x ) however, anaccurateestimateisdi˚culttoobtainforhigh-dimensional,continuouschannels.Asa 99 Figure5.2:Blockdiagramillustratingthecomputationofcapacityoftheproteomicchannel. resultwemodelproteomicchannelcapacityestimationasasupervisedlearningproblemin whichweemployregressiontolearnthechannelconditionaldistribution, ~ P ( y j x ) froma ˝nitesetoftrainingexamplesandthenutilizeittoevaluatethemutualinformationwhich isthenmaximizedtoattaincapacity.Thecapacityestimationoftheproteomicchannelis illustratedinFigure5.2.ItishighlightedthatinsteadofShannon'smutualinformationthe frameworkinthischapteremploysaquadraticmeasureofmutualinformation. Intheframeworkofsupervisedlearning,thelearnerissuppliedwithatrainingsetof featurevectors TˆX : T = f x i g ;i =1 ;::;N drawnindependentlyfroma˝xeddistribution P ( x ) ; x 2X .Inthecurrentscenariotheinputfeaturespace X = Z P + andcorrespondsto thenumberofparticlesofeachproteintypepresentinsidethereceptorsub-volumes.Also providedtothelearnerisasetconditionalprobabilitymeasures y ik = ^ P ( y k j x i ) de˝ned overthesetofreceptorspots y k with k =1 ;:::S .Thelabelsthereforearenormalizedand satisfy P S k =1 y ik =1 .Thetaskofthelearneristochooseasetofregressionfunctions ~ P = f ~ P ( y k j x ) g ;k =1 ;::;S thataccuratelypredictthetrueconditionalprobabilities P ( y k j x ) for thereceptorspots.Thisisaccomplishedbyusingadistancemetric D Q : R S R S ! R that embedspriorknowledgeaboutthetopologyofthefeaturespace.Thecapacityestimation oftheproteomicchannelcantherefore,beformulatedasthemaximizationofthemutual informationsubjecttotheconstraintthatthedistancebetweentheempiricaldistribution, 100 ^ P ( y k j x ) andtheestimateddistribution ~ P ( y k j x ) islessthanequaltoanerrorthreshold " : max I ( x ; y ) st: D Q ^ P ( y k j x ) ; ~ P ( y k j x ) " (5.5) Incontrasttotheconventionalcapacitycomputationthemaximizationfortheproteomic channelisperformedoverthechannelparameters =[ 1 ;::: P ] whicharraydesignercan varytoembedthemaximumamountofinputinformationinthechanneloutput.The parameters 2 R P 0 ;anddeterminetheimportanceassignedtoeachtypeofprobe,they arediscussedindetailinsection5.4.Theoptimizationin(5.5)canalsoberewrittenas: max D I ^ P ( y k j x ) ;P ( y k ) st: D Q ^ P ( y k j x ) ; ~ P ( y k j x ) " (5.6) where D I : R S R S ! R representsadistancemetric.Althoughitispossibletoemploy ametricsuchastheKullback-Leibler-Divergence(KLD)itmakestheoptimizationproblem di˚cultandtherefore,aquadraticdistancewillbeemployedinstead.Fortheproteomic channelitisreasonabletoassumethattheoutputdistribution P ( y k )= P u = U [ y 0 ;y max ] for k =1 ;:::S .Inotherwords P ( y k ) isuniformlydistributedbetweenthecontroloutput y 0 andthemaximumtransduceroutputundersaturationconditions y max .Thisenablesusto reformulatetheproblemofmutualinformationestimationasatrainingprocedureinvolving theminimizationofjointdistancemetricovertheprobabilityfunctions ~ P = n ~ P ( y k j x ) o min ~ P G ( ~ P )= min ~ P h D Q ( ^ P; ~ P )+ D I ( ~ P;P u ) i (5.7) Inthissettingthedistancemetric D I ( :;: ) canbeviewedasanagnostic(non-informative) distancemeasurewhichassumesnoknowledgeofthetrainingdata.Thehyper-parameter > 0 controlsthetrade-o˙betweentheagnosticandpriordistancemetrics.Minimization ofthecostfunctionin(5.7)yieldsthemutual-information I ( x ; y ) basedonthedistribution ~ P ( y j x ) thatliesbetweenthepriordistribution ^ P ( y j x ) andthenon-informative(agnostic) distribution P u .Theminimizationsetupin(5.7)canbecoupledwithlinearconstraints 101 de˝nedonthecumulativestatisticsofthetrainingset.The˝rstlinearconstraintexpresses equivalencebetweentheaverageestimatedprobabilitiesandempiricalfrequenciesforeach receptoroverthetrainingdata: N X i =1 ~ P ( y k j x i )= N X i =1 ^ P ( y k j x i ) ;k =1 ;:::S (5.8) Thisisbasedontheassumptionthatallfeatures x 2 Z P + areequallylikely.Thenormaliza- tionandboundaryconditionsforvalidprobabilitydistributionsaregivenbyasecondsetof linearconstraints: ~ P ( y k j x ) 0 ;k =1 ;:::S; (5.9) M X k =1 ~ P ( y k j x i )=1 (5.10) wherethenormalizingconstraint(5.10)subsumestheadditionalinequalityconstraint P k ( x ) 1 ;k =1 ;:::S .Combining(5.5)and(5.7)theevaluationoftheproteomicchannelcapacity becomesa max-min optimizationproblem: C Q = max min ~ P h D Q ( ^ P; ~ P )+ D I ( ~ P;P u ) i (5.11) subjecttotheconstraintslistedin(5.8),(5.9)and(5.10).Here,capacityisdenotedby C Q tohighlightthatwearetalkingaboutquadratic-capacityandalsotodi˙erentiateitfrom theoptimization-constant C usedinthesubsequentdiscussion. Theminimizationstepin(5.11)isthesameastheoptimizationproblemusedsection3.1.1 therefore,thesameprocesscanbeappliedtoobtainthequadraticdistance D Q betweenthe conditionaldistributions ^ P ( y k j x ) and ~ P ( y k j x ) . D Q ( ^ P; ~ P )= C 2 S X k =1 X x ; v 2T K ( x ; v ) h ^ P ( y k j x ) ~ P ( y k j x ) i h ^ P ( y k j v ) ~ P ( y k j v ) i (5.12) Here K : R S R S ! R representsasymmetric,positivede˝nitekernelsatisfyingthe Mercer'scriterion .Althoughanystandardo˙-the-shelfkernelsuchasaGaussianradial 102 basisfunctionorapolynomialspline[55,57]canbeemployedforoptimizationweemploya kerneldesignedspeci˝callyfortheproteomicchannel.Thisisdonebecausethepurposeof thekernel K ( x ; v ) istoquantifythetopologyofthemetricspaceforpoints x ; v 2X and therefore;itshould,preferably,capturetheunderlyingbio-physicsoftheproteomicchannel. Useofa Gini quadraticdistanceas D I ,theagnosticdistancemetricalongwithauniform distribution P u =1 = ( y max y 0 ) yieldsthefollowingcostfunction H g = S X k =1 2 4 1 2 C N X i =1 N X j =1 i k Q ij j k + 2 N X i =1 ( ~ P ( y k j x i ) i k =C ) 2 3 5 (5.13) Where,theinferenceparametersarede˝nedas i k = C h ^ P ( y k j x i ) ~ P ( y k j x i ) i .Aswasthe caseinchapter3the Gini dualissubjecttothefollowingconstraints S X k =1 i k =0 ;i =1 ;:::N; N X i =1 i k =0 ;k =1 ;:::S; (5.14) i k C ^ P ( y k j x i ) : The Gini -dualin(5.13)isaquadraticfunctionandcanbeminimizedusingstandard quadraticoptimizationlibraries.Thenextsectionintroducestheproteomickernelwhich canbeemployedtooptimizethereceptorparameters. 5.4ProteomicKernel Thepurposeofthekernel K ( x ; v ) istoincorporateknowledgeofthemetricspace X ofthe inputvectors x and v .Inthecurrentcontextthismeansthatwerequireasimilaritymeasure whichtakesintoconsiderationtheunderlyingbio-physicsofthereceptorandproteininter- actions.Toelaboratefurther,aconventional o˙-the-shelf kernelreturnsahighvalueifits inputvectors x and v containsimilarvaluesandlowervalueiftheyaredissimilar.Whereas 103 Figure5.3:Illustrationofinteractionsbetweenproteinoftype` i 'andtwodi˙erenttypesof capturingprobes. fortheproteomicsensingapplicationwerequireakernelwhichreturnsahighvalueifthe receptorprobesinteractwithtwoinputconcentrationvectorsinasimilarmanner. EXAMPLE: Considerthesimplecaseofanarraywith3proteintargetswithinputvec- tors x =[100 ; 0 ; 0] and v =[100 ; 0 ; 50] .Assumethatthereceptorunderconsideration (showninFigure5.3)containsonlyprobesthatinteractwithprotein-1andprotein-2but notwithprotein-3;asaresulttheoutputofthereceptorwillbeuna˙ectedbytheconcen- trationofprotein-3andwillbeidenticalforthesevaluesof x and v .Theproteomickernel shouldbeabletotakethisintoconsiderationandreturnahighsimilarityvaluewhencom- paring x and v .Ifhowever,thescenarioisslightlydi˙erentandtheconcentrationvector x =[100 ; 0 ; 0] and v =[100 ; 50 ; 0] thenthereceptoroutputsresultingfrom x and v will bedi˙erent.Thisshouldbere˛ectedbyacorrespondingdecreaseinthesimilarityvalue returnedbytheproteomickernel. Theproteomickernelcanbedevelopedbyusingaproductoftwosub-kernelsaswill becomeclearbelow.Assumingthatwithinthesub-volumeofacombinatorialreceptora particleofprotein- q reactswithaprobeoftype- p witha˝niteprobability ˘ pq asillustrated inFigure5.3.Sincethereareamaximumof P di˙erenttypesoftargetproteins,thenumber 104 ofdi˙erenttypesofprobeswhichcanbeimmobilizedonareceptorisalso = P .Thetotal numberofparticlesofprotein- q withinareceptorsub-volume = x q (the q -thelementof thevector x ).Thenthebindingofthe ( x q ) particlestothedi˙erenttypesofprobescan bemodeledbyamultinomialdistributionwith P possibleoutcomes(undertheassumption thatbindingofindividualparticlesisindependentofeachother).Theaveragenumberof particles,ofprotein- q thatcanbeattachedtoprobesoftype- p istherefore,givenby: ! pq = x q ˘ pq (5.15) However,thenumberofprobes,immobilizedatthereceptor,is˝nitetherefore,the ! pq will beboundedfromabovebythemaximumnumberofprobesoftype- p (denotedby L p ) ! pq = min L p ;x q ˘ pq (5.16) Wenowhavea P -dimensionalvector ! p whichtellusthe(average)numberofparticlesof eachproteintypethatcanbeaccommodatedbytheprobesoftype- p foragiven x .The similaritybetween x and ! p canbecomputedusingradialbasisfunctionkernel: K p ( x ; z p )= exp j x ! p j 2 2 ˙ 2 ! (5.17) Forareceptorcontaining P di˙erenttypesofprobeswede˝netheproteomickernel K ( x ; v ) asbelow: K ( x ; v )= P X p =1 p K p ( x ; ! p ) K p ( v ; ! p ) (5.18) where,theparameters p controlstheimportanceassignedtoprobeoftype- p .Ahigher valueof p indicatesthattheneedtoimmobilizealargenumberofprobesoftype- p atthe receptorwhereas,avalueclosetozeroimpliesthatprobesoftype- p shouldnotbeemployed. Inotherwordsthevaluesof p isindicativeofhowmuchimportanceshouldbeassignedto probesoftype- p whosereactioncharacteristicsaregivenbythevector ! p . Noticethatthenumberof p parametersinEquation(5.18)isequalto P .Therefore, thenumberofoptimizationvariablesisonlyequalto P .Thisisasigni˝cantadvantagethat 105 isachievedonlyduethetouseofkernelmethods.Iftheoptimizationproblemhadbeen formulated,withoutkernelmethods,directlyintermsoftheprobeparameters k 1 ;k 2 ;k 12 etc thenthenumberofoptimizationvariableswouldhavebeenprohibitivelylarge.Considerfor examplethejointmodelinequation(5.3),iftheoptimizationproblemhadbeenformulated directlyintermsoftheprobeparametersthennumberofoptimizationvariableswouldhave beenequalto P + P ( P 1)= P 2 duetothelargenumberofcrossterms ( k ij ) .Inthe proteomickernelhowever,thecross-termsarepresentinsidethe ! p andthereforedonot needtooptimizedexplicitly. 5.5OptimizationAlgorithm Evaluationofthequadraticcapacityinequation(5.11)isperformedusinganalternating max - min procedure.Inthe˝rststep,probeparameters p areinitializedtohaveuniform valuesandminimizationisperformedusingaprocessidenticaltotheoptimizationapproach employedinchapter3.Aftertheminimizationstep,thealgorithm˝xestheinferencepa- rameters k i and k j andperformsmaximizationovertheprobeparameters.Thissection ˝rstincorporatestheprobeparametersinsidetheoptimizationfunctionandthendescribes thealgorithmwhichcanbeemployedformaximization. Substitutingtheproteomickernelofequation(5.18intothedualof(5.13)yieldsanew costfunction H p = P X p =1 2 4 p 2 C S X k =1 N X i;j =1 i k Q ip Q jp j k + 2 N X i =1 ( ~ P ( y k j x i ) i k =C ) 2 3 5 (5.19) Where, Q ip = K ( x i ; ! p ) and Q jp = K ( x j ; ! p ) .Furthermore,inadditiontotheconstraints in(5.14)theproteomicdual(5.19)issubjecttothefollowingconstraints P X p =1 p =1 p 0 ;p =1 ;:::P (5.20) 106 Thecostfunctionin(5.19)isanon-homogeneouspolynomialandcanhavebothpositive andnegativecoe˚cients.Inaddition,theprobabilityvariables p in(5.19)arenormalized 8 p .Suchafunctioncanbemaximizeddirectlybyapplyingresultsfrom[82]and[83]. Theorem2 ([83])Let H ( f p g ) apolynomialofdegree d invariables p inthedomain D : p 0 ; P P p =1 p =1 ;p =1 ;::;P .De˝neaniterativemapaccordingtothefollowing recursion b p p ( @H @ p ( p )+ P P p =1 p ( @H @ p ( p )+ (5.21) where Md ( P +1) d 1 with M beingthesmallestcoe˚cientofthepolynomial H ( f p g ) . Then f b p g2 D and H ( f b p g ) >H ( f p g ) . ThepolynomialdualcorrespondingtoEquation(5.21)canbemaximizedusingtheresult above.Assumethatthekernelmatricesareboundedsuchthat j Q ip j Q max ; 8 i;p and j Q jp j Q max ; 8 j;p .Furthermore,theinitialvalueoftheprobabilitydistribution 0 p =1 =P 8 p .Denotingthevalueoftheprobabilityat m th iterationby m p theupdateateverystep willbegivenby m +1 p m p m p = P X p =1 m p m p where m p = 1 2 C S X k =1 N X i;j =1 i k Q ip Q jp j k + and = 1 2 C ( P +1) Q 2 max .Thecostfunctionin(5.19)increasesateachiterationand theprocessisrepeateduntilconvergence.Someofthedistributionvariables p cannever reachunityorzero;thisisduetothemultiplicativeupdateprocedure[51].Inpractice however,theyapproachthelimitswithinprecisionmarginsthatarecomparabletoother implementationsoftrainingalgorithmsforSVMs.Furthermore,valuesofthedistribution p closetounityorzerodemonstratealmostnochangethisimpliesthatcachingandshrinking [84]canalsobeemployedtoimprovethespeedoflargemargingrowthtransformation. Theframeworkinthischaptercanbeemployedforevaluatingtheinformationtransfer acrossamutiple-spotproteinarraychannel.Heretheinformationismeasuredintermsof 107 aquadraticdistance.Validationofthisframeworkusingexperimentaldataandnumerical simulationsshallbeperformedaspartoffuturework. 108 CHAPTER6 CONCLUSIONSANDFUTUREWORK 6.1Summary Theprimaryobjectiveofthisthesisistoexaminethepotentialbene˝tsthatcanbeachieved viatheapplicationoftheprinciplesofkernelmethodsandsignalprocessinginbiosensing applications.Forrespiratorysignalestimationithasbeendemonstratedthatuseofmultiple non-invasiveelectrodesdoesenableasigni˝cantreductioninbreathingrateestimationin comparisontousingonlyoneortwoelectrodes.Furthermore,itseemsthatuseofwell- designedlearningalgorithmsresultsinmoreperformanceimprovement.Inthisregarda numberofalgorithmsweretestedanditseemsthatacombinationofbothsignalprocessing andkernelmethodsisthebestapproach.Furthermore,intermsoflung-volumeestimation itseemsthattheSECdoesenablee˙ectiveestimation.Waveletbasedfeatureswerede- velopedforclassi˝cationofsubject'srespiratorystate;thesefeaturesprovideasimpleyet accuratemethodfordetectingthesubject'srespiratorystate.Waveletbasedrespiratory statedetectionoutperformsthemuchsimplerDCTbasedrespiratorystatedetection. Capacityoftheproteomicchannelforasmall-scalearraywasevaluatedinthepresence ofdi˙usionnoiseandnon-idealreceptors.Forthisarraydi˙erentprobeparameterswere investigatedanditwasdemonstratedthatcombinatorialprobesgivehighercapacityas comparedtoconventionalreceptorprobes.Aframeworkforevaluationofcapacityusing quadraticinformationmeasureswasalsopresented.Thisframeworkcanbeemployedfor evaluatingthecapacityofarrayswithasigni˝cantlyhighernumberoftargetproteinswith ease. 109 6.2FutureDirections Oneofthemostappealingfuturedirectionsthathasresultedfromthisthesisisuseofthe waveletbasedprobabilitycurvesforclassifyingnotonlythesubject'srespiratorystatebut alsohisphysicalstate.Aswasdemonstratedinsection3.2.5thesecurvesexhibitdi˙erent characteristicsbasedonthesubject'sphysicalactivityandtherefore,mayenableidenti˝ca- tionofthesubjectphysicalstate.Thisinturnmayallowtheadaptivealgorithmtoadjust itsbehavioraccordinglye.g.assignlowerweightagetochestelectrodesandhigherweightage toabdominalelectrodesifarmmotionisdetected. Forproteomicchannelcapacitycalculationsweintendtovalidatetheframeworkproposed inchapter5byemployingnumericalsimulationsandexperimentalprototypes.Another avenueforfutureresearchistoinvestigatethepossibilityofemployingmorecomplexmodels ofdi˙usionandreceptorswhenevaluatingthechannelcapacity. 110 BIBLIOGRAPHY 111 BIBLIOGRAPHY [1] M.R.Neuman,signs Pulse,IEEE ,vol.2,no.1,pp.2011. [2] WHO.(2014,May)Thetop10causesofdeath.[Online].Available:http: //www.who.int/mediacentre/factsheets/fs310/en/ [3] NIH.(2013)Explorecopd.[Online].Available:http://www.nhlbi.nih.gov/health/ health-topics/topics/copd/ [4] (2014)Exploreheartfailure.[Online].Available:http://www.nhlbi.nih.gov/ health/health-topics/topics/hf/ [5] (2014,Jun.)Lungfunctiontests.[Online].Available:http://www.nhlbi.nih.gov/ health//dci/Diseases/lft/lft_types.html [6] M.A.Cretikos,R.Bellomo,K.Hillman,J.Chen,S.Finfer,andA.Flabouris, ratoryrate:theneglectedvitalsign, MedicalJournalofAustralia ,vol.188,no.11,p. 657,2008. [7] T.Moore,assessmentin Nursingstandard ,vol.21,no.49,pp. 2007. [8] C.Butler-Williams,sta˙awarenessofrespiratoryrate Resus- citation ,vol.62,no.2,pp. [9] N.Shamim,M.Atul,C.GariD etal. ,fusionforimprovedrespirationrate EURASIPjournalonadvancesinsignalprocessing ,vol.2010,2010. [10] G.B.Moody,R.G.Mark,A.Zoccola,andS.Mantero,ationofrespiratorysignals frommulti-lead Computersincardiology ,vol.12,pp.1985. [11] J.A.Hirsch,B.Bishop etal. ,sinusarrhythmiainhumans:howbreathing patternmodulatesheart AmJPhysiol ,vol.241,no.4,pp.1981. [12] S.-B.Park,Y.-S.Noh,S.-J.Park,andH.-R.Yoon,improvedalgorithmforrespira- tionsignalextractionfromelectrocardiogrammeasuredbyconductivetextileelectrodes usinginstantaneousfrequency Medical&biologicalengineering&comput- ing ,vol.46,no.2,pp.2008. [13] C.Orphanidou,S.Fleming,S.Shah,andL.Tarassenko,fusionforestimating respiratoryratefromasingle-lead BiomedicalSignalProcessingandControl ,vol.8, no.1,pp.2013. [14] C.Voscopoulos,D.Ladd,L.Campana,andE.George,vasiverespiratoryvolume monitoringtodetectapneainpost-operativepatients:Case Journalofclinical medicineresearch ,vol.6,no.3,p.209,2014. 112 [15] C.Voscopoulos,J.Brayanov,D.Ladd,M.Lalli,A.Panasyuk,andJ.Freeman,al- uationofanovelnoninvasiverespirationmonitorprovidingcontinuousmeasurementof minuteventilationinambulatorysubjectsinavarietyofclinicalsc Anesthesia &Analgesia ,vol.117,no.1,pp.2013. [16] AmericanMedicalAssociation,Jun.2013.[Online].Avail- able:http://www.ama-assn.org//ama/pub/physician-resources/medical-science/ genetics-molecular-medicine/current-topics/proteomics.page [17] R.Huang,B.Burkholder,V.SloaneJones,W.Jiang,Y.Mao,Q.Chen,andZ.Shi, antibodyarraysinbiomarkerdiscoveryandv CurrentProteomics , vol.9,no.1,pp.2012. [18] H.Akiyama,S.Barger,S.Barnum,B.Bradt,J.Bauer,G.Cole,N.Cooper,P.Eike- lenboom,M.Emmerling,B.Fiebich etal. ,ationandalzheimer™s Neurobiologyofaging ,vol.21,no.3,pp.2000. [19] E.Hirsch,S.Hunot etal. ,inparkinson'sdisease:atargetfor Lancetneurology ,vol.8,no.4,pp.2009. [20] A.Mantovani,P.Allavena,A.Sica,andF.Balkwill, Nature ,vol.454,no.7203,pp.2008. [21] A.Carlsson,C.Wingren,J.Ingvarsson,P.Ellmark,B.Baldertorp,M.Fernö,H.Olsson, andC.A.Borrebaeck,proteomepro˝lingofmetastaticbreastcancerusing recombinantantibodymicroarra EuropeanJournalofCancer ,vol.44,no.3,pp. 2008. [22] A.Vazquez-martin,R.Colomer,andJ.A.Menendez,arraytechnologytode- tecther2(erbb-2)-induced‚cytokinesignature™inbreastc European JournalofCancer ,vol.43,no.7,pp.2007. [23] E.Gorelik,D.P.Landsittel,A.M.Marrangoni,F.Modugno,L.Velikokhatnaya,M.T. Winans,W.L.Bigbee,R.B.Herberman,andA.E.Lokshin,immunobead- basedcytokinepro˝lingforearlydetectionofovarian CancerEpidemiology Biomarkers&Prevention ,vol.14,no.4,pp.2005. [24] I.Visintin,Z.Feng,G.Longton,D.C.Ward,A.B.Alvero,Y.Lai,J.Tenthorey, A.Leiser,R.Flores-Saaib,H.Yu etal. ,markersforearlydetectionof ovarian ClinicalCancerResearch ,vol.14,no.4,pp.2008. [25] R.-P.Huang etal. ,ofmultipleproteinsinanantibody-basedproteinmi- croarray Journalofimmunologicalmethods ,vol.255,no.1,pp.2001. [26] S.W.Tam,R.Wiese,S.Lee,J.Gilmore,andK.D.Kumble,ultaneousanalysisof eighthumanth1/th2cytokinesusingmicroarra Journalofimmunologicalmethods , vol.261,no.1,pp.2002. 113 [27] P.OroszlanandM.Ehrat,proteinmicroarrays:anovelhighperformance microarrayplatformforlowabundanceprotein Proteomics ,vol.2,pp. 393,2002. [28] RayBiotech,angiogenesisarrayg1(8)code:http://www. raybiotech.com/g-series-human-angiogenesis-array-g1-8.html,2004. [29] angiogenesisarrayg2(8)code:http://www.raybiotech. com/g-series-human-angiogenesis-array-g2-8.html,2004. [30] angiogenesisarrayg1000(8)code:http://www. raybiotech.com/g-series-human-angiogenesis-array-g1000-8.html,2005. [31] S.SukhanovandP.Delafontaine,chip-basedmicroarraypro˝lingofoxidized lowdensitylipoprotein-treated Proteomics ,vol.5,no.5,pp.2005. [32] B.Huelseweh,R.Ehricht,andH.-J.Marschall,simpleandrapidproteinarraybased methodforthesimultaneousdetectionofbiowarfareagen Proteomics ,vol.6,no.10, pp.2006. [33] RayBiotech,cytokinearrayc5(4)code:https://www.raybiotech. com/c-series-human-cytokine-array-5-4.html,2009. [34] R.Huang,W.Jiang,J.Yang,Y.Q.Mao,Y.Zhang,W.Yang,D.Yang,B.Burkholder, R.F.Huang,andR.-P.Huang,biotinlabel-basedantibodyarrayforhigh-content pro˝lingofprotein CancerGenomics-Proteomics ,vol.7,no.3,pp. 2010. [35] Y.Liu,M.Gu,E.Alocilja,andS.Chakrabartty,Ultra-reliable nanoparticle-basedelectricaldetectionofbiomoleculesinthepresenceoflargeback- groundin BiosensorsandBioelectronics ,vol.26,no.3,pp.2010. [36] Y.Liu,D.Zhang,E.C.Alocilja,andS.Chakrabartty,detectionusing asilver-enhancedgoldnanoparticle-basedbiochi Nanoscaleresearchletters ,vol.5, no.3,pp.2010. [37] Y.LiuandS.Chakrabartty,actorgraph-basedbiomolecularcircuitanalysisforde- signingforwarderrorcorrecting BiomedicalCircuitsandSystems,IEEE Transactionson ,vol.3,no.3,pp.2009. [38] A.Hassibi,H.Vikalo,andA.Hajimiri,noiseprocessesandlimitsofperformance in Journalofappliedphysics ,vol.102,no.1,pp.014909,2007. [39] M.PierobonandI.F.Akyildiz,yofadi˙usion-basedmolecularcommunica- tionsystemwithchannelmemoryandmolecular InformationTheory,IEEE Transactionson ,vol.59,no.2,pp.2013. [40] I.Smith,J.Mackay,N.Fahrid,andD.Krucheck,ratemeasurement:a comparisonofmetho BritishJournalofHealthcareAssistants ,vol.5,no.1,p.18, 2011. 114 [41] E.P.Scilingo,A.Lanata,andA.Tognetti,forwearablesystems,in Wearable MonitoringSystems .Springer,2011,pp. [42] A.J.SmolaandB.Schölkopf,tutorialonsupportvector Statisticsand computing ,vol.14,no.3,pp.2004. [43] C.CortesandV.Vapnik,ort-vectornetw Machinelearning ,vol.20,no.3, pp.1995. [44] I.Guyon,B.Boser,andV.Vapnik,capacitytuningofverylargevc- dimension Advancesinneuralinformationprocessingsystems ,pp. 1993. [45] C.-C.ChangandC.-J.Lin,LIBSVM:Alibraryforsupportvectormac ACM TransactionsonIntelligentSystemsandTechnology ,vol.2,pp.2011,soft- wareavailableathttp://www.csie.ntu.edu.tw/~cjlin/libsvm. [46] J.-L.GauvainandC.-H.Lee,umaposterioriestimationformultivariategaus- sianmixtureobservationsofmarkovc Speechandaudioprocessing,ieeetransac- tionson ,vol.2,no.2,pp.1994. [47] D.A.Reynolds,T.F.Quatieri,andR.B.Dunn,eakerveri˝cationusingadapted gaussianmixturemo Digitalsignalprocessing ,vol.10,no.1,pp.2000. [48] C.M.BishopandN.M.Nasrabadi, Patternrecognitionandmachinelearning .springer NewYork,2006,vol.1. [49] S.Lloyd,squaresquantizationinp InformationTheory,IEEETransactions on ,vol.28,no.2,pp.1982. [50] A.P.Dempster,N.M.Laird,D.B.Rubin etal. ,umlikelihoodfromincomplete dataviatheema JournaloftheRoyalstatisticalSociety ,vol.39,no.1,pp. 1977. [51] S.ChakrabarttyandG.Cauwenberghs,nisupportvectormachine:Quadraticen- tropybasedrobustmulti-classprobability TheJournalofMachineLearning Research ,vol.8,pp.2007. [52] E.T.Jaynes,theoryandstatisticalmec Physicalreview ,vol.106, no.4,p.620,1957. [53] T.Jebara,e,generativeandimitativePh.D.dissertation,Mas- sachusettsInstituteofTechnology,2001. [54] D.P.Bertsekas, Nonlinearprogramming .Athenascienti˝cBelmont,1999. [55] B.Schölkopf,C.Burges,andA.Smola,dvancesinkernelmethodsŠsupportvector learningmit Cambridge,MA ,1999. 115 [56] M.I.JordanandR.A.Jacobs,hicalmixturesofexpertsandtheem Neuralcomputation ,vol.6,no.2,pp.1994. [57] G.Wahba etal. ,ortvectormachines,reproducingkernelhilbertspacesandthe randomized AdvancesinKernelMethods-SupportVectorLearning ,vol.6,pp. 1999. [58] E.Osuna,R.Freund,andF.Girosi,rainingsupportvectormachines:anapplication tofacein ComputerVisionandPatternRecognition,1997.Proceedings., 1997IEEEComputerSocietyConferenceon .IEEE,1997,pp. [59] J.Platt etal. ,asttrainingofsupportvectormachinesusingsequentialminimalopti- AdvancesinkernelmethodsŠsupportvectorlearning ,vol.3,1999. [60] T.PoggioandG.Cauwenberghs,talanddecrementalsupportvectormachine Advancesinneuralinformationprocessingsystems ,vol.13,p.409,2001. [61] S.G.Mallat,theoryformultiresolutionsignaldecomposition:thewaveletrepre- sen PatternAnalysisandMachineIntelligence,IEEETransactionson ,vol.11, no.7,pp.1989. [62] I.Daubechies etal. , Tenlecturesonwavelets .SIAM,1992,vol.61. [63] D.S.LemonsandA.Gythiel,aullangevin™s1908paperfionthetheoryofbrownian motionfl[fisurlathéoriedumouvementbrownien,flcracad.sci.(paris)[bold146], 533 AmericanJournalofPhysics ,vol.65,p.1079,1997. [64] P.F.Green, KineticsandTransportinSoftandHardMaterials .Taylor&Francis Group,2005. [65] J.S.Gulliver, Introductiontochemicaltransportintheenvironment .CambridgeUni- versityPress,2007. [66] P.Schuck,ofligandbindingtoreceptorimmobilizedinapolymermatrix,as detectedwithanevanescentwavebiosensor.i.acomputersimulationofthein˛uence ofmasstransp Biophysicaljournal ,vol.70,no.3,pp.1996. [67] A.PapoulisandS.U.Pillai, Probability,randomvariables,andstochasticprocesses . TataMcGraw-HillEducation,2002. [68] M.PierobonandI.F.Akyildiz,noiseanalysisformolecularcommuni- cationinnanonetw SignalProcessing,IEEETransactionson ,vol.59,no.6,pp. 2011. [69] S.Wang,W.Guo,S.Qiu,andM.D.McDonnell,erformanceofmacro-scalemolecular communicationswithsensorcleansein Telecommunications(ICT),201421st InternationalConferenceon .IEEE,2014,pp. 116 [70] Y.Liu,S.Chakrabartty,andE.Alocilja,undamentalbuildingblocksformolecular biowirebasedforwarderror-correcting Nanotechnology ,vol.18,no.42,p. 424017,2007. [71] R.G.Ryall,C.J.Story,andD.R.Turner,ofthecausesofthefihook e˙ectflintwo-siteimmunoradiometricassa AnalyticalBiochemistry ,vol.127,no.2, pp.1982. [72] M.GuandS.Chakrabartty,ast:Aframeworkforsimulationandanalysisoflarge- scaleprotein-siliconbiosensor BiomedicalCircuitsandSystems,IEEETrans- actionson ,vol.7,no.4,pp.2013. [73] H.Vikalo,B.Hassibi,andA.Hassibi,statisticalmodelformicroarrays,optimales- timationalgorithms,andlimitsofperformance SignalProcessing,IEEETransactions on ,vol.54,no.6,pp.2006. [74] limitsofperformanceofdnamicroarrain Acoustics,SpeechandSignal Processing,2006.ICASSP2006Proceedings.2006IEEEInternationalConferenceon , vol.2.IEEE,2006,pp.III. [75] C.E.Shannon,mathematicaltheoryofcomm ACMSIGMOBILEMobile ComputingandCommunicationsReview ,vol.5,no.1,pp.2001. [76] T.CoverandJ.Thomas, Elementsofinformationtheory .Wiley-interscience,2006. [77] S.Chandrasekhar,chasticproblemsinphysicsandastronomy Reviewsofmodern physics ,vol.15,no.1,p.1,1943. [78] M.Höller,dvanced˛uorescence˛uctuationspectroscopywithpulsedinterleavedex- Ph.D.dissertation,lmu,2011. [79] S.Arimoto,algorithmforcomputingthecapacityofarbitrarydiscretememoryless c InformationTheory,IEEETransactionson ,vol.18,no.1,pp.1972. [80] R.E.Blahut,ofchannelcapacityandrate-distortion Infor- mationTheory,IEEETransactionson ,vol.18,no.4,pp.1972. [81] J.Dauwels,computationofthecapacityofcontinuousmemorylesschan- in Proceedingsofthe26thSymposiumonInformationTheoryintheBENELUX . Citeseer,2005,pp. [82] L.E.Baum,G.R.Sell etal. ,rowthtransformationsforfunctionson Paci˝cJ.Math ,vol.27,no.2,pp.1968. [83] P.Gopalakrishnan,D.Kanevsky,A.Nadas,andD.Nahamoo,inequalityforratio- nalfunctionswithapplicationstosomestatisticalestimation Information Theory,IEEETransactionson ,vol.37,no.1,pp.Jan1991. [84] T.Joachims, Textcategorizationwithsupportvectormachines:Learningwithmany relevantfeatures .Springer,1998. 117