CAPACITYASSURANCEINHOSTILENETWORKS By JianLi ADISSERTATION Submittedto MichiganStateUniversity inpartialentoftherequirements forthedegreeof ElectricalEngineering{DoctorofPhilosophy 2015 ABSTRACT CAPACITYASSURANCEINHOSTILENETWORKS By JianLi Linearnetworkcodingprovidesanewcommunicationdiagramtotlyincreasethe networkcapacitybyallowingtherelaynodestoencodetheincomingmessages.However, thiscommunicationdiagramisfragiletocommunicationerrorsandpollutionattacks.How tocombaterrorswhilemaintainingthenetworkisachallengingresearchproblem. Inthisdissertation,westudyhowtocombattheattacksinbothxednetworkcodingand randomnetworkcoding. Fornetworkcoding,weprovideanovelmethodologytocharacterizelinearnetwork codingthrougherrorcontrolcoding.Weproposetomapeachlinearnetworkcodingto anerrorcontrolcoding.Underthismapping,thesetwocodesareessentiallyidenticalin algebraicaspects.Meanwhile,weproposeanovelmethodologytocharacterizealinear networkcodingthroughaseriesofcascadedlinearerrorcontrolcodes,andtodevelopnetwork codingschemesthatcancombatnodecompromisingattacks. Forrandomnetworkcoding,weproposeanewerror-detectionanderror-correction(EDEC) schemetodetectandremovemaliciousattacks.TheproposedEDECschemecanmaintain throughputunchangedwhenmoderatenetworkpollutionexistswithonlyaslightincrease incomputationaloverhead.ThenweproposeanimprovedLEDECschemebyintegrating thelow-densityparitycheck(LDPC)decoding.Ourtheoreticalanalysis,performanceevalu- ationandsimulationresultsusingns-2simulatordemonstratethattheLEDECschemecan guaranteeahighthroughputevenforheavilypollutednetworkenvironment. Distributedstorageisanaturalapplicationofnetworkcoding.Itplaysacrucialrolein thecurrentcloudcomputingframeworkinthatitcanprovideadesignbetweensecu- ritymanagementandstorage.Regeneratingcodebasedapproachattracteduniqueattention becauseitcanachievetheminimumstorageregeneration(MSR)pointandminimumband- widthregeneration(MBR)pointfordistributedstorage.Sincethen,Reed-Solomoncode basedregeneratingcodes(RS-MSRcodeandRS-MBRcode)weredeveloped.Theycanalso maintaintheMDS(maximumdistanceseparable)propertyincodereconstruction.However, inthehostilenetworkwherethestoragenodescanbecompromisedandthepacketscanbe tamperedwith,thestoragecapacityofthenetworkcanbetlyted.Inthis dissertation,weproposeaHermitiancodebasedminimumstorageregenerating(H-MSR) codeandaHermitiancodebasedminimumbandwidthregenerating(H-MBR)code.We provethattheycanachievethetheoreticalMSRboundandMBRboundrespectively.We thenproposedataregenerationandreconstructionalgorithmsfortheH-MSRcodeandthe H-MBRcodeinbotherror-freenetworksandhostilenetworks.Theoreticalevaluationshows thatourproposedschemescandetecttheerroneousdecodingsandcorrectmoreerrorsin thehostilenetworkthantheRS-MSR/RS-MBRcodewiththesamecoderaterespectively. InspiredbythenovelconstructionofHermitiancodebasedregeneratingcodes,anatural questionishowtoconstructoptimalregeneratingcodesbasedonthelayeredstructurelike Hermitiancodeindistributedstorage.ComparedtotheHermitianbasedcode,thesecodes havesimplerstructuresandareeasiertounderstandandimplement.Weproposetwooptimal constructionsofMSRcodesthroughrate-matchinginhostilenetworks:2-layerrate-matched MSRcodeand m -layerrate-matchedMSRcode.Forthe2-layercode,wecanachievethe optimalstorageforgivensystemrequirements.Ourcomprehensiveanalysisshows thatourcodecandetectandcorrectmaliciousnodeswithhigherstoragecompared totheRS-MSRcode.Thenweproposethe m -layercodebyextendingthe2-layercodeand achievetheoptimalerrorcorrectionbymatchingthecoderateofeachlayer'sMSR code.Wealsodemonstratethattheoptimizedparametercanachievethemaximumstorage capacityunderthesameconstraint.ComparedtotheRS-MSRcode,ourcodecanachieve muchhighererrorcorrection.Theoptimized m -layercodealsohasbettererror correctioncapabilitythantheH-MSRcode. Copyrightby JIANLI 2015 ThisdissertationisdedicatedtomywifeZhang,Ying. v ACKNOWLEDGMENTS DuringmyPh.D.study,Dr.JianRenhasbeenanexcellentadvisorandmentor.Iwould liketothankDr.JianRenforbringingmeintotheacademicarea.Henotonlyteaches mesolidknowledgeincybersecurityarea,butalsohelpsmeestablishthemethodologyof doingseriousandmeaningfulresearch.Withouthissupportthisdissertationwouldnothave happened. Ialsowouldliketoexpressgratitudetotheprofessorsinmycommittee:Dr.Subir Biswas,Dr.FathiSalem,andDr.RichardEnbodyfromdepartmentofComputerScience. Iwouldnothaveachievedcurrentgoalswithouttheirhelpfuladvice. IwanttothankDr.TongtongLiforthoseenlighteningdiscussions.Shehastaughtme somuchandgreatlyexpandedmyresearchvisions. AndIcannotimaginewhatitwouldbewithoutthegreatsupportfrommyfamily,my labmatesandmyfriends.Iloveyouall. vi TABLEOFCONTENTS LISTOFTABLES ................................... xi LISTOFFIGURES ................................... xii KEYTOABBREVIATIONS .............................. xv LISTOFALGORITHMS ................................ xvi CHAPTER1INTRODUCTION ........................... 1 1.1CombatingPollutionAttacksinNetworkCoding...............1 1.1.1BriefReviewofNetworkCoding.....................1 1.1.2SecurityProblemsofNetworkCoding..................2 1.1.3ExistingworkonCombatingPollutionAttacksinNetworkCoding..3 1.1.4SummaryoftheLimitationsofexistingworkonCombatingPollu- tionAttacksinNetworkCoding.....................4 1.2DistributedStorageinHostileNetworks....................5 1.2.1BriefReviewofCurrentAlgorithmsforDistributedStorage.....5 1.2.2ExistingWorkonDistributedStorage..................7 1.2.3ExistingWorkonDistributedStorageinHostileNetworks......8 1.2.4LimitationsofExistingWorkonDistributedStorageinHostileNetworks9 1.2.5withExistingWorkonSecureNetworkCommunication..9 1.3ProposedResearchDirections..........................9 1.3.1DirectionsforCombatingPollutionAttacksinNetworkCoding...9 1.3.2DirectionsforDistributedStorageinHostileNetworks........10 1.4Overviewofthedissertation...........................11 1.4.1MajorContributions...........................11 1.4.2Structure.................................13 CHAPTER2PRELIMINARY ............................ 15 2.1NetworkCoding..................................15 2.2ErrorControlCoding...............................16 2.2.1ErrorDetection..............................17 2.2.2ErrorCorrection.............................18 2.2.3SomePropertiesofErrorControlCodes................19 2.3RegeneratingCode................................20 2.4HermitianCode..................................21 CHAPTER3COMBATINGPOLLUTIONATTACKSFORFIXEDNETWORKS . 24 3.1CharacterizationofLinearNetworkCodingforPollutionAttacks......24 3.1.1ModelsandAssumptions.........................24 3.1.2AnIllustrativeExample.........................25 vii 3.1.3RelationshipbetweenNetworkCodingandErrorControlCodingin Point-to-PointCommunication.....................27 3.1.3.1The.........................27 3.1.3.2TheNecessity..........................31 3.1.3.3ApplicationinCombatingpollutionattacks.........33 3.1.4MulticastCase..............................34 3.1.4.1The.........................34 3.1.4.2TheNecessity..........................35 3.2ACascadedErrorControlCodingApproach..................35 3.2.1ModelsandAssumptions.........................35 3.2.2AnIllustrativeExample.........................36 3.2.3CharacterizationofNetworkCodingusingCascadedErrorControl CodinginPoint-to-PointCommunication................38 3.2.3.1The.........................38 3.2.3.2TheNecessity..........................41 3.2.3.3ApplicationinCombatingpollutionattacks.........42 3.2.4MulticastCase..............................43 CHAPTER4COMBATINGPOLLUTIONATTACKSFORRANDOMNETWORKS45 4.1System/AdversarialModelsandAssumptions.................45 4.2ProposedEDECScheme.............................46 4.2.1EDECScheme..............................47 4.2.1.1LimitationsofErrorControlCode..............47 4.2.1.2MoErrorControlCode.................47 4.2.1.3PerformanceofMoErrorControlCode........49 4.2.1.4AlgorithmsforEDECScheme.................50 4.2.2Simulationinns-2.............................53 4.2.2.1SimulationPlatform......................54 4.2.2.2NodesDesign..........................56 4.2.2.3SimulationResults.......................57 4.3LDPCDecodingandLEDECScheme......................59 4.3.1LDPCCode................................59 4.3.2DecodingofLDPCCode.........................60 4.3.3RelationshipBetweenLinearNetworkCodeandLDPCCode.....61 4.3.4LEDECSchemeUsingBPA.......................62 4.3.5TheoreticalAnalysis...........................62 4.3.6PerformanceAnalysisandSimulation..................64 4.3.6.1NodesDesign..........................64 4.3.6.2SimulationResults.......................65 CHAPTER5DISTRIBUTEDSTORAGEINHOSTILENETWORKS|HER- MITIANCODEBASEDREGENERATINGCODESAPPROACH .. 71 5.1System/AdversarialModelsandAssumptions.................71 5.2AnIllustrativeExample.............................72 viii 5.2.1RSCodeinDistributedStorage.....................72 5.2.2HermitianCodeinDistributedStorage.................75 5.2.3Inspirationfromthisexample......................77 5.3HermitianCodeBasedMSRRegeneratingCode(H-MSRCode).......77 5.3.1EncodingH-MSRCode..........................77 5.3.2RegenerationoftheH-MSRCodeintheError-freeNetwork.....83 5.3.3RegenerationoftheH-MSRCodeintheHostileNetwork.......85 5.3.3.1DetectionMode.........................86 5.3.3.2RecoveryMode.........................89 5.3.4ReconstructionoftheH-MSRCodeintheError-freeNetwork....89 5.3.5ReconstructionoftheH-MSRCodeintheHostileNetwork......91 5.3.5.1DetectionMode.........................91 5.3.5.2RecoveryMode.........................100 5.3.6RecoverMatrices S l;t ;T l;t from q 2 StorageNodes...........100 5.4HermitianCodeBasedMBRRegeneratingCode(H-MBRCode).......102 5.4.1EncodingH-MBRCode.........................102 5.4.2RegenerationoftheH-MBRCodeintheError-freeNetwork.....105 5.4.3RegenerationoftheH-MBRCodeintheHostileNetwork.......106 5.4.3.1DetectionMode.........................107 5.4.3.2RecoveryMode.........................108 5.4.4ReconstructionoftheH-MBRcodeintheError-freeNetwork....108 5.4.5ReconstructionoftheH-MBRcodeintheHostileNetwork......110 5.4.5.1DetectionMode.........................110 5.4.5.2RecoveryMode.........................112 5.4.6RecoverMatrices M l ;t from q 2 StorageNodes.............113 5.5PerformanceAnalysis...............................114 5.5.1ScalableErrorCorrection........................114 5.5.1.1Errorcorrectionfordataregeneration............114 5.5.1.2Errorcorrectionfordatareconstruction...........115 5.5.2ErrorCorrectionCapability.......................115 5.5.3ComplexityDiscussion..........................117 5.5.3.1H-MSRregeneration......................118 5.5.3.2H-MSRreconstruction.....................118 CHAPTER6DISTRIBUTEDSTORAGEINHOSTILENETWORKS|OPTI- MALCONSTRUCTIONOFREGENERATINGCODESTHROUGH RATE-MATCHINGAPPROACH ................... 119 6.1System/AdversarialModelsandAssumptions.................119 6.2ComponentCodesofRate-matchedMSRCode................120 6.2.1FullRateCode..............................120 6.2.1.1Encoding............................120 6.2.1.2Regeneration..........................121 6.2.1.3Reconstruction.........................121 6.2.2FractionalRateCode...........................123 ix 6.2.2.1Encoding............................123 6.2.2.2Regeneration..........................124 6.2.2.3Reconstruction.........................125 6.32-LayerRate-matchedMSRCode........................125 6.3.1RateMatching..............................126 6.3.2Encoding.................................126 6.3.3Regeneration...............................127 6.3.4ParametersOptimization.........................128 6.3.5Reconstruction..............................129 6.3.5.1OptimizedParameters.....................130 6.3.6PerformanceEvaluation.........................131 6.4 m -LayerRate-matchedMSRCode.......................132 6.4.1RateMatchingandParametersOptimization.............133 6.4.1.1Optimizationfor m =3....................134 6.4.1.2EvaluationoftheOptimizationfor m =3..........135 6.4.1.3GeneralOptimizationResult.................136 6.4.1.4EvaluationoftheOptimization................139 6.4.2PracticalConsiderationoftheOptimization..............141 6.4.2.1EvaluationoftheOptimalErrorCorrection...142 6.4.3Encoding.................................142 6.4.4Regeneration...............................143 6.4.5Reconstruction..............................144 6.4.5.1OptimizedParameters.....................145 CHAPTER7CONCLUSIONS ............................ 146 BIBLIOGRAPHY .................................... 148 x LISTOFTABLES Table2.1 q 3 rationalpointsoftheHermitiancurve..................23 Table4.1Fourcasesofdecodedcodewordsinmoerrorcontrolcode......49 xi LISTOFFIGURES Figure1.1Asimpleexampleofnetworkcoding....................2 Figure1.2Diagramofdistributedstorage........................5 Figure2.1Illustrationofregeneratingcode.......................20 Figure3.1Exampleforillustratingthemainidea...................26 Figure3.2Theprocessesoftransferringnetworkcodeintobipartitegraph.....26 Figure3.3ThecorrespondingbipartitegraphofFigure3.1..............26 Figure3.4Equivalenceofthreekindsofnodesinnetworkcoding...........27 Figure3.5Anexampleofpoint-to-pointnetworkcoding...............29 Figure3.6ThecorrespondingbipartitegraphofFigure3.5..............29 Figure3.7Implementthe(7 ; 4)Hammingcodeinnetworkcoding..........34 Figure3.8TransferthenetworkcodingschemeinFigure3.1intoa3-levelcascaded codingbyadding2virtualnodes.......................37 Figure3.9Thecorrespondingbipartitegraphsof3cascadedlevelsinFigure3.8..37 Figure3.10Transferincomingedgesofnodeshavingmultipleincomingedgesby addingvirtualnodes.............................38 Figure3.11Partitionanetworkcodeintoseverallevels.................39 Figure3.12ThecorrespondingcascadedbipartitegraphofFigure3.5.........40 Figure3.13Implementa2levelcascadederrorcontrolcodeinnetworkcoding....44 Figure3.14ThecorrespondingcascadedbipartitegraphofFigure3.13........44 Figure4.1Applyerrorcontrolcodesinlinearnetworkcoding.............46 Figure4.2Limitationsoferrorcontrolcodes......................47 Figure4.3TheencodingprocessofmoerrorcontrolcodeinEDECscheme..48 Figure4.4ThedecodingprocessofmoerrorcontrolcodeinEDECscheme..49 xii Figure4.5PerformanceofmoerrorcontrolcodeinEDECschemewhen k =650 Figure4.6PerformanceofmoerrorcontrolcodeinEDECschemewhen k =451 Figure4.7Simulationscenario..............................55 Figure4.89timeslotstoavoidpacketscollisions...................55 Figure4.9ThroughputcomparisonbetweenEDECschemeandtheerror-detection schemesbasedonthenumberofbitcorruptedineachsymbol|for smallnumberoferrors............................58 Figure4.10ThroughputcomparisonbetweenEDECschemeandtheerror-detection schemesbasedonthenumberofbitcorruptedineachsymbol|forlarge numberoferrors...............................59 Figure4.11AnillustrativeexampleofparitycheckmatrixandTannergraph....60 Figure4.12MainideaoftheLEDECscheme......................63 Figure4.13FlowchartoftheLEDECalgorithmimplementedinthesinknodes...66 Figure4.14Anexampleoftheparitycheckmatrixinnetworkcoding.........66 Figure4.15Performancecomparisonforsmallnumberofmaliciousnodes.......67 Figure4.16Performancecomparisonformediumnumberofmaliciousnodes.....68 Figure4.17Performancecomparisonforlargenumberofmaliciousnodes.......69 Figure4.18Performancecomparisonbasedonmediumnumberofmaliciousnodes (randomnumberoferrors)..........................70 Figure5.1Anexampleillustrationofmatrix S .....................78 Figure5.2Illustrationofstoringthecodewordmatricesindistributedstoragenodes81 Figure5.3Anexampleillustrationofmatrix M ....................103 Figure5.4ComparisonoferrorcorrectioncapabilitybetweentheH-MSRcodeand theRS-MSRcode...............................117 Figure6.1Thenumberoffractional/fullratecodeblocksfort P det .....131 Figure6.2ratiosbetweenthe2-layerrate-matchedMSRcodeandthe RS-MSRcodefort P det .......................132 xiii Figure6.3Comparisonoftheerrorcorrectioncapabilitybetween m -layerrate- matchedMSRcodefor m =3andRS-MSRcode.............136 Figure6.4Comparisonoferrorcorrectioncapabilitybetweenthe m -layerrate matchedMSRcodeandtheH-MSRcode..................139 Figure6.5Theoptimalerrorcorrectionofthe m -layerrate-matchedMSR codeundertmfor2 m 16...................140 Figure6.6Theoptimalerrorcorrectionfor2 m 16..........142 Figure6.7Latticeofreceivedhelpsymbolsforregeneration..............143 xiv KEYTOABBREVIATIONS AODVAdhocOn-DemandDistanceVector BECBinaryErasureChannel BPABeliefPropagationAlgorithm CRCCyclicRedundancyCheck DCDataCollector EDECErrorDetectionandErrorCorrection GFGaloisField H-MBRHermitianCodeBasedMinimumBandwidthRegeneration H-MSRHermitianCodeBasedMinimumStorageRegeneration LDPCLowDensityParityCheck LEDECLDPCBasedErrorDetectionandErrorCorrection MACMediaAccessControl MBRMinimumBandwidthRegeneration MSRMinimumStorageRegeneration MDSMaximumDistanceSeparable P2PPeertoPeer RSReed-Solomon RS-MBRReed-SolomonCodeBasedMinimumBandwidthRegeneration RS-MSRReed-SolomonCodeBasedMinimumStorageRegeneration xv LISTOFALGORITHMS Algorithm4.1EDECAlgorithmforSourceNodes................. 52 Algorithm4.2EDECAlgorithmforRelayNodes.................. 53 Algorithm4.3EDECAlgorithmforSinkNodes.................. 53 Algorithm4.4BPADecodingAlgorithmforBEC................. 61 Algorithm5.1EncodingH-MSRCode........................ 81 Algorithm5.2 z 0 RegeneratesSymbolsoftheFailedNode z ............ 85 Algorithm5.3(DetectionMode) z 0 RegeneratesSymbolsoftheFailedNode z . 87 Algorithm5.4(RecoveryMode) z 0 RegeneratesSymbolsoftheFailedNode z .. 90 Algorithm5.5DCReconstructstheOriginalFile................. 91 Algorithm5.6(Detectionmode)DCReconstructstheOriginalFile....... 92 Algorithm5.7(RecoveryMode)DCReconstructstheOriginalFile........ 100 Algorithm5.8EncodingH-MBRCode....................... 104 Algorithm5.9 z 0 RegeneratesSymbolsoftheFailedNode z ............ 106 Algorithm5.10(DetectionMode) z 0 RegeneratesSymbolsoftheFailedNode z . 107 Algorithm5.11(RecoveryMode) z 0 RegeneratesSymbolsoftheFailedNode z .. 109 Algorithm5.12DCReconstructstheOriginalFile................. 109 Algorithm5.13(DetectionMode)DCReconstructstheOriginalFile....... 111 Algorithm5.14(RecoveryMode)DCReconstructstheOriginalFile........ 113 Algorithm6.1 z 0 RegeneratesSymbolsoftheFailedNode z ............ 121 Algorithm6.2Regenerationforthe2-layerRate-matchedMSRCode...... 127 Algorithm6.3Reconstructionforthe2-layerRate-matchedMSRCode..... 130 Algorithm6.4Regenerationforthe m -layerRate-matchedMSRCode...... 144 Algorithm6.5Reconstructionforthe m -layerRate-matchedMSRCode..... 145 xvi CHAPTER1 INTRODUCTION Inthisdissertation,wehavedoneresearchesonensuringthenetworkcapacityintwohot researchareas:networkcodinganddistributedstorage. 1.1CombatingPollutionAttacksinNetworkCoding Networkcodingisanewcommunicationdiagramthatisdesignedtoimprovethethroughput androbustnessinnetworkenvironment.Thecorenotationofnetworkcodingisthatit allowstheparticipatingnodestoencodeincomingpacketsatintermediatenetworknodesin awaythatwhenasinkreceivesthepackets,itcanrecovertheoriginalmessage.Network codingprovidesatradebetweenmaximummulticastwrateindirectednetworksand computationalcomplexity.However,inthecontextofnetworkcoding,allparticipating nodesmustencodetheincomingpacketsaccordingtoacodingalgorithm.Ifapacket fromanintermediaterelaynodeiscorruptedorbeingtampered,theentirecommunications maybedisrupted.Onemainpurposeofthisdissertationistodevelopschemesthatcan combatnetworkpollutionandmaliciousattacksfromthenetworknodesbasedonerror controlcoding. 1.1.1BriefReviewofNetworkCoding Networkcodingwasintroducedintheseminalpaperby[1].[2]formulatedthemulticast probleminnetworkcodingasthewfromthesourcetoeachreceivingnode.They provedthatlinearcodingisttoachievetheoptimum.Thisworkmadethenetwork codingsimplerandmorepractical.[3]haveshownthatlinearcodesarettoachieve themulticastcapacitybycodingonalargeenough[4]haveshownthatusingofrandom 1 Figure1.1Asimpleexampleofnetworkcoding linearnetworkcodingisamorepracticalwaytodesignlinearcodes.[5]haveappliedthe principlesofrandomnetworkcodingtothecontextofpeer-to-peer(P2P)contentdistribu- tion,andhaveshownthatledownloadingtimescanbereduced.Sinceithasbeenproved thatlinearnetworkcodesarettoachievethemulticastcapacity,wewillfocusour discussiononlinearnetworkcodinginthisdissertation. ThemainideaofnetworkcodingcanbeillustratedthroughFigure1.1.Assumethe capacityofalltheedgesis C ,thecapacityofthisnetworkis2 C accordingtothew min-cuttheorem.Onlybyencodingtheincomingpacketsymbols x 1 ;x 2 atnode3,this networkcanachievethemaximumcapacity. 1.1.2SecurityProblemsofNetworkCoding Fornetworkcodingtoachievetheexpectedbs,alltheparticipatingnodesinthe networkshouldbefreeofnetworkpollutionandmaliciousattacks.Supposeunderthelinear networkcoding,thesinknodereceives m packetsymbols y 1 ;:::;y m .Itdecodestheoriginal messagesymbols x 1 ;:::;x l bysolvingasetoflinearequations: Bx = 2 6 6 6 6 6 6 6 4 11 12 ::: 1 l 21 22 ::: 2 l . . . . . . . . . . . . m 1 m 2 ::: ml 3 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 4 x 1 x 2 . . . x l 3 7 7 7 7 7 7 7 5 = 2 6 6 6 6 6 6 6 4 y 1 y 2 . . . y m 3 7 7 7 7 7 7 7 5 : (1.1) 2 Ifalltherelaynodesencodecorrectlyand m l ,wecandecodeallthemessagesymbols successfully.However,ifthereareadversariesinthenetworkthatcanmodifythecontents ofthemessagesandsendthemtothesucceedingrelaynodes,theequationsabovemaynot besolvedsuccessfully.Sothecommunicationfails.Inaddition,foralargescalenetwork,a smallerroroccurringatthebeginningofrelaymaytomostofthemessagesinthe end.Thiscancauseatwasteofnetworkresourcesandsometimescanevenruin thewholenetworkcommunication.Thiskindofattackiscalled pollutionattack . 1.1.3ExistingworkonCombatingPollutionAttacksinNetworkCoding Existingworkonpollutioneliminationcanlargelybedividedintoerror-detectionbased schemesanderror-correctionbasedschemes.Forerror-detectionbasedschemes,theerrors arenormallydetectedattheintermediateforwardnodes,whileforerror-correctionbased schemes,theerrorsaregenerallycorrectedatthesinknode.Whiletheerror-correction basedschemesseemtobemoreappealing,thecomplexityforencodinganddecodingis relativelyhigh.Italsocomeswitharelativelyhighercomputationaloverhead.In[6{8], classicnetworkerrorcorrectioncodeswerestudiedfollowingtheworkof[9].Theexistence andconstructionofMDSnetworkerrorcorrectioncodeswerealsostudiedin[10{19].In[20], theauthorsproposedtousenetworkerrorcorrectioncodetolocatethemaliciousattackers. Thedecodingofnetworkerrorcorrectioncodewasstudiedin[21].Anothertypeoferror- correctionbasedschemesuserankmetriccodestocorrecttheerrorsinthesinknodes:[22{27]. In[28],Jaggi etal .developedatwo-partrate-regionfortheircodesbasedonBECchannel codes.[29{31]studiedthetheoreticalnetworkcapacityunderpollutionattacks. Theerror-detectionbasedschemesareattractiveinsomenetworkscenarioswherethenet- worktopologyisunknown.In[32],Zhen etal .proposedaprobabilistickeypre-distribution andmessageauthenticationcodesbasedschemeagainstpollutionattacks.Theirscheme istintheXORnetworkcodingenvironments.Krohn etal .proposedtousehomo- morphichashfunctions[33]toguaranteethecorrectnessofnetworkw.Themainideais 3 thateachintermediatenodewillcheckthecorrectnessofthepackets.Ifapacketfailsatan intermediatenodevitwillbediscarded.Thisapproachcanreducethecommu- nicationoverheadandcanbeusedinrandomnetworkcoding.However,thecomputational complexityisstillveryhigh.Whenthenetworkscaleislarge,computingtoomanyhash valuesalsocreateshighdelay.Similarly,[34]usedthecryptographicideatocaptureand discardthecorruptedpackets.Othererror-detectionbasedschemeswerestudiedin[35{43]. Toaddressthecomputationallimitations,[44]developedasimpleerror-detectionbased Null Key scheme.Themainideaistopartitionthe n -dimensionallinearspaceover GF ( q n )into twoorthogonalsubspacesofdimension k (symbolsubspace)and n k (nullkeyspace). Comparingtothehomomorphichashfunction,theNullKeyschemeismuchmoreet andhasvirtuallynomessagedelay.Foralltheseschemesabove,allcorruptedpacketswill bediscarded.Inpacketizednetworks,alargemessageisdividedintosmallpackets.Ifa maliciousnodecancorruptonefragment(packet)inthewholemessage,accordingtothe approachesdescribedintheerror-detectionbasedschemes,thisfragmentwillbediscarded. Inthisway,thenettransmissioncanbeclosetozero. Therearealsootherapproachestoimprovetheerrorresilienceinnetworkcoding[45{ 53],includingdesigningsecureprotocols,combiningnetworkcodeswithothercodesand implementingsecurenetworkcodesinspscenarios. 1.1.4SummaryoftheLimitationsofexistingworkonCombatingPollution AttacksinNetworkCoding Error-correctionbasedschemesfornetworks Highcomplexityforencodinganddecoding. Highcomputationaloverhead. Error-detectionbasedschemesforrandomnetworks: Highcomputationaloverheadandcommunicationdelayforsomeschemes. 4 Figure1.2Diagramofdistributedstorage Lowtransmissionciencyifamaliciousnodecontinuestocorruptonemessagepacket. 1.2DistributedStorageinHostileNetworks Distributedstorageisanon-demandnetworkdatastorageandaccessparadigm.Thedis- tributeddatastoragearchitecturemodel(inFigure1.2)distributesthedatabasetomultiple serversinmanylocationsacrosstheparticipatingnetworkinthestoragecloud.Underthis model,protecteddataisdistributedonserversinmanylocationsacrosstheparticipating network.EachlocationisdirectlypluggedintotheInternet.Thesedistributedserversmay evenbeuntrustedandunreliable.Ifanattackismadeonthedatainonelocation,ortryto jamthecommunications,onlyasmallamountofbackedupdataisimpacted.Inaddition, sincethepotentialstorageisdispersedtomanylocations,accesstodatadoesnotcome underthesamebandwidthconstraintssinceeachlocationhasitsownpipetotheInternet. Thedecentralizationalsoshrinksthefootprintfordataattacks,sinceabreachofonedata centerdoesnotexposeallbackedupdatatotheattacker. 1.2.1BriefReviewofCurrentAlgorithmsforDistributedStorage Toensureaccessibilityofremotelystoreddataatanytime,atypicalsolutionistostore thedataacrossmultipleserversorclouds,ofteninareplicatedfashion.Datareplication notonlylacksyindatarecovery,butalsorequiressecuredatamanagementforthe storeddata. 5 Itiswellknownthatsecuritydatamanagementisgenerallyverycostlyandveryhardto defendagainstcompromisingattacks.Distributeddatastorageprovidesanelegant betweenthecostlysecuredatamanagementtaskandthecheapstoragemedia.Themain ideaisinsteadofstoringtheentiredatainoneserver,wecansplitthedatainto n data components.Theoriginaldatacanberecoveredonlywhentherequired(threshold)number ofcomponents,say k ,arecollected.Theoriginaldataisinformationtheoreticallysecure foranyonewhocanaccesseitheranindividualcomponentormultiplecomponentswhen thenumberofcomponentscombinedislessthanthethreshold k .Inthiscase,whenthe individualcomponentsarestoreddistributivelyacrossmultiplecloudstorageservers,each cloudstorageserveronlyneedstoassuredataintegrityanddataavailability.Thecostlydata encryptionandsecurekeymanagementmaynolongerbeneededanymore.Thedistributed cloudstoragecanalsoincreasedataavailabilitywhilereducingnetworkcongestionthat leadstoincreasedresiliency.Apopularapproachistoemployan( n;k )maximumdistance separable(MDS)codesuchasanReed-Solomon(RS)code[54,55].ForRScode,thedatais storedin n storagenodesinthenetwork.Thedatacollector(DC)canreconstructthedata byconnectingtoany k healthynodes. WhileRScodeworksperfectinreconstructingthedata,itlacksscalabilityinrepairing orregeneratingafailednode.Todealwiththisissue,theconceptofregeneratingcode wasintroducedin[56].Themainideaoftheregeneratingcodeistoallowareplacement nodetoconnecttosomeindividualnodesdirectlyandregenerateasubstituteofthefailed node,insteadofrecoveringtheoriginaldatathenregeneratingthefailedcomponent. Inthisway,therecoveryproblemofthedistributedstoragecanbeviewedasamulticasting problemwhichcanbesolvedusingnetworkcoding.Networkcodingbecomesthebaseofthe regeneratingcode. ComparedtotheRScode,regeneratingcodeachievesanoptimalbetweenband- widthandstoragewithintheminimumstorageregeneration(MSR)andtheminimumband- widthregeneration(MBR)points.RScodebasedMSR(RS-MSR)codeandMBR(RS- 6 MBR)code[57]havebeenexplicitlyconstructed.However,theexistingresearcheitherhas noerrordetectioncapability,orhastheerrorcorrectioncapabilitylimitedbytheRScode. Moreover,theschemeswitherrorcorrectioncapabilityareunabletodeterminewhetherthe errorcorrectionissuccessful. 1.2.2ExistingWorkonDistributedStorage Whenastoragenodeinthedistributedstoragenetworkthatemployingtheconventional ( n;k )RScode(suchasOceanStore[54]andTotalRecall[55])fails,thereplacementnode connectsto k nodesanddownloadsthewholetorecoverthesymbolsstoredinthefailed node.Thisapproachisawasteofbandwidthbecausethewholehastobedownloaded torecoverafractionofit.Toovercomethisdrawback,Dimakis etal .[56]introducedthe conceptionof f n;k;d;;;B g regeneratingcodebasedonthenetworkcoding.Inthecon- textofregeneratingcode,thecontentsstoredinafailednodecanberegeneratedbythe replacementnodethroughdownloading helpsymbolsfrom d helpernodes.Theband- widthconsumptionforthefailednoderegenerationcouldbefarlessthanthewholeA datacollector(DC)canreconstructtheoriginalestoredinthenetworkbydownloading symbolsfromeachofthe k storagenodes.In[56],theauthorsprovedthatthereisa betweenbandwidth andpernodestorage .Theytwooptimalpoints:minimum storageregeneration(MSR)andminimumbandwidthregeneration(MBR)points.Cur- rentlytherearemanyliteraturesfocusingontheoptimalregeneratingcodesdesign:[58{69]. In[70,71]theimplementationoftheregeneratingcodewerestudied. Theregeneratingcodecanbedividedintofunctionalregenerationandexactregeneration. Inthefunctionalregeneration,thereplacementnoderegeneratesanewcomponentthatcan functionallyreplacethefailedcomponentinsteadofbeingthesameastheoriginalstored component.[72]formulatedthedataregenerationasamulticastnetworkcodingproblemand constructedfunctionalregeneratingcodes.[73]implementedarandomlinearregenerating codesfordistributedstoragesystems.[74]provedthatbyallowingdataexchangeamongthe 7 replacementnodes,abetterbetweenrepairbandwidth andpernodestorage can beachieved.Intheexactregeneration,thereplacementnoderegeneratestheexactsymbols ofafailednode.[75]proposedtoreducetheregenerationbandwidththroughalgebraic alignment.[76]providedacodestructureforexactregenerationusinginterferencealignment technique.[57]presentedoptimalexactconstructionsofMBRcodesandMSRcodesunder product-matrixframework.Thisistheworkthatallowsindependentselectionofthe nodesnumber n inthenetwork. 1.2.3ExistingWorkonDistributedStorageinHostileNetworks NoneoftheseworksinSection1.2.2consideredcoderegenerationundernodecorruption oradversarialmanipulationattacksinhostilenetworks.Infact,alltheseschemeswillfail inbothregenerationandreconstructioniftherearenodesinthestoragecloudsendingout incorrectresponsestotheregenerationandreconstructionrequests. In[77],theByzantinefaulttoleranceofregeneratingcodeswerestudied.In[78],the authorsdiscussedtheamountofinformationthatcanbesafelystoredagainstpassiveeaves- droppingandactiveadversarialattacksbasedontheregenerationstructure.In[79],the authorsproposedtoaddCRCcodesintheregeneratingcodetochecktheintegrityofthe datainhostilenetworks.Unfortunately,theCRCcheckscanalsobemanipulatedbythe maliciousnodes,resultinginthefailureoftheregenerationandreconstruction.In[80],the authorsproposedtoadddataintegrityprotectionindistributedstorage.In[81],theauthors proposedanerasure-codeddistributedstoragebasedonthresholdcryptography.In[82],the authorsanalyzedthevcostforboththeclientreadandwriteoperationinwork- loadswithidleperiods.In[83],theauthorsanalyzedtheerrorresilienceoftheRScode basedregeneratingcodeinthenetworkwitherrorsanderasures.Theyprovidedthethe- oreticalerrorcorrectioncapability.TheirresultisanextensionoftheMDScodetothe regeneratingcodeandtheirschemeisunabletodeterminewhethertheerrorsinthenetwork aresuccessfullycorrected. 8 1.2.4LimitationsofExistingWorkonDistributedStorageinHostileNetworks ExistingWorkonDistributedStorageinHostileNetworkshasthefollowinglimitations: Theerrordetection/correctioncapabilityisIfthereareafewerrors,therewill beawasteofbandwidth.Iftherearetoomanyerrors,theerrorcorrectingprocesswill failwithoutbeingdetected. TheerrorcorrectioncapabilityislimitedbytheerrorcorrectioncapabilityoftheMDS codes. 1.2.5withExistingWorkonSecureNetworkCommunication Itisworthwhiletopointoutthatalthoughtherearestrongconnectionsbetweenregenerating codeindistributedstorageandgeneralnetworkcommunicationofwhichsecurityproblems havebeenwellstudied,ourproposedH-MSR/H-MBRcodesaretfromthesesecurity studiesofnetworkcommunicatione.g.[84{86]inbothprinciplesandscopes.First,unlike allthestudiesabove,theniceerrorcorrectioncapabilityoftheproposedH-MSR/H-MBR codesisduetotheunderlyingHermitiancode[87].Second,theregeneratingcodesstudiedin thisworkandthegeneralnetworkcommunicationarefundamentallytinthatbesides theoveralldatareconstructiontheregeneratingcodesalsoemphasizetherepairofcorrupted codecomponents(regeneration),whilegeneralnetworkcommunicationonlyfocusesondata reconstruction.Noneoftheresearchesofgeneralsecurenetworkcommunicationstudiesthe regenerationproblem.Thescopeofthisworkistfromthatofthoseresearches. 1.3ProposedResearchDirections 1.3.1DirectionsforCombatingPollutionAttacksinNetworkCoding Fornetworks,insteadofdesigningthenetworkcodeswitherrorcorrectioncapability, wefocusonthecharacterizingofthenetworkcodingsothatnetworkcodingcanbeviewed 9 fromtheperspectiveoferrorcontrolcoding.Inparticular,wewillanalyzetherelationship betweenthenetworkcodingandtheerrorcontrolcodinginbothunicastandmulticastcases. Wethealgebraicaspectsforthesetwocasesareessentiallyidentical. Afterwehaveproventhateachnetworkcodingcanbetransferredintoanerrorcontrol codeinabipartitegraphbyignoringthestructureoftheunderlyingerrorcontrolcoding, wethentransferanetworkcodingschemeintoaseriesofcascadederrorcontrolcodes byexploringtheinnerstructureofnetworkcoding.Thismappingenablesustoidentify theminimumnumberofindependenterrorpatterninthecorrespondingnetworkleveland identifythemaliciousnetworknodes. Forrandomnetworks,weproposeanewschemethatcombineserror-detectionanderror- correcting(EDEC)tocombatnetworkpollutionattacks.Originalmessagesymbolsare encodedusingan( n;k )codethensentoutinpackets.Whenanintermediatenodedetects anerror,insteadofdiscardingthepacket,theintermediatenodewillcontinuetoforwardit. Aslongastheerrorsarewithinthedecodingcapability,thesinknodeswillbeabletorecover thecorruptedpacket.InourLDPCbasedEDEC(LEDEC)scheme,wetreatthepacketsas LDPCcodesatsinknodesandusethebeliefpropagationalgorithm(BPA)todecodethe LDPCcode.Inthiscase,theLEDECschemecanmaintainthethroughputunchangedfor moderatenetworkpollution.Itcanalsoguaranteeahighnetworkthroughputevenfora heavilypollutednetworkenvironment,whilethethroughputbecomesverylowforallerror- detectionbasedschemesinthiscase.Intheanalyses,wemainlyfocusonthethroughput impactbroughtbytstrategies(discardvs.keep)towardscorruptedpackets. 1.3.2DirectionsforDistributedStorageinHostileNetworks Fordistributedstorage,weproposeHermitiancodebasedregeneratingcodes:H-MSRcode andH-MBRcode.WeconstructtheH-MSRcodebycombiningtheHermitiancodeand regeneratingcodeattheMSRpoint,thenweconstructtheH-MBRcodebycombiningthe HermitiancodeandregeneratingcodeattheMBRpoint.Thenweprovethatthesecodes 10 canachievethetheoreticalMSRboundandMBRboundrespectively.Wealsoproposedata regenerationandreconstructionalgorithmsfortheH-MSRcodeandtheH-MBRcodein botherror-freenetworksandhostilenetworks. Moreover,inspiredbytheniceperformanceofHermitiancodebasedregeneratingcodes, westepforwardtofurtherconstructoptimalregeneratingcodeswhichhavesimilarlayered structurelikeHermitiancodeindistributedstorage.Weproposeasimpleoptimal constructionof2-layerrate-matchedMSRcode.Weconductboththeoreticalanalysisand performanceevaluationtoshowthatthiscodecanachievetheoptimalstorage. Thenweproposeanoptimalconstructionof m -layerrate-matchedMSRcode.The m -layer codecanachievetheoptimalerrorcorrection. 1.4Overviewofthedissertation 1.4.1MajorContributions Themajorcontributionsofthisdissertationareasfollows: Combatingpollutionattacksinnetworkcoding. 1. Weprovideacomprehensiveanalysisandtheoreticalresultsontherelationships betweenthenetworkcodingandtheerrorcontrolcoding. 2. Weproposeamethodologytodesigntnetworkcodingschemesthatcan combatnetworkerrorsandnetworkpollution. 3. Wedevelopamethodologytomapeachnetworkcodingintoaseriesofcascaded errorcontrolcodes. 4. Weprovideanovelapproachtodesigntnetworkcodingschemesthatcan combatpollutionattacksandlocatethemaliciousnodesbyutilizingtheinner structureofthenetworkcode. 11 Combatingpollutionattacksinrandomnetworkcoding. 1. OurproposedEDECschemeandtheLEDECschemecanmaintainthethroughput unimpactedfornetworkenvironmentwithmoderatemaliciousattackswithonly aslightincreaseincomputationaloverhead. 2. Theproposedschemecanguaranteeahighthroughputevenforaheavilypolluted networkenvironment. 3. WeprovidecomprehensivethroughputanalysisoftheproposedEDECscheme andtheLEDECscheme. 4. Weconductextensivesimulationsusingns-2toevaluatetheperformanceof theproposedschemesandcompareourschemeswiththeerror-detectionbased schemes. Distributedstorageinhostilenetworks|Hermitiancodebasedregeneratingcodes 1. Theoreticalevaluationshowsthatourproposedschemescandetecttheerroneous decodingswhileotherexistingworkcannot. 2. Ourproposedschemescancorrectmoreerrorsinthehostilenetworkthanthe RS-MSR/RS-MBRcodeswiththesamecoderates. 3. OuranalysisalsodemonstratesthattheproposedH-MSR/H-MBRcodeshave lowercomplexitythantheRS-MSR/RS-MBRcodesinbothcodesregeneration andcodesreconstruction. Distributedstorageinhostilenetworks|Optimalconstructionofregeneratingcodes throughrate-matching 1. Ourproposedoptimalconstructionof2-layerrate-matchedMSRcodecanachieve theoptimalstorage,whichishigherthantheRS-MSRcodeproposed in[83]. 12 2. Ourproposedoptimalconstructionof m -layerrate-matchedMSRcodecanachieve theoptimalerrorcorrection,whichishigherthanthecodeproposed in[83]andtheH-MSRcodeproposedin[88].Furthermore,the m -layeredcode iseasiertounderstandandhasmoreythantheH-MSRcode. 1.4.2Structure Thedissertationisstructuredasfollows. Chapter2 introducesthepreliminaryforthisdissertation.Somebasicconceptsand propertiesofnetworkcoding,errorcontrolcoding,regeneratingcodeandhermitiancodeare presented. Chapter3 ismainlyaboutcombatingpollutionattacksfornetworks.The sectionstudiestherelationshipbetweennetworkcodinganderrorcontrolcoding.Thesecond sectioncharacterizesnetworkcodingusingcascadederrorcontrolcoding. Chapter4 ismainlyaboutcombatingpollutionattacksforrandomnetworks.The system/adversarialmodelsandassumptionsarepresentedinthesection.Theproposed EDECschemeandperformanceanalysisaredescribedinthesecondsection.Thethird sectionpresentstheLDPCdecodingandanalysisoftheLEDECscheme. Chapter5 ismainlyabouttheHermitiancodebasedregeneratingcodesindistributed storageinhostilenetworks.Afterthesystem/adversarialmodelsandassumptionsarepre- sented,ourproposedH-MSRcodeisdescribedandanalyzedinthesecondsection.The proposedH-MBRcodeisdescribedandanalyzedinthethirdsection.Performanceanalysis isconductedinthefourthsection. Chapter6 ismainlyabouttheoptimalconstructionofregeneratingcodethroughrate- matching.Thesectionpresentsthesystem/adversarialmodelsandassumptions.The secondsectionproposestwocomponentcodesfortherate-matchedMSRcodes.Thethird sectionproposesandanalyzesthe2-layerrate-matchedMSRcode.Thenthefourthsection proposesandanalyzesthem-layerrate-matchedMSRcode. 13 Chapter7 summarizesthedissertation. 14 CHAPTER2 PRELIMINARY Inthischapter,wewillpresentsomebasicconceptsandpropertiesofnetworkcoding,error controlcoding,regeneratingcodeandhermitiancode. 2.1NetworkCoding Inthisdissertation,weadoptthenotationsof[3].Anetworkisequivalenttoadirected graph G =( V;E ),where V representsthesetofverticescorrespondingtothenetworknodes (sourcenodes,relaynodesandsinknodes)and E representsallthedirectededgesbetween verticescorrespondingtothecommunicationlink.Thestartvertex v ofanedge e iscalled thetailof e andwrittenas v = tail ( e ),whiletheendvertex u ofanedge e iscalledthe headofof e andwrittenas u = head ( e ).Wethecapacityofanedgeasthenumber ofbitsthatcanbetransmittedthroughtheedgeinonetimeunit.Sothecapacityshould benon-negativeintegers.Inthisdissertation,wenormalizethecapacityofoneedgeto1. Ifachannelbetweentwonodeshascapacity C largerthan1,wemodelthischannelas C multipleedgeseachwithcapacity1.Weassumethenetworkisdelay-free[3],thatisallthe edgesinthegraphhavezerodelay.Andthenetworkisacyclic,thatisalltheverticesinthe graphcanbeorganizedinanancestralordering. Forasourcenode u ,thereisasetofdiscreterandomprocessestobesent.Eachofthe randomprocesscanberepresentedbyabinaryvectoroflength m ,thatiseverysymbol sequenceoftherandomprocessisfromthe F 2 m .Wewritethesetofthepro- cessesas X ( u )= f X ( u; 1) ;X ( u; 2) ; ;X ( u; ( u )) g ,inwhich ( u )isthenumberofrandom processesinnode u .Sincewenormalizethecapacityofeachedgeto1,itisreasonableto normalizetherateoftherandomprocess X ( u;i )(1 i ( u ))to1. 15 Wecanwritealink e between r 1 and r 2 as e =( r 1 ;r 2 ).Therandomprocess Y ( e )onthe link e isthefunctionofallthe Y ( e 0 )fromlinks e 0 (suchthat head ( e 0 )= r 1 )andtherandom processes X ( r 1 )fromnode r 1 .In F 2 m linearnetworkcoding[2], Y ( e )canbewrittenas: Y ( e )= ( r 1 ) X l =1 l;e X ( v;l )+ X e 0 : head ( e 0 )= r 1 e 0 ;e Y ( e 0 ) ; (2.1) inwhichtheencodingcots l;e ; e 0 ;e 2F 2 m . Forasinknode v ,thereisalsoasetofdiscreterandomprocessestobeobserved.We writethesetoftheprocessesas Z ( v )= f Z ( v; 1) ;Z ( v; 2) ; ;Z ( v; ( v )) g ,inwhich ( v )is thenumberofrandomprocessesobservedinnode v .Inlinearnetworkcoding, Z ( v;j )can bewrittenas: Z ( v;j )= X e 0 : head ( e 0 )= v e 0 ;j Y ( e 0 ) ; (2.2) inwhichtheencodingcots e 0 ;j 2F 2 m . Aconnectionbetweenasourcenode u andasinknode v canbewrittenas C = ( u;v; X ( u )).Fromtheassumptionsanddeductionsabove,therateofthisconnection R ( C ) isequalto jX ( u ) j ,where j x j isthecardinalityoftheset x .Aslongaswecanretrieve X ( u ) from Z ( v ),wesaythatthisconnectionispossible.Becauseweapplythelinearencoding inthenetwork,wecanthesystemtransfermatrix M betweeninput x andoutput z . Ifwewrite x =( X ( u; 1) ;X ( u; 2) ; ;X ( u; ( u )))and z =( Z ( v; 1) ;Z ( v; 2) ; ;Z ( v; ( v ))), wehave z = x M . Inthisdissertation,sinceweareonlyconcernedaboutthisrelationshipineachsingle encodingperiod,wesimplywritetherandomprocesses X ( u;i ) ;Y ( e ) ;Z ( v;j )asrandom numbers x u;i ;y e ;z v;j . 2.2ErrorControlCoding Inthissection,wepresentthepreliminaryforerrordetection,whichisthebaseoftheNull KeyschemeandtheproposedEDECscheme.Errorcorrection,whichisalsothebaseofthe 16 proposedEDECscheme,isalsopresented. 2.2.1ErrorDetection Supposetheoriginalmessagesymbolsareinthe k -dimensionallinearspaceover GF (2 k ). Afterweencodethesymbolsusingageneratingmatrix G k n froman( n;k )blockcode,the encodedcodewordswillformalinearsubspaceover GF (2 n )ofdimension k .Sotherewill beanother n k dimensionalsubspaceoverthe n dimensionalspace,whichisorthogonal tothecodewordssubspace.Ifwedenoteavalidcodewordby c andthebasesforthe n k dimensionalsubspaceby h 1 ;:::; h n k ,wehave < c ; h i > =0 ; 1 i n k ,where < ; > representstheinnerproduct. Let H ( n k ) n =[ h 1 ;:::; h n k ] T ,then H formstheparity-checkmatrixofthecodewords andwehave c H T = 0 : (2.3) Suppose r = c + e isareceivedcodeword,where e isan n -tupleerrorgeneratedbya maliciousnode.Forthereceivedword r ,accordingtoequation(2.3), c isorthogonalto H , therefore,wehave: r H T =( c + e ) H T = c H T + e H T = e H T : (2.4) Forareceivedword r ,therearetwopossibilities:(i) e isacodewordgeneratedby G but tfromtheoriginalcodeword.Inthiscase,though r containserror,however,because r H T =0,theerrorisundetectableusingconventionalerrorcontrolcodingtechniques;(ii) e containsanonzeroprojectiontotheorthogonalparitychecksubspace,then r H T 6 =0. Inthiscase,wecandetectthatthereceivedwordcontainserror. Innetworkcoding,suppose c 1 ;:::; c i arevalidcodewords, c = P j c j b j isalinear combinationofthecodewords c 1 ;:::; c i ,where b j isthenetworkencodingcotsand 17 hasthevalue0or1.Itcanbeeasilyvthat c H T = X j c j H T b j = 0 : (2.5) Equation(2.5)isthetheoreticalfoundationforerrorcontrolcodingtobeusedinnetwork coding.Therowsof H arecalled NullKeys in[44].Bycheckingpacketsymbolsatevery node,thereisahighprobabilitythattheNullKeyschemecandetectthepollutedpackets afterafewhopsoftransmission.However,the`check-and-dump'strategymayresultina verylowcommunicationundercontinuousnetworkpollutionandpacketcorruption attacks. 2.2.2ErrorCorrection Equation(2.4)iscalledthe syndrome oferrorpattern e ,denotedas s .Itisclearthat r is acodewordifandonlyif s =0.Thetaskofmaximumlikelihooddecodingistothe minimumweighterrorpattern e suchthat r H T = e H T .Inthiscase,thereceived r is correctedto r + e = c . Inlinearnetworkcoding,althoughthepacketsymbolsarenottheoriginalonessentfrom sourcenodes,wecanstillperformerrorcorrectionusingequation(2.4). Suppose r 1 ;:::; r i arethereceivedcodewordsfrom i incomingedges, e istheerrorvector addedtothenetworkcoding r = P j r j b j .Iftheerroriswithinthecorrectioncapabilityof the( n;k )code,thesyndromewillstillbe r H T = e H T + P j r j H T b j = e H T + 0 = e H T . Thenwecancorrecttheerrorusingsyndromedecoding. IntheproposedEDECscheme,thecorruptedpacketsdetectedattheintermediatenodes willnotbedumped.Boththeintactandcorruptedpacketswillbegatheredbythesink nodes.Thesinknodescancorrectthecorruptedpacketsandhaveahighercommunication thantheerror-detectionbasedschemes. 18 2.2.3SomePropertiesofErrorControlCodes Theerrordetectionandcorrectioncapabilitiesaredecidedbythe( n;k )codestructure.We canadoptappropriateerrorcontrolcodesaccordingtothepollutionlevelsofthenetwork. Inthissection,somepropertiesoferrorcontrolcodeswillbepresented. Theorem2.1. (Singletonbound[89]) Foran ( n;k ) blockcodewiththeminimumdistance d ,thefollowingrelationshipholds: k + d n +1 . Theminimumdistance d isastheminimumhammingdistanceforanytwodistinct codewords x and y of C : d min = min f d ( x ; y ) j8 x ; y 2 C g ,where d ( x ; y )isthenumberof positionsatwhichthecorrespondingbitsaretbetween x and y .Foran( n;k )block codewithminimumdistance d ,ifwedeletethe d 1bitsofeverycodewords,allthe codewordsarestilldistinct.Sothereareatmost2 n ( d 1) codewords.Thetotalnumberof originalmessagesymbolsisatmost2 k anditcannotbebiggerthanthenumberofpossible codewords:2 k 2 n ( d 1) . Theorem2.2. ([89]) Foran ( n;k ) blockcodewiththeminimumdistance d ,itcandetect allthe d 1 orlesserrors,oritcancorrectallthe j d 1 2 k orlesserrors. Accordingtotheofminimumhammingdistance,allcodewordswithinthe distance d 1orshorterofavalidcodewordareinvalid.Soallthe d 1orlesserrors canbedetected.Suppose x and y aretwovalidcodewordswiththeminimumhamming distanceand z isacorruptedversionofcodeword x with t errors.Thatis d ( x ; z )= t .Ifwe wanttocorrect z to x ,wemusthave d ( y ; z ) >t .Byusingthetriangleinequity,wehave d ( y ; z ) d ( x ; y ) d ( x ; z )= d t .Wecanensure d ( y ; z ) >t bymaking d t>t .Thatis 2 t ) orfewererrorsifitsminimumdistance d min + ˝ +1 . 19 Figure2.1Illustrationofregeneratingcode Proof. Since ˝> ,wehave d min + ˝ +1 > 2 +1, < j d 1 2 k .Sowecancorrect or fewererrors.Suppose x and y aretwovalidcodewordswiththeminimumhammingdistance d min and z isacorruptedversionofcodeword x with ˝ errors.Inordertoavoidthewrong correction,wemustmakesurethat d ( y ; z ) > .Accordingtothetriangleinequity,wehave d ( y ; z ) d ( x ; y ) d ( x ; z )= d min ˝ .Wecanensure d ( y ; z ) > bymaking d min ˝> . Thatis d min ˝ + +1. 2.3RegeneratingCode Regeneratingcodeintroducedin[56]isalinearcodeover GF q withasetofparameters f n;k;d;;;B g .Aofsize B isstoredin n storagenodes,eachofwhichstores symbols.Areplacementnodecanregeneratethecontentsofafailednodebydownloading symbolsfromeachof d randomlyselectedstoragenodes.Sothetotalbandwidthneededto regenerateafailednodeis = .Thedatacollector(DC)canreconstructthewholeby downloading symbolsfromeachof k d randomlyselectedstoragenodes.Anillustration ofregeneratingcodeisshowninFigure2.1. 20 In[56],thefollowingtheoreticalboundwasderived: B k 1 X i =0 min f ; ( d i ) g : (2.6) Fromequation(2.6),abetweentheregenerationbandwidth andthestoragere- quirement wasderived. and cannotbedecreasedatthesametime.Therearetwo specialcases:minimumstorageregeneration(MSR)pointinwhichthestorageparameter isminimized; ( MSR ; MSR )= B k ; Bd k ( d k +1) ; (2.7) andminimumbandwidthregeneration(MBR)pointinwhichthebandwidth isminimized: ( MBR ; MBR )= 2 Bd 2 kd k 2 + k ; 2 Bd 2 kd k 2 + k : (2.8) 2.4HermitianCode AHermitiancurve H ( q )over GF ( q 2 )incoordinatesisby: H ( q ): y q + y = x q +1 : (2.9) Thegenusof H ( q )is % =( q 2 q ) = 2andthereare q 3 pointsthatsatisfyequation(2.9),denoted as P 0 ; 0 ; ;P 0 ;q 1 ; ;P q 2 1 ; 0 ; ;P q 2 1 ;q 1 (SeeTable2.1),where 0 ; 1 ; ; q 1 are the q solutionsto y q + y =0and ˚ isaprimitiveelementin GF ( q 2 ). L ( mQ )isas: L ( mQ )= f f 0 ( x )+ yf 1 ( x )+ + y q 1 f q 1 ( x ) j deg f j ( x ) < ( j ) ;j =0 ; 1 ; ;q 1 g ; (2.10) where ( j )=max f t j tq + j ( q +1) m g +1 ; (2.11) for m q 2 1.AcodewordoftheHermitiancode[87] H m isas( % ( P 0 ; 0 ) ; ;% ( P 0 ;q 1 ) ; ;% ( P q 2 1 ; 0 ) ; ;% ( P q 2 1 ;q 1 )),where % 2 L ( mQ ).Thedimensionofthemessagebe- foreencodingcanbecalculatedasdim( H m )= P j = q 1 j =0 (deg f j ( x )+1).Asoft-decisionlist 21 decodingalgorithmforHermitiancodeswasproposedin[90].In[87],anovelapproach fordecodingHermitiancodeswithbursterrorswasproposed.Somegoodpropertiesof Hermitiancodeswerestudiedin[91]. 22 Table2.1 q ^3rationalpointsoftheHermitiancurve P 0 ; 0 =(0 ; 0 ) P 1 ; 0 =(1 ;˚ + 0 ) P q 2 1 ; 0 =( ˚ q 2 2 ;˚ ( q 2 2)( q +1)+1 + 0 ) P 0 ; 1 =(0 ; 1 ) P 1 ; 1 =(1 ;˚ + 1 ) P q 2 1 ; 1 =( ˚ q 2 2 ;˚ ( q 2 2)( q +1)+1 + 1 ) . . . . . . . . . . . . P 0 ;q 1 =(0 ; q 1 ) P 1 ;q 1 =(1 ;˚ + q 1 ) P q 2 1 ;q 1 =( ˚ q 2 2 ;˚ ( q 2 2)( q +1)+1 + q 1 ) 23 CHAPTER3 COMBATINGPOLLUTIONATTACKSFORFIXEDNETWORKS Inthischapter,Wethatnetworkcodinganderrorcontrolcodingareessentiallyidentical inalgebraicaspects.Wewillprovideanovelmethodologytocharacterizelinearnetwork codingthrougherrorcontrolcodingfornetworkcoding.Ourmainideaistorepresent eachlinearnetworkcodingwithanerrorcontrolcoding.Wewillprovidecomprehensive theoreticalanalysisontherelationshipsbetweenlinearnetworkcodinganderrorcontrol codinginbothunicastandmulticastscenarios. Meanwhile,ourresearchprovidesanewapproachtounderstandnetworkcodingschemes andanovelmethodologytodevelopnetworkcodingschemesthatcancombatnodecompro- misingattacksandlocatethemaliciousnodes.Wewillcharacterizealinearnetworkcoding throughaseriesofcascadedlinearerrorcontrolcodes.Thisrepresentationenablesusto determinetheindependentsourceoferrorsinthecascadednetworklevel.Itcouldleadto asuccessfuldecodingoftheoriginalmessageandcouldhelplocatingthemaliciousnetwork nodes.Wewillprovidecomprehensivetheoreticalanalysisonnetworkcodinginbothunicast andmulticastscenarios. 3.1CharacterizationofLinearNetworkCodingforPollutionAt- tacks 3.1.1ModelsandAssumptions Inthissection,ourmainideaistocharacterizeandclassifynetworkcodingaccordingtothe underlyingerrorcontrolcoding.Weonlyneedtolimitourconsiderationtolinearnetwork codesin F 2 ,whichmakesthecorrespondingerrorcontrolcodessimplebinaryblockcodes. Inthiscasetheencodingcotscanonlybe0or1andtheadditionoperationequals 24 exclusiveor. 3.1.2AnIllustrativeExample Inthissection,wewillillustrateourmainideausingtheclassicexample[1]showninFig- ure3.1.Inthisexample,sourcenode1multicaststwosymbols x 1 ; 1 ;x 1 ; 2 tosinknodes6and 7.Byencodingatnode4,bothnodes6and7canretrievethetwosymbolssuccessfully.To explainourmainidea,wewillonlyfocusonthecommunicationbetweennode1andnode 6(theshadedareainFigure3.1).Theanalysisissimilartothecommunicationbetween node1andnode7.Inthiscommunication,symbol x 1 ; 1 ispasseddirectlythroughthepath e 1 e 5 .Sowecanmergetheedges e 1 and e 5 togetherinFigure3.2(a):node1send x 1 ; 1 directlytonode6intheequivalentbipartitegraph.Meanwhile,inFigure3.2(b), x 1 ; 1 and x 1 ; 2 arepassedseparatelytonode4through e 1 e 3 and e 2 e 4 ,then x 1 ; 1 + x 1 ; 2 ispassed through e 6 e 8 afterbeingencodedatnode4.Sowecanmergetheedges e 1 ;e 3 together, e 2 ;e 4 togetherand e 6 ;e 8 togetherinthestepofFigure3.2(b):node1sends x 1 ; 1 ;x 1 ; 2 directlytonode4andnode4sends x 1 ; 1 + x 1 ; 2 directlytonode6.Inthesecondstep,wecan ignorenode4andputtheoperation x 1 ; 1 + x 1 ; 2 innode6:node1sends x 1 ; 1 ;x 1 ; 2 directly tonode6andnode6addsthetwosymbolstogetherintheequivalentbipartitegraph. UsingtheprocessesshowninFigure3.2,wetransferthisnetworkcodingproblemintoa bipartitegraphshowninFigure3.3.Inthiswaywecangettheexplicitrelationshipbetween symbolsofnode1andsymbolsofnode6.Ifweview x 1 ; 1 ;x 1 ; 2 asoriginalmessageand z 1 ; 1 ;z 1 ; 2 asthecodewordinanerrorcontrolcode,wecanviewthisnetworkcodeasa(2 ; 2) errorcontrolcodewiththegeneratormatrix 2 6 4 11 01 3 7 5 : Althoughinthisexample,thereisnoredundancyinthe(2 ; 2)errorcontrolcodeand thiscodecannotdetectorcorrecterrors,itisttoshowthatnetworkcodecanbe characterizedusingerrorcontrolcode. Intheexamplesbelow,wewillshowthatnetworkcodeswithredundanciescanbetrans- 25 Figure3.1Exampleforillustratingthemainidea Figure3.2Theprocessesoftransferringnetworkcodeintobipartitegraph Figure3.3ThecorrespondingbipartitegraphofFigure3.1 26 Figure3.4Equivalenceofthreekindsofnodesinnetworkcoding ferredintoerrorcontrolcodes.Meanwhile,theredundanciesofnetworkcodescanbeadded accordingtotheerrorcontrolcodes.Andwecancharacterizenetworkcodingusingerror controlcoding. 3.1.3RelationshipbetweenNetworkCodingandErrorControlCodinginPoint- to-PointCommunication Inthispart,wewillformallystatetherelationshipbetweennetworkcodinganderrorcon- trolcodinginthepoint-to-pointcommunication.Theciencyisstudiedthenthe necessity. 3.1.3.1The Theorem3.1. Everynetworkcodeschemecanberepresentedbyanerrorcontrolcode. Proof. Allthenodesinnetworkcodingcanbecategorizedintothreetypes:simpleforward, multicastandcodethenforward(showninFigure3.4).Sowecantransferthegraph representingthenetworkcodingusingthreeoperationsaccordingly: 1. Simpleforward :Thesenodesdonotencodetheincomingsymbols.Theysimply 27 forwardwhatevermessagetheyhavereceived.Inthissituation,wecanreplacethe nodeswithdirectlinks. 2. Multicast :Likesimpleforward,thesenodesdonotencodetheincomingsymbols either.Theysimplymulticastthemessagethattheyhavereceived.Inthissituation, wecanreplacethenodeswithmultipledirectlinks. 3. Codethenforward :Thesenodesproducethelinearcombinationoftheincoming symbols x 1 ;x 2 ; ;x k .Accordingtotheencodingcots, m outof k received symbolswillbeaddedtogethertoformthenewsymbol x l 1 + x l 2 + + x l m tobe forwarded.Wecanviewthiskindofnodeas m parallelnodes v l 1 ;v l 2 ; ;v l m ,eachof whichhasonlyoneinput.Thesymbols x l 1 ;x l 2 ; ;x l m willbedirectlyforwardedto thesinknode u .Andthesinknodewillcompletetheadditionoperation.Thereforewe cantransfercodethenforwardnodesbysplittingmultipleinputsintomultiplesimple forwardnodes.Thenwecanfurthersimplifythemultiplesimpleforwardnodesasin '1)'. Becausethenetworkcodingislinear,thethreeoperationsarecommutativeandhavethe superpositionproperty.Wecanalwaysperformtheoperationstoalloftheintermediate nodesinthenetworkandreplacethenodeswithsimplelinks.Atlastwecangetabipartite graphconsistingofonlysymbolsinthesourcenodeandencodedsymbolsinthesinknode, whichcanberepresentedusinganerrorcontrolcodecorrespondingtothebipartitegraph. TakethenetworkcodeinFigure3.5asanexample.Thesourcenode1transmitsthree symbols x 1 ; 1 ;x 1 ; 2 ;x 1 ; 3 tosinknode4inthisnetworkcode.Andsinknode4canreceive 6encodedsymbols,whichindicatesthatthereareredundanciesinthisnetworkcoding. FollowingtheoperationsmentionedintheproofofTheorem3.1,wecangetthecorresponding bipartitegraphshowninFigure3.6,whichindicatesthisisa(6 ; 3)errorcontrolcode.The 28 Figure3.5Anexampleofpoint-to-pointnetworkcoding Figure3.6ThecorrespondingbipartitegraphofFigure3.5 generatormatrixis: G = 2 6 6 6 6 4 100110 010101 001011 3 7 7 7 7 5 : (3.1) Thiscodehasaminimumhammingdistance3.Soitcandetectandcorrect1biterror. Thecanalsobevalidatedbythesystemtransfermatrix M .Toshowthis, wewillstillusethenetworktopologyshowninFigure3.5,butwithtencoding 29 cots.Thesymbolsoneachedgecanbewrittenas: y e 1 = 1 ;e 1 x 1 ; 1 + 2 ;e 1 x 1 ; 2 + 3 ;e 1 x 1 ; 3 y e 2 = 1 ;e 2 x 1 ; 1 + 2 ;e 2 x 1 ; 2 + 3 ;e 2 x 1 ; 3 y e 3 = 1 ;e 3 x 1 ; 1 + 2 ;e 3 x 1 ; 2 + 3 ;e 3 x 1 ; 3 y e 4 = 1 ;e 4 x 1 ; 1 + 2 ;e 4 x 1 ; 2 + 3 ;e 4 x 1 ; 3 y e 5 = e 3 ;e 5 y e 3 + e 4 ;e 5 y e 4 y e 6 = e 1 ;e 6 y e 1 + e 2 ;e 6 y e 2 + e 5 ;e 6 y e 5 (3.2) y e 7 = e 1 ;e 7 y e 1 + e 2 ;e 7 y e 2 + e 5 ;e 7 y e 5 y e 8 = e 1 ;e 8 y e 1 + e 2 ;e 8 y e 2 + e 5 ;e 8 y e 5 y e 9 = e 1 ;e 9 y e 1 + e 2 ;e 9 y e 2 + e 5 ;e 9 y e 5 y e 10 = e 3 ;e 10 y e 3 + e 4 ;e 10 y e 4 y e 11 = e 3 ;e 11 y e 3 + e 4 ;e 11 y e 4 : Thesymbolsatthesinknodecanbewrittenas: z 4 ;j = i =11 X i =6 e i ;j y e i ; (1 j 6) : (3.3) matrices A;B asin[3]: A = 2 6 6 6 6 4 1 ;e 1 1 ;e 2 1 ;e 3 1 ;e 4 2 ;e 1 2 ;e 2 2 ;e 3 2 ;e 4 3 ;e 1 3 ;e 2 3 ;e 3 3 ;e 4 3 7 7 7 7 5 ; (3.4) B = 2 6 6 6 6 4 1 ;e 1 1 ;e 6 . . . . . . . . . 6 ;e 1 6 ;e 6 3 7 7 7 7 5 : (3.5) 30 Wealsoamatrix = 2 6 6 6 6 6 6 6 4 1 ; 6 1 ; 7 1 ; 8 1 ; 9 00 2 ; 6 2 ; 7 2 ; 8 2 ; 9 00 5 ; 6 3 ; 5 5 ; 7 3 ; 5 5 ; 8 3 ; 5 5 ; 9 3 ; 5 3 ; 10 3 ; 11 5 ; 6 4 ; 5 5 ; 7 4 ; 5 5 ; 8 4 ; 5 5 ; 9 4 ; 5 4 ; 10 4 ; 11 3 7 7 7 7 7 7 7 5 ; (3.6) here e i ;e j iswrittenas i;j forshort.Thenthesystemmatrixis: M = A B T : (3.7) Inthisexample,thesizesofmatrices A;B are3 4and6 6.Thusthesizeoftransfer matrixis3 6.Theoriginalsymbols x 1 ; 1 ;x 1 ; 1 ;x 1 ; 3 canbeseenasanoriginalmessageof length3.Andthereceivedsymbolsatsinknodecanbeseenasacodewordoflength6.It isappropriatethatweidentifythe3 6transfermatrix M withthegeneratormatrix G of a(6 ; 3)errorcontrolcode.HenceTheorem3.1isvfromtheperspectiveoftransfer matrix. 3.1.3.2TheNecessity Wehaveprovedthatanynetworkcodecanbeviewedasanerrorcontrolcode,nowwewill considerthereverseproblem.Forapoint-to-pointcommunication,anetworkcodeisfeasible onlyifitcansuccessfullydeliverallthedesiredsymbolsfromthesourcenodetothesink node.Theorem3.2showsthecriterionofafeasiblenetworkcode. Theorem3.2. Foralinearnetworkwithsourcenode u ,sinknode v andadesiredconnection C =( u;v; X ( u )) ,thepoint-to-pointconnection C ispossibleifandonlyifthedeterminant ofthe R ( C ) R ( C ) transfermatrix M isnonzero. Theproofofthistheoremcanbefoundin[3]. SincetherearenoredundanciesinthenetworkcodeinTheorem3.2,thedimensionof symbolsinsourcenodeis R ( C )= jX ( u )) j ,andthedimensionofreceivedsymbolsinsink nodeisalsoexactly R ( C ).Therefore,thesizeofthetransfermatrix M is R ( C ) R ( C ). 31 Theorem3.3. Foralinearnetworkwithsourcenode u ,sinknode v andadesiredconnection C =( u;v; X ( u )) ,A(n,k)errorcontrolcodewiththe k n generatormatrix G canbeseen asafeasiblenetworkcodeinthepoint-to-pointconnection C ifwehavetherelationship: k R ( C ) . Proof. Ifthetheoremistruefor k = R ( C ),itwillalsoworkfor k>R ( C ).Itisstraightfor- wardthatanetworkcodecansuccessfullycompletethepoint-to-pointconnectionofwhich therateislowerthanthecode'smaximumcapacity.Soweonlyneedtoprovethecasewhen k = R ( C ). Fromthestatementsinpreliminarysection,wehave k n fora( n;k )errorcontrolcode. Foramessagesequences( x 1 ;x 2 ; ;x k ),wehaveencodeitasfollows: ( z 1 ;z 2 ; ;z n )=( x 1 ;x 2 ; ;x k ) G: (3.8) Because k n ,wecanchoose k independentcolumns( l 1 ;l 2 ; ;l k )from G toformanew matrix G 0 ,whichhastherelationship ( z l 1 ;z l 2 ; ;z l k )=( x 1 ;x 2 ; ;x k ) G 0 : (3.9) Inthiscase, G 0 isa k k fullrankmatrixwithnonzerodeterminant.Ifweview( x 1 ;x 2 ; ;x k ) assymbolsatsourcenode u withtherate R ( C )= k and( z l 1 ;z l 2 ; ;z l k )assymbolsre- ceivedatsinknode v ,then G 0 isthetransfermatrixofthenetworkcode.Accordingto Theorem3.2,thispoint-to-pointconnection C withrate R ( C )= k ispossible.Sointhis case,the( n;k )errorcontrolcodecanbeseenasafeasiblenetworkcode. Inthecasethatthesizeoftransfermatrixislargerthan R ( C ) R ( C ),therepresented errorcontrolcodewillhaveredundancieswhichcanbeusedtocontrolerrors.However, basedonouranalysis,wecanaddredundanciesappropriatelysothatthenetworkcodeis capableofdetectingandcorrectingerrors.Thiscanbedoneintwosteps: 32 1. Accordingtothecommunicationchannelandthedesignrequirements(numberoferrors todetectorcorrect,biterrorrate,etc.),determineanappropriatecoderate k=n and thetypeoftheerrorcontrolcode(Hammingcode,Cycliccode,etc). 2. Accordingtothesourcerate R ( C ),chooseproper k suchthat k R ( C )and n ,and derivethecorrespondinggeneratormatrix G .Thenapplythegeneratormatrix G as thesystemtransfermatrixtothenetworkcoding. 3.1.3.3ApplicationinCombatingpollutionattacks Forexample,inalinearnetworkshowninFigure3.7,thesourcenodeisgoingtosend4 symbols x 1 ;x 2 ; ;x 4 tosinknode.AccordingtoTheorem3.3,wecanapplythe(7 ; 4) Hammingcodewithgeneratormatrix G = 2 6 6 6 6 6 6 6 4 1000110 0100101 0010011 0101010 3 7 7 7 7 7 7 7 5 : (3.10) ThecorrespondingnetworkcodeisalsoshowninFigure3.7.Becausetheminimumdistance ofthecodeis3,thisnetworkcodecancorrect1biterror.Supposethesourcenodesends4 symbols(1 ; 0 ; 1 ; 0),theexpectedreceivedsymbolswillbe(1 ; 0 ; 1 ; 0 ; 0 ; 0 ; 0).However,because themaliciousnode M changesthesymbol,thereceivedsymbolswillbe s =(1 ; 0 ; 1 ; 0 ; 1 ; 0 ; 0). Sinknodecandecodethereceivedsymbolusingthesyndrome-decodingmethod.Theparity- checkmatrixinsinknodeis H = 2 6 6 6 6 4 1101100 1011010 0111001 3 7 7 7 7 5 : (3.11) Withthesyndromeofthereceivedsymbolscalculatedas s H T =(0 ; 0 ; 1),thesinknodecan theerrorpattern(0 ; 0 ; 0 ; 0 ; 1 ; 0 ; 0)andcorrecttheerroneoussymbol.Fromthelocation 33 Figure3.7Implementthe(7 ; 4)Hammingcodeinnetworkcoding oftheerroneoussymbol,thesinknodecanalsothemaliciousnodefromwhich x 1 + x 2 istransmitted. Fromthisexample,wecanseethatwithproperdesignoferrorcontrolcode,wecanmake thecorrespondingnetworkcodecapableofdetectingandcorrectingerrorsthenthe maliciousnodes. 3.1.4MulticastCase Inprevioussection,westudytherelationshipbetweennetworkcodinganderrorcontrol codinginpoint-to-pointcommunicationcase.Wecanderivesimilarresultsforthemulticast case,wherethenetworkconsistsofonesourcenode u andseveralsinknodes v 1 ;v 2 ; ;v N . Thenetworkcodeformulticastingisfeasibleifandonlyifallthesinknodescanreceiveall symbols X ( u )sentfromsourcenode u . 3.1.4.1The Theorem3.1stillholdsinmulticastcasebecausewedonotspecifywhetherthecommuni- cationisunicastormulticastintheproofofthetheorem,whichindicatestheproofofthe theoremisindependentofthetypeofthecommunications. 34 3.1.4.2TheNecessity Themulticastproblemcanbedividedinto N unicastproblems: C =( C 1 ;C 2 ; ;C N )=(( u;v 1 ; X ( u )) ; ( u;v 2 ; X ( u )) ; ; ( u;v N ; X ( u ))) : (3.12) Ifwewriteallthereceivedsymbolstogether: z =( z 1 ;z 2 ; ;z N )=( Z ( v 1 ; 1) ; ;Z ( v 1 ; ( v 1 )) ; ;Z ( v N ; ( v N ))) ; (3.13) wecanobtainthesystemtransferequationforthewholenetwork: z = x M ,inwhich M is amatrixdas M = jX ( u ) j i = N X i =1 ( v i ) : (3.14) Itisobviousthat M = M 1 j M 2 jj M N ; (3.15) inwhich M 1 ;M 2 ; ;M N arethesystemtransfermatrixesforeachunicast C 1 ;C 2 ; ;C N . Theorem3.4. Foralinearnetworkwithsourcenode u ,sinknode v 1 ;v 2 ; ;v N anddesired connections C i =( u;v i ; X ( u ))(1 i N ) ,Aconcatenationof N errorcontrolcodeswith the k n i generatormatrix G i canbeseenasafeasiblenetworkcodeinthemulticastproblem C =( C 1 ;C 2 ; ;C N ) ifwehavetherelationship: k R ( C i ) . Proof. ThistheoremisanaturalextensionofTheorem3.3.If k R ( C i ),allthegenerator matrix G i canbeseenasfeasiblenetworkcodesinunicastproblem C i .Sotheconcatenation of G i canbeseenasafeasiblenetworkcodeinmulticastproblem C . 3.2ACascadedErrorControlCodingApproach 3.2.1ModelsandAssumptions Ifarelaynode r iscompromised,thesymbolstransmittedoneachedge e suchthat head ( e )= r willbemoThenodesafternode r willbepollutedbecauseofthenetworkencoding. 35 Eventuallythesinknodewillreceivemoreerroneoussymbolsthanthoseoriginallybrought bythemaliciousnode.Inthissection,wetrytoexploretheinnerstructureofthenetwork codetocorrecttheerrorsandlocatethemaliciousnode.Wewillpartitionthenetworkinto severalcascadedlevelsandexploretheinnerstructureofthenetworkcode,thuswemustbe abletocorrectlyaccesstheoutputsofalltherelaynodes.Torealizethis,weaddaspecial monitornodeinthenetwork.Thisnodecancollecttheoutputencodedmessagesfromall therelaynodesandcanneverbecompromised. 3.2.2AnIllustrativeExample Letusexaminetheclassicexample[1]againshowninFigure3.1.Byencodingatnode 4,bothnodes6and7canretrievethetwosymbols x 1 ; 1 ;x 1 ; 2 successfully.InSection3.1, wemergedtheintermediatenodesandpathsandtransferredthethenetworkcodeintoa bipartitegraph.Whileinthissection,wetrytoexplorethenetworkcodetoexhibitthe innerstructureofthenetworkcode.Toexplainourmainidea,wewillonlyfocusonthe communicationbetweennode1andnode6(theshadedareainFigure3.1).Theanalysisis similartothecommunicationbetweennode1andnode7.Inthiscommunication,symbol x 1 ; 1 ispassedtonode2,node4,node5andnode6throughonehop,twohops,threehops andtwohopsrespectively,andsymbol x 1 ; 2 ispassedtonode3,node4,node5andnode6 throughonehop,twohops,threehopsandfourhopsrespectively.AsshowninFigure3.8,if weaddtwovirtualnodes v 1 and v 2 onedge e 5 ,wecanmake x 1 ; 1 passedtonode6through fourhops,thusturnalloftheintermediatenodesinto3cascadedlevels.Eachofthelevel canbeseenasasinglenetworkcode,sowecanrepresenteachlevelusingthebipartite graphshowninFigure3.9accordingto[92].Inthisway,weexploretheinnerstructure oftheoriginalnetworkcode,whichisdeterminedbythenetworktopology.Theoriginal networkcodecanbeviewedas3cascadederrorcontrolcodeswiththegeneratormatrices 36 Figure3.8TransferthenetworkcodingschemeinFigure3.1intoa3-levelcascadedcoding byadding2virtualnodes. Figure3.9Thecorrespondingbipartitegraphsof3cascadedlevelsinFigure3.8 2 6 4 110 001 3 7 5 ; 2 6 6 6 6 4 10 01 01 3 7 7 7 7 5 ; 2 6 4 10 01 3 7 5 : Althoughinthisexample,thereisnoredundancyinthethreeerrorcontrolcodes,the correspondingnetworkcodecannotdetectorcorrecterrors,itisttoshowthat networkcodecanbeexpandedtocascadederrorcontrolcodes. IntherestofSection3.2,wewillshowthatnetworkcodescanbetransferredintocascaded errorcontrolcodes.Inthisway,wecancharacterizeanddesignnetworkcodesbasedonthe underlyingcascadederrorcontrolcodesforerrordetection/correctionandmaliciousnodes locating. 37 Figure3.10Transferincomingedgesofnodeshavingmultipleincomingedgesbyadding virtualnodes 3.2.3CharacterizationofNetworkCodingusingCascadedErrorControlCod- inginPoint-to-PointCommunication Herewewillformallystatetherelationshipbetweennetworkcodingandcascadederror controlcodinginthepoint-to-pointcommunication.Theciencyisstudiedthenthe necessity. 3.2.3.1The Theorem3.5. Everynetworkcodeschemecanbeexpandedtoaseriesofcascadederror controlcodes. Proof. Toprovethis,wewillshowthatthenetworkcodecanbepartitionedintoseveral cascadedlevelsofonehopnetworkcodes.Foreachofthenodesthathavemultipleincoming edgesinthenetwork,weaddsomevirtualnodesontheseedgesasshowninFigure3.10.For eachoftheincomingedges,theremaybeseveralpathsthroughwhichmessagesarepassed fromthesourcenodetonode u includingtheedge.Amongallthepaths,wethelongest oneandcalculateitsnumberofhops.Aftercalculatingthehopvalues h 1 ;:::;h m forallthe incomingedges,wechoosethemaximumvalue h max .Foreachoftheincomingedge i ,we add h max h i virtualnodesonit,makingallthepathsfromsourcetonode u havethesame countofhops.Thevirtualnodessimplyforwardthemessagespassedonthecorresponding edges. 38 Figure3.11Partitionanetworkcodeintoseverallevels AftertheoperationinFigure3.10isperformedinallthenodeshavingmultipleincoming edges,sinceallthepathsfromsourcenodetothesametherelaynodehavethesamehop countsandthesinknodeitselfmusthavemultipleincomingedges,everypathfromthesource nodetothesinknodehasthesamenumberofhops,thusthesamenumberofintermediate nodes,includingtherelaynodesandthevirtualnodes.Wecanputthenodeshavingthe samehopcountstogetherasalevelasshowninFigure3.11.Everysinglelevelcanbeviewed asonehopnetworkcodedeterminedbytheconnectionsfromnodesofthepreviouslevel.So everynetworkcodecanbepartitionedintoseveralcascadedlevelsofonehopnetworkcode. AccordingtoTheorem3.1,theseonehopnetworkcodescanberepresentedbyerror controlcodes.Sothecascadednetworkcodescanberepresentedbyconcatenatingthe correspondingerrorcontrolcodestogether.Wecanexpandanynetworkcodetoaseriesof cascadederrorcontrolcodes. TakingthenetworkcodeinFigure3.5asanexample.Thesourcenode1transmitsthree symbols x 1 ; 1 ;x 1 ; 2 ;x 1 ; 3 tosinknode4inthisnetworkcode.Andsinknode4canreceive 6encodedsymbols,whichindicatesthatthereareredundanciesinthisnetworkcoding.In Section3.1,weanalyzethesamecodeandtransferitintoa(6 ; 3)errorcontrolcodewhich cancorrect1error.Herewewillshowthiscodecanbetransferredintoaseriesofcascaded errorcontrolcodes.FollowingtheoperationsmentionedintheproofofTheorem3.5,we cangetthecorrespondingcascadednetworkcodesandcascadederrorcontrolcodesshown 39 Figure3.12ThecorrespondingcascadedbipartitegraphofFigure3.5 inFigure3.12.Nodes v 1 and v 2 areaddedasvirtualnodestopartitiontheoriginalnetwork code.Thelevelerrorcontrolcodeisa(5 ; 3)codeandthesecondlevelcodeisa(6 ; 5) code. Ifanerroroccursonedge e 1 ,node2willreceivewrong x 1 ; 1 .Theerrorwillpropagateto thesucceedingnodes,thustherewillbetwoerroneous x 1 ; 1 and x 1 ; 1 + x 1 ; 3 inthesinknode, whichisbeyondtheerrorcorrectioncapabilityofthe(6 ; 3)errorcontrolcode.Theerrors cannotbedealtusingthetransformingmethodsinSection3.1.However,ifthemonitor nodecancollecttheoutputsymbolsofthelevel(5 ; 3)code,itcancorrecttheerroneous symbol x 1 ; 1 innode2.Sotheerrorpropagationiseliminatedfromthebeginning.By exploringtheinnerstructureofthenetworkcode,wecanmakebetteruseoftheredundancy inthenetwork. Ifnode3isanmaliciousnodeandsendoutcorruptedmessages,therewillbe3errors intheoutputofboththelevelerrorcontrolcodeandthesecondlevel.Theerroris beyondthecapabilityofthecascadederrorcontrolcodes,sowecannotcorrecterrorsor 40 locatethemaliciousnode.Wewillshowthatwecandesignnetworkcodescorresponding topropercascadederrorcontrolcodestocorrecterrorsandlocatemaliciousnodesinthe sectionsbelow. 3.2.3.2TheNecessity Wehaveprovedthatanynetworkcodecanbeviewedasaseriesofcascadederrorcontrol codes,nowwewillconsiderthereverseproblem.Forapoint-to-pointcommunication,a networkcodeisfeasibleonlyifitcansuccessfullydeliverallthedesiredsymbolsfromthe sourcenodetothesinknode. Theorem3.6. Foralinearnetworkandadesiredconnection C =( u;v; X ( u )) ,Aseriesof cascadederrorcontrolcodeswithparameters ( n 1 ;n 0 ) ; ( n 2 ;n 1 ) ;:::; ( n m ;n m 1 ) ,canbeseen asafeasiblenetworkcodeintheconnection C ifwehavetherelationship: n 0 R ( C ) . Proof. Supposetheoriginalmessageis x =( x 1 ;:::;x k ),theoutputencodedmessagefor eachlevelofthecascadederrorcontrolcodesis y i =( y i; 1 ;:::;y i;n i )(1 i m )andthe generatormatrixforeachofthecascadederrorcontrolcodesis G i (1 i m )ofthesize n i 1 n i . y i foreachlevelcanbewrittenas: y 1 = x G 1 ; y 2 = y 1 G 2 ;:::; y m = y m 1 G m : (3.16) Sotheentireencodingequationforthecascadederrorcontrolcodescanbewrittenas y m = x G 1 G 2 G m = x G: (3.17) Ifweviewthecascadederrorcontrolcodesasanerrorcontrolcodewiththegenera- tormatrix G ofthesize n 0 n m ,theparameterforthecodeis( n m ;n 0 ).Accordingto Theorem3.3,if n 0 R ( C ),thenetworkcodeisfeasible. Basedontheanalysis,byimplementingtheerrorcontrolcodeforeachlevelofthe cascadederrorcontrolcodes,wecanaddappropriateredundanciesintothenetworkcodeto controlerrorsandlocatemaliciousnodes.Thiscanbedoneintwosteps: 41 1. Accordingtothenetworktopology,determinethenumberoflevelsofthecascaded codes.Accordingtothedesignrequirements(numberoferrorstodetectorcorrect, numberofmaliciousnodestolocate),determineanappropriatecoderateandthetype oftheerrorcontrolcodeforeachlevel. 2. Accordingtothesourcerate R ( C ),chooseaproper n 0 suchthat n 0 R ( C ),andderive therestofthe n i (1 i m )basedonthecoderateforeachleveloftheerrorcontrol codes.Generatethegeneratormatrices G 1 ;:::;G m accordingtothecodetypesand applythemasthesystemtransfermatricestoeachlevelofthenetworkcodes. 3.2.3.3ApplicationinCombatingpollutionattacks Theorem3.7. Suppose d i ;d i +1 > 2 aretheminimumdistancesof2adjacentlevels( L i ;L i +1 ) ofthecascadednetworkcode.If 2 d i +1 >d i +1 ,thenerrorsin L i +1 spreadbyasingleerror in L i isuncorrectablebythe L i +1 'serrorcontrolcode.However,theycanbecorrectedby the L i 'serrorcontrolcode. Proof. Accordingto[89],onesymbolinthesourcemessageisrelatedtoatleast d min symbols intheencodedcodeword.Sooneerrorin L i canbecomeatleast d i errorsin L i +1 .If 2 d i +1 >d i +1 ,theseerrorsarebeyondthecapabilityoftheerrorcontrolcode.However, thesingleerrorcanbecorrectedat L i because d i > 2.Thentheerrorsin L i +1 canbe correctedaccordingly. LetusanalyzethelinearnetworkshowninFigure3.13,thesourcenode1isgoingto send3symbols x 1 ;x 2 ;x 3 tosinknode12.Thisnetworkcanbepartitionedinto2levels. Nodes2 ; 3 ; 4 ; 5formthelevelandnodes6 ; 7 ; 8 ; 9 ; 10 ; 11formthesecondlevel.Inorder togetthebesterrorcontrolcapability,weimplementtwosystematicRScodesinthetwo levels.Theyare(7 ; 3 ; 5)codeforlevel1and(11 ; 7 ; 5)codeforlevel2.Theminimum distancesofthetwocodesareboth5,thusbothofthemcancorrect2errors.Becausethe errorsoccurringnexttothesourcenodearemoresensitive.Theymaypropagatetothe 42 subsequentnodescausingmuchmoreerrors.Weputthelowerratecodethathasstronger errorcontrolcapabilityatthelevel. Whenthereisnoerrorinthenetwork,wehave( y i; 1 ;y i; 2 ;y i; 3 )=( x 1 ;x 2 ;x 3 ) ;i =1 ; 2 : Itiseasyforthesinknodetodecodethemessages.Ifnode6isamaliciousnodeandit sendsouterroneous y 2 ; 1 ;y 2 ; 2 ,themonitornodecancorrectthese2errorsusingthesec- ondlevelRScodeandoutthismaliciousnodeaccordingtothenetworktopology.If node2isamaliciousnodeanditsendsouterroneous y 1 ; 1 ;y 1 ; 2 ,theerrorswillpropagate to y 2 ; 1 ;y 2 ; 2 ;y 2 ; 8 ;y 2 ; 9 ;y 2 ; 10 ;y 2 ; 11 ,whichpreventsthesecondlevelcodefromcorrectingthe errors.InthecorrespondingcascadedbipartitegraphFigure3.14,theerrorsaremarked withgreycolor.Itisclearthat2errorsfromlevel1spreadto6errorsinlevel2.Evenifwe transferthenetworkcodeintoone(11 ; 3 ; 9)RScodewhichiscapableofcorrecting4errors accordingtoSection3.1,theerrorsarestilltoomanytocorrect.However,basedonthefact thattheerrorsareburstandcorrelated,afterthemonitornodecollectstheoutputsofthe level,itcancorrectthe2errorsoccurringinnode2usingthelevelRScode, outthemaliciousnodebasedonthenetworktopologyandcorrectthe6errorsinthesecond level.OurcascadedRScodecancorrectatmost6errorsbyexploringtheinnerstructure ofthecodeandismorepowerfulthanregularRScodes. Withproperdesignofeachlevelofthecascadederrorcontrolcodes,wecanmakethe correspondingnetworkcodecapableofdetectingandcorrectingerrorsthenlocatingthe maliciousnodes. 3.2.4MulticastCase Becauseinpoint-to-pointcommunicationcase,ourproofsfortherelationship(writtenas R nc;cec )betweennetworkcodeandcascadederrorcontrolcodesaresolelydependedon theproofsfortherelationship(writtenas R nc;ec )betweennetworkcodeanderrorcontrol codeinSection3.1(Theorem3.1andTheorem3.3)andthiskindofdependencehasno relationshipwiththespcommunicationcase,wecanprovethat R nc;cec inthemulticast 43 Figure3.13Implementa2levelcascadederrorcontrolcodeinnetworkcoding Figure3.14ThecorrespondingcascadedbipartitegraphofFigure3.13 caseissimilartothatinthepoint-to-pointcommunicationcase,basedonthefactthatin Section3.1 R nc;ec staysthesameinbothpoint-to-pointandmulticastcases. 44 CHAPTER4 COMBATINGPOLLUTIONATTACKSFORRANDOMNETWORKS Inthischapter,wewillproposeanewerror-detectionanderror-correction(EDEC)scheme todetectandremovethemaliciousattacksforrandomnetworkcoding.TheproposedEDEC schemeissimilarinstructuretotheexistingerrorcontrolbasedschemes.However,itcan maintainthroughputunchangedwhenmoderatenetworkpollutionexistswithonlyaslight increaseincomputationaloverhead.ThenweproposeanimprovedLEDECschemeby integratingthelow-densityparitycheck(LDPC)decoding.Ourtheoreticalanalysisdemon- stratesthattheLEDECschemecanguaranteeahighthroughputevenforheavilypolluted networkenvironment.Wealsoprovideextensiveperformanceevaluationandsimulation resultstovalidateourtheoreticalresultsusingns-2networksimulator. 4.1System/AdversarialModelsandAssumptions Inthischapter,wewillstudycombatingpollutionattacksfortheencodingforrandom networks,whereitistodesignterrorcorrectionnetworkcodeswithoutthe knowledgeofthenetworktopology.Inthiscase,alltheencodingcots l;e ; e 0 ;e and e 0 ;j willbechosenrandomly. Inthischapter,wewillusesomenotationsofSection2.1.Forasourcenode u ,thereisasetofsymbols X ( u )=( x 1 ;:::;x l )tobesent.Foralink e betweenrelaynodes r 1 and r 2 ,writtenas e =( r 1 ;r 2 ),thesymbol y e transmittedonitisthefunctionofallthe y e 0 suchthat head ( e 0 )= r 1 .And y e canbewrittenas: y e = X e 0 : head ( e 0 )= r 1 e 0 ;e y e 0 = l X i =1 e;i x i = e x ; (4.1) where e 0 ;e isthelocalnetworkencodingcot, e;i istheglobalnetworkencoding 45 Figure4.1Applyerrorcontrolcodesinlinearnetworkcoding cotforsymbol y e and e = e; 1 ; e; 2 ;:::; e;l isthenetworkencodingvector.For asinknode v ,thereisasetofincomingsymbols y e 0 ( e 0 : tail ( e 0 )= v )tobedecoded. Aswementionedabove,ifadversariescanmodifythecontentsofthepacketsandsend themtothesucceedingrelaynodes,thecommunicationwillfailandthecapacitywillbe reduced.Inaddition,foralargescalenetwork,asmallerroroccurringatanintermediate relaynodemaytomanypacketsatthesinknode.Thiscancauseatwaste ofnetworkresourcesandsometimescanevenruinthewholenetworkcommunication. Inthisdissertation,themaliciousnodecanaddrandomerrorstothesymbolsinthe receivedpacketsthensendthecorruptedpacketsouttopollutethenetwork.Weadoptthis simpleadversarialmodelbecausewemainlyfocusonthethroughputimpactbroughtby tstrategies(discardvs.keep)towardscorruptedpacketsinthisresearch. 4.2ProposedEDECScheme ThebasicideaoftheproposedEDECschemeinFigure4.1isthatthesourcenodesencode theoriginalmessagesusinganerrorcontrolcodebeforesendingthemout.Theproperties oftheerrorcontrolcodekeepunchangedduringthelinearnetworkcoding. AswementionedinSection1.1.3,theerror-detectionbasedschemesmainlyfocuson detectingthecorruptpackets.Whenacorruptpacketisidenthroughsyndromes,it willbediscarded.Soifanadversarycontinuestocorruptcertainpackets,thesepacketswill becontinuouslydroppedandthecommunicationmayneversucceed.Therefore,weneedto 46 Figure4.2Limitationsoferrorcontrolcodes developtechniquesthatcanimprovethethroughputforthesesituations. 4.2.1EDECScheme SimilartotheNullKeyscheme,ourapproachalsoutilizestheerrorcontrolcode,butwe useboththeerrordetectionanderrorcorrectionproperties.Whenacorruptedpacketis detected,wedonotdropit.Insteadwecollectthecorruptedpackettothesinknodeto correcttheerrors.However,thecorruptedpacketwillnotparticipateinnetworkcodingin thesubsequentrelaynodesonceitisidentobecorrupted. 4.2.1.1LimitationsofErrorControlCode Alinearerror-correctingcodeencodestheoriginal k bitsmessagesymbol m toan n bits codeword c usingageneratingmatrix G k n .Sothecoderateis r = k=n .Supposethe minimumdistanceis d ,accordingtotheresultsinthePreliminary,themaximumnumber oferrorswecancorrectis j d 1 2 k .Ifthenumberoferrorsismorethanthisamount,wemay correctthecorruptedcodewordintoafalseone,asillustratedinFigure4.2. 4.2.1.2MoErrorControlCode Theconventionalerrorcontrolcodemayhaveundetecteddecodingerrors.Thisisaninherent nature.Nomatterhowlowwesetthecoderate,theseundetectederrorsmayexist.The 47 Figure4.3TheencodingprocessofmoerrorcontrolcodeinEDECscheme decodingerrorscanonlybedetectedusingmechanismsotherthanastand-aloneerror- correctingcode. Therefore,weproposetoapplymoerror-controlcodetobothmessagesymbolsand networkcodingcoetsinequation(4.1).Inthissection,wewillusethemessagesymbol asanexample.Theoriginalmessagesymbol m ismappedtoa t bitvalue h usinga homomorphicMACalgorithmlike[33].The t bitswillbeappendedto m toformanew k + t bitsmessagesymbolandtogetthecodewordbyencodingthisnewmessagesymbol.So thecodebecomesan( n;k + t )code.Byaddingtheextrabits,wecanmitigatelimitations oftheconventionalerror-controlcode.Figure.4.3illustratesthemoencodingscheme. Uponasuccessfuldecoding,thedecodedmessagesymbolissplitintotwoparts m 0 and h 0 .Thenwecalculatethemappingof m 0 : h 00 .If h 00 doesnotequalto h 0 ,we candetectadecodingerror.Ourmoisequivalenttochoose2 k messagesymbols from2 k + t symbols.Othermessagesymbolsinthe2 k + t symbolspaceareconsideredtobe illegal.However,thedecodingalgorithmonlyguaranteesthatthedecodedcodewordisin the k + t dimensionalsubspace.Soifthecorrectedcodewordbelongstothe2 k + t 2 k illegal symbolspace,weknowthedecodingcontainserror.Figure4.4illustratesthecorresponding modecodingscheme. Theorem4.1. Supposeadecodingerroroccurs,thewrongcodewordwillbeanycodeword inthe 2 k + t symbolspace.Sotheprobabilityofdetectinganerroneousdecodingis: p = 2 k + t 2 k 2 k + t = 2 t 1 2 t =1 1 2 t : (4.2) 48 Figure4.4ThedecodingprocessofmoerrorcontrolcodeinEDECscheme Table4.1Fourcasesofdecodedcodewordsinmoerrorcontrolcode Case kbitsoriginalsymbol t bitsmappingvalue Results 1 Decodedright Obeymappingrule Successfully 2 Decodedright Violatemappingrule Falsealarm 3 Decodedwrong Obeymappingrule Missdetect 4 Decodedwrong Violatemappingrule Successfully Asanexample,when t =4 ;p = 15 16 ,theprobabilityfor3consecutiveerroneousdecodings tobedetectedis1 1 16 3 ˇ 0 : 9998.Therefore,weonlyneedtoaddaverysmalloverhead todetecterroneousdecodings. 4.2.1.3PerformanceofMoErrorControlCode Inthissection,wewillselectacycliccodewith n =15, t =4todemonstrateourproposed scheme.Weaddsomeerrorstotheencodedsymbols,thendecodethesesymbolsas describedinFigure4.4.Weevaluatetheperformancebycheckingthenumbersofdecoded codewordsinfourtcases.TheresultsaresummarizedinTable4.1. Codewith k =6Inthissimulation,weusea(15 ; 10)codefortheevaluation.Fromthe results(seeFigure4.5)wecansee:(i)Thiscode(withminimumhammingdistance4)can detectandcorrectallthe1biterrorandpartofthetwobitserrors.(ii)Wecansuccessfully detectmostofthedecodingerrorswhenthenumberoferrorsismorethan2.Infact, 49 Figure4.5PerformanceofmoerrorcontrolcodeinEDECschemewhen k =6 thedetectionprobabilityislargerthan0.8mostofthetimeexceptfor14errors,inwhich theprobabilityisabout0.6.(iii)Falsealarmcannotbedistinguishedfromthesuccessful detection.Infact,thefalsealarmisalsocausedbytheerrorsbeyondthecorrectingability. Theonlyerenceisthatthe t bitsappendixpartofthesymbolisdecodedwrong.However, thefalsealarmisneglectableaccordingtotheresults. Codewith k =4Inthissimulation(seeFigure4.6),weusea(15 ; 8)codetodothe evaluation.Theonlyisthatthiscodeisabletocorrectmoreerrorsbecausethe coderateisrelativelylower.Fromtheresultsofthetwotcoderates,wecanseethat adding4extracheckbitsisenoughtodetecterroneouscorrection. 4.2.1.4AlgorithmsforEDECScheme TheproposedEDECschemeisdividedintotwophases:initializationphaseandtransmission phase.Theinitializationphaseisfornullkeyandsecurityparameterdistributionwhiledata symbolsaretransmittedthroughnetworkcodinginthetransmissionphase. 50 Figure4.6PerformanceofmoerrorcontrolcodeinEDECschemewhen k =4 InitializationPhase Ininitializationphase,thesourcenodewilldistributetherow vectorsoftheparitycheckmatrixcorrespondingto G inAlgorithm4.1(nullkeys)toall therelaynodessimilarto[44]usinghomomorphichashes.Unlikenormallinearnetwork codinginwhichthenetworkencodingvectorswillbeattachedtothestartortheendof thepackets,weproposetoinserttheencodingvectorstoapredeterminedsecretlocationin thepackets.Thesourcenodewillsendthelocationinformationtoallthesinknodesduring initializationphasethroughasecuretransmissionprotocolsuchasTLS[93].Thiswill preventthemaliciousnodesfromcorruptingtheencodingvector,whichisessentialforthe datadecoding.Moreover,thesourcenodewillalsosendtheencodingmatrix G c fornetwork encodingvectorsand G fordatasymbolstoallthesinknodes.Oncetheinitializationphase isdone,thesourcenodescanmulticastanynumberofpacketstosinknodes.Theoverhead oftheinitializationphaseisnegligible. 51 TransmissionPhase Inthetransmissionphase,thesourcenodes,relaynodesandsink nodeswillperformtheproposedEDECschemeaccordingtoAlgorithm4.1,4.2and4.3. Algorithm4.1 EDECAlgorithmforSourceNodes for packeti do == Encodenetworkencodingvector i inequation(4.1)usingthemoderror-control code(Figure.4.3) h c map( i ) u c ( i j h c ) Encodednetworkencodingvector u c G c for everysymbol m ofthepacket do == Encode m usingthemoderror-controlcode(Figure.4.3) h map( m ) u ( m j h ) Encodedsymbol u G endfor Sendouttheencodedencodingvectorandsymbols endfor Inalgorithm4.1,thesourcenodewillencodethenetworkencodingvector i usingthe moerror-controlcodewithamuchlongerappendixandlowercoderate,comparedto theencodingofdatasymbols.Thiscanimprovetheerrorresistanceanddetectionproba- bilityforerroneousdecodingstoguaranteethecorrectnessofencodingvectorsusedfordata decoding.Sincethereisonlyoneencodingvectorineachpacket,theoverheadbroughtby thishighersecuritylevelisnegligible. Algorithm4.2presentstheEDECalgorithmforrelaynodes.Sincethenullkeys(rowvec- torsoftheparitycheckmatrixcorrespondingto G inalgorithm4.1)arealreadydistributed ininitializationphase,therelaynodescancheckwhetherapacketisintact. Algorithm4.3presentstheEDECalgorithmforsinknodes.Sincethesinknodehas alreadyreceivedtheencodingmatrix G c and G inAlgorithm4.1ininitializationphase,it canperformtheerror-controlcodedecodinganddetectionforerroneousdecoding.Thenit canderivetheoriginaldatasymbolsthroughdecodingofnetworkcoding. 52 Algorithm4.2 EDECAlgorithmforRelayNodes if everysymbolinthereceivedpacketisintact then if thepacketisindependent then Savethepacket if x (apredeterminednumber)packetsarecollected then repeat Generate x randomly,linearlycombinedpacketsusingthesavedpackets(network coding) until the x newpacketsareindependent Sendoutthe x packets endif endif else if thepacketisindependentfromallthepreviouspackets then Markthepacketascorruptedandsenditout endif endif Algorithm4.3 EDECAlgorithmforSinkNodes Apacketisreceived Decodethenetworkencodingvectorandeverysymbolinthepacketusingthedecoding algorithmforthemoerror-controlcode(Figure.4.4) if thenetworkencodingvectorandallsymbolsaredecodedcorrectly then if thepacketisindependent then Savethepacket if l (inequation(1.1))independentpacketsaresaved then Solvethenetworkcodingequations endif endif endif 4.2.2Simulationinns-2 Inthissection,thesimulationplatformforEDECschemeinns-2[94]ispresented.Then wewillcomparetheEDECschemeandtheerror-detectionbasedschemes.Inthesimulation, weimplementtheNullKeyschemetorepresenttheerror-detectionbasedschemes. 53 4.2.2.1SimulationPlatform ns-2isadiscreteeventsimulatorthatprovidescomprehensivesupportforsimulationof networkprotocols.ItisidealforthesimulationofEDECscheme.Thescenarioissetasa gridnetworkwithonesourcenode,anumberofrelaynodesandsinknodes.Allthenodesare setaswirelessnodesusingwirelessphysicallayer,802.11MACprotocolandAODVrouting protocol.ThewirelesschannelissettoTwoRayGround.Thenodestransmitpacketsusing broadcasting.Onceanodereceivesapacket,itwillstartthecorrespondingoperations dependingonitstype(source,relay,malicious,sink)andthepacketcontent. Figure4.7showsthetopologyofthesimulatednetwork.Thesourcenodeislocated atthelowerleftcornerand19sinknodeslieattheupperright.Therestnodesareall intermediatenodesthatcanrelaypackets.Inthesimulation,werandomlypickanumber ofintermediatenodesasmaliciousnodestoperformpollutionattacks.Thesenodescanadd certainerrorstoreceivedpacketsbeforesendingthepacketsouttopollutethenetwork.We canchangethenumberofmaliciousnodestoevaluateperformanceofthealgorithmsunder tnetworkconditions.Asanexample,inFigure4.7,werandomlypick50nodesoutof 209intermediatenodestobemaliciousnodes.Themaliciousratioisabout50 = 209=24%. Therestoftheintermediatenodesactasrelaynodes.Afterreceivingapacket,theywill conductthepollutiondetection.Intheerror-detectionbasedschemes,ifthepacketis corrupted,itwillbedropped.WhileintheEDECscheme,wewillforwardthesepackets. However,thesepacketswillnotparticipateinnetworkcoding.Thenodesbehaviorswillbe detailedinthenextsection. Becausethepacketsaretransmittedthroughbroadcasting,althoughtheMACprotocolis IEEE802.11,wewillstillhavepacketscollisionsthatwilleventuallytthesimulation results.Inthisdissertation,weonlyfocusonnetworklayerprotocols.Thusafterconsidering thetransmissionrangeofthesinglenode,adjacentnodesareassignedttimeslots (seeFigure4.8)toavoidpacketscollisions.Thereare9timeslotsintotalandtheduration 54 Figure4.7Simulationscenario Figure4.89timeslotstoavoidpacketscollisions ofeachtimeslotis100ms.Thenodesareallowedtosendpacketsonlyiftheyareintheir ownslots.Ifnot,theywillhavetowaituntiltheirnextslots.InFigure4.8,wegivean examplefornodesthatbelongtotimeslot2tosimultaneouslytransmitwithoutpackets collisions. 55 4.2.2.2NodesDesign Aftersettingupthesimulationplatform,wecanfairlyevaluatethealgorithmswithoutcon- sideringotherfacts.Fourtypesofnodesaredesignedaccordingtothealgorithmsdescribed above. SourceNode Inthesimulation,thesourcenodewillmulticasta352-symbolmessage, whichisfragmentedinto32packetsof11symbols.Eachsymbolhasthesize k =512 bits.Inthewholenetworktherewillbe32linearlyindependentpackets.Thatis l =32in equation(1.1).Afterinitializingthenetwork,thesourcenodewillencodeeachdatasymbol usingthemoerror-controlcodepresentedinFigure.4.3with t =16.Theencoding vector i willalsobeencodedusingthemoerror-controlcodewith t =32.According toTheorem4.1,theprobabilityofdetectinganerroneousdecodingfortheencodingvector isabout1 2 10 ,whichmeansoncethedecodedencodingvectorpassthevin Figure.4.4,wecanviewtheencodingvectorasintact.Thenthesourcenodewillinsert encodednetworkencodingvectorintothepredeterminedlocationineachpacketandsend outthepacket. RelayNodes RelaynodeswillperformEDECschemeaccordingtoalgorithm4.2.Because thenetworkiscollisionfreeandallthetransmittedpacketscanbereceived,eachpacketonly needstobetransmittedonce.Soifanewlyreceivedvalidpacketislinearlydependentof previoustransmittedpackets,itwillbediscarded.Sincethereare32linearlyindependent packetsintotal,ifarelaynodedoesnottransmitpacketsuntilallthe32packetshavebeen received,thetimedelaywillbehuge.Moreover,arelaynodemayneverbeabletocollect allthe32validpacketsduetomaliciousattacks.Tousenetworkcodingtlywhile minimizingthetimedelay,relaynodeswillperformnetworkencodingoncetheycollect4 independentvalidpackets. 56 MaliciousNodes Similartotherelaynodes,maliciousnodesonlysendoutindependent packets.However,weassumethatthemaliciousnodeswillnotperformnetworkencoding inthiscase.Theyonlypollutepacketsandsendoutcorruptedpackets. SinkNodes Sinknodeswilldecodingboththenetworkcodingandtheunderlingmo error-controlcodeaccordingtoalgorithm4.3.Aftertheoriginalsymbolsaresuccessfully retrieved,allthepacketsreceivedafterwardswillbeignored. 4.2.2.3SimulationResults Weconductedsimulationsundertpercentagesofthemaliciousrelaynodes.Tomake theresultsmoreclear,wethenumberofbitsthatthemaliciousnodescancorrupt foreachsymbol.Thenwemakethisnumberrandomaccordingtoouradversarymodel. SmallNumberofErrors Whenthenumberofbitsthatmaliciousnodescancorrupt foreachsymboliswithinthecapabilityoferrorcontrolcodes,thethroughputcomparison betweentheEDECschemeandtheerror-detectionbasedschemesisshowninFigure4.9. Inthewecanseethat:(i)Whenthepercentageofmaliciousnodesislessthan 10%,theperformanceofthetwoschemesarealmostthesame.(ii)Withtheincreasingof themaliciousnodes,theperformanceoferror-detectionbasedschemesdegradetly. WhilethethroughoutoftheEDECschemeremainsunchanged.(iii)Whenthepercentage ofthemaliciousnodesislargerthan65%,theerror-detectionbasedschemesdonotworkat allbecausetoomanycorruptedpacketshavebeendumped.However,thethroughputforthe EDECschemestillremainsunchangedbecausetheEDECschemecansuccessfullyrecover allofthemessagesymbolsfromthecorruptedpackets.Thisscenariowillremaintrueas longasthecorruptedpacketsymbolsarewithinthecapabilityoftheerrorcontrolcodes.In thiscase,theEDECschemesurpassestheerror-detectionbasedschemesinthroughput. 57 Figure4.9ThroughputcomparisonbetweenEDECschemeandtheerror-detectionschemes basedonthenumberofbitcorruptedineachsymbol|forsmallnumberoferrors LargeNumberofErrors Whenthenumberofbitscorruptedineachsymbolofthere- ceivedpacketsisbeyondthecapabilityoftheerrorcontrolcodes,thethroughputcomparison betweentheEDECschemeandtheerror-detectionbasedschemesisshowninFigure4.10. Fromtheresultswecanseethattheperformanceofthetwoschemesarealmostthesame. ThisisbecausethecorruptedpacketsthatcannotberecoveredbytheEDECschemehave alreadybeendumpedbytheerror-detectionbasedschemes. RandomNumberofErrors Whenthemaliciousnodesaddsrandomerrorstothesym- bolsinthereceivedpackets,theperformanceoftheEDECschemeiscomparablewith theerror-detectionbasedschemes.Thisisbecausethatwhilesomeofthesymbolsinthe corruptedpacketscanbecorrected,butsomearebeyondthedecodingcapabilityofthe errorcontrolcode,whichmakesthepacketunusablewithresultsimilartothepacketbeing dumpedintheerror-detectionbasedschemes. 58 Figure4.10ThroughputcomparisonbetweenEDECschemeandtheerror-detection schemesbasedonthenumberofbitcorruptedineachsymbol|forlargenumberoferrors 4.3LDPCDecodingandLEDECScheme IntheEDECscheme,onlylinearlyindependentpacketsparticipateinthenetworkdecoding atthesinknodes.Corruptedorlinearlydependentpacketswillnotbeused.Inthissec- tion,wewillexploreutilizingthesepacketstorecovermoremessagesymbolsusingLDPC decoding. 4.3.1LDPCCode Lowdensityparitycheck(LDPC)linearblockcodewasintroducedbyGallagerin 1962[95].OneoftheimportantcharacteristicofLDPCcodeisitssparseparitycheck matrix.Byusingiterativedecoding,LDPCcodecanachieveerror-correctionperformance closetoShannonbounds[96].TheadvantagesofLDPCcodewerediscussedin[97,98]. SomenewclassesofasymptoticallygoodLDPCcodeswerestudiedin[99{101].Andsome decodingalgorithmsofLDPCcodeswerepresentedin[102{105]. LDPCcodescanbecategorizedastheregularLDPCcode,ofwhichtheparitycheck 59 Figure4.11AnillustrativeexampleofparitycheckmatrixandTannergraph matrix H hasednumberof1'spercolumnandperrow,andtheirregularLDPCcode[106], ofwhichtheparitycheckmatrixmayhavetnumberof1'sineachcolumnandeach row.Inthisdissertation,wewillformulatethenetworkcodingtotheirregularLDPCcode. 4.3.2DecodingofLDPCCode Theiterativedecodingalgorithm,knownasbeliefpropagationalgorithm(BPA),isgenerally usedtodecodetheLDPCcode.TheBPAisasoft-decisionalgorithmstudiedin[107{109]. Forabinaryerasurechannel(BEC),thebitsinthecodewordsarereceivedas0's,1'sor x 's(erasures).TheBPAcanbedescribedovertheTannergraph[110],whichisabipartite graph.InaTannergraph,therearetwotypesofnodes:thesymbolnodes(corresponding tothereceivedbits),andthechecknodes(correspondingtotherowsoftheparitycheck matrix).AnillustrativeexampleoftheparitycheckmatrixanditsTannergraphisshown inFigure4.11.Intheparitycheckmatrix,everyrowrepresentsaparitycheckequation. Thesymbolnodes,whichcorrespondtothebitsequalto1'sinarowoftheparitycheck matrix,areconnectedtothechecknodewhichcorrespondstothesamerow.Thesenodes andedgesintheTannergraphexpresstheparitycheckequationofthatrow.InFigure4.11, node h 1representstherowoftheparitycheckmatrix.Andthethesecondand thirdelementsoftherowinparitycheckmatrixare1's,sosymbolnodes d 1, d 2and d 3 areconnectedto h 1intheTannergraph. Thedecodingalgorithmcanbedescribedthroughthefollowingalgorithm: 60 Algorithm4.4 BPADecodingAlgorithmforBEC while Therearechecknodesconnectedtoonlyoneunknownsymbolnode do for Eachofthesechecknodes do Theunknownsymbolnode xor(Alloftheothersymbolnodesconnectedtothe checknode) endfor endwhile if Alltheunknownsymbolnodesarerecovered then Decodesuccessfully endif 4.3.3RelationshipBetweenLinearNetworkCodeandLDPCCode Inlinearnetworkcoding,packetsarelinearlycombinedattheintermediatenodes.The packetsthatarereceivedatthesinknodessatisfyequation(1.1).Inthenetworkcode decodingpartoftheEDECalgorithm,onlyindependentvalidpacketsareused.However, thereisalsohelpfulinformationinthelinearlydependentpacketsorcorruptedpackets.If wecanexploitandusethesepackets,wecanimprovethesystemperformance.Denotethe receivedencodingvectoras a i =( a 1 ;i ; ;a l;i ) T ,where1 i m and m isthenumberof receivedpackets.Thenthegenerationmatrixoftheblockcodecanbeas: G =[ a 1 a l ; a l +1 a m ]=[ P 1 ;P 2 ] ; (4.3) wherethematrix P 1 canbemadeasa l l fullrankmatrixthroughcolumnexchangeafter l independentpacketsarereceived,and P 2 isa l ( m l )matrix. Asanexample,supposethereisonlyonebit x i ineveryoriginalpacketinthesource node(1 i l ). x =( x 1 ;:::;x l ).Inthiscase,thereisalsoonlyonebit y j inevery receivedpacketinthesinknodes(1 j m ).Denoteallthe m receivedpacketsasavector y .Wehavethefollowingencodingequation: y = x G : (4.4) Thecorrespondingparitycheckmatrix H canbederivedasfollows. 61 thegeneratingmatrixas G = P 1 1 [ I l ;P 1 1 P 2 ] ; (4.5) andtheparity-checkmatrixas H =[( P 1 1 P 2 ) T ;I m l ] : (4.6) Wecanverifythecorrectnessof H byverifyingthefollowequation: G H T = P 1 1 [ I l ;P 1 1 P 2 ] [( P 1 1 P 2 ) T ;I m l ] T = 0 : (4.7) Afterderivingthecorrespondingparitycheckmatrix H ,wecandecodethelinearnetwork codeusingtheBPAalgorithm.ThelinearnetworkcodecanbeviewedasaratelessLDPC code,andhasthepropertyoferrorcontrolcodes. AlthoughlinearnetworkcodescanbeseenasratelessLDPCcodes,theBPAalgorithm cannotbeusedtodecodeanetworkcodeifthenetworkcodeisderivedafteranormal errorcontrolencode,becausewecannottheincorrectdecodingswhichcanbeviewed aserasures.However,forthemoerrorcontrolcodesintheEDECscheme,wecan determinetheerroneousdecodingsandmarkthecorrespondingbitsaserasures.Therefore, wecandecodethelinearnetworkcodeusingtheBPA. 4.3.4LEDECSchemeUsingBPA IntheLEDECscheme,weusethelinearlydependentpacketsandthecorruptedpacketsand decodethelinearnetworkcodeusingBPAalgorithm.Figure4.12illustratesthismainidea ofthedecodingalgorithm. 4.3.5TheoreticalAnalysis Whenthenumberoferrorsispartiallybeyondthedecodingcapabilityoftheerrorcontrol code,theLEDECschemecangetadditionalbfromdecodingoftheLDPCcodes. 62 Figure4.12MainideaoftheLEDECscheme Hereweusea(31 ; 15)cycliccodewithgenerationpolynomial x 16 + x 14 + x 10 + x 9 + x 8 + x 7 + x 5 + x 4 + x 3 + x 2 + x +1asanillustrativeexample.Itiseasytocalculatethat thereareonly17515entriesforthe4-biterrorsinthesyndrometable,whilethenumber forallthe4-biterrorsis 31 4 =31465.Itmeansonly17515distinct4-biterrorscanbe successfullycorrected.Inthissituation,4-biterrorsareconsideredtobepartiallybeyond thedecodingcapabilityofthe(31 ; 15)code.Thesuccessfulerrorcorrectionprobabilityis about17515 = 31465=0 : 5567.Becauseweusethemoerror-controlcode,whichcan detecttheerroneousdecodings,thefailederrorcorrectionscanbeseenaserasureswith erasureprobability P e =1 0 : 5567=0 : 4433. Considertheworstcaseinwhichalmostallthepacketsarecorruptedbythemalicious nodes.Theerror-detectionbasedschemesdonotworkatallbecauseallofthepackets aredumped.TheEDECschemedoesnotworkeitherbecausewitherasureprobability P e =0 : 4433therewillnotbeenoughcorrectablepacketstosolvethenetworkcodingequa- tion(1.1). FortheLEDECscheme,let d denotetheprobabilitythatanedgefromachecknode isconnectedtoasymbolnodeofdegree d ,and ˆ d denotetheprobabilitythatanedge fromasymbolnodeisconnectedtoachecknodeofdegree d intheTannergraphofthe correspondingLDPCcode.ThegeneratingfunctionsforanLDPCcodeisas: ( x )= 63 P d d x d 1 , ˆ ( x )= P d ˆ d x d 1 .Accordingto[111],themaximalfractionoferasuresthat arandomLDPCcodewithgivengeneratingfunctionscancorrectisboundedby P max = min f x (1 ˆ (1 x )) g (0 2 k 2canbederivedthesamewaythrough 77 Figure5.1Anexampleillustrationofmatrix S truncatingoperations. Let 0 ; ; q 1 beastrictlydecreasingintegersequencesatisfying0 < i ( i ) ; 0 i q 1,where i istheparameter fortheunderlyingregeneratingcode.Theleast commonmultipleof 0 ; ; q 1 is A .Supposethedatacontains B = A P q 1 i =0 ( i +1) messagesymbolsfromthe GF ( q 2 ).Inpractice,ifthesizeoftheactualdata islargerthan B symbols,wecanfragmentitintoblocksofsize B andprocesseachblock individually. Wearrangethe B symbolsintotwomatrices S;T asbelow: S = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 S 0 S 1 . . . S q 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ;T = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 T 0 T 1 . . . T q 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ; (5.13) where S i =[ S i; 1 ;S i; 2 ; ;S i ] ; T i =[ T i; 1 ;T i; 2 ; ;T i ] : (5.14) 78 S i;j ; 0 i q 1 ; 1 j i ,isasymmetricmatrixofsize i i withtheupper- triangularentriesdbydatasymbols.Thus S i;j contains i ( i +1) = 2symbols.The numberofcolumnsofeachsubmatrix S i ; 0 i q 1,isthesame: i i = A .Thesize ofmatrix S is( P q 1 i =0 i ) A .Soitcontains P q 1 i =0 ( i ( i +1) = 2) i =( A P q 1 i =0 ( i +1)) = 2 datasymbols.Figure5.1showsanexampleofmatrix S for q =4 ; 0 =6 ; 1 =5 ; 2 = 4 ; 3 =3.In5.1,thesubmatrix S i;j isrepresentedbythesquareinthecorresponding positionwiththesizerepresentingthesizeofthesubmatrix. T i;j (0 i q 1 ; 1 j i )isconstructedthesameas S i;j .So T hasthesame structureas S andcontainstheother( A P q 1 i =0 ( i +1)) = 2datasymbols. 1. ForaHermitiancode H m over GF ( q 2 ) ,weencodematrix M dim( H m ) A = [ M 1 ;M 2 ;M A ] byencodingeachcolumn M i ;i =1 ; 2 ; ;A ,individuallyusing H m .The codewordmatrixisdas H m ( M )=[ H m ( M 1 ) ; H m ( M 2 ) ; ; H m ( M A )] ; (5.15) where H m ( M i ) hasthefollowingform( % 2 L ( mQ ) ): [ % ( P 0 ; 0 ) ; ;% ( P 0 ;q 1 ) ; ;% ( P q 2 1 ; 0 ) ; ;% ( P q 2 1 ;q 1 )] T ; (5.16) andtheelementsof M i areviewedasthecoofthepolynomials f 0 ( x ) ; ;f q 1 ( x ) in % when M i isencoded. Let i = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 100 0 111 1 1 ˚˚ 2 ˚ i 1 . . . . . . . . . . . . . . . 1 ˚ q 2 2 ( ˚ q 2 2 ) 2 ( ˚ q 2 2 ) i 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 (5.17) 79 beaVandermondematrix,where ˚ istheprimitiveelementin GF ( q 2 )mentionedinsec- tion2.4and0 i q 1. = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 0 0 0 1 0 . . . . . . . . . . . . 00 q 2 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 (5.18) tobeadiagonalmatrixcomprisedof q 2 elements,where i ; 0 i q 2 1,ischosenusing thefollowingtwocriteria:(i) i 6 = j ; 8 i 6 = j; 0 i;j q 2 1.(ii)Any d i =2 i rowsof thematrix i ; i ],0 i q 1,arelinearlyindependent. Wealso i = i I (5.19) tobea q q diagonalmatrixfor0 i q 2 1,whereIisthe q q identicalmatrix.And = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 0 0 0 1 0 . . . . . . . . . . . . 00 q 2 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 (5.20) isa q 3 q 3 diagonalmatrixformedby q 2 diagonalsubmatrices 0 ; ; q 2 1 . Fordistributedstorage,weencodeeachpairofmatrices( S;T )usingAlgorithm5.1.We willnamethisencodingschemeas Hermitian-MSRcodeencoding ,or H-MSRcodeencoding . ForH-MSRcodingencoding,wehavethefollowingtheorem. 80 Algorithm5.1 EncodingH-MSRCode Step1: Encodethedatamatrices S;T inequation(5.13)usingaHermitiancode H m over GF ( q 2 )withparameters ( j )(0 j q 1)and m ( m q 2 1).Denote thegenerated q 3 A codewordmatricesas H m ( S )and H m ( T ). Step2: Computethe q 3 A codewordmatrix Y = H m ( S )+ H m ( T ). Step3: Divide Y into q 2 submatrices Y 0 ; ;Y q 2 1 ofsize q A andstoreeachsubmatrix inastoragenodeasshowninFigure.5.2. Figure5.2Illustrationofstoringthecodewordmatricesindistributedstoragenodes Theorem5.1. TheH-MSRcodeencodingdescribedinAlgorithm5.1canachievetheMSR pointindistributedstorage. Proof. Westudythestructureofthecodewordmatrix H m ( S ).Sinceeverycolumn ofthematrixisanindependentHermitiancodeword,wecandecodethecolumn h = [ h 0 ; 0 ; ;h 0 ;q 1 ; ;h q 2 1 ; 0 ; ;h q 2 1 ;q 1 ] T asanexamplewithoutlossofgenerality.We arrangethe q 3 rationalpointsoftheHermitiancurvefollowingtheorderinTable2.1.Inthe table,wecanthatforeach i;i =0 ; 1 ; ;q 2 1,therationalpoints P i; 0 ;P i; 1 ; ;P i;q 1 allhavethesamecoordinate. Suppose % 2 L ( mQ ): % ( P i;l )= f 0 ( P i;l )+ y ( P i;l ) f 1 ( P i;l )+ +( y ( P i;l )) q 1 f q 1 ( P i;l ) ; 0 i q 2 1,0 l q 1 ; deg f j ( x )= j 1for0 j q 1.Since P i; 0 ;P i; 1 ; ;P i;q 1 allhavethesamecoordinateand f j ( P i;l )isonlyappliedtothecoordinateof P i;l , wehave f j ( P i;l )= f j ( ˚ s i ) ;s 0 = ;s i = i 1 ; for i 1 ;˚ =0,whichdoesnotdepend 81 on l .Therefore,wecanderive q 2 setsofequationsfor0 i q 2 1: 8 > > > > > > > > > > > > > > < > > > > > > > > > > > > > > : f 0 ( ˚ s i )+ y ( P i; 0 ) f 1 ( ˚ s i )+ +( y ( P i; 0 )) q 1 f q 1 ( ˚ s i )= h i; 0 f 0 ( ˚ s i )+ y ( P i; 1 ) f 1 ( ˚ s i )+ +( y ( P i; 1 )) q 1 f q 1 ( ˚ s i )= h i; 1 ................................................................... f 0 ( ˚ s i )+ y ( P i;q 1 ) f 1 ( ˚ s i )+ +( y ( P i;q 1 )) q 1 f q 1 ( ˚ s i )= h i;q 1 : (5.21) IfwestorethecodewordmatrixinstoragenodesaccordingtoFigure.5.2,eachsetofthe equationscorrespondstoastoragenode.Aswementionedabove,thesetofequationsin equation(5.21)canbederivedinstoragenode i . Sincethecotmatrix B i isaVandermondematrix: B i = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 y ( P i; 0 ) y ( P i; 0 ) q 1 1 y ( P i; 1 ) y ( P i; 1 ) q 1 . . . . . . . . . . . . 1 y ( P i;q 1 ) y ( P i;q 1 ) q 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 : (5.22) wecansolve u i =[ f 0 ( ˚ s i ) ;f 1 ( ˚ s i ) ; ;f q 1 ( ˚ s i )] T from u i = B 1 i h i ; (5.23) where h i =[ h i; 0 ;h i; 1 ; ;h i;q 1 ] T . Fromallthe q 2 storagenodes,wecangetvectors F i =[ f i (0) ;f i (1) ; ;f i ( ˚ q 2 2 )] T ; i =0 ; ;q 1,whichcanbeviewedasextendedReed-Solomoncodes. Nowconsiderallthecolumnsof H m ( S ),wecangetthefollowingequation: i S i;j = F i;j ; (5.24) 82 where F i;j =[ F (1) i ; ; F ( i ) i ],0 i q 1,1 j i ,and F ( l ) i correspondstothe l th columnofthesubmatrix S i;j . Nextwewillconsiderthestructureofthecodewordmatrix H m ( T ).Becausetheencoding processfor H m ( T )isthesameasthatof H m ( S ),for H m ( T ),wecanderive i T i;j = E i;j ; (5.25) where E i =[ e i (0) ;e i (1) ; ;e i ( ˚ q 2 2 )] T , E i;j =[ E (1) i ; ; E ( i ) i ],0 i q 1,1 j i ,and E ( l ) i correspondstothe l th columnofthesubmatrix T i;j . Thirdly,wewillstudytheoptimalityofthecodeinthesenseoftheMSRpoint.For i S i;j + i T i;j ; 0 i q 1 ; 1 j i ,since S i;j ;T i;j aresymmetricandsatisfy therequirementsforMSRpointaccordingto[57]withparameters d =2 i ;k = i +1 ; = i ; =1 ;B = i ( i +1).Byencoding S;T using H m ( S )+ H m ( T )anddistributing Y 0 ; ;Y q 2 1 into q 2 storagenodes,eachrowofthematrix i S i;j + i T i;j ; 0 i q 1 ; 1 j i ,canbederivedinacorrespondingstoragenode.Because i S i;j + i T i;j achievestheMSRpoint,datarelatedtomatrices S i;j ;T i;j ; 0 i q 1 ; 1 j i ,can beregeneratedattheMSRpoint.Therefore,Algorithm5.1canachievetheMSRpoint. 5.3.2RegenerationoftheH-MSRCodeintheError-freeNetwork Inthissection,wewilldiscusstheregenerationfortheH-MSRcodeintheerror-freenetwork. Let v i =[ e 0 ( ˚ ( s i )) ;e 1 ( ˚ ( s i )) ; ;e q 1 ( ˚ ( s i ))] T ,then u i + i v i = B 1 i y i =[ f 0 ( ˚ s i )+ i e 0 ( ˚ s i ) ; ;f q 1 ( ˚ s i )+ i e q 1 ( ˚ s i )] T ; (5.26) foreverycolumn y i of Y i . Themainideaoftheregenerationalgorithmsistoregenerate f l ( ˚ s i )+ i e l ( ˚ s i ),0 l q 1,bydownloadinghelpsymbolsfrom d l =2 l nodes,where d l representstheregeneration parameter d for f l ( ˚ s i )+ i e l ( ˚ s i )intheH-MSRcoderegeneration. 83 Supposenode z fails,wedeviseAlgorithm5.2inthenetworktoregeneratetheexact H-MSRcodesymbolsofnode z inareplacementnode z 0 .Forconvenience,wesuppose d q =2 q =0and V i;j;l = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 i;l i +1 ;l . . . j;l 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ; (5.27) where t;l ;i t j ,isthe t th rowof l ; l ].Eachnode i ,0 i q 2 1,onlystoresits ownencodingvector i;l ,0 l q 1. First,replacementnode z 0 willsendrequeststohelpernodesforregeneration: z 0 sends theinteger j to d j d j +1 helpernodes,towhich z 0 hasnotsentrequestsbefore,forevery j from q 1to0indescendingorder. Uponreceivingtherequestinteger j ,helpernode i willcalculateandsendthehelp symbolsasfollows:node i willcalculate e Y i = B 1 i Y i toremovethecotmatrix B i fromthecodewordmatrix.Sincethe( l +1) th rowof e Y i correspondstothesymbols relatedto f l ( ˚ s i )+ i e l ( ˚ s i ),for0 l j ,node i willdividethe( l +1) th rowof e Y i into l rowvectorsofthesize1 l :[ ~y i;l; 1 ; ~y i;l; 2 ; ; ~y i;l l ].Thenforevery0 l j and1 t l ,node i willcalculatethehelpsymbol~ p i;l;t = ~y i;l;t T z;l ,where z;l isthe z th rowoftheencodingmatrix l inequation(6.2).Atlast,node i willsendoutall thecalculatedsymbols~ p i;l;t .Here j indicatesthat z 0 isrequestingsymbols~ p i;l;t ,0 l j and1 t l ,calculatedby[ f 0 ( ˚ s i )+ i e 0 ( ˚ s i ) ; ;f j ( ˚ s i )+ i e j ( ˚ s i )] T Since d l 1 >d l 2 for l 1 > > > < > > > > : 1 ;r = s +1 0 ;r 6 = s +1 ; 1 r d l ; 0 s d l 1 : (5.29) V 0 ;d l 1 ;l ^ x 2 = V 0 ;d l 1 ;l V 1 1 ;d l ;l [ e 1 ;e 2 ; ;e d l ] T = V 0 ;d l 1 ;l [ 0 ; 1 ; ; d l 1 ] [ e 1 ;e 2 ; ;e d l ] T (5.30) =[ x 2 ; 0 ;e 1 ; ;e d l 1 ] T : Tocalculate x 2 ; 0 ,wederivetheexpressionof 0 ;l .Because 1 ;l ; 2 ;l ; ; d l ;l are linearlyindependent,theycanbeviewedasasetofbasesofthe d l dimensionallinearspace. 87 Sowehave 0 ;l = r = d l X r =1 r r;l ; (5.31) where r 6 =0 ;r =1 ; ;d l ,becauseany d l vectorsoutof 0 ;l ; 1 ;l ; ; d l ;l arelinearly independent.Then x 2 ; 0 = 0 @ r = d l X r =1 r r;l 1 A [ 0 ; 1 ; ; d l 1 ][ e 1 ;e 2 ; ;e d l ] T = r = d l X r =1 r e r : (5.32) If e 0 = r = d l X r =1 r e r ; (5.33) then V 0 ;d l 1 ;l ^ x 1 = V 0 ;d l 1 ;l ^ x 2 and ^ x 1 = ^ x 2 . Whenonlyoneelementof e 0 ;e 1 ; ;e d l isnonzero,since 1 ; ; d l areallnonzero, equation(5.33)willneverhold.Inthiscase,theprobabilityis0.Whentherearemorethan onenonzeroelements,itmeanstherearemorethanonemaliciousnodes.Ifthenumberof maliciousnodesislessthan d l +1,theywillnotbeabletocolludetosolvethecots r in(5.31).Theprobabilitythatequation(5.33)holdswillbe1 =q 2 . Theorem5.2. Whenthenumberofmaliciousnodesinthe d l +1 helpernodesofAlgo- rithm5.3islessthan d l +1 ,theprobabilityforthebogussymbolssentfromthemalicious nodestobedetectedisatleast 1 1 =q 2 . Proof. Since V 0 ;d l 1 ;l and V 1 ;d l ;l arefullrankmatrices, x 1 canbecalculatedby(Forcon- venience,use e i torepresent e i;l;t ): x 1 = V 1 0 ;d l 1 ;l " ~ p 0 ;l;t + e 0 ; ; ~ p d l 1 ;l;t + e d l 1 # T = x + V 1 0 ;d l 1 ;l [ e 0 ;e 1 ; ;e d l 1 ] T = x + ^ x 1 : (5.34) 88 x 2 canbecalculatedthesameway: x 2 = x + V 1 1 ;d l ;l [ e 1 ;e 2 ; ;e d l ] T = x + ^ x 2 : (5.35) If ^ x 1 = ^ x 2 ,Algorithm5.3willfailtodetecttheerrors.Sowewillfocusontherelationship between ^ x 1 and ^ x 2 .AccordingtoLemma1,whenthenumberofmaliciousnodesinthe d l +1helpernodesislessthan d l +1,theprobabilitythat ^ x 1 = ^ x 2 isatmost1 =q 2 .Sothe probabilitythat x 1 6 = x 2 ,equivalentlythedetectionprobability,isatleast1 1 =q 2 . 5.3.3.2RecoveryMode Oncethereplacementnode z 0 detectserrorsusingAlgorithm5.3,itwillsendinteger j = q 1 toalltheother q 2 1nodesinthenetworkrequestinghelpsymbols.Helpernode i willsend helpsymbolssimilartoSection5.3.2. z 0 canregeneratesymbolsusingAlgorithm5.4. 5.3.4ReconstructionoftheH-MSRCodeintheError-freeNetwork HerewewilldiscussthereconstructionoftheH-MSRcodeintheerror-freenetwork.The mainideaofthereconstructionalgorithmsistoreconstruct f l ( ˚ s i )+ i e l ( ˚ s i ),0 l q 1, bydownloadinghelpsymbolsfrom k l = l +1nodes,where k l isusedtorepresentthe reconstructionparameter k for f l ( ˚ s i )+ i e l ( ˚ s i )intheH-MSRcodereconstruction.We deviseAlgorithm5.5inthenetworkforthedatacollectorDCtoreconstructtheoriginal Forconvenience,wesuppose q =0. First,DCwillsendrequeststothestoragenodesforreconstruction:DCsendsinteger j to k j k j +1 helpernodes,towhichDChasnotsentrequestsbefore,forevery j from q 1 to0indescendingorder. Uponreceivingtherequestinteger j ,node i willcalculateandsendsymbolsasfollows: node i willcalculate ~ Y i = B 1 i Y i toremovethecotmatrix B i fromthecodeword matrix.Sincethe( l +1) th rowof ~ Y i correspondstothesymbolsrelatedto f l ( ˚ s i )+ i e l ( ˚ s i ), for0 l j ,node i willsendoutthe( l +1) th rowof ~ Y i : ~y i;l .Here j indicatesthatDC 89 Algorithm5.4 [RecoveryMode] z 0 RegeneratesSymbolsoftheFailedNode z inHostile Network Step1: Forevery q 1 l 0indescendingorderand1 t l inascendingorder, wecanregeneratethesymbolswhentheerrorsinthereceivedhelpsymbols~ p 0 i;l;t from q 2 1helpernodescanbecorrected.Withoutlossofgenerality,weassume 0 i q 2 2. Step1.1: Let p 0 =[~ p 0 0 ;l;t ; ~ p 0 1 ;l;t ; ; ~ p 0 q 2 2 ;l;t ] T .Since V 0 ;q 2 2 ;l x = p 0 , p 0 canbe viewedasanMDScodewithparameters( q 2 1 ;d l ;q 2 d l ). Step1.2: Substitute~ p 0 i;l;t in p 0 withthesymbol representinganerasureifnode i hasbeendetectedtobecorruptedinthepreviousloops(previousvaluesof l;t ). Step1.3: Ifthenumberoferasuresin p 0 islargerthanmin f q 2 d l 1 ; b ( q 2 d q 1 1) = 2 cg ,thenthenumberoferrorshaveexceededtheerrorcorrectioncapability.So herewewillthedecodingfailureandexitthealgorithm. Step1.4: Sincethenumberoferrorsiswithintheerrorcorrectioncapabilityof theMDScode,decode p 0 to p 0 cw andsolve x . Step1.5: Ifthe i th positionsymbolsof p 0 cw and p 0 aredit,marknode i as corrupted. Step1.6: Compute ~y z;l;t = z;l S l;t + z z;l T l;t asdescribedinAlgorithm5.2. Step2: Let e Y z bea q A matrixwiththe l th rowas[ ~y z;l; 1 ; ; ~y z;l l ] ; 0 l q 1. Step3: Calculatetheregeneratedsymbolsofthefailednode z : Y z 0 = Y z = B z e Y z . isrequestingsymbolsof ~y i;l ,0 l j ,calculatedby[ f 0 ( ˚ s i )+ i e 0 ( ˚ s i ) ; ;f j ( ˚ s i )+ i e j ( ˚ s i )] T . Since k l 1 >k l 2 for l 1 > > > < > > > > : C 1 ;i;j + i D 1 ;i;j = ^ R 1 ;i;j + ^ W 1 ;i;j C 1 ;i;j + j D 1 ;i;j = ^ R 1 ;j;i + ^ W 1 ;j;i ; (5.38) where C 1 ;i;j ;D 1 ;i;j ; ^ R 1 ;i;j ; ^ W 1 ;i;j aretheelementsinthe i th row, j th columnof C 1 ;D 1 ; ^ R 1 ; ^ W 1 respectively.Solveequation(5.38)forallthe i;j ( i 6 = j; 0 i l ; 0 j l 1),wecan getthecorresponding C 1 ;i;j ;D 1 ;i;j .Becausethestructureof C 1 and D 1 arethesame,wewill onlyfocuson C 1 (correspondingto S 1 )intheproof.Thecalculationfor D 1 (corresponding to T 1 )isthesame. DC 1 S 1 T DC 1 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 1 . . . l 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 S 1 [ T 0 ; T 1 ; ; T l ]= C 1 : (5.39) Sotheelementsofthe i th rowof C 1 (excepttheelementinthediagonalposition)canbe writtenas: i S 1 [ T 0 ; ; T i 1 ; T i +1 ; T l ]=[ C 1 ;i; 0 ; ;C 1 ;i;i 1 ;C 1 ;i;i +1 ; ;C 1 l ] : (5.40) 94 Let= 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 1 . . . l 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ,thenisan l l fullrankmatrix,andwecanderive S 1 from S 1 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 [ C 1 ; 0 ; 1 ;C 1 ; 0 ; 2 ; ;C 1 ; 0 l ][ T 1 ; T 2 ; ; T l ] 1 [ C 1 ; 1 ; 0 ;C 1 ; 1 ; 2 ; ;C 1 ; 1 l ][ T 0 ; T 2 ; ; T l ] 1 [ C 1 l 1 ; 0 ;C 1 l 1 ; 1 ; ;C 1 l 1 l ][ T 0 ; T 1 ; ; T l ] 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 : (5.41) For R 2 0 = R 2 + W 2 inAlgorithm5.6,wecanget S 2 thesameway.If S 1 = S 2 , Algorithm5.6willfailtodetecttheerrors.Thiswillhappenifalltherowsof S 1 and S 2 arethesame.Sowewillfocusonthe i th rowof S 1 and S 2 . Step2.Calculatethefailureprobabilities Dependingonthevaluesof i ,wediscuss twocases: (a)Ifnoneofthe i (0 i l )equalsto0,wecansolve C 1 ;i;j inequation(5.38): C 1 ;i;j = j ^ R 1 ;i;j i ^ R 1 ;j;i i j + e i T j i e j T i j = N 1 ;i;j + Q 1 ;i;j : (5.42) Inequation(5.42), N 1 ;i;j representstheoriginalsolutionwithouterrors,while Q 1 ;i;j repre- 95 sentstheimpactoftheerrors.Sothe i th rowof S 1 canbewrittenas: [ C 1 ;i; 0 ; ;C 1 ;i;i 1 ;C 1 ;i;i +1 ; ;C 1 l ] 1 1 ;i =[ N 1 ;i; 0 ; ;N 1 ;i;i 1 ;N 1 ;i;i +1 ; ;N 1 l ] 1 1 ;i +[ Q 1 ;i; 0 ; ;Q 1 ;i;i 1 ;Q 1 ;i;i +1 ; ;Q 1 l ] 1 1 ;i (5.43) = ˘ i + 1 ;i ; where 1 ;i =[ T 0 ; ; T i 1 ; T i +1 ; ; T l ]. ˘ i correspondstothepartindependentofthe errors. 1 ;i istheerrorpartandcanbefurtherexpandedas: 1 ;i = " e i T 0 i ; ; e i T i 1 i ; e i T i +1 i ; ; e i T l i # 1 1 ;i " e 0 T i 0 ; ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l T i l # 1 1 ;i : (5.44) Thepartofequation(5.44)canbereducedasfollows: " e i T 0 i ; ; e i T i 1 i ; e i T i +1 i ; ; e i T l i # 1 1 ;i = e i i h T 0 ; ; T i 1 ; T i +1 ; ; T l i 1 1 ;i (5.45) = e i i : Sowehave: 1 ;i = e i i " e 0 T i 0 ; ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l T i l # 1 1 ;i = e i i ˆ 1 ;i : (5.46) For R 2 0 = R 2 + W 2 inAlgorithm5.6where W 2 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 e 0 . . . e l 1 e l +1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ,wecanderive C 2 ;i;j ,then 96 S 2 thesameway.The i th rowof S 2 canbewrittenas: ˘ i + 2 ;i = ˘ i + e i i ˆ 2 ;i ; (5.47) where ˆ 2 ;i = " e 0 T i 0 ; ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l 1 T i l 1 ; e l +1 T i l +1 # 1 2 ;i , 2 ;i =[ T 0 ; ; T i 1 ; T i +1 , ; T l 1 ; T l +1 ]. Because 1 ;i isafullrankmatrix, ˆ 1 ;i = ˆ 2 ;i isequivalentto ˆ 1 ;i 1 ;i = ˆ 2 ;i 1 ;i . SimilartotheproofofLemma1,suppose 1 2 ;i = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 . . . l 1 l +1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ,wehave s T r = 8 > > > > < > > > > : 1 r = s 0 r 6 = s .So ˆ 1 ;i 1 ;i = " ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l 1 T i l 1 ; e l T i l # ; (5.48) ˆ 2 ;i 1 ;i = " ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l 1 T i l 1 ;x 2 l # : (5.49) Because T 0 ; ; T i 1 ; T i +1 ; ; T l 1 ; T l +1 arelinearlyindependent,theycanbeviewed asasetofbasesofthe l dimensionallinearspace.Sowehave T l = r = l +1 X r =0 ;r 6 = l r T r : (5.50) Thus x 2 l = " ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l 1 T i l 1 ; e l +1 T i l +1 # 1 2 ;i 0 @ r = l +1 X r =0 ;r 6 = l r T r 1 A = 0 @ r = l +1 X r =0 ;r 6 = l r e r T i r 1 A : (5.51) If e l T i l = r = l +1 X r =0 ;r 6 = l r e r T i r (0 i l 1) ; (5.52) 97 ˆ 1 ;i and ˆ 2 ;i willbeequal,soare S 1 and S 2 .Therefore,Algorithm5.6willfail. Fortheerror e i (0 i l +1),thefollowingequationholds: e i [ T 0 ; T 1 ; ; T l 1 ]=[^ e i; 0 ; ^ e i; 1 ; ; ^ e l 1 ]= ^ e i : (5.53) Because[ T 0 ; T 1 ; ; T l 1 ]isafullrankmatrix,thereisaone-to-onemappingbetween e i and ^ e i .Equation(5.52)canbewrittenas: ^ e l ;i l = r = l +1 X r =0 ;r 6 = l r ^ e r;i r (0 i l 1) : (5.54) Whenthenumberofmaliciousnodesinthe k l +1nodesislessthan k l +1,themalicious nodescancolludetosatisfyequation(5.54)foratmostoneparticular i .Sotheprobability thatequation(5.54)holdsis1 =q 2 foratleast l 1outof l i 0 s between0and l 1. Ifweconsiderequation(5.54)forallthe i 0 s simultaneously,theprobabilitywillbeatmost (1 =q 2 ) l 1 .Aswehavementionedabove,theprobabilityfor T 1 = T 2 willbeatmost (1 =q 2 ) l 1 .Inthiscase,thedetectionprobabilityisatleast1 (1 =q 2 ) 2( l 1) . (b)Ifoneofthe i (0 i l )equalsto0,wecanassume 0 =0withoutlossof generality.When i =0,thesolutionforequation(5.38)is: C 1 ; 0 ;j = ^ R 1 ; 0 ;j + e 0 T j = N 1 ; 0 ;j + Q 1 ; 0 ;j : (5.55) Similartoequations(5.43),(5.44)and(5.45),wehave 1 ; 0 = e 0 .For R 2 0 = R 2 + W 2 ,itis easytoseethat 2 ; 0 = e 0 .Sotherowsof S 1 and S 2 arethesamenomatterwhat theerrorvector e 0 is. When i> 0 ;j =0,thesolutionforequation(5.38)is: C 1 ;i; 0 = ^ R 1 ;i; 0 + 0 T 0 + e 0 T i = N 1 ;i; 0 + Q 1 ;i; 0 ; (5.56) where 0 isazerorowvector.When i> 0 ;j> 0,thesolutionhasthesameexpressionas equation(5.42).Inthiscase,forthe i th ( i> 0)rowof S 1 ,equation(5.44)canbewritten 98 as: 1 ;i = " 0 ; ; e i T i 1 i ; e i T i +1 i ; ; e i T l i # 1 1 ;i " e 0 T i ; ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l T i l # 1 1 ;i : (5.57) Thepartofequation(5.57)canbedividedintotwoparts: " e i T 0 i ; ; e i T i 1 i ; e i T i +1 i ; ; e i T l i # 1 1 ;i " e i T 0 i ; 0 ; ; 0 # 1 1 ;i = e i i e i i [ T 0 ; 0 ; ; 0 ] 1 1 ;i : (5.58) Soequation(5.57)canbefurtherwrittenas: 1 ;i = e i i " e i T 0 i e 0 T i ; ; e i 1 T i i 1 ; e i +1 T i i +1 ; ; e l T i l # 1 1 ;i = e i i ˆ 1 ;i : (5.59) Byemployingthesamederivationincase(a),for1 i l 1, ˆ 1 ;i and ˆ 2 ;i willbe equalif e l T i l = r = l +1 X r =1 ;r 6 = l r e r T i r 0 e 0 T i + 0 e i T 0 i ; (5.60) ^ e l ;i l = r = l +1 X r =1 ;r 6 = l r ^ e r;i r 0 ^ e 0 ;i + 0 ^ e i; 0 i : (5.61) Whenthenumberofmaliciousnodesinthe k l +1nodesislessthan k l +1,forthesamereason asincase(a),theprobabilitythatequation(5.61)holdsis1 =q 2 foratleast l 2outof l 1 i 0 s between1and l 1.Ifweconsiderequation(5.61)forallthe i 0 s simultaneously,the probabilitywillbeatmost(1 =q 2 ) l 2 .Heretheprobabilityfor T 1 = T 2 willbe(1 =q 2 ) l 2 . Inthiscase,thedetectionprobabilityis1 (1 =q 2 ) 2( l 2) . Combiningbothcases,thedetectionprobabilityisatleast1 (1 =q 2 ) 2( l 2) . 99 5.3.5.2RecoveryMode OnceDCdetectserrorsusingAlgorithm5.6,itwillsendinteger j = q 1toallthe q 2 nodesinthenetworkrequestingsymbols.Storagenodeswillstillusethewaysimilartothat oftheerror-freenetworkinSection5.3.4tosendsymbols.Thereconstructproceduresare describedinAlgorithm5.7. Algorithm5.7 [RecoveryMode]DCReconstructstheOriginalFileinHostileNetwork Step1: Forevery0 l q 1,wedividethesymbolvector ~y 0 i;l into l equalrowvectors: [ ~y 0 i;l; 1 ; ~y 0 i;l; 2 ; ; ~y 0 i;l l ].(Withoutlossofgenerality,weassume0 i q 2 1.) Step2: Forevery q 1 l 0indescendingorderand1 t l inascendingorder, DCcanreconstructthematricesrelatedtotheoriginalwhentheerrorsinthe receivedsymbolvectors ~y 0 i;l;t from q 2 storagenodescanbecorrected: Step2.1: Let R 0 =[ ~y 0 T 0 ;l;t ; ~y 0 T 1 ;l;t ; ; ~y 0 T q 2 1 ;l;t ] T . Step2.2: Ifthenumberofcorruptednodesdetectedislargerthanmin f q 2 k l ; b ( q 2 k q 1 ) = 2 cg ,thenthenumberoferrorshaveexceededtheerrorcorrection capability.Wewillthedecodingfailureandexitthealgorithm. Step2.3: Sincethenumberoferrorsiswithintheerrorcorrectioncapabilityof theH-MSRcode,substitute ~y 0 i;l;t in R 0 withthesymbol representinganerasure vectorifnode i hasbeendetectedtobecorruptedinthepreviousloops(previous valuesof l;t ). Step2.4: Solve S l;t ;T l;t usingthemethoddescribedinsection5.3.6.Ifsymbols fromnode i aredetectedtobeerroneousduringthecalculation,marknode i as corrupted. Step3: DCreconstructstheoriginalfromallthematrices S l;t ;T l;t ,0 l q 1and 1 t l . 5.3.6RecoverMatrices S l;t ;T l;t from q ^2 StorageNodes Whentherearebogussymbols~ p 0 i;l;t sentbythecorruptednodesforcertain l;t ,wecan recoverthematrices S l;t ;T l;t asfollows: 100 For R 0 inAlgorithm5.7,wehave DC 2 6 6 6 6 4 S 0 T 0 3 7 7 7 7 5 = R 0 ,and DC S 0 T DC + DC DC T 0 T DC = R 0 T DC ; (5.62) where DC = DC ; DC DC ], DC = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 1 . . . q 2 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 and i represents i;l whichisthe i th rowoftheencodingmatrix l intheproofofTheorem5.1. Let C = DC S 0 T DC , D = DC T 0 T DC ,and ^ R 0 = R 0 T DC ,then C + DC D = ^ R 0 : (5.63) Since C;D arebothsymmetric,wecansolvethenon-diagonalelementsofthemasfollows: 8 > > > > < > > > > : C i;j + i D i;j = ^ R 0 i;j C i;j + j D i;j = ^ R 0 j;i : (5.64) Becausematrices C and D havethesamestructure,hereweonlyfocuson C (corresponding to S 0 ).Itisstraightforwardtoseethatifnode i ismaliciousandthereareerrorsinthe i th rowof R 0 ,therewillbeerrorsinthe i th rowof ^ R 0 .Furthermore,therewillbeerrorsinthe i th rowand i th columnof C . S 0 T DC = ^ S 0 ,wehave DC ^ S 0 = C: (5.65) Herewecanvieweachcolumnof C asa( q 2 1 ; l ;q 2 l )MDScodebecause DC isa Vandermondematrix.Thelengthofthecodeis q 2 1sincethediagonalelementsof C is 101 unknown.Supposenode j isuncorrupted.Ifthenumberoferasures ˙ (correspondingtothe previouslydetectedcorruptednodes)andthenumberofthecorruptednodes ˝ thathave notbeendetectedsatisfy: ˙ +2 ˝ +1 q 2 l ; (5.66) thenthe j th columnof C canberecoveredandtheerrorlocations(correspondingtothe corruptednodes)canbepinpointed.Thenon-diagonalelementsof C canberecovered.So DCcanreconstruct S l;t usingthemethodsimilarto[57].For T l;t ,therecoveringprocessis similar. 5.4HermitianCodeBasedMBRRegeneratingCode(H-MBR Code) 5.4.1EncodingH-MBRCode Inthissection,wewillanalyzetheH-MBRcodebasedontheMBRpointwith =1. Accordingtoequation(2.8),wehave d = . Let 0 ; ; q 1 beastrictlydecreasingintegersequencesatisfying0 < i ( i ) ; 0 i q 1.Theleastcommonmultipleof 0 ; ; q 1 is A .Let k 0 ; ;k q 1 beainteger sequencesatisfying0 ˝ RS MSR when q 3 . Proof. For ˝ RS MSR ,wehave ˝ RS MSR = 6 6 6 4 0 @ q 3 q q 1 X l =0 d l 1 A = 2 7 7 7 5 (5.81) j ( q 3 q q d q 1 q 2 ( q 1)) = 2 k = q ( q 2 d q 1 1) = 2 q ( q 1) 4 q ( q 2 d q 1 1) = 2 q ( q 1) 4 : For ˝ H MSR ,wehave ˝ H MSR = q b ( q 2 d q 1 1) = 2 c : (5.82) When q =3,itiseasytoverifythat ˝ H MSR >˝ RS MSR . When q> 3,Wecanrewriteequation(5.82)as ˝ H MSR q ( q 2 d q 1 1) = 2 q= 2 : (5.83) Thegapbetween ˝ H MSR and ˝ RS MSR isatleast q ( q 1) 4 q 2 = q 2 3 q 4 > 0 ;q> 3 ; (5.84) sowehave ˝ H MSR >˝ RS MSR . Example1. Suppose q =4 and m =37 ,theHermitiancurveisdby y 4 + y = x 5 over GF (4 2 ) .Fromthepreviousdiscussion,wehave (0)=10 ; (1)=9 ; (2)=7 ; (3)=6 . Choose 0 =6 ; 1 =5 ; 2 =4 ; 3 =3 .So d 0 =12 ;d 1 =10 ;d 2 =8 ;d 3 =6 .Accordingto theanalysisabove,wehave ˝ H MSR =4 ˝ 3 =4 b (15 6) = 2 c =16 ,whichislargerthan ˝ RS MSR = b (60 36) = 2 c =12 . 116 Figure5.4ComparisonoferrorcorrectioncapabilitybetweentheH-MSRcodeandthe RS-MSRcode Wealsoshowthemaximumnumberofmaliciousnodesfromwhichtheerrorscanbe correctedbytheH-MSRcodeinFigure.5.4.Herewesettheparameter q oftheHermitian codefrom4to16withastepof2.InthetheperformanceoftheRS-MSRcode withthesamecoderatesastheH-MSRcodeisalsoplotted.Thecomparisonresultfur- therdemonstratesthatfordataregenerationtheH-MSRcodehasbettererrorcorrection capabilitythantheRS-MSRcode. FordatareconstructionAlgorithm5.7,H-MSRcodecanbeviewedas q MDScodeswith parameters( q 2 1 ;k l 1 ;q 2 k l +1).Thedecodingforthereconstructionisperformedfrom thecodewiththelargestminimumdistancetothecodewiththesmallestminimumdistance asinthedataregenerationcase.Similarly,wecanconcludethatfordatareconstructionthe H-MSRcodehasbettererrorcorrectioncapabilitythantheRS-MSRcodeunderthesame coderate. 5.5.3ComplexityDiscussion ForthecomplexityoftheH-MSRcode,weconsidertwoscenarios. 117 5.5.3.1H-MSRregeneration FortheH-MSRregeneration,comparedwithRS-MSRcode,theH-MSRcodewillslightly increasethecomplexityofthehelpernodes.Foreachhelpernode,theextraoperationis amatrixmultiplicationbetween B 1 i and Y i .Thecomplexityis O ( q 2 )= O (( n 1 = 3 ) 2 )= O ( n 2 = 3 ).Similarto[87],forareplacementnode,fromAlgorithm5.2andAlgorithm5.3, wecanderivethatthecomplexitytoregeneratesymbolsforRS-MSRis O ( n 2 ),whilethe complexityforH-MSRisonly O ( n 5 = 3 ).Likewise,forAlgorithm5.4,thecomplexityto recovertheH-MSRcodeis O ( n 5 = 3 ),and O ( n 2 )forRS-MSRcode. 5.5.3.2H-MSRreconstruction Forthereconstruction,comparedwithRS-MSRcode,theadditionalcomplexityoftheH- MSRcodeforeachstoragenodeis O ( q 2 ),whichis O ( n 2 = 3 ).Thecomputationalcomplexity forDCtoreconstructthedatais O ( n 5 = 3 )fortheH-MSRcodeand O ( n 2 )fortheRS-MSR code. 118 CHAPTER6 DISTRIBUTEDSTORAGEINHOSTILENETWORKS|OPTIMAL CONSTRUCTIONOFREGENERATINGCODESTHROUGH RATE-MATCHINGAPPROACH InspiredbytheniceperformanceofHermitiancodebasedregeneratingcodes,wewillstep forwardinthischaptertofurtherconstructoptimalregeneratingcodeswhichhavesimilar layeredstructurelikeHermitiancodeindistributedstorage.ComparedtotheHermitian basedcode,thesecodeshavesimplerstructureandareeasiertounderstandandimplement. WewillproposetwooptimalconstructionsofMSRcodesthroughrate-matchinginhostile networks:2-layerrate-matchedMSRcodeand m -layerrate-matchedMSRcode.Forthe 2-layercode,wecanachievetheoptimalstorageforgivensystemrequirements. Ourcomprehensiveanalysisshowsthatourcodecandetectandcorrectmaliciousnodes withhigherstoragecomparedtotheRS-MSRcode.Thenwewillproposethe m - layercodebyextendingthe2-layercodeandachievetheoptimalerrorcorrectionciency bymatchingthecoderateofeachlayer'sMSRcode.Wewillalsodemonstratethatthe optimizedparametercanachievethemaximumstoragecapacityunderthesameconstraint. ComparedtotheRS-MSRcode,ourcodecanachievemuchhighererrorcorrection. Theoptimized m -layercodealsohasbettererrorcorrectioncapabilitythantheH-MSRcode. 6.1System/AdversarialModelsandAssumptions Thesystem/adversarialmodelsandassumptionsinthischapterarethesamewithChapter5. Weusethenotation CH torefertoeitherthefullrateMSRcodeoracodewordofthefull rateMSRcode.Theexactmeaningcanbediscriminatedclearlyaccordingtothecontext. 119 6.2ComponentCodesofRate-matchedMSRCode Inthissection,wewillintroducetwotcomponentcodesforrate-matchedMSRcode ontheMSRpointwith d =2 k 2.ThecodebasedontheMSRpointwith d> 2 k 2can bederivedthesamewaythroughtruncatingoperations.Intherate-matchedMSRcode, therearetwotypesofMSRcodeswithtcoderates:fullratecodeandfractionalrate code. 6.2.1FullRateCode 6.2.1.1Encoding Thefullratecodeisencodedbasedontheproduct-matrixcodeframeworkin[57].According toequation(2.7),wehave H = d= 2, H =1foroneblockofdatawiththesize B H = ( +1) .Thedatawillbearrangedintotwo symmetricmatrices S 1 ;S 2 ,eachofwhich willcontain B H = 2data.Thecodeword CH isas CH = 2 6 6 6 6 4 S 1 S 2 3 7 7 7 7 5 = S H ; (6.1) where = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 111 ::: 1 1 ˚˚ 2 :::˚ 1 . . . . . . . . . . . . . . . 1 ˚ n 1 ( ˚ n 1 ) 2 ::: ( ˚ n 1 ) 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 (6.2) isaVandermondematrixand=diag[ 1 ; 2 ; ; ]suchthat i 6 = j for1 i;j ;i 6 = j ,where i 2 GF q for1 i , ˚ isaprimitiveelementin GF q ,andany d rowsof 120 arelinearindependent.Theneachrow ch i ,0 i