MATRIXCOMPLETIONWITHSIDEINFORMATIONFOREFFECTIVERECOMMENDATIONByImanBarjastehADISSERTATIONSubmittedtoMichiganStateUniversityinpartialtoftherequirementsforthedegreeofElectricalEngineering-DoctorofPhilosophy2016ABSTRACTMATRIXCOMPLETIONWITHSIDEINFORMATIONFOREFFECTIVERECOMMENDATIONByImanBarjastehThemassivenumberofonlinechoices,e-commerceitems,andrelateddataavailableonthewebmakesitprofoundlychallengingforuserstoextractinsightfulinformationthatcanhelpthemdecideonwhattoselectamongavastmultitudeofchoices.Inrecentyears,recommendersystemshaveplayedanimportantroleinreducingtheoverwhelmingimpactofsuchinformationoverload.Inparticular,aspformofrecommendersystem,whichisknownascollaborativeisthemostpopularapproachtobuildingthesesystems,andithasbeensuccessfullyemployedinmanyapplications.Moreimportantly,thematrixcompletionparadigmprovidesanappealingsolutiontothecollaborativeprobleminrecommendationsystems.However,collaborativeringbasedapproachesperformpoorlyforsparsedataandspecifortheso-calledcoldstartusers.Recently,therehasbeenanupsurgeinterestinutilizingotherrichsourcesofsideinfor-mationaboutitems/userstocompensatefortheofratinginformation.Suchinformationisofmoreimportancetobeaggregatedwhenasingleviewofthedataissparseorevenincomplete.Duetotheadventandpopularityofonlinesocialnetworksande-commercewebsites,manyttypesofsideinformationareavailablethatcanbetakenintoaccountinadditiontotraditionalratingmatricesinordertoimprovetherecommendation.Theoverarchinggoalofthisthesisistoproposeanovelandgeneralalgorithmicframeworkbasedonmatrixfactorizationthatsimultaneouslyexploitsthesimilarityinformationamongusersanditemstoalleviatethedatasparsityissueandspthecold-startproblem.Weextendmatrixfactorizationandproposeamodelthattakesintoaccountthesideinformationaswellastheratingmatrix.Therefore,bymodelingttypesofsideinformation,suchassocialortrust/distrustrelationshipsbetweenusersandmeta-dataaboutitems,asaconstraintsimilarity/dissimilaritygraph,weproposeantiverecommendationframeworkthatisabletoboosttherecommendationaccuracyandovercomethechallengesinexistingrecommendationsystemssuchascold-startusers/itemsanddatasparsityproblems.Theproposedmodelingframeworkiscapableofperformingbothratingandlinkprediction.Basedontheproposedframework,akeyobjectiveofthisthesisistodevelopnovelalgo-rithmsandderivetheoreticalguaranteesfortheirperformance.Thealgorithmswedevelopedsofarhavebeenexperimentallyevaluatedandcomparedagainstexistingstate-of-the-artmethodsonreallifedatasets(suchasMovieLens,NIPS,Epinionsandetc.).Ourexperi-mentalresultsshowthatourproposedmodelingframeworkandrelatedalgorithmsachievesubstantialqualitygainswhencomparedtowithexistingmethods.Ourexperimentalresultsalsoillustratehowourframeworkandalgorithmscanovercometheshortcomingsofotherstate-of-the-artrecommendationtechniques.CopyrightbyIMANBARJASTEH2016Tomyparents,Leila&KarimvACKNOWLEDGMENTSFirstandforemost,ImustmymostheartfeltthankyoutomyadvisorDr.Radha,forhisguidance,encouragement,andinspiringsupervisionthroughoutthecourseofthisresearchwork.HismotivationtoreachhighertaughtmehowIshouldnothaveanylimitsformydreams.HetaughtmehowIshouldearnmyPhDratherthangettingitandhisapproachmademeearnmyPhDratherthangettingit.It?shardtoexpresshowthankfulIamforhisunwaveringsupportoverthelastyears.IwouldliketotakeonthisopportunitytothankmythesiscommitteemembersDr.Esfahanian,Dr.DebandDr.Biswas,whohaveaccommodatedmytimingconstraintsdespitetheirbusyschedules,andprovidedmewithpreciousfeedbackforthepresentationoftheresults,inbothwrittenandoralform.Thisworkwouldnotbepossiblewithoutthewonderfulguidancefromtwofriends,Dr.RanaForsatiandDr.MehrdadMahdavi,whohelpedmealotthroughoutmyPhD.Igreatlyenjoyedworkingwiththemandmassivethankstothemforsharingtheirinterestandexpertise.Tomyfriends,youallhavestoodbymebothpersonallyandprofessionallythroughoutmyjourneythroughgraduateschool.Iwouldnothavegottenthroughitwithoutyouall,norwouldmytimeatMichiganStatebeenasmuchfun.Thankyouall.Lastbutnotleast,Iwanttoexpressmydeepestgratitudetomybelovedparents,LeilaandKarim,andmydearestsiblings.Thankyouforencouragingintellectualcuriositythroughoutmyentirelife.Thatalongwithyourloveandsupportwereinstrumentalinthisverychallengingendeavour.viPREFACE"Weareleavingtheeraofsearchandenteringoneofdiscovery!"{CNNMoney,theracetocreateasmartGoogleviiTABLEOFCONTENTSLISTOFTABLES....................................xiLISTOFFIGURES...................................xiiChapter1Introduction................................11.1ObjectivesandProposedFramework......................41.1.1CollaborativeFilteringFramework...................41.1.2CollaborativeRankingMethodI.....................51.1.3CollaborativeRankingMethodII....................71.2OrganizationoftheThesis............................9Chapter2PreliminariesandRelatedWorks...................112.1Preliminaries...................................112.2RichSideInformationofUsersandItems....................122.2.1SocialNetworks..............................132.2.2User-ContributedInformation......................142.3RelatedWorks...................................162.3.1MatrixCompletionandCold-startRecommendation.........162.3.2NetworkCompletionandCold-startRecommendation.........20Chapter3DecoupledCompletionandTransductionAlgorithm.......223.1MatrixCompletion................................223.1.1TheSetting................................233.1.2Subspacesharinganderrorpropagation................263.1.3Adecoupledsolution...........................273.2NetworkCompletion...............................323.2.1TheSetting................................333.2.2TheProposedAlgorithm.........................343.3AnalysisofRecoveryError............................403.3.1ProofofTheorem3.3.1..........................41Chapter4Semi-supervisedCollaborativeRankingAlgorithm........454.1TheSetting....................................464.2TransductiveCollaboratingRanking......................474.2.1Abasicformulation............................474.2.2Semi-supervisedcollaborativeranking..................504.3TheOptimization.................................524.3.1Gradientdescentwithshrinkageoperator................534.3.2toptimizationbydroppingconvexity..............56viiiChapter5PushTrust:AntRecommendationAlgorithmbyLever-agingTrustandDistrustRelations..................585.1TheSetting....................................595.1.1Matrixfactorizationforrecommendation................595.1.2Matrixfactorizationwithsocialregularization.............615.2ThePushTrustAlgorithm............................635.2.1Collaborativesocialranking.......................635.2.2Aconvexformulationforsocialranking.................665.3Optimizationprocedure.............................69Chapter6Cold-startRecommendation:RatingPrediction.........726.1Introduction....................................726.2Experiments....................................756.2.1Datasets..................................766.2.2Metrics..................................776.2.3Experimentsonsyntheticdataset....................786.2.4Experimentssetup............................806.2.5Thebaselinealgorithms.........................816.2.6ExistingUsers/ExistingItems.....................836.2.7ExistingUsers/Cold-startItems.....................866.2.8Cold-startUsers/ExistingItems.....................876.2.9Cold-startUsers/Cold-startItems....................88Chapter7Cold-starRecommendation:RankingPredictionI........897.1Introduction....................................897.2Experiments....................................937.2.1Datasets..................................947.2.2Metrics..................................967.2.3Methodology...............................967.2.4BaselineAlgorithms...........................977.2.5S2CORvs.rating..............................987.2.6Robustnesstonotmissingatrandomratings..............997.2.7Dealingwithcold-startitems......................100Chapter8Cold-starRecommendation:RankingPredictionII.......1028.1Introduction....................................1028.2Experiments....................................1068.2.1TheEpinionsdataset...........................1068.2.2Evaluationmetrics............................1068.2.3Baselinealgorithms............................1078.2.4Comparisontobaselinealgorithms...................1088.2.5Experimentsonhandlingcold-startusers................109ixChapter9NetworkCompletion...........................1129.1Introduction....................................1129.2ExperimentalEvaluation.............................1169.2.1Datasets..................................1179.2.2EvaluationMetrics............................1209.2.3BaselineAlgorithms...........................1209.2.4Experimentsonsyntheticdataset....................1239.2.5Performanceevaluationonlinkprediction...............1259.2.5.1LinkpredictiononEpinions..................1269.2.5.2LinkpredictiononWeibo...................1279.2.6PerformanceevaluationonNetworkCompletion............128Chapter10Conclusions................................133BIBLIOGRAPHY....................................135xLISTOFTABLESTable6.1:Statisticsofrealdatasetsusedinourexperiments...........78Table6.2:SyntheticDataSet............................83Table6.3:ResultsofscenarioIonMovieLens100Kand1MandEpinions...85Table6.4:Resultsofcold-startscenariosonrealdatasets.............87Table7.1:Statisticsofrealdatasetsusedinourexperiments...........95Table7.2:ResultsofcomparingrankingsmethodsversusratingsmethodsonML-IMDB.isregularizationparameter,hisdimensionoflatentfeatures,Tisthenumberofiterations,islearningrateand˙isthestandarddeviation............................99Table7.3:ResultsofemployingmissingratingsversusignoringthemonML-IMDB.=0:6isregularizationparameter,h=10isdimensionoflatentfeatures,T=100isthenumberofiterations..........100Table7.4:Resultsoncold-startitems.,1andareregularizationparame-ters,hisdimensionoflatentfeatures,listhenumberofsimilarityfunctionsandTisthenumberofiterations...............101Table8.1:StatisticsofratingdataandsocialnetworkinthesampleEpinionsdatasetusedinourexperiments.....................107Table8.2:ComparisonwithothermethodsintermsofMAEandRMSEerrors.109Table8.3:Theaccuracyofhandlingcold-startusersandtheofsocialrelations.Thenumberoflatentfeaturesinthesexperimentsisk=5.110Table9.1:StatisticsofFacebookandGoogle+datasets..............120Table9.2:LinkpredictionresultsonEpinionsdatasetandtheoftrainingsize.....................................127Table9.3:LinkpredictionresultsonWeibodatasetandtheoftrainingsize.....................................129Table9.4:ComparisonoftalgorithmsonGoogle+withdeferentsizesofobservednodes..............................131xiLISTOFFIGURESFigure5.1:TheultimategoalistoextractlatentfeaturesfromtheratingmatrixRinawaythatrespectsthesocialcontextofusersinsocialnetworkS.Inparticular,fromeachuser'sperspective,sayuseru,thegoalistolatentfeaturesforu'sneighborssuchthatwhenrankedbasedontheirsimilaritytothelatentfeaturesoftheuseru,i.e.,u2Rk,thetrustedfriendsN+(u)arepushedtothetopportionofthelist,thedistrustedfriendsN(u)arepushedtothebottomofthelist,andtheneutralfriendsN(u)appearinthemiddleofrankedlist..65Figure6.1:RMSEofsyntheticdatasetforrentvaluesofpandq......78Figure6.2:RMSEandMAEofsyntheticdatasetfortvaluesofnoisevariance.................................79Figure9.1:TherecoveryerroroftheproposedMC-DTalgorithm,i.e.kAcAkF,fortnoisevariances.......................123Figure9.2:Therecoveryerroroftalgorithmsonasyntheticdatasetfortsizesofpartiallyobservedsubmatrixwithmnodes.....125Figure9.3:TherecoveryoffouralgorithmsonFacebookdatasetmeasuredinRMSEundertpercentageofobservednodes..........131Figure9.4:TherecoveryoffouralgorithmsonFacebookdatasetmeasuredinMAEundertpercentageofobservednodes..........132xiiChapter1IntroductionDuetothepopularityandexponentialgrowthofe-commercewebsites(e.g.Amazon,eBay)andonlinestreamingwebsites(e.g.Hulu),acompellingdemandhasbeencreatedfortrecommendersystemstoguideuserstowarditemsoftheirinterests(e.g.products,books,movies).Recommendersystemsseektopredicttheratingthatauserwouldgivetoanitemandfurthertrytosuggestitemsthataremostappealingtotheusers.Recently,recommendersystemshavereceivedaconsiderableamountofattentionandhavebeenthemainfocusofmanyresearchstudies[2].Content-based(CB)andcollaborative(CF)arewell-knownexamplesofrecommendationapproaches.AsdemonstratedbyitsperformanceintheKDDCup[29]andcompetition[13],themostsuccessfulrecommendationtechniqueusediscollaborativeThistechniqueexploitstheusers'opinions(e.g.,movieratings)and/orpurchasing(e.g.,watching,reading)historyinordertoextractasetofinterestingitemsforeachuser.InfactorizationbasedCFmethods,bothusersanditemsaremappedintoalatentfeaturespacebasedonobservedratingsthatarelaterusedtomakepredictions.InsparkcontrasttoCF,incollaboratingranking(CR)models[58,22,119,25],wherethegoalistoranktheunrateditemsintheorderofrelevancetotheuser,thepopularrankingmeasuressuchasasdiscountedcumulativegain(DCG),normalizeddiscountedcumulativegain(NDCG),andaverageprecision(AP)[54]areoftenemployedtocollaborativelylearnarankingmodelforthelatentfeatures.1RecentstudieshavedemonstratedthatCRmodelsleadtotlyhigherrankingaccuracyovertheirtraditionalCFcounterpartsthatoptimizeratingprediction.Thisisimportantconsideringthefactthatwhatwereallycareinrecommendationisnottheactualvaluesofratings,buttheorderofitemstoberecommendedtoaspuser.Therefore,theerrormeasuressuchasRMSEareoftenhopelesslyt,astheirplaceequalemphasisonalltheratings.Amongrankingmodels,themethodsthatmainlyconcentrateonthetopofthelisthavereceivedaconsiderableamountofattention,duetothehigherprobabilityofexaminingthetopportionofthelistofrecommendationsbyusers.Therefore,theintroductionofrankingmetricssuchaspushnormornorm[99,5,22,59],sparkedawidespreadinterestinCRmodelsandhasbeenproventobemoreeinpractice[112,22].Ontheotherhand,thehardnessofgatheringthecompletestructureofnetworkeddatasuchassocialnetworks,biologicalnetworks,andpublicationnetworksisoneofthechalleng-ingproblemsthatmostofnetworkanalysisapplicationsarecurrentlyfacing.However,therearecurrentlyveryfewtechniquesforworkingwithincompletenetworkdata.Therefore,itistemptingtoobserveapartialsampleofalargenetwork,andbasedonthatsample,inferwhattherestofthenetworklookslike,aproblemknownasthenetworkcompletionorsurveysampling.Sp,thesampleischosenbyrandomlyselectinganumberofver-tices,andforeachweareallowedtoobserveallorauniformlysampledsubsetofedgesitisincidentwith.Studyoftheproblemofrecoveringthetopologyofanundirectednetworkbyobservingonlyapartiallyobservedrandomsubsampleofitwithenoughedgesalsoreceivedaconsiderableamountofattentionandhavebeenthemainfocusofmanyresearchstudies[55].Thisgeneralproblemarisesinanumberoftsettingsininformationretrieval,2socialnetworkanalysis,andcomputationalbiology[7,86].Forinstance,inonlinesocialnet-workssuchasFacebook,Myspace,andTwitter,giventhelargeadoptionofthesenetworks,inferringthefullnetworkformasmallsampleoflinksbetweenusersortheimplicitsocialnetworkingstructurebetweenthemisachallengingproblem[86].Despitetimprovementsinrecommendationsystemsandnetworkcompletionapproaches,andinparticularfactorizationbasedmethods,thesesystemsfromafewinherentlimitationsandweaknessessuchasdatasparsityandcold-startproblems.Specif-ically,inmanyrealworldapplications,theratingdataortheobservednetworksareverysparse(i.e.,mostusersdonotratemostitemstypicallyresultinginaverysparseratingma-trix)orforasubsetofusersoritemstheratingdataisentirelymissing(knowsascold-startuserandcold-startitemproblem,respectively[103]).Toaddresstheiesassociatedthelatterissues,therehasbeenanactivelineofresearchduringthepastdecadeandavarietyoftechniqueshavebeenproposed[87,100,109,121,64,114,122,77,92,120,62].Thestudiesintheliteraturehaveapproachedthecold-startproblemfrommanytangles,buttheycommonlyexploittheauxiliaryinformationabouttheusersanditemsinadditiontotheratingdatathatareusuallyavailable(seee.g.[106]foramorerecentsurvey).Byleveragingmultiplesourcesofinformationonecanpotentiallybridgethegapbetweenexistingitems(orusers)andnew(coldstart)items(orusers)tomitigatethecold-startproblem.Themainmotivationbehindthesetechniquesstemsfromtheobservationthatothersourcesofdatacanbeusedtorevealmoreinformationabouttheunderlyingpatternsbetweenusersanditemsandthuscomplementtheratingdata.Forinstance,knowingtheunderlyingsocialconnections(friends,family,etc.)betweenuserscangiveusabetterunderstandingofthesourcesofonauser'sdecision.Subsequently,theavailabilityofauxiliary3informationsuchasusers'information[2],socialcontext(trustanddistrustrelations)ofusers[35],informationembeddedinthereviewtext[64],andfeaturesofitems[37]providetangiblebtotherecommender.Hence,anintriguingresearchquestion,whichisthemainfocusofthisthesisis:Howcansideinformationaboutusersanditemsbectivelyincorporatedinfactorizationmethodstoovercomethecold-startproblem?1.1ObjectivesandProposedFrameworkInthissection,twomajorcategoryofalgorithmswillbehighlighted.1.1.1CollaborativeFilteringFrameworkAlthoughtherearemanytcollaborativeeringalgorithmsintheliteraturetoaug-mentmatrixfactorizationwithsideinformation,suchassharedsubspacelearning[102]andkernelizedmatrixfactorization[122],thedominantparadigminexistingmethodsistoperform(a)completionofpartiallyobservedmatrix(e.g.ratingmatrix)(b)transductionofknowledgefromobservedentries(e.g.existingratings)tocold-startentries(e.g.items/users)simultaneously.Whilethesemethodsareabletogenerategoodresultsinpractice,theyhavenotabledraw-backs:1.thesemethodspropagatethecompletionandtransductionerrorsrepetitivelyandinanuncontrolledway42.manyofstate-of-artalgorithmsareusuallyapplicationbased,e.g.see[114,63],anddonotageneralframeworkforalleviatingcold-startproblems.Theoverarchinggoalofthisthesisistoanswertheabovequestionbyproposingantmatrixfactorizationapproachusingsimilarityinformation.Infact,notonlyweproposeageneralframeworkfordealingwithcold-startproblems,butalsowestudyasomewhatmoregeneralproblemwherebothcold-startusersandcold-startitemsarepresentsimultaneouslyandaddressthesetwochallengessimultaneously.Inparticular,byconsideringthedrawbacksoftheexistingmethods,weproposeatwo-stagealgorithmthatdecouplesthecompletionandtransductionstages.First,weexcludethecold-startitemsandusersandcompletetheratingmatrix.Next,wetransducttheknowledgetocold-startusers/itemsusingtherecoveredsub-matrixinadditiontotheavailablesideinformationabouttheusersanditems.Hence,thereisnoerrorpropagationofcompletionandtransduction.Interestingly,beyondjustdealingwithcold-startproblem,theproposedalgorithmalsoprovidesanewaytoexploitthesideinformationaboutusers(oritems)tomitigatethedatasparsityandcompensateforthelackofratingdata.1.1.2CollaborativeRankingMethodIAlthoughCRmodelsforrecommendersystemshasbeenstudiedextensivelyandsomeprogresshasbeenmade,however,thestateofairsremainsunsettled:theissueofhandlingcold-startitemsinrankingmodelsandcopingwithnotmissingatrandomassumptionofratingsareelusiveopenissues.First,inmanyrealworldapplications,theratingdataareverysparse(e.g.,thedensityofthedataisaround1%formanypubliclyavailabledatasets)orforasubsetofusersoritemstheratingdataisentirelymissing(knowsascold-startuser5andcold-startitemproblem,respectively)[103].Second,collaborativeandrankingmodelsrelyonthecriticalassumptionthatthemissingratingsaresampleduniformlyatrandom.However,inmanyrealapplicationsofrecommendersystems,thisassumptionisnotbelievedtohold,asinvariablysomeusersaremoreactivethanothersandsomeitemsareratedbymanypeoplewhileothersarerarelyrated[111].Theseissueshavebeeninves-tigatedinfactorizationbasedmethods,nonetheless,itisnotstraightforwardtoadaptthemtoCRmodelsandareleftopen[22].Motivatedbythesechallenges,weaskthefollowingfundamentalquestioninthecontextofcollaborativerankingmodels:Isitpossibletoctivelylearnacollaborativerankingmodelinthepresenceofcold-startitems/usersthatisrobusttothesamplingofobservedratings?Inthisthesis,wegiveantiveanswertotheabovequestion.Inparticular,weintroduceasemi-supervisedcollaborativerankingmodel,dubbedS2COR,byleveragingsideinformationaboutbothobservedandmissingratingsincollaborativelylearningtherankingmodel.Inthelearnedmodel,unrateditemsareconservativelypushedaftertherelevantandbeforetheirrelevantitemsintherankedlistofitemsforeachindividualuser.Thiscrucialgreatlybooststheperformanceandlimitsthebiascausedbylearningonlyfromsparsenon-randomobservedratings.Wealsointroduceagraphregularizationmethodtoexploitthesideinformationaboutuserstoovercomethecold-startusersproblem.Insummary,thekeyfeaturesofS2CORare:Inspiredbyrecentdevelopmentsinrankingattop[99,5,?],theproposedmodelisacollaborativerankingmodelthatprimarilyfocusesonthetopoftherecommendationlistforeachuser.Moreover,instarkcontrasttopairwiserankingmodelswhichhave6quadraticdependencyonthenumberofitems,theproposedrankingmodelhasalineardependencyonthenumberofitems,makingitsuitableforlarge-scalerecommendation.Itleveragessideinformationaboutitemswithbothobservedandmissingratingswhilecollaborativelylearningtherankingmodel,whichenablesittotivelyincorporatetheavailablesideinformationinknowledgetransduction.Byincorporatingtheunrateditemsinranking,itlimitsthebiascausedbylearningsolelybasedontheobservedratingsandconsequentlydealswiththenotmissingatrandomissueofratings.Itisalsoabletoleveragethesimilarityinformationbetweenusersbasedonagraphregularizationmethodtomakehighqualityrecommendationsforuserswithfewratingsorcold-startuserswithoutanratinginformation.1.1.3CollaborativeRankingMethodIIRecently,onlinenetworkswheretwooppositekindofrelationshipscanoccurhavebecomecommon.Forinstance,theEpinions[44],ane-commercesiteforreviewingandratingproducts,allowsuserstoevaluateothersbasedonthequalityoftheirreviews,andmaketrustanddistrustrelationswiththem.SimilarpatternscanbefoundinonlinecommunitiessuchasSlashdotinwhichmillionsofuserspostnewsandcommentdailyandarecapableoftaggingotherusersasfriends/foesorfans/freaks.Additionally,usersonWikipediacanvotefororagainstthenominationofotherstoadminship[16].Whenmoreusersissuingdistruststatements,themoreimportantitwillbecometoexploitthisnewinformationsourceinrecommendersystems.Itisacknowledgedthatalongwiththetrustrelationships,alsodistrustcanplayan7importantroleinboostingtheaccuracyofrecommendations[117,116,35].Recently,someattemptshavebeenmadetoexplicitlyincorporatethedistrustrelationsinrecommendationprocess[44,70,116].Thisdemonstratedthattherecommendersystemscanbfromtheproperincorporationofdistrustrelationsinsocialnetworks.However,despitethesepositiveresults,therearesomeuniquechallengesinvolvedindistrust-enhancedrecommendersystems.Inparticular,ithasprovenchallengingtomodeldistrustpropagationinamannerwhichisbothlogicallyconsistentandpsychologicallyplausible.Furthermore,thenaivemodelingofdistrustasnegativetrustraisesanumberofchallenges-bothalgorithmicandphilosophical.Finally,itisanopenchallengehowtobestincorporatetrustanddistrustrelationsinmodel-basedmethods,e.g.,matrixfactorization,simultaneously.Inmemorybasedrecommendersystems,thesimultaneousexploitationoftrustanddistrusthasbeeninvestigatedin[118,115].Anattempttosimultaneouslyexploittrustanddistrustrelationsinfactorizationbasedmethodhasbeenmadeveryrecentlyin[35].Inparticular,arankingmodelwasproposedtorankthelatentfeaturesofusers,fromtheperspectiveofindividualusers,thatrespectstheirsocialcontext.Despitetheencouragingimprovementsthathasbeenbroughtbysimultaneousexploitationoftrustanddistrustrelations,theproposedalgorithmsersfromtwoissues.First,asthenumberofconstrainttripletsimposedfromsocialregularizationoflatentfeaturescanincreasecubicallyinthenumberofusersinthesocialnetwork,itiscomputationallytomaketheirideascalabletolargesocialgraphs.Second,thealgorithmproposedin[35]onlyconsidersthetrustedanddistrustedfriendsofeachuserinordertoregularizethelatentfeatures.Thus,itignorestheneutralfriends{theuserswhohasnorelationtotheuser.Asaresult,theneutralfriendsmightappearbeforethetrustedfriendsintherankedlist.Thereforeneutralfriendsmightbecomemoretialthantrustedfriends8andimpacttherecommendationnegatively.Inthisthesiswedescribeandanalyzeafairlygeneralandmethodtolearnandinferfromratingdata,alongwithtrustanddistrustrelationshipsbetweenusers.Themainbuildingblockofourproposedalgorithm,dubbedPushTrust,isantalgorithmtorankthelatentfeaturesofusersbyleveragingtheirsocialcontext.Inrankingoflatentfeatures,wewishtomakesurethatthetopportionofthelistincludesthetrustedfriendsandthedistrustedfriendsarepushedtothebottomoflist.Also,wewouldliketheneu-tralfriendsappearaftertrustedandbeforedistrustedfriendsinthelist.Comparedtothemethodproposedin[35],thekeyfeaturesofthePushTrustalgorithmare:(i)itsquadratictimecomplexityinthenumberofusers,and(ii)itsabilitytotaketheneutralfriendsofusersintotheconsiderationinregularizingthelatentfeaturesofusers.Thoroughexperi-mentsontheEpinionsdatasetdemonstratethemeritsoftheproposedalgorithminboostingthequalityofrecommendationsanddealingwithdatasparsityandcold-startusersproblems.1.2OrganizationoftheThesisTheremainderofthethesisisorganizedasfollows.Chapter2laysoutthefoundationfortherestofthethesis.Inparticular,weprovideasurveyofsomeofthebackgroundmaterialsandstate-of-the-artalgorithms.Inpart??wetalkaboutthetheoryofourproposedalgorithmsandtheiranalysis.Inchapter3weproposeourdecoupledmatrixfactorizationbasedalgorithm(RECT)anditsanalysis.Inchapter4wetalkaboutoursemi-supervisedcollaborativerankingalgorithmanditsanalysis(S2COR).Inchapter5,weproposeanewcollaborativerankingalgorithm9thatutilizedthetrust/distrustrelationsbetweenusers.Inpart??wetalkaboutthreetmajorreallifeapplicationsofourproposedalgorithm.Chapter6focusesontheapplicationofRECT,whichthecold-startrecom-mendation.Chapter7isalsoacold-startapplicationofalgorithmS2COR,whichprovidesarankingofitemsforusers.Chapter8isacold-startapplicationofalgorithmPushTrust.Chapter9isanotherapplicationofRECTfornetworkcompletionproblem.Part??focusesontheconclusions(chapter10)ofthisthesisandthefutureworks(chapter??).10Chapter2PreliminariesandRelatedWorks2.1PreliminariesInthischapterweformallytherecommendationproblemandsptalkaboutthematrixcompletionbasedproblemstclassesofexistingmethods.Thenwedis-cusstheapproachesforevaluatingrecommendersystems.Itisworthmentioningthatweexplaintheproblemasarecommendationproblembuttheproposedframeworkisageneralframeworkformatrixcompletionproblem.InrecommendationsystemsweassumethatthereisasetofnusersU=fu1;:::;ungandasetofmitemsI=fi1;:::;imgwhereeachuseruiexpressesopinionsaboutasubsetofitems.Inthisthesis,weassumeopinionsareexpressedthroughanexplicitnumericrating(e.g.,scalefromonetoe),butotherratingmethodssuchashyperlinkclicksarepossibleaswell.Wearemainlyinterestedinrecommendingasetofitemsforanactiveusersuchthattheuserhasnotratedtheseitemsbefore.TheratinginformationissummarizedinannmmatrixR2Rnm;1in;1jmwheretherowscorrespondtotheusersandthecolumnscorrespondtotheitemsand(p;q)-thentryistherategivenbyuseruptotheitemiq.Wenotethattheratingmatrixispartiallyobservedanditissparseinmostcases.Twogeneraltaskscanbeforarecommender,oneisratingpredictionandtheotheroneistop-Nrecommendations.Inthefollowingweformallythesetwoproblems:RatingPrediction11Givenauseru2Uandanitemi2I,forwhichru;iisunknown,computethepredictedratingofuonitemi,ru;iusingtheratingmatrixRandthesocialnetworkS.Top-NPredictionGivenauseru2UandtheratingmatrixR,recommendthetopNmostrelateditemstouseruthathasnotratedyet.Themainfocusofthisthesisisonbothratingpredictionandtop-Nrecommendation.TheUsingaratingpredictionsystem,onecaninferatop-Nrecommenderbysimplyrankingallitemsccordingtotheirpredictedrating.Thisapproachisobviouslynotientbute.Therearemoresophisticatedwaystoperformtop-Nrecommendationthatwediscusslaterinthisthesis.Antandtiveapproachforrecommendersystemsistofactorizetheuser-itemratingmatrixRbyamultiplicativeofk-rankmatricesRˇUV>,whereU2RnkandV2Rmkutilizethefactorizeduser-spanditem-spmatrices,respectively,tomakefurthermissingdataprediction.Therearetwobasicformulationstosolvethisproblem:theseareoptimizationapproaches(seee.g.,[97,65,71,58])andprobabilisticapproaches[80].LetRbethesetofobservedratingsintheuser-itemmatrixR2Rnm,i.e.,R=f(i;j)2[n][m]:Rijhasbeenobservedg.Inoptimizationbasedmatrixfactorization,thegoalistolearnthelatentmatricesUandVbysolvingthefollowingoptimizationproblem:2.2RichSideInformationofUsersandItemsAsmentionedintheIntroduction,welookbeyondtheU-Imatrixtoincludetwotypesofad-ditionalinformationthatisconsideredusefulforimprovingtherecommendations:richside12informationaboutusersanditems,andinformationaboutthesituationinwhichusersinter-act(e.g.,rate,click,orpurchase)withitems.Themainfocusofthisthesisisincorporatingtherichsideinformationavailabletobettercompletetheratingmatrix.TherangeofsourcesofsideinformationonusersanditemsstretchingbeyondtheU-Imatrixisquitebroadandvaried.Oneofthemostcommonsideinformationsourcesisattributeinformation.Userattributesmayincludeinformationsuchastheuser?sgender,age,andhobbies.Itemattributespropertiesoftheitem,suchascategoryorcon-tent.However,twosourcesofinformationthathaverecentlyincreasedinimportanceintherecommendersystemresearcharesocialnetworksanduser-contributedinformation.Intheremainderofthissubsection,wediscusstheseinformationsourcesinmoredetail.2.2.1SocialNetworksTheemergenceofsocialnetworkshasimpactedawiderangeofresearchdisciplinesinthepastyears,andrecommendersystemsarenoexception.Sptotherecommendersystemarea,socialnetworksintroduceinformationintheformofuser-userrelationships,whichmaybeparticularlyusefulforimprovingthequalityofrecommendation.Ingeneral,thesocialrelationshipsbetweenuserscanbeeitherdirectedorundirected.Socialtrustanddistrustrelationshipsareamongthemoststudiedofdirectedsocialrelationships.Thetrust/distrustrelationshipcanusuallybedescribedasanasymmetricuser-usergraph/matrix,whichindi-cateswhetheroneusertrusts/distrustsanother.Anotherimportantdirectedsocialrelation-shipisfollow,bythefollowrelationshipusedbyTwitter.Thefollowrelationshipissimilartothetrustrelationshipinthatittheappreciationofoneuser(thefol-lower)foranother(thefollowee).InthecaseofTwitter,thefollowerreceivesthefollowee?smicroblogposts.Thecanonicalexampleofanundirectedsocialrelationshipisfriendship,13asusedinFacebook.Friendshipcanberepresentedasasymmetricuser-usergraph/matrix,whichencodeswhethertwousersarefriendsofeachother.Itisalsopossibletoextractmorecomplexrelationships,suchastiestrengthandsimilarity,betweenusersinasocialnetworkbyanalyzingthelinkstructureandthecommonpatternsofuserbehavior.Recom-mendersystemalgorithmsthatattempttoleveragesocialrelationships,bothdirectedandundirected,applytheassumptionthatuserswhostandinapositiverelationshipwitheachothermayalsosharesimilarinterests,aswillbediscussedlater.2.2.2User-ContributedInformationUser-contributedinformationhasbecomewidelyavailableinmostrecommendersystems,anditsvolumehasgrownsteadilysincetheintroductionofWeb2.0technology.Strictlyspeaking,theuserratingscontainedintheU-Imatrixcanbeconsideredasonetypeofuser-contributedinformationaswell.Here,weintroducefourtypesofuser-contributedinformationthatgobeyondtheU-Imatrix:tags,geotags,multimediacontent,andfree-textreviewsandcomments,whichareincreasinglyusedinrecommendersystems[106]:Tags:Tagsareshorttextuallabelsthatusersassigntoitems.Tagsfromcon-ventionalcategorylabelsinthatuserscanassignthemfreely?thatis,theyarenotconstrainedbyapresetlist.Userstagitemsfortreasons.Sometagsdescribeitemproperties,andothersexpresshowusersfeelaboutanitem.Tagsarerecognizedasaninformationsourcethatcanbehighlybforimprovingtheperformanceofrecommendersystems.Inadditiontoexploitingtagsforrecommendingitems,person-alizedtagrecommendationhasalsobecomeanactiveresearchtopic,which,however,fallsoutsidethescopeofthissurvey.14Geotags:SinceGPSpositioninghasbecomeastandardfunctionalityofmobiledigitaldevices(e.g.,cellphonesordigitalcameras),locationinformationbecomesabundantinsocialmediasites,suchasphotoandvideosharingsitesandmicrobloggingsites.Thelocationinformationofaniteminthosesitesisusuallyintheformofgeotags?thatis,explicitlatitudeandlongitudecoordinates.Asinitsname,geotagcanberegardedasaspecialclassoftagsthatareparticularlyusedforgeographicalpositions.Inaphotosharingsite,geotagsofaphotomayindicatethattheuploaderofthephotohasbeentothatlocation(ornearby).Similarly,thegeotagsofauser?stweetsmaybeusedtotracethelocationoftheuser.Duetotheavailabilityoflargequantitiesofgeotags,remarkableprogresshasbeenachievedinboththeresearchonexploitinglocationinformationforimprovingrecommendationandonfacilitatinglocation/travelrecommendation.MultimediaContent:Socialmediasites,suchasFlickrandYouTube,havefa-cilitatedtheirusersforuploadingandsharingmultimediacontent(e.g.,imagesandvideos).Theuser-contributedmultimediacontentservesasanothertypeofsideinfor-mationfortheonlineusers.Forexample,thecategoriesofthephotosinauser'salbummaywhatkindsofitemsshelikestosee.Theparticulartypeofvideosthatauserusuallypostsmayindicateherinterestinaparticularitem.Suchinformationcanbeexploitedformoreelaboratelymodelingtheuserinterestsandthuscontributingtocontentrecommendation[81,11,6].ReviewsandComments:Lastbutnotleast,movingbeyondtagsandgeotags,freetextreviewsandcommentsthatarepublishedbyusersonlineareanotherimpor-tantsourceofcommunitycontributions.Theyarevaluablenotonlybecauseoftheir15semanticsbutalsobecauseofthesentimentdimension.Forthisreason,itisnotsur-prisingthatreviewsandcommentshavebeenexploitedasatypeofsideinformationforimprovingrecommendersystems.2.3RelatedWorksBeforedelvingintothedetaileddescriptionandanalysisoftheproposedframework,wewouldliketodrawconnectionsto,andputourworkincontextof,someofthemorerecentwork.Thetapproachescanberoughlydividedintothefollowingcategories.2.3.1MatrixCompletionandCold-startRecommendationNaivemethods.Thecategoryincludesnaivealgorithmsthattrytorecommenditemstousersbasedontheirpopularity[87],orbasedonarandomselection[66].Naivemethodstreatallcold-startusers/itemsinasamewayandassumethatallusers/itemscontributethesametorecommendations.ThishastheoftremendouslyreducingtheaccuracyduetoalackofanyWarm-startmethods.Forcold-startscenarios,sincethereisnohistoricaldataavailableforeitherusersoritems,warm-startmethodseitheraskuserstorateasetofitemsorimporttheirpreferencesfromanothersourceofauxiliaryinformationinordertoexpandtheuser[66,121,24].Inparticular,thesemethodsexplicitlyaskanewusertoratekrepresentativeitemsinordertoregulatethetasteofnewuserfordealingwithcold-staruserproblems.Similarly,anewitemcanforcedtoberatedbykrepresentativeusersincold-staritemscenarios.Liuetal.[66]proposedarepresentative-basedmatrixfactorizationmodelwhichisable16toadjustitselfwithnewusers/itemsbylearningtheirparameters.Sunetal.[113]proposedanalgorithmforrapidofnewusers,whichisguidedbyadecisiontreethatfocusesonincreasingtherecommendationaccuracywhileminimizingtheuserinteractionZhouetal.[121]proposedanewuserpreferenceelicitationstrategytolearntheofusersanditems.Morerecently,Contardoetal.[24]proposedarepresentative-learningformalismforcold-startuserrecommendation,whichisbasedonaninductivemethodwhoseprincipleistocoderatingsonitemsinthelatentrepresentationspace.Thedrawbackoftheseapproachesisthatthecold-startuserswillbeforcedtoratenumberofrepresentativeitems,whichincreasestheuserinteractions.Featurecombination.Asmentionedearlier,inrecentyearstherehasbeenanupsurgeofinterestinutilizingotherrichsourcesofsideinformationaboutitemsandusersalongwiththeratingmatrixtoincreasetheaccuracyoftherecommendationanddealingwithcold-startchallenges[106].Therefore,featurecombinationapproaches,asthethirdcategoryofcold-startrecommendations,becamequiteappealing.Thesemethodsemployandcombinefeaturesrelatedtousers(e.g.oritems(e.g.metadata)toincreasetheaccuracywhileminimizingtheuserinteractions.Featurecombinationmethodscanbeusedasasinglemodelregardlessofthetypesavailableinformation.Therecentadvancesinmatrixfactorizationmethodssuggestsubspacesharingormatrixco-factorizationcanelyincorporatesideinformation[68,43,48,47],whereseveralmatrices(ratingandsideinformationmatrices)aresimultaneouslydecomposed,sharingsomefactormatrices.ThekernelizedmatrixfactorizationapproachstudiedbyZhouetal.[122],incorporatestheauxiliaryinformationintothematrixfactorizationtoassessthesimilarityoflatentfeaturesusingtheavailablesimilaritymatrices.Saveskietal.'s[102]ma-trixfactorizationmethod,inacommonlow-dimensionalspace,collectivelydecomposesthe17contentandthecollaborativematrices.Wealsonotethatseveralrecentstudiesextendthemaximummarginmatrixfactorization,whichwasdevelopedforcollaborative[110],toincorporatesideinformation[85,1].Thesestudiesaimtoovercomethedatasparsityproblembyreducingthenumberofsampledentriesandistfromourwork.Analternativewayforfeaturecombinationistomaptheavailableauxiliaryfeaturestothelatentfeaturesofthefactorizationmodel.Elbadrawyetal.[30]proposedanapproachthatlearnsamappingfunctiontotransformthefeaturevectorsofitemsintotheirlatentspace.Ganteretal.[37]alsoproposedamatrixfactorizationmodelthatmapsthefeaturestothelatentfeaturesofthemodel.Boltzmannmachines[46]andaspectmodels[103]areotherfactormodelsthatutilizesideinformationforcold-startrecommendation.Anotherfamilyoffeaturecombinationmethodsincludesthosethatrelyonexplicitfeaturesofitems/userstocomputethesimilaritybetweenitems/usersbyextractingtkeyfeaturessuchastextualsimilarityorsemanticsimilarity[64,114].Boltzmannmachinesareanothertypesoflatentfactormodels,whichmodelthejointdistributionofasetofbinaryvariablesthroughtheirpairwiseinteractions[46].Gunawardanaetal.[46]builtarecommendersystemthatcombinestheindividualuserinformationwiththeirpairwisesimilarityacrosstheusersandlearnsweightsforfeatures,whichistheofhowwelleachuser'sactioncanbepredicted.Aspectmodel,asanotherfamilyoflatentfactormodels,hasalsobeenemployedtotacklecold-startproblems.Aspectmodelisalatentclassvariableframeworkthatisconstructedforcontingencytablesmoothing.Scheinetal.[103]employedaspectmodel,toproposeaprobabilisticmodelcombingcontentandcollaborativedata.Anotherfamilyoffeaturecombinationmethodsincludesthosethatthatrelyonexplicitfeaturesofitems/users.Theseapproachescomputethesimilaritybetweenitems/usersby18extractingtkeyfeaturessuchastextualsimilarityorsemanticsimilarity.Lingetal.[64]introducedanmodelbycombiningtheratingswiththetextualfeaturesofthereviewsusingatopicmodelingapproach.Trevisioletal.[114]studiedthenewsbrowsingbehaviorofusersandtheybuiltaBrowsGraphandemployedthestructuralandtemporalpropertiesofthemtoproposeanewsrecommenderforuserstodealwithcold-startproblem1.Modelcombination.Finally,modelcombinationmethodscombinetheoutputsofdif-ferentrecommedersbyvariousstrategiessuchas:voting[90],weightingtheoutputscoreofrecommenders[23],switchingbetweenrecommenders[14],orandre-rankingtheoutputs[17].Thedrawbackoftheseapproachesisthattheyrequirebuildingtandseparaterecommendersystemstoeachsourceofusedinformationandcombiningtheirout-puts[46].Parketal.[87]castthecold-startrecommendationasaregressionproblemandappliedacombinationofallinformationofusers/itemstoincorporatethesideinformation.Transductionwithmatrixcompletion.Inmatrixcompletiontheobjectiveistooutthemissingentriesofamatrixbasedontheobservedones.Earlyworkonmatrixcompletion,alsoreferredtoasmaximummarginmatrixfactorization[110]wasdevelopedforcollaborativeTheoreticalstudiesshowthat,itisttoperfectlyrecoveramatrix[122,77,92].Wealsonotethatseveralrecentstudiesextendthemaximummarginmatrixfactorizationwhichwasdevelopedforcollaborative[110]toincorporatesideinformation[32,85].Wealsonotethatmatrixcompletionwithdimensionalsideinformationwasexploitedin[1].Finally,wenotethatalthoughvarioushybridmethodssuchasfactorizationmachines[94],1PredictionsandmodelingofnetworksareverypopularinntofstudiessuchasBiology[26]orCivilEngineering[27].19content-boostedcollaborative[76],probabilisticmodels[91],pairwisekernelmeth-ods[12],andots-basedmethods[88]havebeendevelopedtoblendcollaborativeingwithsideinformation,theyarespdesignedtoaddressthedatasparsityproblemandfailtocopewithcold-startusersoritemsproblemwhichisthemainfocusofthisthe-sis.Variousmethodsforblendingpurecollaborativewithcontent-basedhavebeendeveloped.Forexample,theseincludecontent-boostedcollaborative[76],probabilisticmodels[91],pairwisekernelmethods[12],andots-basedmethod[88].Amethodbasedonthepredictivediscretelatentfactormodelswasproposed[4],whichmakesuseoftheadditionalinformationintheformofpre-spcovariates.2.3.2NetworkCompletionandCold-startRecommendationNetworkcompletionandinference.Thereareafewlearningalgorithmsfornetworkcompletionproblemwherethecommonthemeistomakestructuralassumptionabouttheunderlyingnetworkanddevisetmethodstoconstructit.In[55]thenetworkcomple-tionprobleminwhichanincompletenetworkincludingunobservednodesandedgesisgivenisaddressed.ItisassumedthattheunderlyingnetworkfollowstheKroneckergraphmodel.TheExpectationMaximization(EM)frameworkhasbeenusedtoinfertheunobservedpartofthenetwork.Anewmathematicalandcomputationalmethodwhichcanidentifybothmissingandspuriousinteractionsincomplexnetworksbyusingthestochasticblockmodelstocapturethestructuralfeaturesinthenetworkispresentedin[45].Thismethodwasalsoappliedtoaproteininteractionnetworkofyeast.In[49]asamplingmethodtoderivedenceintervalsfromsamplenetworksisproposed.Also,theproblemoflinkprediction[61]relatestoourworkinwhichtheaimistopredictthefutureedgesofanetwork.Sincewedonotobserveanylinksforunsamplednodesinnetworkcompletion,wecannotutilizethe20locallinkstructuresorsimplelinkpredictionalgorithmsforthosenodes.Therefore,thestatisticalmodelsarenotfeasibleordonotperformwellfornetworkcompletionproblem.Transductionwithmatrixcompletion.Inmatrixcompletiontheobjectiveistooutthemissingentriesofamatrixbasedontheobservedones.Earlyworkonmatrixcom-pletion,alsoreferredtoasmaximummarginmatrixfactorization[110]wasdevelopedforcollaborativeTheoreticalstudiesshowthat,itisttoperfectlyrecoveramatrix[122,77,92].Severalrecentstudiesinvolvematrixrecoverywithsideinformationarebasedongraphicalmodelsbyassumingspecialdistributionoflatentfactors;theseal-gorithms,aswellas[32,85],considersideinformationinmatrixfactorization.Finally,wenotethatmatrixcompletionwithdimensionalsideinformationwasexploitedin[1].21Chapter3DecoupledCompletionandTransductionAlgorithm3.1MatrixCompletionAlthoughthematrixcompletionparadigmprovidesanappealingsolutiontothecollabora-tiveprobleminrecommendationsystems,somemajorissues,suchasdatasparsityandcold-startproblems,stillremainopen.Inparticular,whentheratingdataforasubsetofusersoritemsisentirelymissing,commonlyknownasthecold-startproblem,thestan-dardmatrixcompletionmethodsareinapplicableduethenon-uniformsamplingofavailableratings.Inrecentyearstherehasbeenconsiderableinterestindealingwithcold-startusersoritemsthatareprincipallybasedontheideaofexploitingothersourcesofinformationtocompensateforthislackofratingdata.Inthischapter,weproposeanovelandgen-eralalgorithmicframeworkbasedonmatrixfactorizationthatsimultaneouslyexploitsthesimilarityinformationamongusersanditemstoalleviatethecold-startproblem.Incon-trasttoexistingmethods,ourproposedalgorithmdecouplesthefollowingtwoaspectsofthecold-startproblemtoelyexploitthesideinformation:(i)thecompletionofaratingsub-matrix,whichisgeneratedbyexcludingcold-startusers/itemsfromtheoriginalratingmatrix;and(ii)thetransductionofknowledgefromexistingratingstocold-startitems/usersusingsideinformation.Thiscrucialantlybooststheperformancewhenap-22propriatesideinformationisincorporated.Therecoveryerroroftheproposedalgorithmisanalyzedtheoreticallyand,tothebestofourknowledge,thisisthealgorithmthataddressesthecold-startproblemwithprovableguaranteesonperformance.Weconductthoroughexperimentsonrealdatasetsthatcomplementourtheoreticalresults.Theseex-perimentsdemonstratetheenessoftheproposedalgorithminhandlingthecold-startusers/itemsproblemandmitigatingdatasparsityissues.3.1.1TheSettingInthissectionweestablishthenotationswhichareusedthroughoutthechapter.Ourgeneralconventionthroughoutthischapter(andthesis)istouselowercaseletterssuchasuforscalarsandboldfacelowercaseletterssuchasuforvectors.Thesetofnon-negativerealnumbersisdenotedbyR+.Weuse[n]todenoteasetonintegersf1;2;:::;ng.WeuseboldfaceuppercaseletterssuchasMtodenotematrices.Thetransposeofavectorandamatrixdenotedbym>andM>,respectively.TheFrobeniusandspectralnormsofamatrixM2RnmisdenotedbykMkF,i.e,kMkF=qPni=1Pmj=1jMijj2andkMk2,respectively.ThenuclearnormofamatrixisdenotedbykMk=trace(pM>M).Weuse(M)ytodenotetheMoore-PenrosepseudoinverseofmatrixM.Thedotproductbetweentwovectorsmandnisdenotedbym>n.IncollaborativeweassumethatthereisasetofnusersU=fu1;:::;ungandasetofmitemsI=fi1;:::;imgwhereeachuseruiexpressesopinionsaboutasubsetofitems.Inthischapter,weassumeopinionsareexpressedthroughanexplicitnumericrating(e.g.,scalefromonetoe),butotherratingmethodssuchashyperlinkclicksarepossibleaswell.Wearemainlyinterestedinrecommendingasetofitemsforanactiveusersuchthattheuserhasnotratedtheseitemsbefore.Theratinginformationissummarizedinan23nmmatrixR2Rnm;1in;1jmwheretherowscorrespondtotheusersandthecolumnscorrespondtotheitemsand(p;q)-thentryistherategivenbyuseruptotheitemiq.Wenotethattheratingmatrixispartiallyobservedanditissparseinmostcases.Antandtiveapproachforrecommendersystemsistofactorizetheuser-itemratingmatrixRbyamultiplicativeofk-rankmatricesRˇUV>,whereU2RnkandV2Rmkutilizethefactorizeduser-spanditem-spmatrices,respectively,tomakefurthermissingdataprediction.Therearetwobasicformulationstosolvethisproblem:theseareoptimizationapproaches(seee.g.,[97,65,71,58])andprobabilisticapproaches[80].LetRbethesetofobservedratingsintheuser-itemmatrixR2Rnm,i.e.,R=f(i;j)2[n][m]:Rijhasbeenobservedg.Inoptimizationbasedmatrixfactorization,thegoalistolearnthelatentmatricesUandVbysolvingthefollowingoptimizationproblem:L(U;V)=minU;V12X(i;j)2RRi;ju>ivj2+UkUk2F+VkVk2F(3.1)TheoptimizationprobleminEq.(3.1)hasthreemainterms.Thetermaimstominimizetheinconsistencybetweentheobservedentriesandtheircorrespondingvaluesobtainedbythefactorizedmatrices.Thelasttwotermsregularizethelatentmatricesforusersanditems,respectively.TheparametersUandVareregularizationparametersthatareintroducedtocontroltheregularizationoflatentmatricesUandV,respectively.Afterlearningthelatentfeaturesforusersanditems,thepredictionofeachmissingentrycanbecomputedbytheinnerproductoflatentvectorsofthecorrespondingrowandthecorrespondingcolumn.Fornewusersoritemsinthesystem,sincethereisnoobservedratingdatainR,the24correspondingrowsandcolumnsintheratingmatrixarefullyunobservedwhichresultsinaninabilitytodrawinferencestoeitherrecommendexistingitemstonewusersornewitemstoexistingusers.Inparticular,itisnotfeasibleforstandardmatrixfactorizationmethodstoinaroworacolumnthatisentirelymissingintheoriginalratingmatrix.Inthischapter,weassumethatthereisasub-matrixM2Rpq;1pn;1qmwhichincludesenoughratingdatatobefullyrecoveredviastandardmethodssuchasmatrixfactorizationormatrixcompletion.WecalltherestofitemsandusersforwhichtheratingdataisentirelymissingandarenotpresentinMascold-start.Tomakerecommendationstocold-startusers/itemsweassumethatbesidestheobservedentriesinthematrixM,thereexisttwoauxiliarysimilaritymatricesA2RnnandB2Rmmthatcapturethepairwisesimilaritybetweenusersanditems,respectively.Thesimilaritymatricescanbecomputedfromtheavailableinformationsuchasusers'orsocialcontextoritems'features.Themainfocusofthischapterisonexploitingavailablesideinformationtoimprovetheaccuracyofrecommendationsandresolvecold-startitemanduserproblems.Althoughthemainfocusofthischapterisdealingwithcold-startproblems,ourgeneralframework,willalsoapplytootherproblemsinvolvingmatrixcompletionwithsideinforma-tionsuchasnetworkcompletion[73],whereasmallsampleofanetworkisobservedandsideinformationaboutthenodesareavailable(i.e.,R2f0;1;?gnnisthepartiallyobservedadjacencymatrixwhere'?'standsforaunobservedlinkandA=B2Rnn+isthepairwisesimilaritymatrixbetweenthenodes)andthegoalistoinfertheunobservedpartofthenetwork.253.1.2SubspacesharinganderrorpropagationBeforedelvingintothealgorithm,wediscussthematrixco-factorizationmethodusedtoexploitthesimilarityinformation.Thishasbeenproventobeveryeforhandlingcold-startproblems[102]andmotivatesourownalgorithm.Astraightforwardapproachtoexploitandtransferknowledgefromsimilaritymatricestotheratingdataistocasttheproblemasasharedsubspacelearningframework(a.k.amatrixco-factorization)basedonajointmatrixfactorizationtojointlylearnacommonsubsetofbasisvectorsfortheratingmatrixRandthesimilaritymatricesAandBforusersanditemsasformulatedinthefollowingoptimizationproblem:minU2Rnr;V2Rmr;W2Rnr;Z2Rmr12kRUV>k2F+kUk2F+kVk2F+12kAUW>k2F+12kBZV>k2F+kWk2F+kZk2F(3.2)withastheregularizationparameterforthenormsofthesolutionmatricesandcommonlatentspacerepresentationisachievedbyusingthesamematricesWandZ.Themainissuewiththisapproachisthatthecompletionoftheunobservedentriesinrat-ingmatrixRandtransductionofknowledgefromtheseentriestocold-startusers/itemsviasimilaritymatricesiscarriedoutsimultaneously.Therefore,thecompletionandtransduc-tionerrorsarepropagatedrepetitivelyanduncontrollably.Theissuewitherrorpropagationbecomesevenworseduetothenon-convexityofoptimizationproblemsinEq.(3.8){jointlyinparametersUandV.Inantoalleviatethisy,weproposeanalternativeapproachthatdiverges26fromthesealgorithmsandtransfersinformationfromsimilaritymatricestoaratingmatrixviathefullyrecoverablesub-matrixM.Inparticular,theproposedalgorithmdecouplesthecompletionfromtransductionandconstitutesoftwostages:(i)completionofthesub-matrixMwhichcanbedoneperfectlywithzerocompletionerrorwithaveryhighprobability,and(ii)transductionofratingdatafromtherecoveredsub-matrixtocold-startitemsandusersusingsimilaritymatrices.Thiscrucialgreatlybooststheperformanceoftheproposedalgorithmwhenappropriatesideinformationisincorporated.3.1.3AdecoupledsolutionWithourassumptiononthecorrelationofratingandsimilaritymatricesinplace,wecannowdescribeouralgorithm.Tothisend,weconstructanorthogonalmatrixUA=[uA1;;uAs]2Rnswhosecolumnspacesubsumestherowspaceoftheratingmatrix.WealsoconstructanotherorthogonalmatrixUB=[uB1;;uBs]2Rmswhosecolumnspacesubsumesthecolumnspaceoftheratingmatrix.ToconstructsubspacesUAandUBweusethestseigen-vectorscorrespondingtotheslargesteigen-valuesoftheprovidedsimilaritymatricesAandB,respectively.WenotethattheextenttowhichtheextractedsubspacesUAandUBfromsimilaritymatricessubsumethecorrespondingrowandcolumnspacesoftheratingmatrix,dependsontherichnessofthesimilarityinformation.Toformalizethis,fromthelow-rankassumptionoftheratingmatrixR,itfollowsthatitcanbedecomposedasR=Pri=1uiv>iwhereristherankofthematrix.Nowwenotethatthei-thlatentfeaturesvectoruicanbedecomposedinauniquewayintotwoparts,parallelandorthogonal:ui=uki+u?i,whereukiisthepartthatisspannedbythesubspaceUAextractedfromthesimilarityinformationandu?iisthepartorthogonaltoUA.27Inasimilarway,forthelatentfeaturesvectorofj-thitem,i.e.,vj,wehaveitsdecompo-sitionasvi=vki+v?i,wherevkiisthepartthatisspannedbythesubspaceUBextractedfromthesimilarityinformationaboutusersandv?iisthepartorthogonaltoUB.Wenotethattheorthogonalcomponentsofleftandsingularvectorsu?iandv?icapturetheextenttowhichthesimilaritymatricesdonotprovideinformationaboutratingdataandcannotberecoveredusingtheseauxiliaryinformation.Tobuildintuitionforthealgorithmthatwepropose,werelatetheratingmatrixtothesimilaritymatrices.Havingdecomposedthelatentfeaturesasaboveintotwoparallelandorthogonalcomponents,wecanrewritetheratingmatrixRas:R=rXi=1uiv>i=rXi=1(uki+u?i)(vki+v?i)>=rXi=1ukivk>i+rXi=1ukiv?>i+rXi=1u?ivk>i+rXi=1u?iv?>i=R+RL+RR+RE;(3.3)whereRisthepartoftheratingmatrixthatisfullyspannedbythesubspacesUAandUB,thematrixRListhepartwhereonlytheleftsingularvectorsarespannedbyUAandtherightsingularvectorsareorthogonaltothesubspacespannedbyUB,andthematrixRRisthepartthattheleftsingularvectorsareorthogonaltothesubspacespannedbyUAandtherightsingularvectorsarespannedbyUB.Finally,thematrixREistheerrormatrixwherebothleftandsingularvectorsareorthogonaltothesubspacesspannedbyUAandUB,respectively,whichdoesnotbformthesideinformationatall.Inparticular,theerrormatrixREcannotberecoveredfromthesideinformationastheextractedsubspacesprovidenoinformationabouttheorthogonalpartsu?iandv?iofthesingularvectors.Therefore,theerrorcontributedbythismatrixintotheestimationerrorofrecoveredratingmatrix28Algorithm1RECT1:Input:¶R2Rnm;r:thepartiallyobservedratingmatrixanditsrank·A2Rnn:theusers'similaritymatrix¸B2Rmm:theitems'similaritymatrix2:Extractthemaximalrecoverableratingsub-matrixM2Rpq(accordingtoTheorem??)3:Completethesub-matrixMtogetcM(accordingtoequation3.9)4:DecomposecMascM=Pri=1buibv>i5:ExtractsubspacesUAandUBbyspectralclusteringfromsimilaritymatricesAandB,respectively6:Computebai=cU>AcUAycU>Abui;i=1;2;;r7:Computebbi=cU>BcUBycU>Bbvi;i=1;2;;r8:ComputecR=UAPri=1baibb>iU>B9:Output:cRisunavoidable.Inthefollowingsubsections,wedeviseantivemethodtorecovertheratingmatrixRfromthesub-matrixMandsubspacesUAandUB,andthenprovidetheoret-icalguaranteesontheestimationerrorintermsofthemagnitudeoftheerrormatrixkREkF.Thecompletionstage.ThestepinAlgorithm2,afterextractingthesub-matrixM,istocompleteMtogetthefullyrecoveredmatrixcM.Todoso,weusethematrixfactorizationformulationinEq.(3.1)whichhasachievedgreatsuccessandpopularityamongtheexistingmatrixcompletiontechniques[110,80,104].Inparticular,wesolvethefollowingconvexoptimizationalgorithm[18]tofullyrecoverthesubmatrix:cM=argminX2RpqkXks.t.Xij=Mij;8(i;j)2M(3.4)29whereMisthesetofobservedratingsinM.Wenotethatbasedonmatrixcompletiontheory,itisguaranteedthatthelowrankma-trixMcanbeperfectlyrecoveredprovidedthatthenumberofobservedentriesist.Thetransductionstage.WenowturntorecoveringthematrixR=Pri=1uiv>ifromthesubmatrixcMandthesubspacesUAandUBextractedfromthesimilaritymatricesAandBaboutusersanditems,respectively.ThedetailedstepsoftheproposedcompletionalgorithmareshowninAlgorithm2.Inthenextstep,theratinginformationintherecoveredmatrixcMistransducedtothecold-startusersanditems.Tomotivatethetransductionstep,letusfocusontheRmatrixasinEq.(3.3).SinceukiandvkiarefullyspannedbythesubspacesUAandUBfollowingourconstructionabove,wecanwritethemas:uki=UAai;i=1;2;;rvki=UBbi;i=1;2;;r;(3.5)whereai2Rsandbi2Rs;i=1;2;;raretheorthogonalprojectionofthesingularvectorsontothecorrespondingsubspaces.BysubstitutingtheequationsinEq.(3.5)intothedecompositionofRweget:R=rXi=1ukivk>i=rXi=1UAaib>iU>B=UA0@rXi=1aib>i1AU>B(3.6)Fromtheabovederivation,weobservethatthekeyfortherecoveryofthematrixRistoestimatethevectorsai;bi;i=1;2;;r.Nextweshowhowtherecoveredratingsub-30matrixcM,alongwiththesubspacesextractedfromthesimilaritymatrices,canbeutilizedtoestimatethesevectorsundersomemildconditionsonthenumberofcold-startusersanditems.Tothisend,considerthedecompositionoftherecoveredmatrixascM=Pri=1buibv>i.Theestimationofvectorsai;bi;i=1;2;;rinEq.(3.6)andequivalentlythematrixRisasfollows.First,letcUA2RpsbearandomsubmatrixofUAwherethesampledrowscorrespondtothesubsetofrowsinthematrixcM.SimilarlyweconstructasubmatrixofUBdenotedbycUB2RqsbysamplingtherowsofUBcorrespondingtothecolumnsincM.Anestimationofai;bi;i2[r]vectorsisobtainedbyorthogonalprojectionofleftandrightsingularvectorsofcMontothesampledsubspacescUAandcUBbysolvingfollowingoptimizationproblems:bai=argmina2RsbuicUAa22;bbi=argminb2RsbvicUBa22:(3.7)Then,weestimateRby:cR=UA0@rXi=1baibb>i1AU>B=UAcU>AcUAycU>A0@rXi=1buibv>i1AcUBcU>BcUByUB;whereinthelastequalityweusedthefactthatcU>AcUAycU>AbuiandcU>BcUBycU>BbviaretheoptimalsolutionstotheordinaryleastsquaresregressionproblemsinEq.(3.7).Here()ydenotestheMoore-Penrosepseudoinverseofamatrix.TheestimatedratingmatrixcRissimplysettobecR=cR.Remark3.1.1.ThemaincomputationalcostinimplementingtheAlgorithm2isthespec-31tralclusteringofsimilaritymatrices,orthogonalprojectionsontosubspacescUAandcUB,andcomputingthesingularvaluedecompositionofcM.WenotethatalthoughtheoriginalratingmatrixRmightbelarge,therecoverablesub-matrixisantlysmallandcanbedecomposed3.2NetworkCompletionThissectioninvestigatesthenetworkcompletionproblem,whereitisassumedthatonlyasmallsampleofanetwork(e.g.,acompleteorpartiallyobservedsubgraphofasocialgraph)isobservedandwewouldliketoinfertheunobservedpartofthenetwork.Inthischapter,weassumethatbesidestheobservedsubgraph,sideinformationaboutthenodessuchasthepairwisesimilaritybetweenthemisalsoprovided.Incontrasttotheoriginalnetworkcompletionproblemwherethestandardmethodssuchasmatrixcompletionisinapplicableduethenon-uniformsamplingofobservedlinks,weshowthatbyelyexploitingthesideinformation,itispossibletoaccuratelypredicttheunobservedlinks.Incontrasttoexistingmatrixcompletionmethodswithsideinformationsuchassharedsubspacelearningandmatrixcompletionwithtransduction,theproposedalgorithmdecouplesthecompletionfromtransductiontoectivelyexploitthesimilarityinformation.Thiscrucialgreatlybooststheperformancewhenappropriatesimilarityinformationisused.Therecov-eryerroroftheproposedalgorithmistheoreticallyanalyzedbasedontherichnessofthesimilarityinformationandthesizeoftheobservedsubmatrix.Tothebestofourknowledge,thisisthealgorithmthataddressesthenetworkcompletionwithsimilarityofnodeswithprovableguarantees.ExperimentsonsyntheticandrealnetworksfromFacebookandGoogle+showthattheproposedtwo-stagemethodisabletoaccuratelyreconstructthe32networkandoutperformsothermethods.3.2.1TheSettingBeforeproceedingtotheproposedalgorithm,inthissectionweestablishthenotationusedthroughoutthesectionandformallydescribeourproblemsetting.Notation.Scalarsaredenotedbylowercaseletters,vectorsbyboldfacelowercaselettersuchasa,andmatricesbyboldfaceupperlettersuchasM.Weuse[n]todenotethesetofintegersf1;2;;ng.Thedotproductbetweentwovectorsaandbisdenotedbya>b.Theparallelandperpendicularprojectionsofavectoratoasubspacearedenotedbyakanda?,respectively.TheFrobenius,spectral,andtracenormsofamatrixMaredenotedbykMkF,kMk2,andkMk,respectively.Thetransposeofavectorandamatrixaredenotedbya>andM>,respectively.Weuse(A)ytodenotetheMoore-PenrosepseudoinverseofthematrixA.Toformalizethesetting,weassumethereisatrueundirectedunweightedgraphG=(V;E)onn=jVjdistinguishableverticeswithadjacencymatrixA2f0;1gnn.InthissectionweassumethatonlyapartiallyobservedsubmatrixO2f0;1;?gmm;1mninducedbyasamplesub-graphG0=(V;E0);E0ˆEoforiginalgraphisgiven.Here\0"representsaknownabsentedge,\1"denotesaknownpresentedge,and\?"indicatesanunobservededge.IncontrasttotheclassicallinkpredictionproblemwheretheassumptionisthatrandomentriesofAaremissing,innetworkcompletionproblem[55],theinformationaboutasubsetofnodesisentirelymissing,i.e.,thecorrespondingrowofthesenodesiscompletelymissing.However,weassumethatmissingedgesinOaresampleduniformlyatrandomandmatrixcompletionmethodsarecapableofcompletingthismatrix.Weassumethatbesidestheobservedentries,sideinformationaboutnodesinthesocial33graphisalsoavailable.Inparticular,itisassumedthatthepairwisesimilaritybetweennodesisobtainedfromfeaturesofnodesanditiscapturedinasimilaritymatrixS2Rnn.Theoverarchinggoalistoshowthatsideinformationcanpotentiallybtheprocessofnetworkcompletion.Inparticular,weaimatpredictingtheunobservededgestocompletetheadjacencymatrixAbasedonthepartiallyobservedsubmatirxOandthesimilaritymatrixS.3.2.2TheProposedAlgorithmWenowturntodescribingouralgorithmandtheassumptionsunderlyingit.Weassumethatthestructureofthenetworkandthesimilarityinformationarecorrelatedandtosomeextent,whichwillbeformalizedlater,sharethesamelatentinformation;thatis,therowvectorsinAshareanunderlyingsubspacespannedbytheleadingeigen-vectorsinthesimilaritymatricesS.Thisassumptionfollowsthefactthatthesimilarityofnodesprovidesauxiliaryinformationaboutthelinksthatexistbetweenthem;otherwisetherewouldnotbeanyhopetobefromtheseinformationincompletingthematrix.Beforedelvingintothealgorithm,wediscusstwoalternativemethodsandnoteafewoftheirforoursetting.Anaiveapproachtosolvethenetworkcompletionproblemistoapplythestandardtoolsfrommatrixcompletion[19,18,93]andinparticulartransductivematrixcompletion[42]byincorporatingSassideinformation1.However,asmentionedearlier,incontrasttotheclassicalmatrixcompletionormissing-linkinferenceproblemwheretheassumptionisthatrandomentriesofAaremissing,inourcasecompleterowsofAaremissing,whichmakesthestandardmatrixcompletionalgorithmsinapplicable.1Wenotethatformanyrealworldsocialnetworkstheunderlyingadjacencymatrixislow-rank(e.g.,see[21]).34Anotherstraightforwardapproachtoexploitandtransferknowledgefromsimilarityma-trixtopredictthestructureofnetworkistocasttheproblemasasharedsubspacelearningframeworkbasedonajointmatrixfactorizationtojointlylearnacommonsubsetofbasisvectorsfortheadjacencymatrixAandthecorrespondingsimilaritymatrixS.Inparticular,thegoalwouldbetofactorizethematricesAandSintothreesubspaces:oneissharedbetweentheadjacencyandthesimilaritymatrices,andtwoaresptothematricesasformulatedinthefollowingoptimizationproblem:minU;V;W12kAUV>k2F+12kSUW>k2F+kUk2F+kVk2F+kWk2F(3.8)withastheregularizationparameterforthenormsofthesolutionmatrices.ThemainissuewithsubspacesharingisthatthecompletionoftheunobservedentriesofadjacencymatrixAfromsampledobservedinOandtransductionofknowledgefromtheseentriestofullyunobservednodesviasimilaritymatrixS,iscarriedoutsimultaneously.Therefore,thecompletionandtransductionerrorsarepropagatedrepetitivelyandinanuncontrolledwaythathinderstheenessofsimilarityinformation.Intoalleviatetheses,weproposeanalternativeapproachthatdivergesfromtheresalgorithmsandtransfersinformationfromsimilaritymatricestoadjacencyma-trixviathefullyrecoverablesubmatrixO.Inparticular,theproposedalgorithmdecouplesthecompletionfromtransductionandconstitutesoftwostages:(i)completionofthepar-tiallyobservedsubmatrixOwhichcanbedoneperfectlywithzerocompletionerror,and(ii)transductionoflinksfromtherecoveredsubmatrixtounobservednodesusingsimilaritymatrix.Thiscrucialnceprovidesantivewaytoexploitthesimilarityinforma-35Algorithm2NetworkCompletionwithSideInformation[34]1:Input:n:thenumberofnodesinnetwrokG=(V;E)O:theadjacencymatrixofsubgraphwithmnodesS:thepartiallyobservedpairwisesimilaritymatrixsrank(A):numberofeigenvectorsinsubspace[Extraction]2:ExtractUsfromSbyspectralclustering3:CompletethesubmatrixObysolvingtheconvexoptimizationproblemin(3.9)togetcO[Completion]4:SamplemrowsofUs2RnsuniformlyatrandomtocreatematrixcUs2Rms5:Setb=cU>scUsycU>scOcUscU>scUsy6:Output:cA=UsbU>s[Transduction]tionandgreatlybooststheperformanceoftheproposedalgorithmwhenappropriatesideinformationisincorporated.Withourassumptiononthecorrelationofadjacencymatrixandnodesimilaritiesinplace,wecannowdescribeouralgorithm.ThedetailedstepsoftheproposedalgorithmsareshowninAlgorithm2.Inwhatfollows,wewilldiscusseachstageingreaterdetail.Subspaceextraction.ThestepinAlgorithm2istoextractarepresentativesubsapcefromthesimilaritymatrixS.Tothisend,weextractanorthogonalmatrixUs2Rnsfromsimilaritymatrixwhosecolumnspacesubsumesthecolumnspaceofadjacencymatrix,wheresischosentobelargertherankofadjacencymatrix,i.e.,srank(A).ToconstructsubspacesUsweusetheseigen-vectorscorrespondingtotheslargesteigen-valuesoftheprovidedsimilaritymatricesSwhichcanbedonebysingularvaluedecompositionorspectralclustering[67].Completion.Inthesecondstep,werecoverthepartiallyobservedsubmatrixO.AlthoughtoolsfrommatrixcompletionarenotapplicabletothefulladjacencymatrixA,butsince36basedonourassumptiontheobservedlinksinthesampledsubgraphareuniformlysam-pled,wecanrecoverthesubmatrixOusingtoolsfommatrixcompletion.Todoso,letbethesetofobservedlinksintheinducedsubmatirxO,i.e.,=f(i;j)2[n][n]:Oijhasbeenobservedg.Thenwesolvethefollowingconvexoptimizationalgorithm[18]tofullyrecoverthesubmatrix:minimizekXks.t.Xij=Oij;8(i;j)2:(3.9)WeusecOtodenotetheoptimalsolutionoftheoptimizationprobleminEq.(3.9).Transduction.HavingrecoveredthesubmatrixandextractedthesubspaceUsfromthesimilaritymatrix,wenowturntothetransductionstep.Todoso,wenotethatsinceboththeadjacencymatrixofthesocialnetworkAandtherecoveredsubmatrixcOarelowrank,wecandecomposethesematricesas:A=rXi=1aiai>andcO=rXi=1baiba>i(3.10)whereristherankofadjacencymatrix.Toseethis,wecanconsiderai2f1;0gnandbai2f1;0gmasthecorrespondingmembershipsassignmenttotheithhiddencomponentsofthegraph.Wenotethatifthesimilaritymatrixissettobeequivalenttotheadjacencymatrix,thentheindicatorvectorsofconnectedcomponentsareexactlya1;a2;;ar.Toformalizethecorrelationofsimilarityinformationandtheadjacencymatrix,weas-sumethatbothmatricesshareacommonsubspace.Thereforeeachai;i2[r]canbede-composedinauniquewayintotwoparallelandorthogonalpartsasai=aki+a?i;whereaki37belongstocolumnspanofUsanda?iisorthogonaltocolumnspanofUs.Moreover,sinceakibelongstocolumnspamofUswecanwriteaki=Usbi;i=1;2;;r;(3.11)forsomebi2Rs.Tobuildintuitionforthetransductionstep,werelatetheadjacencymatrixtothesimilaritymatrix.Havingdecomposedthelatentfeaturesasaboveintotwoparallelandorthogonalcomponents,wecanrewritetheadjacencymatrixas:A=rXi=1aiai>=rXi=1(aki+a?i)(aki+a?i)>=rXi=1akiaki>+rXi=1a?ia?i>=A+AE;(3.12)whereAisthepartfullycapturedbythesimilarityinformation,andthematrixAEistheerrormatrixwhosesingularvectorsareorthogonaltothesubspacesspannedbythesimilaritysubspaceanddonotbformthesideinformationatall.Inparticular,theerrormatrixAEcannotberecoveredfromthesideinformationastheextractedsubspacesprovidenoinformationabouttheorthogonalparta?iofthesingularvectors.Therefore,theerrorcontributedbythismatrixintotherecoveryerrorofinferredadjacencymatrixisunavoidable.BycombiningEq.(3.11)withEq.(3.12),theadjacencymatrixcanbewrittenas:A=rXi=1akiaki>+AE=Us0@rXi=1bibi>1AU>s+AE:(3.13)38Fromabovederivation,weobservethatthekeytorecoverthematrixAistoestimatethevectorsbi;i=1;2;;r.InthefollowingweshowhowtherecoveredsubmatrixcOalongwiththesubspaceUsextractedfromthesimilaritymatrixcanbeutilizedtoestimatethesevectorsperfectlyundersomemildconditiononthenumberofsamplednodesinthesubgraphG0=(V;E0).Thekeyideaisthatalthoughwecannotsolvethelinearsystemaki=Usbiforbitoestimateaiasakiisnotaccessible,butwecanreplaceaiandUswithotheraccessiblevectorbaiandsubspacecUs,respectively,andestimateakiperfectlyinhighprobability.InparticularletcUsbearandomsubspaceobtainedbycUs=swhereisamnrandommatrixdistributedasthemrowsofuniformlyrandompermutationmatrixofsizenwhereselectedrowscorrespondtotherowsinO.Toobtainanestimateofbiwesolvethefollowingoptimizationproblem:bbi=minb2RsbaicUsb22=cU>scUsycU>sbai;where()ydenotestheMoore-PenrosepseudoinverseofamatrixandinthelastequalityweusedthefactthatcU>scUsycU>sistheoptimalsolutiontotheordinaryleastsquaresregressionproblemabove.Havingcomputedanestimateforeachbi,werecoverthefullmatrixby:cA=Us0@rXi=1bbibb>i1AU>s:393.3AnalysisofRecoveryErrorInordertoseetheimpactofsimilarityinformationonpredictingthemissingstructureofthenetwork,wetheoreticallyanalyzetherecoveryerroroftheproposedalgorithm.Inparticular,theperformanceofAlgorithm2oninferringthefullstructureoftheinputgraphwithadjacencymatrixAisstatedinthefollowingtheorem[10].Theorem3.3.1.LetA2f0;1gnnbetheadjacencymatrixofthesocialgraphG=(V;E)withcoherenceparameter.LetO2f0;1;?gmmbeapartiallyobservedsubmatrixofAwherethelinksareuniformlysampledwithm8lograndnumberofobservedlinksjjC4mr2log2m;whereristherankoftheoriginalmatrixA.LetcAbetherecoveredmatrixbyAlgorithm2usingsimilaritymatrixSaboutnodes.Thenwithprobability1n3,itholds:kAcAkFnmkAEk2F+1+nrnmkAEkF;whereAE=Pri=1a?ia?i>istheorthogonalcomponentofprojectionofadjacencymatrixtothesubspaceofsimilaritymatrix.Inaboveinequality,theparameterisknownasincoherence[18,93]whichisthepre-vailingassumptioninanalysisofmatrixcompletionalgorithms.ThisparameterstatesthatthesingularvectorsofAshouldbede-localized,inthesenseofhavingsmallinnerproduct40withthestandardbasis,tomakethefullrecoverypossiblewhichisformallybelow:3.3.2.AnnnmatrixMwithsingularvaluedecompositionM=>is-incoherentifmaxi;jjUijjppnandmaxi;jjVijjppn:(3.14)BeforeprovingTheorem3.3.1,letuspausetomakesomeremarksconcerningtheresultgivenabove.First,onecanobservethattherecoveryerrorisstatedintermsofthenormoferrormatrixAEwhichcapturestheextenttowhichthesimilaritymatrixfailstoinferthelinksbetweenusers.Also,theerrordecreasesbyreducingthenumberofunobservednodesasexpected.Thefollowingtheoremwhichfollowsfrom[19]showsthatunderhighincoherenceanduniformsampling,solvingEq.(3.9)exactlyrecoversOwithhighprobability.Lemma3.3.3.LetMbeanmmmatrixwithrankr.Inaddition,assumeMis-incoherent.ThenthereexistssomeconstantC,suchthatifC4mr2log2mandentriesareuniformlysampled,thenwithprobabilityatleast1n3,MistheuniqueoptimizerofEq.(3.9).Lemma3.3.3indicates,ifthattheobservedsubgraphhasenoughlinks,i.e.,jjC4mr2log2m,thatonecanfullyrecoverthesubgraph.Therefore,therecoveryerrorofinducedsubmatrixiszero,providedthenumberofobservedexamplesistlylargeanduniformlysampled.3.3.1ProofofTheorem3.3.1WenowprovetheresultinTheorem3.3.1.41Proof.(ofTheorem3.3.1)Toprovetheresult,werewritetheerroras:kAcAkF=kA+AEcAkFkAcAkF+kAEkF=Us0@rXi=1bib>i1AU>sUs0@rXi=1bbibb>i1AU>sF+kAEkF=Us0@rXi=1bib>irXi=1bbibb>i1AU>sF+kAEkF=rXi=1bib>irXi=1bbibb>iF+kAEkFForsimplicity,wetwomatricesB=[b1;b2;;br]andbB=[bb1;bb2;;bbr].Thentheaboveinequalitycanbewrittenas:kAcAkFB>BbB>bBF+kAEkFToboundthetermintheR.H.Sofaboveinequality,byasimplealgebraicinequality,wewriteitas:kB>BbB>bBkFkBbBk2F+2k(BbB)bBkFkBbBk2F+2kBbBkFkBkFNowweturntoboundingkBbBkF.Tothisend,wehave:kBbBk2F=rXi=1kbibbik22(3.15)First,weshowthatbbi=bi+Ma?i;whereM=U>s>syU>s>:42Wehave,bbi=cU>scUsycU>sbai=(s)>sy(s)>i=U>s>syU>s>aki+a?i=bi+U>s>syU>s>?i=bi+Ma?i:BysubstitutingthisinequalityinEq.(3.15)wehave,kBbBk2F=rXi=1kMa?ik22rXi=1max(M)ka?ik22=max(M)rXi=1ka?ik22=max(M)kAEk2F=max(cU1s)kAEk2FPuttingalltheinequalitiestogetherandnotingthatkBkFn,yieldskAcAkFkBbBk2F+2kBbBkFkBkF+kAEkFmax(cU1s)kAEk2F+2nqmax(cU1s)kAEk2F+kAEkF(3.16)Withtheboundingofmax(cU1s)whichreliesonthefollowingresult[40].Lemma3.3.4.LetUbeannbykmatrixwithorthonormalcolumns.TaketobethecoherenceofU.Select2(0;1)andanonzerofailureprobability.Letbearandommatrixdistributedasthepcolumnsofauniformlyrandompermutationmatrixofsizen,wherep2(1)2klog k!43Thenwithprobabilityexceeding1,thematrixU>hasfullrowrankandU>y22n:ApplyingLemma3.3.4totheinequalityinEq.(3.17)yields:kA~AkFn+2nrn+1kAEk2F(3.17)Bysetting=12inLemma3.3.4wegetthedesiredboundstatedintheTheorem3.3.1.44Chapter4Semi-supervisedCollaborativeRankingAlgorithmExistingcollaborativerankingbasedrecommendersystemstendtoperformbestwhenthereisenoughobservedratingsforeachuserandtheobservationismadecompletelyatrandom.Underthissettingrecommendersystemscanproperlysuggestalistofrecommendationsaccordingtotheuserinterests.However,whentheobservedratingsareextremelysparse(e.g.inthecaseofcold-startuserswherenoratingdataisavailable),andarenotsampleduniformlyatrandom,existingrankingmethodsfailtoelyleveragesideinformationtotransducttheknowledgefromexistingratingstounobservedones.Weproposeasemi-supervisedcollaborativerankingmodel,dubbedS2COR[8],toimprovethequalityofcold-startrecommendation.S2CORmitigatesthesparsityissuebyleveragingsideinformationaboutbothobservedandmissingratingsbycollaborativelylearningtherankingmodel.Thisenablesittodealwiththecaseofmissingdatanotatrandom,buttoalsoectivelyincorporatetheavailablesideinformationintransduction.Weexperimentallyevaluatedourproposedalgorithmonanumberofchallengingreal-worlddatasetsandcomparedagainststate-of-the-artmodelsforcold-startrecommendation.Wereporttlyhigherqualityrecommendationswithouralgorithmcomparedtothestate-of-the-art.454.1TheSettingInthissectionweestablishthenotationusedthroughoutthethesisandformallydescribeourproblemsetting.Scalarsaredenotedbylowercaselettersandvectorsbyboldfacelowercaseletterssuchasu.WeuseboldfaceuppercaseletterssuchasMtodenotematrices.TheFrobeniusnormofamatrixM2RnmisdenotedbykMkF,i.e,kMkF=qPni=1Pmj=1jMijj2andits(i;j)thentryisdenotedbyAi;j.ThetracenormofamatrixisdenotedbykMkwhichisasthesumofitssingularvalues.Thetransposeofavectorandamatrixdenotedbyu>andU>,respectively.Weuse[n]todenotethesetonintegersf1;2;;ng.Thesetofnon-negativerealnumbersisdenotedbyR+.TheindicatorfunctionisdenotedbyI[].Foravectoru2Rpweusekuk1=Ppi=1juij,kuk2=Ppi=1juij21=2,andkuk1=max1ipuitodenoteits`1,`2,and`1norms,respectively.Thedotproductbetweentwovectorsuandu0isdenotedbyeitherhu;u0ioru>u0.IncollaborativeweassumethatthereisasetofnusersU=fu1;;ungandasetofmitemsI=fi1;;imgwhereeachuseruiexpressesopinionsaboutasetofitems.TheratinginformationissummarizedinannmmatrixR21;+1;?gnm;1in;1jmwheretherowscorrespondtotheusersandthecolumnscorrespondtotheitemsand(p;q)thentryistherategivenbyuseruptotheitemiq.Wenotethattheratingmatrixispartiallyobservedanditissparseinmostcases.Wearemainlyinterestedinrecommendingasetofitemsforanactiveusersuchthattheuserhasnotratedtheseitemsbefore.464.2TransductiveCollaboratingRankingWenowturnourattentiontothemainthrustofthethesiswherewepresentourtransductivecollaborativerankingalgorithmwithaccuracyattopbyexploitingthefeaturesofunrateddata.Webeginwiththebasicformulationandthenextendittoincorporatetheunrateditems.Thepseudo-codeoftheresultinglearningalgorithmisprovidedinAlgorithm3.4.2.1AbasicformulationWeconsiderarankingproblem,where,givenasetofusersUandknownuserfeedbackonasetofitemsI,thegoalistogeneraterankingsofunobserveditems,adaptedtoeachoftheusers'preferences.Hereweconsiderthebipartitesettinginwhichitemsareeitherrelevant(positive)orirrelevant(negative).Manyrankingmethodshavebeendevelopedforbipartiteranking,andmostofthemareessentiallybasedonpairwiseranking.Thesealgorithmsreducetherankingproblemintoabinaryproblembytreatingeachrelevant/irrelevantinstancepairasasingleobjecttobe[?].Asmentionedabove,mostresearchhasconcentratedontheratingpredictionprobleminCFwheretheaimistoaccuratelypredicttheratingsfortheunrateditemsforeachuser.However,mostapplicationsthatuseCFtypicallyaimtorecommendonlyasmallrankedsetofitemstoeachuser.Thusratherthanconcentratingonratingpredictionweinsteadapproachthisproblemfromtherankingviewpointwherethegoalistoranktheunrateditemsintheorderofrelevancetotheuser.Moreover,itisdesirabletoconcentrateaggressivelyontopportionoftherankedlisttoincludemostlyrelevantitemsandpushirrelevantitemsdownfromthetop.Sp,weproposeanalgorithmthatmaximizesthenumberofrelevantitemswhicharepushedtotheabsolutetopofthelistbyutilizing47Algorithm3S2COR1:input:2R+:theregularizationparameter,andftgt1:thesequenceofscalarstepsizes2:InitializeW02Rnd3:Chooseanappropriatestepsize4:fort=1;:::;Tdo5:Computethesub-gradientofGt2@L(Wt)usingEq.(4.11)6:[Ut;t;Vt] SVD(Wt11t1Gt))7:Wt Utt1I+V>t8:endfor9:output:theP-NormPushrankingmeasurewhichisspeciallydesignedforthispurpose[99].Forsimplicityofexposition,letusconsidertherankingmodelforasingleuseru.LetX+=fx+1;;x+n+gandX=fx1;;xngbethesetoffeaturevectorsofn+relevantandnirrelevantitemstouseru,respectively.Weconsiderlinearrankingfunctionswhereeachitemfeaturesvectorx2Rdismappedtoascorew>x.Thegoalistoparameterswforeachusersuchthattherankingfunctionbestcapturespastfeedbackfromtheuser.Thegoalofrankingistomaximizethenumberofrelevantitemsrankedabovethehighest-rankingirrelevantitem.Wecastthisideaforeachuseruindividuallyintothefollowingoptimizationproblem:minw2Rd1n+n+Xi=1I"hw;x+iimax1jnhw;xji#(4.1)whereI[]istheindicatorfunctionwhichreturns1whentheinputistrueand0otherwise,n+andnarethethenumberofrelevantandirrelevantitemstouseru,respectively.Letusnowderivethegeneralformofourobjective.Wehypothesizethatmostusersbasetheirdecisionsaboutitemsbasedonanumberoflatentfeaturesabouttheitems.Inordertouncovertheselatentfeaturedimensions,weimposealow-rankconstraintonthe48setofparametersforallusers.Tothisend,letW=[w1;w2;;wn]>2Rnddenotethematrixofallparametervectorsfornusers.LetI+if1;2;:::;mgandIif1;2;:::;mgbethesetofrelevantandirrelevantitemsofithuser,respectively.Theoverallobjectiveforallusersisformulatedasfollows:F(W)=kWk+nXi=10BB@1jI+ijXj2I+iI264hwi;xjimaxk2Iihwi;xki3751CCA;(4.2)wherekkisthetracenorm(alsoknownasnuclearnorm)whichisthesumofthesingularvaluesoftheinputmatrix.TheobjectiveinEq.(4.2)iscomposedoftwoterms.Thetermistheregularizationtermandisintroducedtocapturethefactormodelintuitiondiscussedabove.Thepremisebehindafactormodelisthatthereisonlyasmallnumberoffactorstheprefer-ences,andthatauser'spreferencevectorisdeterminedbyhoweachfactorappliestothatuser.Therefore,theparametervectorsofallusersmustlieinalow-dimensionalsubspace.Trace-normregularizationisawidely-usedandsuccessfulapproachforcollaborativeandmatrixcompletion.Thetrace-normregularizationiswell-knowntobeaconvexsurro-gatetothematrixrank,andhasrepeatedlyshowngoodperformanceinpractice[110,20].Thesecondtermisintroducedtopushtherelevantitemsofeachusertothetopofthelistwhenrankedbasedontheparametervectoroftheuserandfeaturesofitems.Theaboveoptimizationproblemisintractableduetothenon-convexindicatorfunction.Todesignpracticallearningalgorithms,wereplacetheindicatorfunctionin(4.2)withitsconvexsurrogate.Tothisend,theconvexlossfunction`:R7!R+as`(x)=49[1x]+.ThisisthewidelyusedhingelossinSVMtion(seee.g.,[15])1.ThislossfunctiontheamountbywhichtheconstraintsarenotByreplacingthenon-convexindicatorfunctionwiththisconvexsurrogateleadstothefollowingtractableconvexoptimizationproblem:F(W)=kWk+nXi=10BB@1jI+ijXj2I+i`hwi;xjikXiwik11CCA(4.3)whereXi=[x1;:::;xni]>isthematrixoffeaturesofniirrelevantitemsinIiandkk1isthemaxnormofavector.4.2.2Semi-supervisedcollaborativerankingInthispart,weextendtheproposedrankingideatolearnbothfromratedaswellasunrateditems.Themotivationofincorporatingunrateditemscomesfromthefollowingkeyobservations.First,wenotethatcommonlythereisasmallsetofrated(eitherrelevantorirrelevant)itemsforeachuserandalargenumberofunrateditems.AsitcanbeseenfromEq.(4.2),theunrateditemsdonotplayanyroleinlearningthemodelforeachuserasthelearningisonlybasedonthepairofrateditems.Whenthefeatureinformationforitemsisavailable,itwouldbeveryhelpfulifonecanleveragesuchunrateditemsinthelearning-to-rankprocesstoelyleveragetheavailablesideinformation.Byleveragingbothtypesofratedandunrateditems,wecancompensateforthelackofratingdata.Second,the1Wenotethatotherconvexlossfunctionssuchasexponentialloss`(x)=exp(x),andlogisticloss`(x)=log(1+exp(x))alsocanbeusedasthesurrogatesofindicatorfunction,butforthesimplicityofderivationweonlyconsiderthehingelosshere.50non-randomnessinobservingtheobservedratingscreatesabiasinlearningthemodelthatmaydegradetheresultingrecommendationaccuracy.Therefore,aprecisemodeltoreducethectofbiasintroducedbynon-randommissingratingsseemsessential.Toaddressthesetwoissues,weextendthebasicformulationinEq.(4.2)toincorporateitemswithmissingratingsinrankingofitemsforindividualusers.Aconservativesolutionistopushtheitemswithunknownratingstothemiddleofrankedlist,i.e.,aftertherelevantandbeforetheirrelevantitems.Todoso,letIi=InI+i[Iidenotethesetofitemsunratedforuseri2U.WeintroducetwoextratermsintheobjectiveinEq.(4.2)topushtheunrateditemsIibelowtherelevantitemsandabovetheirrelevantitems,whichyileldsthefollowingobjective:L(w)=1jI+ijXi2I+i`0B@hw;xiimaxj2Iihw;xji1CA+1jI+ijXi2I+i`0@hw;xiimaxj2Iihw;xji1A+1jIijXi2Ii`0B@hw;xiimaxj2Iihw;xji1CA(4.4)Equippedwiththeobjectiveofindividualusers,wenowturntothecollaborating51rankingobjectiveas:F(W)=kWk+nXi=10BB@1jI+ijXj2I+i`hwi;xjikXiwik11CCA+nXi=10BB@1jI+ijXj2I+i`hwi;xjikXiwik11CCA+nXi=10B@1jIijXj2Ii`hwi;xjikXiwik11CA;(4.5)whereXi=[x1;:::;xni]>isthematrixofniunrateditemsinIi.Remark4.2.1.Weemphasizethatbeyondtheaccuracyconsiderationsofpushnormattopofthelist,thepushnormhasaclearadvantagetothepairwiserankingmodelsfromacomputationalpointofview.Inparticular,thepushnormhasalinearO(m)dependencyonthenumberofitemswhichisquadraticO(m2)forpairwiserankingmodels.4.3TheOptimizationWenowturntosolvingtheoptimizationproblemin(4.5).Westartbydiscussingagradientdescentmethodwithshrinkageoperatorfollowedbyitsacceleratedversion,andthenproposeanon-convexformulationwithalternativeminimizationformoreeoptimizationofobjectiveinS2COR.524.3.1GradientdescentwithshrinkageoperatorDuetothepresenceoftracenormoftheparametersmatrix,thisobjectivefunctionfallsintothegeneralcategoryofcompositeoptimization,whichcanbesolvedbystochasticgradientorgradientdescentmethods.Inthispartweproposeaprojectedgradientdecentmethodtosolvetheoptimizationproblem.Firstwewritetheobjectiveas:minW2RndF(W)=kWk+L(W);(4.6)whereL(W)=Pni=1L(wi).Asimplewaytosolvingtheaboveoptimizationproblemisgradientdescentalgorithm[84],whichneedstoevaluatethegradientofobjectiveateachiteration.Todealwiththenon-smoothtracenormkWkintheobjective,wenotethattheoptimizationprobleminEq.(4.6)canbereformulatedundertheframeworkofproximalregularizationorcompositegradientmapping[84].Bytakingadvantageofthecompositestructureitispossibletoretainthesameconvergenceratesofthegradientmethodforthesmoothoptimizationproblems.Inparticular,theoptimizationproblemin(4.6)canbesolvediterativelyby:Wt=argminWL(Wt1)+tr(WWt1)>rL(Wt1)+t2kWWt1k2F+kWk;(4.7)whereftgt1isasequenceofscalarstepsizesandtr()isthetraceofinputmatrix.Byignoringtheconstantterms,Eq.(4.7)canalsoberewrittenas:t2W Wt11trL(Wt1)!2F+kWk:(4.8)53Weusethesingularvalueshrinkageoperatorintroducedin[18]totheoptimalsolutiontoEq.(4.8).Tothisend,considerthesingularvaluedecomposition(SVD)ofamatrixM2RndofrankrasM=;=diag(f˙ig1ir);whereUandVarerespectivelynranddrmatriceswithorthonormalcolumns,andthesingularvalues˙iarepositive.Forascalar˝2R+,thesingularvalueshrinkageoperatorP˝as:P˝(M):=UP˝()V;P˝()=diag([˙i˝]+g);(4.9)where[x]+isthepositivepartofx,namely,[x]+=max(0;x).Theshrinkageoperatorbasicallyappliesasoft-thresholdingruletothesingularvaluesofM,ectivelyshrinkingthesetowardszero.Theorem4.3.1(Theorem2.1,[18]).Foreach˝0andW2Rnd,thesingularvalueshrinkageoperator(??)obeysP˝(W)=argminXˆ12kXWk2F+˝kXk˙:(4.10)TheabovetheoremshowsthatthesingularvalueshrinkageoperatorofamatrixisthesolutiontoEq.(4.10).Equippedwiththisresult,theoptimizationprobleminEq.(4.8)canbesolvedbycomputingtheSVDofupdatedWt1andthenapplyingsoftthresholdingonthesingularvaluesas:Wt=Pt1 Wt11t1rL(Wt1)!NowweneedtoevaluatethegradientofL(W)atWt1.TheconvexfunctionL(W)isnottiable,soweuseitssubgradientinupdatingthesolutionswhichcanbecomputed54forithparametervectorwi,asfollows:gi=@L=@wi=Xj2I+iIhhwi;xjikXiwik11i@kXiwik1xj+Xj2I+iIhhwi;xjikXiwik11i@kXiwik1xj+Xj2IiIhhwi;xjikXiwik11i@kXiwik1xj;(4.11)where@kXiwik1isthesubtialofthefunctionkXiwik1atpointwi.Sincethesubtialofthemaximumoffunctionsistheconvexhulloftheunionofsubtialsoftheactivefunctionsatpointw[83],wehave:@kXiwik1=@max1j2Iihwi;xji=convnxjjhwi;xji=kXiwik1;j2Ijo:Then,G=[g1;g2;:::;gn]>2RndisasubgradientatW,i.e.G2@L(W).Remark4.3.2.Wenotethathere,fortheeaseofexposition,weonlyconsiderthenon-entiableconvexhingelossasthesurrogateofnon-convexduetoitscomputationalvirtues.However,byusingothersmoothconvexsurrogatelossessuchassmoothedhingelossorexponentialloss,onecanapplytheacceleratedgradientdescentmethods[83]tosolvetheoptimizationproblemwhichresultsinantlyfasterconvergenceratecomparedtothenaivegradientdescent(i.e,O(1=p)convergencerateforacceleratedmethodcomparedtotheO(12)rateforgradientdescentfornon-smoothoptimization,whereisthetarget55accuracy).4.3.2toptimizationbydroppingconvexityThemaincomputationalcostineachiterationoftheS2CORalgorithmliesincomputingtheSVDdecompositionofWk.AnalternativecheaperwaytosolvingtheoptimizationprobleminEq.(4.6)isasfollows.ForadrankofthetargetparametermatrixW,sayk,onecandecomposeitasW=UV.FromtheequivalencerelationbetweentracenormandtheFrobeniusofitscomponentsindecomposition,kWk=minU2Rnk;V2RmkW=UV>12kUk2F+kVk2Fwecanwritetheobjectiveintermsoftheselowrankcomponentsas:minU2Rnk;V2Rmk2kUk2F+kVk2F+L(UV);(4.12)Thesefactoredoptimizationproblemdoesnothavetheexplicittracenormregularization.However,thenewformulationisnon-convexandpotentiallysubjecttostationarypointsthatarenotgloballyoptimal.However,despiteitsnon-convexity,theformulationinEq.(4.12)iscompetitiveascomparedtotrace-normminimization,whilescalabilityismuchbetter.Inparticular,theobjectiveisnotjointlyconvexinbothUandVbutitisconvexineachofthemtheotherone.Therefore,toalocalsolutiononecansticktothestandard56Algorithm4S2COR+1:input:;2R+:theregularizationparameters,ftgt1:thesequenceofscalarstepsizes,andS2Rnn:thesimilaritymatrixofusets2:InitializeW02Rnd3:Chooseanappropriatestepsize4:fort=1;:::;Tdo5:Computethesub-gradientofGt2@L(Wt)usingEq.(4.11)6:ComputecGt=Gt+2XX>WL7:[Ut;t;Vt] SVD(Wt11t1cGt))8:Wt Utt1I+V>t9:endfor10:output:gradientdescentmethodtoasolutioninaniterativemannerasfollows:Ut+1 (1t)UttrULjU=Ut;V=Vt;Vt+1 (1t)VttrVLjU=Ut;V=Vt:Remark4.3.3.Itisremarkablethatthelargenumberofusersoritemsmaycausecom-putationalproblemsinsolvingtheoptimizationproblemusingGDmethod.Thereasonisessentiallythefactthatcomputingthegradientateachiterationrequirestogothroughalltheusersandcomputethegradientforpairofitems.Toalleviatethisproblemonecanuti-lizestochasticgradientmethod[82]tosolvetheoptimizationproblem.Themainideaistochooseadsubsetofpairsforgradientcomputationinsteadofallpairsateachiterationorasampleauseratrandomforgradientcomputationinsteadofincludingallusers.Wenotethatthisstrategygeneratesunbiasedestimatesofthetruegradientandmakeseachiterationofalgorithmcomputationallymorecomparedtothefullgradientcounterpart.57Chapter5PushTrust:AntRecommendationAlgorithmbyLeveragingTrustandDistrustRelationsTheofsocial-enhancedrecommendersystemsisincreasing,alongwithitspracti-cality,asonlinereviews,ratings,friendshiplinks,andfollowerrelationshipsareincreasinglybecomingavailable.Inrecentyears,therehasbeenanupsurgeofinterestinexploitingsocialinformation,suchastrustanddistrustrelationsinrecommendationalgorithms.Thegoalistoimprovethequalityofsuggestionsandmitigatethedatasparsityandthecold-startusersproblemsinexistingsystems.Inthispaper,weintroduceageneralcollaborativesocialrankingmodeltorankthelatentfeaturesofusersextractedfromratingdatabasedonthesocialcontextofusers.Incontrasttoexistingsocialregularizationmethods,theproposedframeworkisabletosimultaneouslyleveragetrust,distrust,andneutralrelations,andhasalineardependencyonthesocialnetworksize.Byintegratingtherankingbasedsocialregu-larizationideaintothematrixfactorizationalgorithm,weproposeanovelrecommendationalgorithm,dubbedPushTrust.OurexperimentsontheEpinionsdatasetdemonstratethat58collaborativelyrankingthelatentfeaturesofusersbyexploitingtrustanddistrustrelationsleadstoasubstantialincreaseinperformance,andtoelydealwithcold-startusersproblem.5.1TheSettingInthissectionweestablishthenotationusedthroughoutthepaperandformallydescribeourproblemsetting.Weadoptthefollowingnotationthroughoutthepaper.Scalarsaredenotedbylowercaselettersandvectorsbyboldfacelowercaseletterssuchasu.WeuseboldfaceuppercaseletterssuchasAtodenotematrices.TheFrobeniusnormofamatrixA2RnmisdenotedbykAkF,i.e,kAkF=qPni=1Pmj=1jAijj2andits(i;j)thentryisdenotedbyAi;j.Thetransposeofavectorandamatrixdenotedbyu>andU>,respectively.Weuse[n]todenotethesetonintegersf1;2;;ng.Thesetofnon-negativerealnumbersisdenotedbyR+.TheindicatorfunctionisdenotedbyI[].Foravectoru2Rpweusekuk1=Ppi=1juij,kuk2=Ppi=1juij21=2,andkuk1=max1ipuitodenoteits`1,`2,and`1norms,respectively.Thedotproductbetweentwovectorsuandu0isdenotedbyeitherhu;u0ioru>u0.Weuse[z]+=max(0;z)todenotethepositivecomponentofascalar.5.1.1MatrixfactorizationforrecommendationIncollaborativeweassumethatthereisasetofnusersU=fu1;;ungandasetofmitemsI=fi1;;imgwhereeachuseruiexpressesopinionsaboutasetofitems.Inthispaper,weassumeopinionsareexpressedthroughanexplicitnumericrating(e.g.,scalefromonetoe),butotherratingmethodssuchashyperlinkclicksarepossibleaswell.We59aremainlyinterestedinrecommendingasetofitemsforanactiveusersuchthattheuserhasnotratedtheseitemsbefore.Tothisend,weareaimedatlearningamodelfromtheexistingratings,i.e.,phase,andthenusethelearnedmodeltogeneraterecommendationsforactiveusers,i.e.,onlinephase.TheratinginformationissummarizedinannmmatrixR2Rnm;1in;1jmwheretherowscorrespondtotheusersandthecolumnscorrespondtotheitemsand(p;q)thentryistherategivenbyuseruptotheitemiq.Wenotethattheratingmatrixispartiallyobservedanditissparseinmostcases.Anntandeapproachtorecommendersystemsistofactorizetheuser-itemratingmatrixRbyamultiplicativeofk-rankmatricesRˇUV>,whereU2RnkandV2Rmkutilizethefactorizeduser-spanditem-spmatrices,respectively,tomakefurthermissingdataprediction.Therearetwobasicformulationstosolvethisproblem:theseareoptimizationbased(seee.g.,[97,65,71,58])andprobabilistic[80].LetRbethesetofobservedratingsintheuser-itemmatrixR2Rnm,i.e.,R=f(i;j)2[n][m]:Rijhasbeenobservedg;wherenisthenumberofusersandmisthenumberofitemstoberated.Inoptimizationbasedmatrixfactorization,thegoalistolearnthelatentmatricesUandVbysolvingthefollowingoptimizationproblem:F(U;V)=12X(i;j)2RRi;ju>ivj2+UkUk2F+VkVk2F(5.1)TheoptimizationprobleminEq.(5.1)constitutesofthreeterms:thetermaimstominimizetheinconsistencybetweentheobservedentriesandtheircorrespondingvalueob-tainedbythefactorizedmatrices.Thelasttwotermsregularizethelatentmatricesforusersanditems,respectively.TheparametersUandVareregularizationparametersthatareintroducedtocontroltheregularizationoflatentmatricesUandV,respectively.605.1.2MatrixfactorizationwithsocialregularizationInthissectionwereviewtheexistingfactorizationmethodsthatarecapableofex-ploitingsocialcontextofusers.Thegeneralthemeinthesemethodsistoregularizethelatentfeaturesofusersbasedontheirsocialcontext.ToincorporatethetrustrelationsintheoptimizationproblemformulatedinEq.(5.1),fewpapers[70,52,72,65]proposedthesocialregularizationmethodwhichaimsatkeepingthelatentvectorofeachusersimilartohis/hertrustedneighborsinthesocialnetwork.Theproposedmodelsforcetheuserfeaturevectorstobeclosetothoseoftheirneighborstobeabletolearnthelatentfeaturesofuserswithnoorveryfewratings[52].Moresp,theoptimizationproblembecomesas:F(U;V)=12X(i;j)2RRiju>ivj2+U2kUkF+V2kVkF+S2nXi=1ui1jN+(i)jXj2N+(i)uj;(5.2)whereSisthesocialregularizationparameterandN+(i)Visthesubsetofuserswhohavetrustrelationshipswithithuserinthesocialgraph.Inasimilarwayonecanexploitthedistrustrelationsbyforcingthelatentfeaturesofeachusertobeasfaraspossiblefromtheaverageoflatentfeaturesofhis/herdistrustedfriendsformulatedby:F(U;V)=12X(i;j)2RRiju>ivj2+U2kUkF+V2kVkFS2nXi=1ui1jN(i)jXj2N(i)uj;(5.3)61whereN(i)Visthesubsetofuserswhohavedistrustrelationshipswithithuserinthesocialgraph.Itisremarkablethatwhileintuitionandexperimentalevidenceindicatethattrustissomewhattransitive,distrustiscertainlynottransitive.Therefore,thedistrustcannotbeconsideredasthenegativeoftrustrelationsinsocialnetwork[116].Asaresult,simultaneousexploitationoftrustanddistrustrelationshipsisnotasstraightforwardasexploitingeithertrustordistrustrelations.Inaveryrecentwork[35],theauthorsproposedthematrixfactorizationbasedrecommenderalgorithmthatisabletoexploitbothtypesofrelationshipsinfactorization.Themainideaistolearningthelatentfeaturesforeachuserusuchthatuserstrustedbyuinthesocialnetwork(withpositiveedges)arecloseanduserswhicharedistrustedbyu(withnegativeedges)aremoredistant.Learninglatentfeaturesinthiswaybasicallyinducesarankingoflatentfeaturesofeachuser'sneighborswherethetrustedfriendsappearonthetopofthelistanddistrustedfriendsmovetothebottomofthelist.Toformalizetherankingideainlearningthelatentfeatures,letSdenotethesetoftriplets(i;j;k)whereithusertrustjthuseranddistrustskthuserinthesocialrelations,i.e.,S=n(i;j;k)2[n][n][n]:Sij=1&Sik=1o:Thentheobjectiveoffactorizationbecomes:F(U;V)=12X(i;j)2RRiju>ivj2+U2kUkF+V2kVkF+S2X(i;j;k)2Sh1kuiujk2+kuiukk2i+;(5.4)wherethethirdtermisintroducedtoregularizethelatentfeaturesofuserswhoviolatetherankingconstraints.Asmentionedearlier,thesimultaneousexploitationoftrustanddistrusthasbeeninves-tigatedinmemorybasedrecommendersystemstoo[118,115].Themainrationalbehind62thealgorithmproposedin[118]istoemploythedistrustinformationtodebugorouttheusers'propagatedweboftrust.Itisalsohasbeenrealizedthatthedebuggingmethodsmustexhibitamoderatebehaviorinordertobee.[115]addressedtheproblemofconsideringthelengthofthepathsthatconnecttwousersforcomputingtrust-distrustbetweenthem,accordingtotheconceptoftrustdecay.Thisworkalsointroducedseveralaggregationstrategiesfortrustscoreswithvariablepathlengths.5.2ThePushTrustAlgorithmInthissectionweintroducethecollaborativesocialrankingframeworkforsocialrecommen-dationwhich,incontrasttosocialregularizationbasedmethods,isabletoincorporatebothtrustanddistrustrelationshipsinthesocialnetworkalongwiththepartiallyobservedratingmatrix.5.2.1CollaborativesocialrankingBeforedelvingintothemathematicalformulation,wediscussthemainideabehindtheproposedPushTrustalgorithm.Asdiscussedearlier,toincorporatethesocialcontextofusersinthefactorizationmodel,theappropriatelatentfeaturesforusersmustbefoundsuchthateachuserisbroughtclosertotheusersshe/hetrustsandseparatedfromtheusersthatshe/hedistrustsandhavetinterests.Wenotethatsimplyincorporatingthisideainmatrixfactorization,bynaivelypenalizingthesimilarityofeachuser'slatentfeaturestohisdistrustedfriends'latentfeatures,failstoreachthedesiredgoal.Themainreasonisthatdistrustisnotastransitiveastrustandcannotdirectlyreplacetrustintrustpropagationapproaches.Therefore,simultaneouslyutilizingdistrustandtrustrelationsin63matrixfactorizationrequirescarefulconsideration(trustistransitive,i.e.,ifuserutrustsuservandvtrustsw,thereisagoodchancethatuwilltrustw,butdistrustiscertainlynottransitive,i.e.,ifudistrustsvandvdistrustsw,thenwmaybeclosertouthanvormaybeevenfartheraway).Thisobservationimpliesthatdistrustshouldnotbeusedasawaytoreversedeviations.Thisstatementisconsistentwiththepreliminaryexperimentalresultsin[116]formemory-basedCFmethodswhichrevealedthattakingdistrustasthenegativeoftrustrelationsisnotthecorrectwaytoincorporatedistrustinrecommendersystems.Motivatedbythisnegativeresult,in[35]therankingoflatentfeaturesofneighborsofusersbasedonthetypeoftheirrelationshasbeenshownasasuitablesolutiontoincorporatebothtrustanddistrustrelations.Thegoalwastoconstructarankingoflatentfeaturesofallusersbasedontheirsimilaritysuchthattrustedfriendsgetahigherrank/scorethanthedistrustedfriends.Inparticular,fromarankingperspective,thesimilarityoflatentfeaturesinduceanorderingoveralluserswheretrustedusersarepushedtotopofthelist.However,therankingmethodproposedin[35]fromtwomainissues.First,whentheneighborsofaspuseruareconsideredintherankingmodelbasedonthesimilarityoftheirlatentfeaturestolatentfeaturesofu,theneutraluserswhohavenorelationtouareignored.Thismightcausetheneutralfriendstoappearanyplaceintheranking(e.g.,abovethetrustedfriendsorbelowthedistrustedfriends).Thisthequalityofrecommendationsntly.Moresp,thiscontradictsourbasicassumptionontheenessofsocialinformation,whichreliesonthefactthatpeopletendtorelymoreonrecommendationsfrompeopletheytrustthanrecommendationsbasedonanonymouspeople.Therefore,insharpcontrastto[35]whichexcludestheuserswhohavenorelationtoaspuser,wealsomodeltheseusersinlearningthelatentfeaturesbyforcingthemtoappearintherankedlistafterthetrustedfriendsyetbeforethedistrustedfriends.This64Figure5.1:TheultimategoalistoextractlatentfeaturesfromtheratingmatrixRinawaythatrespectsthesocialcontextofusersinsocialnetworkS.Inparticular,fromeachuser'sperspective,sayuseru,thegoalistolatentfeaturesforu'sneighborssuchthatwhenrankedbasedontheirsimilaritytothelatentfeaturesoftheuseru,i.e.,u2Rk,thetrustedfriendsN+(u)arepushedtothetopportionofthelist,thedistrustedfriendsN(u)arepushedtothebottomofthelist,andtheneutralfriendsN(u)appearinthemiddleofrankedlist.matchesourintuitionasweexpectneutralfriends'contributiontothepredictionnottobebetterthanthecontributionoftrustedfriendsandworsethanthecontributiondistrustedfriends.Moreover,thenumberofconstraintsimposedbysocialregularizationintherankingmodelproposedin[35]increasescubicallywiththenumberofusersinthenetwork.Thisgrowthlimitstheapplicabilityoftheirmethodtosmallscalesocialnetworks.Toresolvethisissue,weintroduceanovelrakingmodelthatonlyintroducesaquadraticnumberofconstraintsintothefactorizationobjective,thusmakingitmorefavorableforlargescalesocialnetworks.ThemainintuitionbehindtherankingmodelutilizedinthePushTrustalgorithmisshowninFigure5.1.Tokeepourdiscussionsimpleandeasytofollow,weconsiderthesocialregularizationforasingleuseruwithlatentfeaturesvectoru2Rk.LetN+(u)=fv2[n]jSuv=+1g,N(u)=fv2[n]jSuv=1gbethesetoftrustedanddistrustedneighborsofuinthesocialnetwork,respectively.AlsoletN(u)=VnN+(u)[N(u)betheset65ofusersforwhomtheuseruisnotsociallyconnected,i.e,N(u)=fv2[n]jSuv=0g.Ifwerankthelatentfeaturesofusersfromtheviewpointofuseru,wewishthetrustedfriendsN+(u)havethemaximumsimilaritytouandarepushedtothetopportionoftherankedlist.Similarly,thedistrustedfriendsN(u)areexpectedtohavelesssimilaritytouandappearatthebottomportionoftherankedlist.Thesetofneutraluserswhoarenotconnectedtouseru,i.e.,N(u)arepotentiallyeithertrustedordistorted.Asaresult,aconservativedecisionistoplacethelatentfeaturesofneutralusersaftertrustedandbeforedistrustedfriends.Toaccomplishthisgoal,weintroduceasocialregularizationtermforeachuseru2U,wherethethelatentfeaturesoftheuseruispenalizedbasedonthedeviationofrankingofu'sneighborsfromtheoptimalrankingdiscussedabove.5.2.2AconvexformulationforsocialrankingTosimplifytheanalysiswerewritethesocialregularizedmatrixfactorizationproblemas:F(U;V)=12X(i;j)2RRiju>ivj2+V2kVkF+U2kUk2F+SnXi=1P(ui);whereP:Rk7!R+isthesocialregularizationoflatentfeaturesofindividualusers.Theformalizationofdiscussedintuitiononsocialregularizationoflatentfeaturesofusersisasfollows.LetU+=fu+1;u+2;;u+pg,U=fu1;u2;;uqg,andU=fu1;u2;;urgbethesetoflatentfeaturesofusersinN+(u),N(u),andN(u),respec-tively,wherep=jN+(u)j,q=jN(u)j,andr=jN(u)j.Thesimilaritybetweenthelatentfeaturesofuandtheirneighborvinthenetworkismeasuredbyhu;uvi.Anaiveimplemen-tationofrankingideaistoconsiderthetripletoflatentfeatureswhichyieldsalargenumber66ofconstraintstobeHowever,herewerelyontherecentmethodsthatassistwithranklearningtorank,suchasp-normpush,push,andreverse-heightpush[99,5],toorderthelatentfeaturesbasedontheirsimilaritytothelatentfeaturesofu.Inparticular,wethefollowingrankingfunctionthattriestoputasmanypossibletrustedfriendsbeforethemostsimilardistrustedandneutralfriends.Asitwillberevealedlaterinthissection,thisformulationresultsinatdecreaseinthenumberofconstraintsthatonlylinearlydependsonthenumberofusersinthesocialnetwork.Formally,foruseruwethefollowingobjectiveasitssocialregularization:P(u)=1ppXi=1I"hu;u+iimax1jqhu;uji#+1ppXi=1I"hu;u+iimax1jrhu;uji#+1rrXi=1I"hu;uiimax1jqhu;uji#;(5.5)whereI[]istheindicatorfunctionwhichreturns1whentheinputistrueand0otherwise.ThesocialregularizationinEq.(5.5)consistsoffourterms.Thetermisthestandardregularizationoflatentfeatureswhichisincludedtosimplifythederivationoftheobjective.Thesecondandthirdtermsaimtoputtrustedfriendsattopoftherankedlistbasedontheirsimilaritytothelatentfeaturesofu.Finally,thelasttermattemptstoputneutralfriendsbeforethedistrustedfriends.Clearly,byminimizingthesocialregularizationforeachuser;minu2RkP(u),itisguaranteedthattheextractedlatentfeaturesrespectthesocialcontextofindividualusersdiscussedinsection4.1.Theaboveoptimizationproblemisintractableduetothenon-convexindicatorfunction.Todesignpracticallearningalgorithms,wereplacetheindicatorfunctioninEq.(5.5)withitsconvexsurrogate.Tothisend,theconvexlossfunction`:R7!R+as`(x)=67Algorithm5PushTrustAlgorithm[33]1:Input:R2Rnm:thepartiallyobservedratingmatrix,R[n][m]:thesetofobservedratingsinR,S21;0;+1gnn:thesocialgraphbetweenusers1:U;V;S2R+:theregularizationparameters2:fort=1;:::;Tdo3:fori=1;:::;ndo4:ComputethegradientgtuiusingEq.(5.8)5:Setut+1i=utitgtui6:endfor7:forj=1;:::;mdo8:ComputethegradientgtvjusingEq.(5.9)9:Setvt+1j=vtjtgtvj10:endfor11:endfor12:Output:~R=UTV>T,whereUT=[uT1;:::;uTn]andVT=[vT1;:::;vTm][1x]+.ThisisthewidelyusedhingelossinSVMtion(seee.g.,[15])1.ThislossfunctiontheamountbywhichtheconstraintsarenotByreplacingthenon-convexindicatorfunctionwiththisconvexsurrogateleadstothefollowingtractableconvexoptimizationproblem:P(u)=1ppXi=1` hu;u+iimax1jqhu;uji!+1ppXi=1` hu;u+iimax1jrhu;uji!+1rrXi=1` hu;uiimax1jqhu;uji!:(5.6)Equippedwiththesocialregularizationofindividualusers,wenowturntothematrixfactorizationobjective.Todoso,LetUi=[ui;1;;u+i;q]>,andUi=[ui;1;;ui;r]>denotethematrixoflatentfeaturesofdistrustedandneutralfriendsofithuserinthesocialnetwork,respectively.Hereui;jandui;jisthelatentfeaturesofjthdistrustedandneutral1Wenotethatotherconvexlossfunctionssuchasexponentialloss`(x)=exp(x),andlogisticloss`(x)=log(1+exp(x))alsocanbeusedasthesurrogatesofindicatorfunction,butforthesimplicityofderivationweonlyconsiderthehingelosshere.68friendsoftheithuser,respectively.Then,wehave:F(U;V)=12X(i;j)2RRiju>ivj2+V2kVkF+U2kUk2F+SnXi=10B@1pXj2N+(i)`hui;u+i;jikUiuik11CA+SnXi=10B@1pXj2N+(i)`hui;u+i;jikUiuik11CA+SnXi=10B@1rXj2N(i)`hui;ui;jikUiuik11CA:(5.7)5.3OptimizationprocedureWenowturntosolvingtheoptimizationprobleminEq.(5.7).WenotethatbyexcludingtherankingconstraintsfromtheformulationinEq.(5.7),wegetthestandardmatrixfac-torizationobjectiveasformulatedinEq.(5.1)whichisnon-convexjointlyinbothUandV.However,despiteitsnon-convexity,itiswidelyusedinpracticalcollaborativeapplicationsastheperformanceiscompetitive(orbetter)whencomparedtotrace-normminimization,whilescalabilityismuchbetter.Forexample,asindicatedin[58],toaddresstheproblem,Eq.(5.1)hasbeenappliedwithafairamountofsuccesstofactorizedatasetswith100millionratings.Toalocalsolutiononecansticktothestandardgradientdescentmethodtoasolutioninaniterativemanner.ThedetailedstepsoftheproposedPushTrustalgorithminitsmostbasicformareshowninAlgorithm5.Theoptimizationisdonebyalternatingminimization,i.e.,updatingUwhilekeepingVandthenupdatingVwhilekeepingUTheobjectivefunctionisbi-convex,i.e.,convexinoneargumentwhilekeepingtheotherTheupdatesofthelatentfactorsforeachuserandeachitemcanbedoneusinggradientdescent.Inparticular,a69localminimumoftheobjectivefunctioncanbefoundbyperforminggradientdescentinfeaturevectorsuiandvj.Todoso,wecomputethe(sub)gradientofobjectivefunctionF(U;V)withrespecttouiandvjas:gui=@F@ui=mXj=1Iij(Riju>ivj)vj+Uui(5.8)+SppXj=1Ihhui;u+i;jikUiuik11i@kUiuik1u+i;j+SppXj=1Ihhui;u+i;jikUiuik11i@kUiuik1u+i;j+SrrXj=1Ihhui;ui;jikUiuik11i@kUiuik1ui;jandgvj=@F@vj=nXi=1Iij(Riju>ivj)ui+Vvj:(5.9)whereIijistheindicatorfunctionofobservedentriesintheratingmatrixR,i.e.,Iij=1if(i;j)2Rand0otherwise,and@kUiuik1isthesubtialofthefunctionkUiuik1atpointui.Sincethesubtialofthemaximumoffunctionsistheconvexhulloftheunionofsubtialsoftheactivefunctionsatpointu[83],wehave:@kUiuik1=@max1jqhui;uji=convnujjhui;uji=kUiuik1;j2[q]o:Inasimilarway,wecancomputethe@kUiuk1.Havingcomputedthegradients,weitera-70tivelyupdatethelatentfeaturesatiterationt+1by:ut+1i utitgtui;vt+1i vtitgtvi:Here,tisthelearningrate,andgtuiandgtviarethesubgradientsoftheobjectivewithrespecttouiandviatiterationt,respectively.Weobservethat,duetothenon-smoothmaxoperatorinthesocialregularizationterm,oneneedstousetosub-gradientdescentmethodstooptimizetheaboveobjective.ComparingtheobjectiveinEq.(5.7)withobjectiveEq.(5.4)thatisintroducedin[35],afewcommentsareinorder.First,heretheneutralfriendsofusersarealsoincorporatedinrankingthelatentfeatures.ThisismoreconservativethantheformulationinEq.(5.4)whichignoresthesetypesofrelations.Inparticular,inEq.(5.4)theneutralfriendsmayappearanyplaceintherankingevenbeforethetrustedfriends.However,inEq.(5.7)neutralfriendsarepushedtothemiddleofrankedlistwhichismorereasonableunderrealworldassumptions.Second,asitwillbecomemoreclearinthedualformationofobjectiveEq.(5.7),thenumberofconstraintsinEq.(5.7)increasesquadraticallyO(n2)withthenumberofusersbecauseofthemaximumnormusedinformulation,whileinEq.(5.4)thenumberofconstrainsiscubicO(n3)inthenumberofusersduetothepairwiserankingusedtoranktheneighborsofeachuser.Thisprovidesatcomputationaladvantagewhenwetacklesocialrecommendationproblemswithlargenumberofusers.Remark5.3.1.Wenotethatforlarge-scalenetworks,thecomputationofgradientateachstepofAlgorithm5mightbecomputationallyexpensiveandonecanusestochasticmethodstoreplacetheexpensivefullgradientcomputationwithcheapstochasticgradientcomputation.Thiscanbedonebysamplingauserandonlyupdatinghis/herlatentfeatures.71Chapter6Cold-startRecommendation:RatingPrediction6.1IntroductionDuetothepopularityandexponentialgrowthofe-commercewebsites(e.g.Amazon,eBay)andonlinestreamingwebsites(e.g.Hulu),acompellingdemandhasbeencreatedfortrecommendersystemstoguideuserstowarditemsoftheirinterests(e.g.products,books,movies).Recommendersystemsseektopredicttheratingthatauserwouldgivetoanitemandfurthertrytosuggestitemsthataremostappealingtotheusers.Recently,recommendersystemshavereceivedaconsiderableamountofattentionandhavebeenthemainfocusofmanyresearchstudies[2].Content-based(CB)andcollaborative(CF)arewell-knownexamplesofrecommendationapproaches.AsdemonstratedbyitsperformanceintheKDDCup[29]andcompetition[13],themostsuccessfulrecommendationtechniqueusediscollaborativeThistechniqueexploitstheusers'opinions(e.g.,movieratings)and/orpurchasing(e.g.,watching,reading)historyinordertoextractasetofinterestingitemsforeachuser.InfactorizationbasedCFmethods,bothusersanditemsaremappedintoalatentfeaturespacebasedonobservedratingsthatarelaterusedtomakepredictions.Despitetimprovementsinrecommendationsystems,andinparticularfactor-72izationbasedmethods,thesesystemsfromafewinherentlimitationsandweak-nessessuchasdatasparsityandcold-startproblems.Sp,inmanyrealworldap-plications,theratingdataareverysparse(i.e.,mostusersdonotratemostitemstypi-callyresultinginaverysparseratingmatrix)orforasubsetofusersoritemstherat-ingdataisentirelymissing(knowsascold-startuserandcold-startitemproblem,respec-tively[103]).Toaddresstheassociatedthelatterissues,therehasbeenanactivelineofresearchduringthepastdecadeandavarietyoftechniqueshavebeenproposed[87,100,109,121,64,114,122,77,92,120,62].Thestudiesintheliteraturehaveapproachedthecold-startproblemfrommanytangles,buttheycommonlyexploittheauxiliaryinformationabouttheusersanditemsinadditiontotheratingdatathatareusuallyavailable(seee.g.[106]foramorerecentsurvey).Byleveragingmultiplesourcesofinformationonecanpotentiallybridgethegapbetweenexistingitems(orusers)andnew(coldstart)items(orusers)tomitigatethecold-startproblem.Themainmotivationbehindthesetechniquesstemsfromtheobservationthatothersourcesofdatacanbeusedtorevealmoreinformationabouttheunderlyingpatternsbe-tweenusersanditemsandthuscomplementtheratingdata.Forinstance,knowingtheunderlyingsocialconnections(friends,family,etc.)betweenuserscangiveusabetterun-derstandingofthesourcesofonauser'sdecision.Subsequently,theavailabilityofauxiliaryinformationsuchasusers'information[2],socialcontext(trustanddis-trustrelations)ofusers[35],informationembeddedinthereviewtext[64],andfeaturesofitems[37]providetangiblebtotherecommender.Hence,anintriguingresearchquestion,whichisthemainfocusofthisthesisis:73Howcansideinformationaboutusersanditemsbectivelyincorporatedinfactoriza-tionmethodstoovercomethecold-startproblem?Althoughtherearemanydtalgorithmsintheliteraturetoaugmentmatrixfactor-izationwithsideinformation,suchassharedsubspacelearning[102]andkernelizedmatrixfactorization[122],thedominantparadigminexistingmethodsistoperform(a)thecom-pletionofratingmatrixand(b)thetransductionofknowledgefromexistingratingstocold-startitems/userssimultaneously.Whilethesemethodsareabletogenerategoodre-sultsinpractice,theyhavenotabledrawbacks:1)thesemethodspropagatethecompletionandtransductionerrorsrepetitivelyandinanuncontrolledway,2)manyofstate-of-artalgo-rithmsareusuallyapplicationbased,e.g.see[114,63],anddonotageneralframeworkforalleviatingcold-startproblems.Theoverarchinggoalofourworkistoanswertheabovequestionbyproposingantmatrixfactorizationapproachusingsimilarityinformation.Infact,notonlyweproposeageneralframeworkfordealingwithcold-startproblems,butalsowestudyasomewhatmoregeneralproblemwherebothcold-startusersandcold-startitemsarepresentsimultaneouslyandaddressthesetwochallengessimultaneously.Inparticular,byconsideringthedrawbacksoftheexistingmethods,weproposeatwo-stagealgorithmthatdecouplesthecompletionandtransductionstages.First,weexcludethecold-startitemsandusersandcompletetheratingmatrix.Next,wetransducttheknowledgetocold-startusers/itemsusingtherecoveredsub-matrixinadditiontotheavailablesideinformationabouttheusersanditems.Hence,thereisnoerrorpropagationofcompletionandtransduction.Interestingly,beyondjustdealingwithcold-startproblem,theproposedalgorithmalsoprovidesanewaytoexploitthesideinformationaboutusers(oritems)tomitigatethedatasparsityandcompensateforthe74lackofratingdata.Unlikemostexistingmethods,wealsoprovideatheoreticalperformanceguaranteeontheestimationerrorofourproposedalgorithm.Wefurthercomplementourtheoreticalresultswithcomprehensiveexperimentsonafewbenchmarkdatasetstodemonstratethemeritsandadvantagesofproposedframeworkindealingwithcold-startproblems.Ourresultsdemonstratethesuperiorityofourproposedframeworkoverseveralofstate-of-artcold-startrecommenderalgorithms.6.2ExperimentsInthissection,weconductexhaustiveexperimentstodemonstratethemeritsandadvan-tagesoftheproposedalgorithm.Weconductourexperimentsonthreewell-knowndatasets:MovieLens(1Mand100K)1,Epinions2andNIPS3.Usingthese,weexploreseveralfun-damentalquestions.Predictionaccuracy:Howdoestheproposedalgorithmperformincomparisontothestate-of-the-artalgorithmswithincorporatingsideinformationofusers/items.Fur-ther,towhatdegreecantheavailablesideinformationhelpinmakingmoreaccuraterecommendationsexistingitemsforexistingusers?Dealingwithcold-startusers:Howdoesexploitingsimilarityrelationshipsbetweenuserstheperformanceofrecommendingexistingitemstocold-startusers?Dealingwithcold-startitems:Howdoesexploitingsimilarityinformationbetweenitemstheperformanceofrecommendingcold-startitemstoexistingusers?1http://www.grouplens.org/node/732http://www.trustlet.org/wiki/Epinions\_dataset3http://www.cs.nyu.edu/~roweis/data.html75Dealingwithcold-startusersanditemssimultaneously:Howdoesexploitingsimilarityinformationbetweenusersanditemstheperformanceofrecommendingcold-startitemstocold-startusers?6.2.1DatasetsSinceourproposedgeneralframeworkisapplicablesolutionforbothcold-startandnetworkcompletionproblems,weselectedrealwell-knowndatasetsforeacheachproblem.WeusedMovieLensandEpinionsastwosamplesofcold-startproblemandusedNIPSdatasetasanetworkcompletionproblem.MovieLensdataset.WeusedtwoofthewellknownMovieLensdatasets,100Kand1M,whichhavesideinformationforbothusersandmovies(items)thatmakesthemidealforourexperiment.ThestatisticsofMovieLensdatasetsaregiveninTable6.1.Epinionsdataset.Thisdatasetisobtainedfromauser-orientedproductreviewwebsitethathasatrustnetworkofusers.Userscanspecifywhethertheytrustotherusersornot.Weusethistrustnetworkasthesideinformationfortheusers.ThestatisticsofEpinionsarealsoshowninTable6.1.NIPSdataset.Wehavealsoappliedouralgorithmtopaper-authorandpaper-wordma-tricesextractedfromtheco-authornetworkattheNIPSconference[98].Thecontentsofthepapersarepre-processedsuchthatallwordsareconvertedtolowercasesandstop-wordsareremoved.WecomputethecosinesimilarityofthevectorrepresentationweightedwithTD-IDFofpaperswiththeonesofallotherpapers.ThestatisticsofthisdatasetareshowninTable6.1.766.2.2MetricsWeadoptthewidelyusedtheMeanAbsoluteError(MAE)andtheRootMeanSquaredError(RMSE)metricsforpredictionaccuracy[50].LetTdenotethesetofratingsthatwewanttopredict,i.e.,T=f(i;j)2[n][m];RijneedstobepredictedgandletcRdenotethepredictionmatrixobtainedbyarecommendationalgorithm.Then,MAE=P(i;j)2TjRijcRijjjTj;TheRMSEmetricisas:RMSE=vuuutP(i;j)2TRijcRij2jTj:Givenanitemi,letribetherelevancescoreoftheitemrankedatpositioni,whereri=1iftheitemisrelevanttotheiandri=0otherwise.TheNDCGmeasureisanormalizationoftheDiscountedCumulativeGain(DCG)measure.DCGisaweightedsumofthedegreeofrelevancyoftherankedusers.ThevalueofNDCGisbetween[0;1]andatpositionkisas:NDCG@k=ZkkXi=12ri1log(i+1)Inallourexperiments,wesetthevalueofktothenumberofrateditemsofusersforcalculatingtheNDCG.77Table6.1:Statisticsofrealdatasetsusedinourexperiments.StatisticsMovieLens100KMovieLens1MEpinioinsNIPSNumberofusers9436,0408,5772,073Numberofitems16823,7063,7691,740Numberofratings100,0001,000,209203,2753,990Rangeofratings1-51-51-50-1Maximumnumberofratingsbyusers737231441643Maximumnumberofratingsforitems5833428123610Averagenumberofratingsbyusers106.04165.5923.701.95Averagenumberofratingsforitems59.45257.5853.932.29Figure6.1:RMSEofsyntheticdatasetfortvaluesofpandq6.2.3ExperimentsonsyntheticdatasetInthissectionwepresenttheresultsofourproposedalgorithmononesyntheticdataset.Wegeneratedasyntheticdatasettoevaluateourapproachbeforemovingforwardtorealdatasets.FirstwegeneratetwomatricesU2[0;1]4;000randV2[0;1]2;000r.ThenbyusingUandVwegeneratedaratingmatrixR4;0002;000=UV>thatincludes4,000usersand2,000items.ThenwegeneratedasimilaritymatrixA4;0004;000=UU>forusersandasimilaritymatrixB2;0002;000=VV>foritems.ThenweaddedrandomnoisetotheallelementsofAandBwherethenoisefollowsaGaussiandistributionN(0;0:5).WeconsiderAandBastwosimilaritymatricesbetweenusersanditems,respectably.Inthisstudywetriedtvaluesforpwithintherangeof100to2,000withstep-size78Figure6.2:RMSEandMAEofsyntheticdatasetfortvaluesofnoisevarianceof100andrentvaluesforqwithintherangeof100to1,000withsamestep-size.Togeneratecold-startusersanditemsinourdataset,wedividedthedatasetinto4partitions.Wedividedusersintotwogroupsofpexistingusersandnpcold-startusers.Wealsoselectedqitemsasexistingitemsandmqascold-startitems.HencePartitions1and2havepusersandpartitions3and4havenpcold-startusers;partitions1and3haveqitemsandpartitions2and4havemqcold-startitems.TheRMSEoftestdatafortcombinationsofpandqvaluesisshowninFigure6.1.Itshowsthatbyincreasingpandq,theRMSEdecreases.Hence,thefewerunobservedelementswehave,thelowertheerroris.Consideringthefactthatourtheoreticalupperboundoferrordecreasesbyincreasingpmorqn,thisobservationiscompletelyconsistentwithourtheoreticalupperbound.Foroursyntheticdataset,weaddednoisetosimilaritymatricesthatfollowsazeromeanGaussiandistributionwithvarianceof0.5.Toinvestigatethetsofnoisevariance,wevarythevariancefrom0:1to1withstepsizeof0:1andcalculatetheaccuracyofouralgorithm.AsFigure6.2shows,byincreasingthenoisevariance,bothRMSEandMAEoftheresultsontestdataincrease;Hence,wecanseethestabilityofRECTwithrespecttonoise.796.2.4ExperimentssetupTobetterevaluatetheofutilizingthefeaturesofusersanditems,westudiedthefollowingrecommendationscenarios.I[existingusers/existingitems]:Therearemanytalgorithmsforpredict-ingtheratingofexistingusersonthoseexistingitemsthattheyhavenotratedyet.Werandomlyremoved20%oftheavailableratingsfromtheratingmatrixandusedtherestoftheratingsforourtrainingtopredicttheremovedvalues.II[existingusers/cold-startitems]:Inthiscasewehaveanumberofcold-startitemswithnocold-startusers.Tosimulatethisscenario,weconsidered80%oftheitemsastrainingitemsandtheremaining20%ascold-startitems.Thenwetriedtopredicttheratingsofexistingusersonthecold-startitems.III[cold-startusers/existingitems]:Often,inarecommendersystemavastma-jorityoftheusersarenewusers.Tosimulatethecold-startuserscenario,wedividedtheusersintoadisjointtraining(80%)setandatest(%20)set.Thenwepredictedtheratingofofcold-startusersontheexistingitems.IV[cold-startusers/cold-startitems]:Inthisscenariowedividedbothusersanditemsintotwosubsetsofexistingandcold-startusersanditems,respectively.Weaimedtorecommendthecold-startitemstothecold-startusers,whichisthemostchallengingformofcold-startproblem.Wealsoanalyzedtherunningtimeneededfortrainingofouralgorithmandcold-startscenariobaselinealgorithmstobetterevaluatethecold-startscenarios.Itisalsoworthmentioningthatwetunedtheparameterofouralgorithmbyrunningitontrainingsetsandreportedthesevaluesforeachresult.806.2.5ThebaselinealgorithmsToevaluatetheperformanceofourproposedRECTalgorithm,weconsideredavarietybaselineapproaches.Thebaselinealgorithmsarechosenfromfourtypesofcategories:(i)state-of-the-artalgorithmsforratingpredictions,(ii)thosethatcanonlydealwithcold-startitems,(iii)onesthatcandealwithcold-startusersand(iv)algorithmsthatarecapableofdealingwithbothcold-startitemsandusers.Inparticular,weconsideredthefollowingbasicalgorithms4:RS(RandomStrategy)[66]:Asimplebaselinethatselectsatrandomasubsetofusersoritems.Therecommendationforcold-startusersanditemsisachallengingcase,whereRSisoneofthebaselinemethods.U-KNN(UserKNN)[57]:PredictstheratesusingthesimilaritywiththeKnearestneighborwhereusershaveweights.I-KNN(ItemKNN)[57]:Isaweighteditem-basedKNNapproachforrateprediction.GA(GlobalAverage):Usestheaverageofratingsoverallratings.I-A(ItemAverage):Usestheaverageratingofanitemforitsprediction.U-A(UserAverage):Usestheaverageratingofauserforitsprediction.SO(SlopeOne)[60]:Pre-computestheaveragebetweentwoitemsthatareratedbyusers.SOisafrequencyweightedslopeoneratingprediction.BPSO(BiPolarSlopeOne)[60]:IsaBi-polarfrequencyweightedslopeoneratingprediction.SMF(SocialMatrixFactorization)[51]:Isamatrixfactorizationalgorithmthatin-corporatesthesocialnetworkforprediction.CC(CoClustering)[39]:Isaweightedco-clusteringalgorithmthatinvolvessimulta-4SomebaselinealgorithmsusedforourexperimentsareimplementedintheMyMediarecommenderframework[38]81neousclusteringofusersanditems.LFLLM(LatentFeatureLogLinearModel)[78]:Isascalablelog-linearmodelthatexploitsthesideinformationfordyadicprediction.U-I-B(UserItemBaseline)[57]:Isaratingpredictionmethodthatusestheaverageratingvaluealongwitharegularizeduseranditembias.FWMF(FactorWiseMatrixFactorization):Isamatrixfactorizationbasedmodelwithafactor-wiselearningBMF(BiasedMatrixFactorization)[96]:Isamatrixfactorizationthatlearnsbystochasticgradientdescentwithexplicitbiasforusersanditems.SVDPP(SVD++)[56]:Isamatrixfactorizationthattakesintoaccountwhatusershaveratedanddirectlyusersanditems.SSVDPP(SigmoidSVD++)[56]:IsaversionofSVD++thatvariantthatusesasigmoidfunction.SU-AFM(SigmoidUserAsymmetricFactorModel)[89]:Isanasymmetricfactormodelthatrepresentstheitemsintermsofthoseusersthatratedthem.SI-AFM(SigmoidItemAsymmetricFactorModel)[89]:Isanasymmetricfactormodelthatrepresentsusersintermsoftheitemstheyrated.SCAFM(SigmoidCombinedAsymmetricFactorModel)[89]:Isanasymmetricfactormodelthatrepresentsitemsintermsoftheusersthatratedthem,andusersintermsoftheitemstheyrated.CBF(ContentBasedFiltering)[102]:Thisalgorithmbuildsaforeachuserbasedonthepropertiesofthepastuser'spreferreditems.LCE(LocalCollectiveEmbeddings)[102]:Isamatrixfactorizationmethodthatex-ploitspropertiesofitemsandpastuserpreferenceswhileenforcingthemanifoldstruc-82Table6.2:SyntheticDataSetSettingAlgorithmTraining(10%)Training(20%)Training(30%)Training(40%)Training(50%)MAERSMEMAERSMEMAERSMEMAERSMEMAERSMEScenarioIRECT0.03530.00200.03450.00180.03450.00180.03410.00180.03400.0018ScenarioIIRECT0.03530.00200.03450.00180.03450.00180.03410.00180.03400.0018ScenarioIIIRECT0.03450.0010.03420.00190.03420.00190.03430.00190.03420.0019ScenarioIVRECT0.03490.00190.0340.00180.03460.00190.03430.00190.03440.0018tureexhibitedbythecollectiveembeddings.LCEcanbefedwithpropertiesofeitheritemsoruserstodealwithcold-startitemsorusers,respectively.WealsoconductexperimentswithaversionofLCEwithoutlaplacianregularization,whichwillbereferredasLCE-NL[102].KMF(KernelizedMatrixFactorization)[122]:Isamatrixcompletionbasedalgorithm,whichincorporatesexternalsideinformationoftheusersoritemsintothematrixfactorizationprocess.RECT:Ourproposedalgorithm.6.2.6ExistingUsers/ExistingItemsScenarioIisthestandardcaseforrecommendersystemsthatallusersanditemshaveahistoryofratingandbeingrated,respectively.Eachuserusuallyratesonlyanumberofitemsandtherearenoratingsforotheritems.Thegoalhereistopredicttheratingsforthoseunrateditems.Therearemanyttechniquesforpredictingtheratesbutwecandividethemintotwomajorcategories:neighborbasedapproachesandlatenfactormodels.Weselectedourbaselinealgorithmsfromeachofthesecategoriesforourcomparison.Toshowtheresults,weappliedRECTonMovieLens100K,MovieLens1MandEpinionsdatasets.GroupLensResearch5made5setsavailable,whichare80%/20%splitsofthe5grouplens.org/83MovieLens100Kintotrainingandtestdata.WealsosplittheMovieLens1MandEpinionsinto80%/20%randomly5timestoconducta5-foldcross-validationandrecordedtheaver-agevaluesofourmetrics.Table6.3showstheaverageRMSEandMAEof5-foldcross-validationforallbaselinealgo-rithmsandRECTandthesmallestRMSEandMAEvaluesforeachcolumnisindicatedbyboldface.Theresultssuggestedthatamongneighbor-basedapproaches,I-KNNisthebest-performingalgorithmforMovieLens1Mand100KandU-KNNisthesecondbest-performingone.Thatisbecauseofthefactthatthesimilaritybetweenmovies(genre,director,etc)andsimilaritybetweentasteofusers,playanimportantroleinpredictionaccuracy.I-KNNandU-I-BoutperformedotherneighborbasedmethodsonEpinionsinrespecttoRMSEmeasureswhileBPSOandI-KNNoutperformedothersinrespecttoMAEmeasures.Simi-larityofitems,andhavingthesamepatternofratingsforsimilaritems,onEpinionshelpedI-KNN'sresultsperformbetterthanotherneighborbasedmethods.Table6.3alsoshowsthecomparisonbetweenlatentfactormethodsinwhichRECT(SVDPP)algorithmachievedthe(second)bestperformanceofRMSEandMAEforMovie-Lens100KandRMSEforMovieLens1MandachievedthesecondbestperformanceofMAEforMovieLens1M.OnEpinions,too,RECToutperformedotherlatentfactormethods.Table6.3clearlyshowsthatRECTachievedthebestperformanceforalldatasetsamongallmethodsofbothcategories,latentfactorandneighborbasedmethods(exceptforMAEonMovieLens1M),theperformanceadvantageofRECToverallbaselinealgorithms.Hence,ourproposeddecoupledmethodbyincorporatingtheauxiliaryinformation,revealstheneedforpreventingerrorpropagationalongwithusingsideinformationtoobtainmoreaccuratepredictions.84Table6.3:ResultsofscenarioIonMovieLens100Kand1MandEpinionsAlgorithmsHyperparametersMovieLens100KMovieLens1MEpinionsRMSEMAERMSEMAERMSEMAENeighbor-BasedMethodsGA|1.11900.93991.1160.93271.16920.8878I-A|1.02200.81590.97590.77901.06950.8140U-A|1.03900.83501.0340.82721.12760.8769U-KNNk=800.93550.73980.89520.70302.39992.2229I-KNNk=80,sh=10,u=25,v=100.92410.72700.87110.68301.02790.6993U-I-Bu=5,v=20.94190.74500.90810.71901.02900.8010SO|0.93970.74030.90200.71201.08650.7067BPSO|0.97440.74820.93900.71991.04490.6813CCCi=3,Cu=3,T=300.95590.75260.91180.71341.05730.7890LatentFactorModelsSMFp=10,u=0.015,v=0.015,b=0.011.01340.78841.22840.93151.12240.8436s=1,=0.01,b=1,T=30FWMFp=5,T=5,sh=1500.92120.72520.86010.67301.50901.0597BMFp=160,b=0.003,=0.07,T=1000.91040.71940.85400.67601.02400.7918u=0.08,v=0.1MFp=10,T=750.91330.72450.85700.67511.09080.8372g=0.05,=0.005LFLLMp=10,b=0.01,u=0.015,v=0.0150.95500.76170.90120.70821.28911.0386=0.01,T=30,b=1SI-AFMb=0.7,g=0.015,=0.001,p=100.95680.76281.0350.84881.15340.8816b=0.33,T=1SU-AFMb=0.7,g=0.0150.95690.76340.90620.71891.0690.8398b=0.33,T=1,=0.001,p=10SCAFM|0.94990.75590.91210.72391.06000.8312SVDPPb=0.07,g=10.90650.71350.85100.66801.05500.8220b=0.005,p=50,=0.01,T=50SSVDPPb=0.7,T=301.1850.91470.94020.73521.33280.9022p=10,g=0.015,=0.001,b=0.33RS|1.69601.38601.70701.39401.90961.5789RECTr=100.70020.66280.71570.67210.71570.6796856.2.7ExistingUsers/Cold-startItemsTosimulatecold-startitemproblems(scenarioII),wedividedtheitemsintotwodisjointtrainingandtestsubsets.Weused80%oftheitemsasexistingitemsfortrainingandtheremaining20%ascold-startitemsfortesting.Aswementionedearlier,ourgeneralframeworkcanbeemployedtodealwithlinkpre-dictionchallengesaswell.Inordertokilltwobirdswithonestone,weselectedtheNIPSdatasettonotonlysimulatethecold-startitemscenario,butalsotoshowtheresultsofRECTforlinkpredictionchallenge.NIPShasrichsideinformationfortheitems(papers)andshowstherelationship(0or1)betweenpapersandauthors.SinceinNIPSthevaluesareeither0or1,predictingtheauthorsofnewpaperscanbeperceivedasalinkpredictionproblem.Wecomparedouralgorithmwithfourcompetitiverecommendationmethods:CBF,KMF,LCEandLCE-NLonNIPSdataset.Table6.4showstheaverageRMSEandMAEof5-foldcross-validationforthesealgorithmsforcold-startitemscenario.Theparameter6settingofeachalgorithmisalsogivenintheTable6.4forreproducingtheexperiments.TheresultsindicatethatRECTisthebestperformingalgorithmamongallbaselinealgo-rithmsincold-startitemscenarioinrespecttoRMSE,MAEandNDCG.HavingthehighestNDCGvalueamongallcompetitivealgorithmsshowsthatRECTcanpresentthetop-rankeditemstousersbetterthanotheralgorithms.RECThasalsothelowestRMSEandMAEvaluefromwhichwecanconcludethatpredictingtheratingsforcold-startitemsismoreaccurate.Hence,RECTcanbettersuggestthetop-rankedcold-startitemstouserswithhigheraccuracy.Table6.4alsoshowstherunningtimeofthealgorithms.HereCBFtakestheleasttimetogenerateitsresultsduetothefactthatitonlyisrequiredtobuildtheuserItis6Thedetailsofparametersareexplainedin[38]86Table6.4:Resultsofcold-startscenariosonrealdatasets.DSAlgorithmHyperparametersMeasuresNDCG@kRMSEMAETime(s)NIPSCBF|0.38610.79430.88810.17597LCEk=500,g=0.5,"=0:001,Tm=500,=0.050.42400.76920.8675709.869LCE-NLk=500,g=0.5,"=0.001,Tm=500,=00.41860.75320.85621823.48KMF˙r=0.4,D=10,=0.003,=0.10.14150.88040.919619.5413RECTr=10000.46260.51110.680523.8410EpinionsCBF|0.22010.66440.77414.4800LCEk=500,g=0.5,"=0.001,Tm=500,=0.050.23270.67130.77861067.11LCE-NLk=500,g=0.5,"=0.001,Tm=500,=00.23190.67120.778511969.9KMF˙r=0.4,D=10,=0.003,=0.10.20840.85220.88821196.32RECTr=10630.23430.66180.7716144.660MovieLens100KRS|0.10221.19810.97820.0131KMF˙r=0.4,D=10,=0.003,=0.10.24230.98230.873011.540RECTr=1000.26410.86720.72304.8729MovieLens1MRS|0.06521.38200.93260.0682KMF˙r=0.4,D=10,=0.003,=0.10.18340.97300.844263.732RECTr=1000.27830.85240.716210.578worthmentioningthatKPMFandRECTarecomparablyfast,butLCEandLCE-NLtakemuchmoretimeduetotheirconvergenceconditions.6.2.8Cold-startUsers/ExistingItemsTosimulatecold-startuserproblems(scenarioIII),wedividedtheusersintotwodisjointsubsetsoftrainingandtest.Weused80%oftheusersfortrainingandtheremaining20%fortestingofthecold-startuserscenario.ToshowtherelativeresultsofRECT,wecompareitwithcompetitivealgorithms:CBF,LCE,LCE-NLandKPMF.SinceEpinionshasthetrustnetworkamongtheusers,whichisausefulsideinformation,wechoseittosimulatethisscenario.Table6.4showstheaveraged(5-foldcrossvalidation)performanceresultsofthementionedalgorithmsanditclearlyshowsthatRECTachievedthebestperformanceofNDCG,RMSEandMAEonEpinionsdataset.ConsideringtheNDCGvalues,RECToutperformsallothermethodswhileNDCGofLCEandLCE-NLarestillclose.RECT,LCEandLCE-NLalsooutperformothermethodsinrespecttoRMSEandMAE.Inthiscase,whileRECTandLCEandLCE-NLbehavesim-87ilarly,themajorbetweenRECTandLCE(-NL)istherunningtimewhereCBFoutperformsRECTandRECToutperformstherestofmethods.ThoughtherearenomajorbetweenthevaluesofthesemetricsforRECTandothers,theconsistencyinout-performingothercompetitivemethodsonbothcold-startuseranditemscenariosthestabilityandperformanceadvantageofRECTovercurrentstate-of-the-artalgorithms.Again,asCBFisonlyrequiredtobuildtheuserithastheshortestrunningtime.Ouralgorithmhasthesecondfastestrunningtime,butitshouldbenotedthattheotheralgorithmstakesubstantiallylongertheexecutethaneitherRECTorCBF.6.2.9Cold-startUsers/Cold-startItemsTosimulatehandlingbothcold-startusersanditems(scenarioIV),werandomlyselected20%oftheusersand20%oftheitemsascold-startusersanditems,respectively.Inthisscenariowetriedtopredicttheratingsofcold-startusersoncold-startitems.Sincetherearenohistoricalratingsforeitherusersoritems,toshowtheresultsofRECTonthischallengingscenario,wecomparedtheresultsofRECTwithonlyRSandKMF.WedidnotincludeLCE,LCE-NLandCBFastheyarenotapplicableinthisscenario.Table6.4showstheresultsofapplyingRS,KMFandRECTalgorithmsonMovieLens100Kand1M.Onbothdatasets,RECToutperformedRSandKMF.RSgenerallyperformsworsethantheotheralgorithmsincold-startusers/itemsscenarios,whichisacommonprobleminnewlylaunchedwebsites;andthatitisnecessarytocarefullyusesimilarityinformationofusersanditemstohaveamoreaccuraterecommendations.Therefore,RECTisanovelalgorithmthatcandealwithallthreetypesofcold-startproblems.WewouldliketoalsonotethatRECTisabletopredictaninitialguessofanitempopularityforcold-startitemsinonlinerecommendersystemsaccurately.88Chapter7Cold-starRecommendation:RankingPredictionI7.1IntroductionDuetothepopularityandexponentialgrowthofe-commerceandonlinestreamingwebsites,acompellingdemandhasbeencreatedfortrecommendersystemstoguideuserstowarditemsoftheirinterests(e.g.products,books,movies)[2].Incollaborativemethods,eitherorranking,byrelyingonthelow-rankassumptionontheusers'preferences,bothusersanditemsaremappedintoalatentfeaturespacebasedonpartiallyobservedratingsthatarelaterusedtomakepredictions.Incollaborative(CF)methodssuchasmatrixfactorization[58],wheretheaimistoaccuratelypredicttheratings,thelatentfeaturesareextractedinawaytominimizethepredictionerrormeasuredintermsofpopularperformancemeasuressuchasrootmeansquareerror(RMSE).InsparkcontrasttoCF,incollaboratingranking(CR)models[58,22,119,25],wherethegoalistoranktheunrateditemsintheorderofrelevancetotheuser,thepopularrankingmeasuressuchasasdiscountedcumulativegain(DCG),normalizeddiscountedcumulativegain(NDCG),andaverageprecision(AP)[54]areoftenemployedtocollaborativelylearnarankingmodelforthelatentfeatures.RecentstudieshavedemonstratedthatCRmodelsleadtotlyhigherranking89accuracyovertheirtraditionalCFcounterpartsthatoptimizeratingprediction.Thisisimportantconsideringthefactthatwhatwereallycareinrecommendationisnottheactualvaluesofratings,buttheorderofitemstoberecommendedtoaspuser.Therefore,theerrormeasuressuchasRMSEareoftenhopelesslyt,astheirplaceequalemphasisonalltheratings.Amongrankingmodels,themethodsthatmainlyconcentrateonthetopofthelisthavereceivedaconsiderableamountofattention,duetothehigherprobabilityofexaminingthetopportionofthelistofrecommendationsbyusers.Therefore,theintroductionofrankingmetricssuchaspushnormornorm[99,5,22,59],sparkedawidespreadinterestinCRmodelsandhasbeenproventobemoreeinpractice[112,22].Thegeneralrecommendationsettingconsistsofapoolofitemsandapoolofusers,whereuserfeedbacksuchasratingsareavailableforvaryingsubsetsofitems.Recommendersystemsseektopredicttheratingthatauserwouldgivetoanitemandfurthertrytosuggestitemsthataremostappealingtotheusers.Recently,recommendersystemshavereceivedaconsiderableamountofattentionandhavebeenthemainfocusofmanyresearchstudies[2].AsdemonstratedbyitsperformanceintheKDDCup[29]andcompetition[13],currentlythemostsuccessfulrecommen-dationtechniqueusediscollaborativeInfactorizationbasedCFmethods[58],bothusersanditemsaremappedintoalatentfeaturespacebasedonobservedratingsthatarelaterusedtomakepredictions.AlthoughCRmodelsforrecommendersystemshasbeenstudiedextensivelyandsomeprogresshasbeenmade,however,thestateofairsremainsunsettled:theissueofhandlingcold-startitemsinrankingmodelsandcopingwithnotmissingatrandomassumptionofratingsareelusiveopenissues.First,inmanyrealworldapplications,theratingdata90areverysparse(e.g.,thedensityofthedataisaround1%formanypubliclyavailabledatasets)orforasubsetofusersoritemstheratingdataisentirelymissing(knowsascold-startuserandcold-startitemproblem,respectively)[103].Second,collaborativeandrankingmodelsrelyonthecriticalassumptionthatthemissingratingsaresampleduniformlyatrandom.However,inmanyrealapplicationsofrecommendersystems,thisassumptionisnotbelievedtohold,asinvariablysomeusersaremoreactivethanothersandsomeitemsareratedbymanypeoplewhileothersarerarelyrated[111].Theseissueshavebeeninvestigatedinfactorizationbasedmethods,nonetheless,itisnotstraightforwardtoadaptthemtoCRmodelsandareleftopen[22].Motivatedbythesechallenges,weaskthefollowingfundamentalquestioninthecontextofcollaborativerankingmodels:Isitpossibletoctivelylearnacollaborativerankingmodelinthepresenceofcold-startitems/usersthatisrobusttothesamplingofobservedratings?Inthisthesis,wegiveantiveanswertotheabovequestion.Inparticular,weintroduceasemi-supervisedcollaborativerankingmodel,dubbedS2COR,byleveragingsideinformationaboutbothobservedandmissingratingsincollaborativelylearningtherankingmodel.Inthelearnedmodel,unrateditemsareconservativelypushedaftertherelevantandbeforetheirrelevantitemsintherankedlistofitemsforeachindividualuser.Thiscrucialgreatlybooststheperformanceandlimitsthebiascausedbylearningonlyfromsparsenon-randomobservedratings.Wealsointroduceagraphregularizationmethodtoexploitthesideinformationaboutuserstoovercomethecold-startusersproblem.Insummary,thekeyfeaturesofS2CORare:Inspiredbyrecentdevelopmentsinrankingattop[99,5,?],theproposedmodelisacollaborativerankingmodelthatprimarilyfocusesonthetopoftherecommendation91listforeachuser.Moreover,instarkcontrasttopairwiserankingmodelswhichhavequadraticdependencyonthenumberofitems,theproposedrankingmodelhasalineardependencyonthenumberofitems,makingitsuitableforlarge-scalerecommendation.Itleveragessideinformationaboutitemswithbothobservedandmissingratingswhilecollaborativelylearningtherankingmodel,whichenablesittotivelyincorporatetheavailablesideinformationinknowledgetransduction.Byincorporatingtheunrateditemsinranking,itlimitsthebiascausedbylearningsolelybasedontheobservedratingsandconsequentlydealswiththenotmissingatrandomissueofratings.Itisalsoabletoleveragethesimilarityinformationbetweenusersbasedonagraphregularizationmethodtomakehighqualityrecommendationsforuserswithfewratingsorcold-startuserswithoutanratinginformation.TobuildtheintuitiononhowincorporatingmissingratingsinS2CORisbinhandlingcold-startproblemandmitigatingdatasparsityissue,wenotethatinmanyrealworldapplicationstheavailablefeedbackonitemsisextremelysparse,andthereforetherankingmodelsfailtotivelyleveragetheavailablesideinformationintransdcutingtheknowledgefromexistingratingstounobservedones.Thisproblembecomesespeciallyeminentincaseswheresurrogaterankingmodelssuchaspairwisemodelsareusedduetotheircomputationalvirtues,wheretheunobservedratingsdonotplayanyroleinlearningthemodel.Asaresult,byleveragingrichsourcesofinformationaboutallitems,onecanpotentiallybridgethegapbetweenexistingitemsandnewitemstoovercomethecold-startproblem.92Turningtothenon-randomsamplingissueofobservedratings,wenotethatthenon-randomnessisobservingtheratingscreatesabiasinlearningthemodelthatnegativelyimpactsthefuturepredictionsandmaydegradetheresultingrecommendationaccuracyifignored.Therefore,thenatureofmissingratingshastobemodeledpreciselyastoobtaincorrectresults.Toreducetheofbias,theproposedrankingmodeltakesaconservativeapproachandpushestheitemswithunknownratingstothemiddleofrankedlist,i.e.,aftertherelevantandbeforetheirrelevantitems.Thisisequivalenttoassumingapriorabouttheunknownratingswhichisbelievedtoperformwellasinvestigatedin[28].However,unlike[28],theproposedrankingideaisfreeofdecidinganexplicitvalueformissingratingswhichmakesitmorevaluablefromapracticalpointofview.Weconductthoroughexperimentsonrealdatasetsandcompareourresultswiththestate-of-the-artmodelsforcold-startrecommendationtodemonstratetheenessofourproposedalgorithminrecommendationatthetopofthelistandmitigatingthedatasparsityissue.Ourresultsindicatethatouralgorithmoutperformsotheralgorithmsandprovidesrecommendationwithhigherqualitycomparedtothestate-of-the-artmethods.7.2ExperimentsInthissection,weconductexhaustiveexperimentstodemonstratethemeritsandadvan-tagesoftheproposedalgorithm.Weconductourexperimentsonthreewell-knowndatasetsMovieLens,AmazonandCiteULike.WeinvestigatehowtheproposedS2CORperformsincomparisontothestate-of-the-artmethods.Inthefollowing,weintroducethedatasetsthatweuseinourexperimentsandthenthemetricsthatweemploytoevaluatetheresults,followedbyourdetailedexperimentalresultsontherealdatasets.93Inthefollowingsubsections,weintendtoanswerthesekeyquestions:Rankingversusrating:Howdoeslearningoptimizationforarankingbasedlossfunctiontheperformanceofrecommendingversusthesquarelossfunction?Employingmissingratings:Howdoesemployingthemissingratingscouldhelpinmakingmoreaccuraterecommendations?Dealingwithcold-startitems:Howdoestheproposedalgorithm,withincorporat-ingsideinformationofitems,performincomparisontothestate-of-the-artalgorithmstodealwithcold-startitems?7.2.1DatasetsWeusethefollowingwellknowndatasetstoevaluatetheperformanceofS2COR:ML-IMDB.WeusedML-IMDBwhichisadatasetextractedfromtheIMDBandtheMovieLens1MdatasetsbymappingtheMovieLensandIMDBandcollectingthemoviesthathaveplotsandkeywords.Theratingvaluesare10discretenumbersrang-ingfrom1to10andtheratingweremadebinarybytreatingalltheratingsgreaterthan5as+1andbelow5as1.Amazon.Weusedthedatasetofbest-sellingbooksandtheirratingsinAmazon.Eachbookhasaoneortwoparagraphsoftextualdescription,whichhasbeenusedtohaveasetoffeaturesofthebooks.Ratingscanbeintegersnumbersfrom1to5.Theratingswerealsomadebinarybytreatingalltheratingsgreaterorequalto3as+194Table7.1:Statisticsofrealdatasetsusedinourexperiments.StatisticsML-IMDBAmazonCiteULikeNumberofusers2,11313,0973,272Numberofitems8,64511,07721,508Numberofratings739,973175,612180,622Numberoffeatures8,7445,7666,359Averagenumberofratingsbyusers350.213.455.2Averagenumberofratingsforitems85.613.455.2Density4.05%0.12%0.13%andbelow3as1.CiteULike.Itisanonlinefreeserviceformanaginganddiscoveringscholarlyref-erences.Userscanaddthosearticlesthattheyareinterestedintotheirlibraries.Collectedarticlesinauser'slibrarywillbeconsideredasrelevantitemsforthatuser.Thisdatasetdoesnothaveexplicitirrelevantitemsandwaschosentoillustratetheofconsideringmissingdatawhileonlyhavingrelevantitmes.Forallabovedatasets,thedescriptionabouttheitemsweretokenizedandafterremovingthestopwords,therestofthewordswerestemmed.Thenthosewordsthathavebeenappearedinlessthan20itemsandmorethat20%oftheitemswerealsoremoved[105].Attheend,theTF-IDFwasappliedontheremainingwordsandtheTF-IDFscoresrepresentedthefeaturesoftheitems.ThestatisticsofthedatasetsaregiveninTable7.1.AsitisshowninTable7.1,allthesethreedatasetshavehighdimensionalfeaturespace.957.2.2MetricsWeadoptthewidelyusedmetrics,DiscountedCumulativeGainatnandRecallatn,forassessingtheperformanceofourandbaselinealgorithms.Foreachuseru,givenanitemi,letskbetherelevancescoreoftheitemrankedatpositionk,wheresk=1iftheitemisrelevanttotheuseruandsk=0otherwise.Now,giventhelistoftop-nitemrecommendationsforuseru,DiscountedCumulativeGainatn,isas:DCGu@n=s1+nXk=2sklog2(k)IfwedividetheDCGu@nbyitsmaximumvalue,wegettheNDCGu@nvalue.Giventhelistoftop-nitemrecommendationsforeachuseru,Recallatnwillcountthenumberofrelevantitemsappearedintherecommendationlistdividedbythesizeofthelist.Recallatnisas:RECu@n=jfrelevantitemstoug\ftop-nrecommendeditemsgjjftop-nrecommendeditemsgjDCG@n,NDCGu@nandREC@nwillbecomputedforeachuserandthenwillbeaveragedoverallusers.7.2.3MethodologyGiventhepartiallyobservedratingmatrix,wetransformedtheobservedratingsofalldatasetsfromamulti-levelrelevancescaletoatwo-levelscale(+1;1)while0isconsideredforunobservedratings.Werandomlyselected60%oftheobservedratingsfortrainingand20%forvalidationsetandconsidertheremaining20%oftheratingsasourtestset.To96betterevaluatetheresults,weperformeda3-fold-crossvalidationandreportedtheaveragevalueforourresults.7.2.4BaselineAlgorithmsTheproposedS2CORalgorithmiscomparedtothefollowingalgorithms:MatrixFactorization(MF)[110]:Isamatrixcompletionmethodthatfactorizestheincompleteobservedmatrixandcompletesthematrixusingtheunveiledlatentfeatures.MatrixFactorizationwithSideInformation(KPMF)[122]:Isamatrixcom-pletionbasedalgorithm,whichincorporatesexternalsideinformationoftheusersoritemsintothematrixfactorizationprocess.ItimposesaGaussianProcessprioroverallrowsofthematrix,andthelearnedmodelexplicitlycapturestheunderlyingcorrelationamongtherows.DecoupledCompletionandTransduction(DCT)[9]:Isamatrixfactorizationbasedalgorithmthatdecouplesthecompletionandtransductionstagesandexploitsthesimilarityinformationamongusersanditemstocompletethe(rating)matrix.FeatureBasedFactorizedBilinearSimilarityModel(FBS)[105]:Thisalgo-rithmusesbilinearmodeltocapturepairwisedependenciesbetweenthefeatures.Inparticular,thismodelaccountsfortheinteractionsbetweenthetitemfeatures.CollaborativeUser-spFeature-basedSimilarityModels(CUFSM):Byusingthehistoryofratingsforusers,itlearnspersonalizedusermodelacrossthe97dataset.Thismethodisoneofthebestperformingcollaborativelatentfactorbasedmodel[31].RegressionbasedLatentFactorModel(RLF):1Thismethodincorporatesthefeaturesofitemsinfactorizationprocessbytransformingthefeaturestothelatentspaceusinglinearregression[3].IfthelearningmethodisMarkovChainMonteCarlo,wenameitRLF-MCMC.CosineSimilarityBasedRecommender(CSR):Usingthesimilaritybetweenfeaturesofitems,thepreferencescoreofauseronanitemwillbeestimated.Wewouldliketomentionthatasbaselinealgorithmsweonlyconsiderstate-of-theartmethodsthatareabletoexploitthesideinformationaboutitems.7.2.5S2CORvs.ratingManytalgorithmsaretryingtoproviderecommendationstouserssuchthatthepredictedratingvaluesbeveryclosetotheactualratesthatuserswouldprovide.ThesealgorithmstrytominimizetheerrorbetweenthepredictedvaluesandactualratingvaluesbyminimizingMeanSquaredError(MAE)orRootMeanSquareError(RMSE)oretc.Then,duetothefactthatuserstendtoonlycareaboutthetopoftheirrecommendationlist,predictingarankingofitemsofinterestinsteadofratingsbecamethemainfocusofrecentworks[22].InthissectionwecomparetheresultsofS2CORwiththosestate-of-the-artalgorithmsthattrytopredictratingsforunrateditems.Amongthestate-of-the-artalgorithms,wechoseadiversesetofalgorithms,whichareMatrixFactorization,MatrixFactorizationwithSideInformation,DecoupledCompletionandTransductionand1TheimplementationofthismethodisavailableinLibFMlibrary[95].98Table7.2:ResultsofcomparingrankingsmethodsversusratingsmethodsonML-IMDB.isregularizationparameter,hisdimensionoflatentfeatures,Tisthenumberofiterations,islearningrateand˙isthestandarddeviation.AlgorithmsHyperparametersNDCG@10MF=0:003;h=10;˙=0:40.0947KPMF=0:003;h=10;˙=0:40.1005DCTh=100.0095RLF-MCMCT=100;˙=0:10.0248S2COR=0:55;h=10;T=1000.265RegressionBasedLatentFactorModel.Table7.2showstheNDCGvalueoftop10itemsofrecommendationlist.ItshowsthatS2CORoutperformedallotherratingpredictionbasedalgorithmsintermsofNDCGmeasure.Theresultstheenessofprovidingtheranksofitemsratherthantheirratings.7.2.6RobustnesstonotmissingatrandomratingsInthissectionwecomparetheofincorporatingtheunobservedratingsinourlearningincomparisonwithexcludingthemfromourlearning.Mostofthemethodsintheliteratureignoretheunobservedratingsandtraintheirmodelonlybaseonobservedratings.Byincorporatingtheunrateditemsinranking,ourmethodcanlimitthebiascausedbylearningsolelybasedontheobservedratingsandconsequentlydealswiththenotmissingatrandomissueofratings.Table7.3showsresultsofcomparingthesetwoscenariosforS2CORonML-IMDB.Inordertoseethebetweenthesetwoscenarios,weconsidered70%oftheratingsfortrainingand30%fortesttohavemoregroundtruthforourtesting.Table7.3showstheNDCG@5,10,15and20forbothscenariosanditshowsthatincorporatingtheunobservedratingscausestoimprovetheaccuracyofrecommendationlist.Hence,theNDCGvaluesfortop5,10,15and20itemsimprovedwhenunrateditemswereincludedas99Table7.3:ResultsofemployingmissingratingsversusignoringthemonML-IMDB.=0:6isregularizationparameter,h=10isdimensionoflatentfeatures,T=100isthenumberofiterations.Algorithm:S2CORNDCG@5NDCG@10NDCG@15NDCG@20Observedratings1.16902.22182.83623.2849Observed+missingratings1.17942.24052.85853.3096partofthetrainingprocess.7.2.7Dealingwithcold-startitemsWenowturntoevaluatingtheenessofS2CORforcold-startrecommendation.Todoso,werandomlyselected60%oftheitemsasourtrainingitemsand20%forvalida-tionsetandconsideredtheremaining20%oftheitemsasourtestset.Inthisscenario,baselinealgorithmsthatareusedforcomparisonareCSR,FBS,CUFSMandRLF.Fortheexperiments,weusedML-IMDB,AmazonandCiteULikedatasets.Table7.4showsthemeasurementresultsofapplyingmentionedalgorithmsonthesedatasets.Foreachtest,theparameters'valuesproducingthebestrankingonthevalidationsetwereselectedtobeusedandreported.AsitcanbeseenfromtheresultsinTable7.4,theproposedS2CORalgorithmoutperformedallotherbaselinealgorithmsandprovidedarecommendationswithhigherqualityincomparisontoothermethods.WecanalsoseefromtheresultsofTable7.4thatfortheML-IMDBdataset,theimprovementintermsofREC@10istcomparedtootherdatasets.Sincethedensityofthisdatasetismuchhigherthanothertwodatasets,thisobservationindicatesthatourmethodismoreectiveinutilizingsideinformationcom-paredtoothermethods.TheseresultsdemonstratethetivenessofS2CORincomparisonwithotherstate-of-the-artalgorithms.S2CORwasabletooutperformotherstate-of-the-artalgorithmsbyconsideringthemissingdataandfocusingontopoftherecommendationlist100Table7.4:Resultsoncold-startitems.,1andareregularizationparameters,hisdimensionoflatentfeatures,listhenumberofsimilarityfunctionsandTisthenumberofiterations.AlgorithmsHyperparametersDCG@10REC@10ML-IMDBCSR|0.12820.0525RLFh=150.04550.0155CUFSMl=1;1=0:0050.21600.0937FBS=0:01;=0:1;h=50.22700.0964S2COR=0:6;h=10;T=2000.27310.2127AmazonCSR|0.02280.1205RLFh=300.00760.0394CUFSMl=1;1=0:250.02820.1376FBS=0:1;=1;h=10.02840.1392S2COR=0:6;h=10;T=2000.11950.1683CiteULikeCSR|0.06840.1791RLFh=750.04240.0874CUFSMl=1;1=0:250.07910.2017FBS=0:25;=10;h=50.07920.2026S2COR=0:6;h=10;T=2000.09200.2243forcold-startitems.101Chapter8Cold-starRecommendation:RankingPredictionII8.1IntroductionRecommendersystemshavebecomeubiquitousinrecentyears,andareappliedinavarietyofapplications[53].Typically,therecommendationsystemselectsasmallsetfromtheunderlyingpoolofitems(e.g.,products,news,movies,books)torecommendtoanactiveuser.Theusercaneitherlike(e.g.,read,click,buy,etc.)ordisliketheitemsinthelist,ortakenoaction.Content-basedring(CB)andcollaborative(CF)arewell-knownexamplesofrecommendationapproaches.AsdemonstratedinKDDCup[29]andcompetition[13],themostsuccessfulrecommendationtechniqueusediscollaborativewhichexploitstheusers'opinions(e.g.,movieratings)and/orpurchasing(e.g.,watching,reading)historyinordertoextractasetofinterestingitemsforeachuser.InbasedCFmethods,bothusersanditemsaremappedintoalatentfeaturespacebasedonobservedratingsthatarelaterusedtomakepredictions.Despitetimprovementsonrecommendationapproaches,andinparticularcol-laborativebasedmethods,thesesystemsfromafewinherentlimitationsthatmustbeaddressed.Inparticular,thesetechniquesserfromdatasparsityinreal-worldscenariosandfailtoaddressuserswhoratedfewornoitemsoritemsthathavenotbeen102ratedbyanyuser{commonlyknownascold-startusersanditemsproblems,respectively.Forinstance,accordingto[101],thedensityofextantratingsinmostcommercialrecommendersystemsisoftenmuchlessthanone.Inrecentyearstherehasbeenanupsurgeofinterestinexploitingsocialinformationsuchastrustrelationsamongusersalongwithratingdatatoimprovetheperformanceofrecommendersystemsandresolvesparsityandcold-startproblems(seee.g.[117,106,35]formorerecentsurveys).Awell-knownexampleistheFilmTrustsystem[41].Thisisanonlinesocialnetworkthatprovidesmovieratingandreviewfeaturestoitsusers.Thesocialnetworkingcomponentofthewebsiterequiresuserstoprovideatrustratingforeachpersontheyaddasafriend.Themainmotivationforexploitingtrustinformationinrecommendationprocessstemsfromtheobservationthattheideasweareexposedtoandthechoiceswemakearetlybyoursocialcontext.Forexample,in[108]itwaspointedoutthatpeopletendtorelymoreonrecommendationsfromotherstheytrustmorethanrecommendationsbasedonanonymouspeoplesimilartothem.Thisobservation,combinedwiththegrowingpopularityofopensocialnetworks,hasgeneratedarisinginterestintrust-enhancedrecommendationsystems.Basedonthisintuition,afewtrust-awarerecommendationmethodshavebeenproposed[69,74,51].Recently,onlinenetworkswheretwooppositekindofrelationshipscanoccurhavebecomecommon.Forinstance,theEpinions[44],ane-commercesiteforreviewingandratingproducts,allowsuserstoevaluateothersbasedonthequalityoftheirreviews,andmaketrustanddistrustrelationswiththem.SimilarpatternscanbefoundinonlinecommunitiessuchasSlashdotinwhichmillionsofuserspostnewsandcommentdailyandarecapableoftaggingotherusersasfriends/foesorfans/freaks.Additionally,usersonWikipediacanvotefororagainstthenominationofotherstoadminship[16].Whenmoreusersissuingdistrust103statements,themoreimportantitwillbecometoexploitthisnewinformationsourceinrecommendersystems.Itisacknowledgedthatalongwiththetrustrelationships,alsodistrustcanplayanimportantroleinboostingtheaccuracyofrecommendations[117,116,35].Recently,someattemptshavebeenmadetoexplicitlyincorporatethedistrustrelationsinrecommendationprocess[44,70,116].Thisdemonstratedthattherecommendersystemscanbfromtheproperincorporationofdistrustrelationsinsocialnetworks.However,despitethesepositiveresults,therearesomeuniquechallengesinvolvedindistrust-enhancedrecommendersystems.Inparticular,ithasprovenchallengingtomodeldistrustpropagationinamannerwhichisbothlogicallyconsistentandpsychologicallyplausible.Furthermore,thenaivemodelingofdistrustasnegativetrustraisesanumberofchallenges-bothalgorithmicandphilosophical.Finally,itisanopenchallengehowtobestincorporatetrustanddistrustrelationsinmodel-basedmethods,e.g.,matrixfactorization,simultaneously.Inmemorybasedrecommendersystems,thesimultaneousexploitationoftrustanddistrusthasbeeninvestigatedin[118,115].Anattempttosimultaneouslyexploittrustanddistrustrelationsinfactorizationbasedmethodhasbeenmadeveryrecentlyin[35].Inparticular,arankingmodelwasproposedtorankthelatentfeaturesofusers,fromtheperspectiveofindividualusers,thatrespectstheirsocialcontext.Despitetheencouragingimprovementsthathasbeenbroughtbysimultaneousexploitationoftrustanddistrustrelations,theproposedalgorithmsersfromtwoissues.First,asthenumberofconstrainttripletsimposedfromsocialregularizationoflatentfeaturescanincreasecubicallyinthenumberofusersinthesocialnetwork,itiscomputationallytomaketheirideascalabletolargesocialgraphs.Second,thealgorithmproposedin[35]onlyconsidersthetrustedanddistrustedfriendsofeachuserinordertoregularize104thelatentfeatures.Thus,itignorestheneutralfriends{theuserswhohasnorelationtotheuser.Asaresult,theneutralfriendsmightappearbeforethetrustedfriendsintherankedlist.Thereforeneutralfriendsmightbecomemoretialthantrustedfriendsandimpacttherecommendationnegatively.Inthispaperwedescribeandanalyzeafairlygeneralandmethodtolearnandinferfromratingdata,alongwithtrustanddistrustrelationshipsbetweenusers.Themainbuildingblockofourproposedalgorithm,dubbedPushTrust,isantalgorithmtorankthelatentfeaturesofusersbyleveragingtheirsocialcontext.Inrankingoflatentfeatures,wewishtomakesurethatthetopportionofthelistincludesthetrustedfriendsandthedistrustedfriendsarepushedtothebottomoflist.Also,wewouldliketheneu-tralfriendsappearaftertrustedandbeforedistrustedfriendsinthelist.Comparedtothemethodproposedin[35],thekeyfeaturesofthePushTrustalgorithmare:(i)itsquadratictimecomplexityinthenumberofusers,and(ii)itsabilitytotaketheneutralfriendsofusersintotheconsiderationinregularizingthelatentfeaturesofusers.Thoroughexperi-mentsontheEpinionsdatasetdemonstratethemeritsoftheproposedalgorithminboostingthequalityofrecommendationsanddealingwithdatasparsityandcold-startusersproblems.Outline.Thesectionsisorganizedasfollows.InSection5wegaveaformalofthesettingandreviewedtheexistingsocialenhancedmatrixfactorizationmethods.ThePushTrustalgorithmanditsassociatedoptimizationmethodarepresentedinSection5.2and5.3.ExperimentsareprovidedinSection8.21058.2ExperimentsInthissection,weconductseveralexperimentstocomparetherecommendationqualitiesofourPushTrustalgorithmwithotherstate-of-the-artrecommendationmethods.Webe-ginwithabriefdescriptionoftheEpinionsdatasetandtheevaluationmetrics,andthenintroducethebaselinemethods,followedbydiscussionofourexperimentalresults.8.2.1TheEpinionsdatasetToevaluatetheproposedalgorithmontrust-anddistrust-awarerecommendations,weusetheEpinionsdataset,theonlysocialratingnetworkdatasetpubliclyavailable.Moreover,animportantreasonwhywechoosetheEpiniondatasetisthatusertrustanddistrustinformationisincludedinthisdataset.Epinionsallowsuserstoevaluateotherusersbasedonthequalityoftheirreviews.Itadditionallyprovidestrustanddistrustevaluationsinadditiontoratings.ThesocialrelationsininEpinionsaredirected.Table8.1showsthestatisticsoftheEpinionsdatasetusedinourexperiments.Toconductthecomingexperiments,wesampledasubsetofEpinionsdatasetwithn=121;240usersandm=685;621titems.Thetotalnumberofobservedratingsinthesampleddatasetis12,721,437whichapproximatelyincludes0:02%ofallentriesintheratingmatrixR:Thisdemonstratesthesparsityoftheratingmatrix.Thesocialnetworkincludes481,799and96,823trustanddistrustrelationsbetweenusers,respectively.8.2.2EvaluationmetricsWeadoptthewidelyusedtheMeanAbsoluteError(MAE)andRootMeanSquaredError(RMSE)[50]metricsforpredictionaccuracy.TheMAEmetricisveryappropriateand106Table8.1:StatisticsofratingdataandsocialnetworkinthesampleEpinionsdatasetusedinourexperiments.StatisticQuantityNumberofusers121,240Numberofitems685,621Numberofratings12,721,437Numberoftrustrelations481,799Numberofdistrustrelations96,823usefulmeasureforevaluatingpredictionaccuracyintests[50].LetTdenotethesetofratingstobepredicted,i.e.,T=f(i;j)2[n][m];RijneedstobepredictedgandletcRdenotethepredictionmatrixobtainedbyarecommendationalgorithm.Then,theMAEandRMSEmetricsareas:MAE=1jTjX(i;j)2TjRijcRijj;andRMSE=vuutX(i;j)2TRijcRij2=jTj:Thestmeasure(MAE)considerseveryerrorofequalvalue,whilethesecondone(RMSE)emphasizeslargererrors.8.2.3BaselinealgorithmsInordertodemonstratethebofourapproach,wecompareourmodelwiththefollowingfactorizationbasedmethodswithandwithoutsocialregularization.MF(matrixfactorizationbasedrecommender):ThisalgorithmisthebasicmatrixfactorizationinEq.(5.1),whichdoesnottakethesocialdataintoaccount.107MF-T(matrixfactorizationwithtrustinformation):Thematrixfactorizationalgo-rithmwhichexploitsthetrustrelationsbetweenusersinfactorization[70]asformulatedinEq.(5.2).MF-D(matrixfactorizationwithdistrustinformation):Thealgorithmproposedin[70]toonlyexploitdistrustrelationsasformulatedinEq.(5.3).MF-TD(matrixfactorizationwithtrustanddistrustinformation):Thealgorithmproposedin[35]toincorporatebothtrustanddistrustrelationshipsasformulatedinEq.(5.4).PushTrust:Thealgorithmproposedinthispaper.Experimentalsetup.Fortestingouralgorithm,weperform5-foldcrossvalidationinourexperiments.Ineachfold80%oftheratingdatawasrandomlyselectedasthetrainingsetandtheremaining20%asthetestdata.Afteroptimizationovertrainingdata,theextractedlatentmodelsareusedtopredictthetestdata.8.2.4ComparisontobaselinealgorithmsWenowfocusoncheckhowdothetrust,distrustandneutralrelationshipsbetweenuserstherecommendationaccuracy.Inotherwords,wewouldliketoexperimentallyevalu-atewhetherincorporatingentsocialrelationshipscanindeedenhancethetrust-basedrecommendationprocess.Inthissection,wepresentresultsonaccuracyofratingpredic-tionintermsofRMSEandMAE.Wecompareourapproachwiththefourstate-of-the-artalgorithmsdiscussedabove:MF,MF-T,MF-D,andMF-TDontheEpinions.Werunthealgorithmswithtlatentvectordimensions,k=5andk=10.WeevaluatethesealgorithmsbymeasuringbothMAEandRMSEerrors.Theparametersofall108Table8.2:ComparisonwithothermethodsintermsofMAEandRMSEerrors.kMethodMAERMSE5MF0.95121.2612MF-T0.87351.2030MF-D0.91371.2230MF-TD0.83121.1023PushTrust0.80101.080210MF0.97631.2725MF-T0.89301.2134MF-D0.92341.2300MF-TD0.84321.1450PushTrust0.82161.1230algorithmsaretunedtoachievethebestperformance.Table8.2reportstheMAEandRMSEofallcomparisonalgorithmsontheEpinionsdataset.Weunsurprisinglyobservedthattheperformanceofalgorithmsexploitingsocialinformationarebetterthanthepurematrixfactorizationbasedalgorithm.Additionallytheproposedalgorithmoutperformstheotherbaselinealgorithmsinallcases.ItisalsosurprisingtoseethatincreasingkintheEpinionsdatasetdidnotimprovetheresultswhileweknowthatbyincreasingk,theyofthealgorithmwillbeincreasedwhichshouldyieldmoreaccurateresults.TheEpinionsdatasetissmallandincreasingkcreatesmoreparametersinthemodelwhichleadstoovWealsonotethatMF-TperformsbetterthantheMF-Dduetolargenumberoftrustrelationsindatasetcomparedtothenumberofdistrustrelations.8.2.5Experimentsonhandlingcold-startusersHandlingcold-startusers(i.e.,userswithfewratingsornewusers)isonethemainchallengesinexistingrecommendersystems.Forexample,morethan50%ofusersintheEpinionsarecold-startusers.Henceofanyrecommendationalgorithmforcold-startusersbecomesveryimportant.Inthissetofexperiments,theperformanceofproposedalgorithm109Table8.3:Theaccuracyofhandlingcold-startusersandtheofsocialrelations.Thenumberoflatentfeaturesinthesexperimentsisk=5.%Cold-startUsersMeasureMFMF-TMF-DMF-TDPushTrust30%MAE0.99230.91240.97210.86030.8345RMSE1.72111.55621.64331.46021.336220%MAE0.98120.89050.94050.85420.8268RMSE1.69621.43391.52501.26301.243010%MAE0.96340.88020.92330.84800.8204RMSE1.32221.29211.31051.16881.1207isevaluatedonclod-startusersandcomparedtobaselinealgorithms.Toevaluatetalgorithmswerandomlyselect30%,20%,and10%oftheusers,tobecold-startusers.Forcold-startusers,wedonotincludeanyratinginthetrainingdataandconsiderallratingsmadebycold-startusersastestingdata.Table8.3showstheperformanceofabovementionedalgorithms.AsitisclearfromtheTable8.3,whenthenumberofcold-startusersishigh,exploitingthetrustanddistrustrelationshipstlyimprovestheperformanceofrecommendation.Thisresultisin-terestingasitrevealsthatthelackofratinginformationforcold-startandnewuserscanbealleviatedbyincorporatingthesocialrelationsofusers.Theimprovementbecomessignif-icantbyexploitingbothtrustanddistrustrelationshipswhichtheimportanceofsocialdatawhenexploitedinanappropriateway.TheresultsreportedinthatTable8.3implythatPushTrustisabletohandlecold-startusersmoreelythanotheralgorithms.WealsonotethatPushTrustoutperformstherecentlydevelopedMF-TDalgorithm.ThisisbecausetheMF-TDignorestheneutralfriendsbutPushTrustisdesignedtopushtheseuserstoappearafterthetrustedfriendsandbeforethedistrustedfriendsintherankedlistoflatentfeatures.Wenotethatwithoutconsideringneutralfriends,theirlatentfeaturesmightbecomeevencloserthanthetrusted110friendsofaspuserandconsequentlythepredictionsnegatively.ThiscrucialshowstimprovementintheperformanceasrevealedbytheresultsinTable8.3.111Chapter9NetworkCompletion9.1IntroductionGatheringcompleteknowledgeofthestructureofverylargenetworkeddatasetsisofpivotalimportancetomostnetworkanalysisapplications.However,determiningthisstructureinnetworks,suchas:social,biological,andpublicationnetworks,isveryultduetothesheersizeofthedatacoupledwithitsgeneralsparsity.Becauseoftheseitisreasonabletotryandmakestructuralmeasurementsusingapartially,andpartiallyobserved,sampleofanetworkandextrapolatethenetwork'spropertiesbasedonthissample.Suchinferencesareknownasnetworkcompletionorsurveysampling.Inagraph,thisisdonebyselectinganumberofnodesatrandomandobservingall,orauniformsample,ofthelinksincidenttothesenodes.Inthiswork,weattemptnetworkcompletionbyrecoveringthetopologyofanundirectednetworkbyobservingonlyapartially-observedrandomsamplefromthenetworkwith\enough"links.[55].Theneedforeandntnetworkcompletionalgorithmsoccursinawideva-rietyofsettingssuchainformationretrieval,socialnetworkanalysis,andcomputationalbiology.Withtheexplosionofbigdata,possessionofdataisnotoftenanissue,butrathercharacterizationofsaiddataisveryForinformationretrievalproblems,theremaybenaturalstructureintheset,butonlyasmallsamplingmaybefeasiblyobtainedtodeter-minetheplacementoftheobjectsintovariousclasses.Herenetworkcompletion'sattemptis112tousethatsmallsampletodeterminetherestofthenetwork.Similartotheinformationre-trievalproblem,themassiveuseradoptionofsocialnetworkshavebothincreasedtherewardinunderstandingtheunderlyingsocialstructureandalsotheyindoingso.TherearebillionsofusersamongFacebook,Twitter,andothersocialnetworks.Inferringthefullnetworkfromasmallsampleoflinksbetweenusers,ortheimplicitsocialnetworkingstructurebetweenthem,isachallengingproblemandthesubjectofrecentresearch[86].Incomputationalbiology[7],mostofthecrucialquestionsrelatetounderstandingthecom-plexinteractionsbetweenentities,suchasgenesorproteins.Theseinteractionsformlargenetworksandtheircomplexstructureaidsbiologistsindeterminingtheroleofproteinsinabiologicalsystem.Thesestructuralinsightsallowfortheformulation,andexperimentaltesting,ofsphypothesesaboutgenefunction.Unfortunately,theavailableexperimen-taltechniquescannotsamplethecompletesystem,butratheronlyatinyfractionofit.Inbuildingacompletebiologicalnetwork,againthenetworkcompletionproblemarises.Manymethodsexisttoanalyze,contextualize,andestimateconnectionsinlargegraphsandthetemptationtoapplythemtoournetworkcompletionproblemisstrong.Theyoftenusewellestablishedstatisticalandmachinelearningtoolsandtheyworkverywellundermanypreciselyconditions.However,theystruggletohandlethecasesinwhichweareinterestedduetosizeandrandomnessrestrictions.Eachoftheexamplesofproblemsthatmaybeaidedwithnetworkcompletiontechniques,seeminglycouldrelyonsimplychoosingverygoodsubsetsofthegivendata.Fromthisstatisticalstandpoint,thegeneraltrendtocopewiththeyofunderstandingthestructureoflarge-scalenetworkshasbeenonesamplingmethodstherein.Theideahereisthatasmall,butcarefullychosensamplecanbeusedtorepresentthepopulation.Therearespecializedtechniquessuchassamplingandclustersampling113thatimprovetheprecisionorncyofthesamplingprocess[36].However,duetosecurity,userprivacy,anddataaggregationoverheadtheexistenceofmissingdataisalmostinevitable.Thisabsentdatamisleadsestimationoffullnetworkproperties.Also,inmanyapplicationssuchassocialrecommendation,aglobalstructureofthenetworkisrequiredtobeinferred.Sincewedonotobserveanylinksforunsamplednodes,thestatisticalmodelsarenotfeasibleand,consequentially,theydonotperformwellforthesetypesofproblems.Anotherattemptatsolvingthisproblemuseslearningtechniques.Underthescopeoflearning,thenetworkcompletionproblemmaybecastasaninferenceproblem.Throughtheseinferences,itaimstolearnthestructureofanetworkfromasampledsubgraph.Yet,therearetwomajorissuesinlearningthenetworktopologyfromrandomsamples.First,thelinkspresentintheobservablesamplearenotchosenuniformly.Thisistypicallyrequiredinordertoapplymostwell-knowntechniquessuchasmatrixfactorizationormatrixcompletion[93].Secondly,therearehugenumberofnodesinthenetworkforwhichnoedgeissampled.This,again,makesthesestandardmethodsinfeasible.Consideringtheaforementionedlimitations,weaimtoexploitauxiliaryinformationtoimprovethenetworkcompletion.Usuallywehavesomeinformationaboutthenodesofthepartiallyobservednetwork.Furthermore,exploitingothersourcesofinformationaboutthenodesprovidesusefulinformationabouttheunderlyingstructureofthenetwork.Forexample,inasocialnetwork,thesimilaritybetweenusersbasedontheirinforma-tionisoftenseenasamajorfactorforconnectivityinsocialnetworks.Accordingtothehomophilyprinciple,peopletendtoconnecttothosewithwhoaresharingsimilarinterests,tastes,beliefs,socialbackgrounds,popularity,raceandetc[75].Inbiologicalnetworks,theinteractionsbetweenproteinsorothermoleculesrequireanexactorcomplementaryoftheircomplexsurfaces.Thiscanbetreatedanonymouslywithsimilarityinthecontextof114connectivityormicro-arrayexpressiondata[107].Therefore,iftherichauxiliaryinforma-tionisappropriatelyintegratedintonetworkmodels,thenetworkmodelcannecessarilyhavemorepredictivepower.Moregenerally,usingauxiliaryinformationmaytaketheformofidentifyingsimilarusers,sothattheunobservedlinkscanbeapproximatedwiththeobservedones,ordetermininggraphstructuresthatshouldexistinunobservednodes.Thissimilarityinformationaboutnodescancomplementthenetworkstructure,leadingtomoreprecisepredictionoflinks.Moreover,ifonesourceofinformationismissingornoisy,theothercancomplementit.Thus,itisimportanttoconsiderbothsourcesofinformationtogetherincompletion,leadingtoaninterestingresearchquestionas:Howtoctivelyincorporatesideinformationaboutnodestolearnthestructureofanetworkfromasampledsubgraphoftheunderlyingnetwork?Thisproblemisthemainfocusofthischapter,andourgoalistoreconstructtheunderly-ingnetworkwithoutanyknowledgeaboutthetopologicalfeaturesoftheunderlyingnetwork,butonlybasedonapartiallyobservedsubgraphofthenetworkandauxiliaryfeaturesofnodes.Tocompleteanunobservednetwork,weproposeantalgorithmthatantlydepartsfromtheexistingmethodswithsideinformationsuchassharedsubspacelearning[32]andmatrixcompletionwithtransduction[42].Ourproposedalgorithmconsistsoftwodecoupledstages:completionandtransduction.Inthestep,weonlycompletethelargestsub-networkthatcanbecarriedoutperfectlyfollowingthematrixcompletiontheory.Splly,undersomemildassumptionsonthenumberofsampledlinksandcoherenceofthematrixitmaybeperfectlycompleted(e.g.,[93]).Then,therecoveredsub-115networkalongwithavailablesimilaritysideinformationaboutnodesisutilizedtotransducttheknowledgetofullyunobservednodes.Weprovidetheoreticalperformanceguaranteesontheestimationerroroftheproposedalgorithm.Thisisincontrasttomostexistingmethodsthatdonotcomewiththeoreticalsupport.Experimentalresultsshowthatbysimultaneouslyexploitingthesampledsubgraphaswellasthesimilarityinformationbetweennodesextractedfromothersourcesofinformation,ourmethodperformstlybetterthannaturalalternativesandthestate-of-the-artmethods.Outline.Theremainderofthechapterisorganizedasfollows.InSection3.2.1,webeginbyestablishingnotationandformallydescribingoursetting.WetheninSection3.2.2de-scribetheproposedalgorithmandproveourmaintheoreticalresultonitsperformanceinSection3.3.TheexperimentalresultsareprovidedinSection9.2.9.2ExperimentalEvaluationTobetterunderstandtheperformanceoftheproposedalgorithm,wedivideourexperimentsintotwotscenarios.Inthescenario,weapplytheproposedalgorithmonafewwell-knownsocialnetworkdatasetsandcomparethemwiththebaselinelinkpredictionalgorithms.Inthesecondscenario,wecomparetheproposedalgorithmwiththestate-of-the-artalgorithmsfornetworkcompletion.Themainreasonforsplittingourexperimentsintotwopartsistheuniquechallengesthatexistinnetworkcompletionduetothesparsenessinformationofthelinks,whichmakesitachallengingproblem.Wenotethattalgo-rithmsmightachievetotallytperformancesonthesetwoscenarios.Theapplicationoftheproposedalgorithmstolinkpredictionandnetworkcompletionondatasetswillbe116discussedinSections9.2.5and9.2.6.Webeginbyintroducingthedatasetswehaveusedinourexperiments.Thencomparetheproposedalgorithmtoanumberofwell-knownalgo-rithmsaccordingtotheirqualityonbothsyntheticandmultiplepopularrealworldsocialnetworks.9.2.1DatasetsWehaveselectedfourrealworlddatasetsonwhichweperformedourexperiments.Twoofthesedatasets,whichareEpinionsandWeibowillbeusedinourapplication,whichislinkpredictionproblem.Theothertwodatasets,FacebookandGoogle+,areusedinoursecondapplicationwhichisnetworkpredictionproblem.Facebookdataset.Facebookisoneofthemostpopularrealworldsocialnetworkdatasets.OnFacebook,peoplehavedirectedfriendshiprelationshipswitheachotherandeachuserhasacontaininginformationabouttheuser.Foreachuser(anodeofournetwork)featuressuchasgender,jobtitle,ageandetcareextractedfromtheirproThenet-workofusersisproducedbycombiningego-networksofFacebookdatasetavailableattheStanfordLargenetworkDatasetCollection1.ThestatisticsofdatasetsaresummarizedinTable9.1.Togeneratethesimilaritymatrixbetweenusers,weusedthecosinesimilar-itymetricbetweentheextractedfeaturesandcomputedthepairwisesimilaritiesforallusers.Google+dataset.ThesecondrealsocialnetworkthatweexperimentedonisGoogle+.ItissimilartoFacebookintermsofrelationshipsbetweenusersandeachuserhasherownThefeaturesofusers,suchasgender,ageandetc,areextractedfromtheir1http://snap.stanford.edu/data117andtheego-networksofusersarecombinedtogethertohaveanetworkofusersalongwiththeirfeatures.ThisdatasetisalsoavailableattheStanfordLargenetworkDatasetCollec-tion2andthestatisticsareshowninTable9.1.Tocomputethesimilaritiesbetweenusers,weusedthecosinesimilaritytocalculatethepairwisesimilaritybetweenallusers.Epinionsdataset.Epinionsisanexplicittrust/distrustsocialnetworkcreatingadirectednetwork.Sincethenumberofdistrustrelationsisveryfew,weonlyconsiderexplicittrustrelationshipsbetweenusers.ThestatisticsofthisdatasetaresummarizedinTable9.1.ThisdatasetisavailablethroughJiliangTangatArizonaStateUniversity3.Generally,theEpinionsdatasetisusedtopredictratingsgiventhetrustgraphofitsusersassideinformation.We,conversely,arepredictingtheexistenceoflinksviaalinkpredic-tionprocesswithouralgorithmwhileusingtheratinginformationassideinformation.Forextractingthefeatures,eachuserisendowedwithacollectionoftheirratingsoftheitems.Theseratingscontainscoresandtimestamps.Toextractmorefeatures,wealsoincludedinformationabouttheitemsthatuserswereinterestedin,suchasnamesandcategories.Featuresfurtherincludeanormalizedtripleforeachitemcategory.Thisgivesthetotalnumberofreviewsinthatcategoryalongwiththenumberofpositiveandnegativereviewspercategorynormalizedbythemaximumvalueofeachpresentinthedata.Forpairwisesimilaritiesbetweenusers,weusedcosinesimilaritytogeneratethesimilaritymatrixofusers.TencentWeibodataset.TencentWeiboisaChinesemicrobloggingsitesimilartoTwitter.Thereisanetworkstructurewhereuserscan`follow'anothertocreatealink.Inaddition2http://snap.stanford.edu/data3http://www.public.asu.edu/~jtang20/datasetcode/truststudy.htm118tothelargenetwork,thesideinformationisquiterich.Weextractedthefeaturesfortheanonymizedusersthatincludespersonalinformation(age,gender,andinterests),socialinformation(numberofmessagessent,messagesreposted,usersfollowed,etc.),anditemc(categoriesandkeywords).Afollower-followeedirectedgraphisgiven,andweuseouralgorithmtopredicttheexistenceofadirectededgethatrepresentswhetherornotagivenuserwillfollowanother.Wefurtheronlyconsideredpositivelinks(whereauserfollowsanother),andignoredallnegativelinks(whereauserexplicitlychosesnottofollowanother)forprediction.ThedataisavailablefromtheKDDcup4.TheoriginalTencentWeibosocialdirectedgraphhas1.4millionusersandover73millionlinks.Wewantedtochoseasmallsubsetofapproximately4,000userswithmanyobservedlinksamongthemandareasonableamountofsideinformation.Toachievethisweusedamosearchthatpenalizedaddingnodesthatdidnotcontainlinksgoingbackintothealreadyexplorednodes.Oncethisprocessterminated,other(possiblyisolated)nodesatrandomwereaddediftheyhadalargeamountofside-informationuntilourgraph'sorderthresholdwasmet.Theobservedaverageratioofdirectedlinkstonodesforrandomdirectedgraphsoforderapproximately4,000was0.46.Withourmethodwefoundadirectedsubgraphof4,052verticeswithatlyhigherthanaverageratioofdirectedlinkstonodesat0.98.Thestatisticsofdata-subsetsaresummarizedinTable9.1.Togeneratethesimilaritymatrixofusers,weusedthecosinesimilaritybetweenallpairsofusers.4http://www.kddcup2012.org/c/kddcup2012-track1/data119Table9.1:StatisticsofFacebookandGoogle+datasets.Dataset#ofNodes#oflinks#ofNodeFeaturesFacebook4,089170,174175Google+250,46930,230,905690Epinions90,879365,42290,087Weibo4,0523,95716,7869.2.2EvaluationMetricsEvaluationmetrics.Weevaluatetheperformanceoftalgorithmsbymeasuringdiscrepancybetweentheoriginalandcompletedmatrices.WeadoptthewidelyusedMeanAbsoluteError(MAE)andRootMeanSquaredError(RMSE)metrics[55]forevaluation.Thesemeasuredareformallyasfollows.LetAandcAdenotethefullandestimatedadjacencymatrices,respectively.LetTbethesetofunobservedtestlinks.Then,MAE=P(i;j)2TjAijcAijjjTj;TheRMSEmetricisas:RMSE=vuuutP(i;j)2TAijcAij2jTj:9.2.3BaselineAlgorithmsToevaluatetheperformanceofourproposedalgorithm,weconsideravarietyofbaselineapproaches.Wecategorizeourbaselinealgorithmsintotwocategories,setincludesthosethatareusedinthelinkpredictionexperimentsandthesecondsetincludesthosethatareusedinournetworkcompletionexperiments.120Linkprediction.ThebaselinealgorithmsarechosenfromvarietytypesoflinkpredictionalgorithmsalgorithmstohaveafaircomparisonandallimplementationareavailableinMyMediaLitePackage[38].BPRMFMatrixfactorizationwithBayesianpersonalizedranking(BPR)fromimplicitfeedbackproducesrankingsfrompairwiseclassations.ThematrixfactorizationmodelprovidesitempredictionoptimizedforBPR.BiPolarSlopeOne(BPSO):SlopeOneratingpredictionmethodsweightedbybipolarfrequency.MatrixFactorization(MF):Matrixfactorizationwherelearningisprovidedbystochasticgradientdescentfactoringtheobservedratingsintouseranditemfactormatrices.SlopeOne(SO)SlopeOneratingpredictionmethodwithfrequencyrating.User-ItemBaseline(UIB):Assignsanaverageratingvalue,andregularization,forbaselinepredictions.Usestheaverageratingvalue,plusaregularizeduseranditembiasforprediction.CoClustering(CC):Performssimultaneousclusteringonboththerowsandthecolumnsoftheratingmatrix.LatentFeaturelog-linearModel(LFM):Ratingpredictionmethodthatusesalog-linearmodelonthelatentfeaturesofthesystem.Latent-featureloglinearmodel121BiasedMatrixFactorization(BMF):Matrixfactorizationwithparametersforbiasesofusersanditems.Utilizeslearningprovidedbystochasticgradientdescent.SVDPlusPlus(SPP):Singularvaluedecompositionmatrixfactorizationmethodthatmakesuserofimplicitfeedback.Furtherconsideredwhatitemsanduserseachuserhasrated.SigmoidSVDPlusPlus(SSPP):Singularvaluedecompositionmatrixfactoriza-tionmethodthatmakesuserofimplicitfeedbackandutilizesasigmoidfunction.SigmoidItemAsymmetricFactorModel(SFM):Asymmetricfactormodelthatrepresentstheitemsbasedonhowtheusersratedthem.Networkcompletion.Thebaselinealgorithmsarechosenfromthreettypesofnetworkcompletionalgorithmstohaveafaircomparison:(i)methodsthatonlyutilizetheobservedlinks,(ii)methodsthatusenetworkstructureaswellasnodefeatures,and(iii)methodsthatlearnsharedspace.MF:Theclassofbaselinealgorithmsconsideronlythenetworkstructureandignorethenodefeatures.WeselecttheMatrixCompletion(MF)algorithminthisclass[110]forourcomparison.MF-NF:Secondclassofbaselinealgorithmsincludemethodsthatemploybothnodefeaturesandthestructureofnetwork.Wechooseamatrixfactorizationbasedal-gorithmthatfactorizestheadjacencymatrixandcombinesthelatentfeatureswithexplicitfeaturesofnodesandlinksusingabilinearregressionmodel[79].Theimple-mentationofthisalgorithmisprovidedbytheauthors5.5http://cseweb.ucsd.edu/~akmenon/code/122Figure9.1:TherecoveryerroroftheproposedMC-DTalgorithm,i.e.kAcAkF,fortnoisevariances.MF-SS:ThethirdclassofbaselinealgorithmscombinethenetworkstructurewithnodefeaturesbysharingasubspaceasformulatedinEq.(3.8).WerefertothisalgorithmbyMatrixFactorizationwithSubspaceSharing(MF-SS)[32].MC-DT:Finally,thisreferstothealgorithmproposedinthispaperwhichisreferredtoasMatrixCompletionwithDecoupledTransduction(MC-DT).9.2.4ExperimentsonsyntheticdatasetInthissectionwepresenttheresultsonasyntheticdataset.Todoso,wegeneratedaadja-cencymatrixAwith2,048nodes.Togeneratethenetwork,wetheranktor=10,i.e.,thenumberofconnectedcomponents,andevenlydistributedthenodesintothecomponents.Thenwegeneratedapairwisesimilaritymatrixfornodesbyaddinganoisetermtothead-123jacencymatrixS=A+N,whereeachentryofnoisematrixfollowsauniformdistributionNij˘U(0;0:5).Togenerateanincompletenetworkinourdataset,werandomlyremovedthe20%ofalllinkssothatthenetworkhas119,999links(outof2,048nodesand149,999links).ofnoiseinsideinformation.Tobetterunderstandtheofsimilarityinfor-mationontherecoveryerror,wealsoinvestigatedtheofnoiseinsimilaritymatrixontheperformanceofMC-DTalgorithm.AsdemonstratedinTheorem3.3.1,therecoveryerrorhasalineardependencyontheerrorcontributedbytheerrorcomponentofsimilaritymatrixAE.Toempiricallyverifythisonsyntheticdataset,weaddednoisetosimilarityma-trixthatfollowsazeromeanGaussiandistributionN(0;˙)withntvaluesofvariance˙,whilekeepingthesizeofobservedsubmatrixtom=400.Inparticular,wechangedthevariancefrom˙=0:1to˙=1withstepsizeof0:1andcalculatedtherecoveryerror.AsFigure9.1shows,byincreasingthenoise,therecoveryerrorincreaseslinearly.Thisisconsistentwithourtheoreticalresult.trainingsize.WefurtherinvestigatethesizeofobservedsubmatrixOonrecoveryerroroftalgorithmonsyntheticdata.Wechangedthesizeofobservedsubmatrixmwithintherangeof200to1600withstep-sizeof200andreportedtherecoveryerrorfortalgorithmsinFigure9.2.FromFigure9.2itcanbeobservedthatbyincreasingm,therecoveryerrordecreasesforallofthemethods.Hence,thefewerunobservedelementswehave,thelowertherecoveryerroris.TheproposedMC-DTalgorithmperformsthebest,verifyingitsreliableperformance.ThetbetweentherecoveryerrorofMC-DTandthreeotheralgorithmsimpliestheenessofMC-DTinexploiting124Figure9.2:Therecoveryerroroftalgorithmsonasyntheticdatasetfortsizesofpartiallyobservedsubmatrixwithmnodes.similarityinformation.WenotethatsinceMC-DTignoresthemissingpartofthenetworkandcompletesthesubmatrixpurelyfromtheobservedpartofthenetwork,thecompletionerrorbecomeszeroandisnotpropagatedintransductionstage.Itisofinteresttonotethatbyincreasingthesizeofobservedsubmatrix,i.e.,asmapproachesn,theofsimilarityinformationondecreasingrecoveryerrorforallmethodsbecamelesstial.Thisfollowsfromthefactthatexistenceoflinksinthenetworkprovidemuchricherinformationthanthepairwisesimilaritybetweenusers.9.2.5PerformanceevaluationonlinkpredictionInthissectionwedemonstratetheresultsofapplyingourproposedalgorithmalongwithtbaselinealgorithmsonthelinkpredictionproblem.Wedemonstratetheresultsontworeal-datasets,whichareEpinionsandWeibo.1259.2.5.1LinkpredictiononEpinionsEpinionsisaconsumerreviewwebsite,whereusersmayreadreviewsaboutitemstohelpthemdecideontheirpurchase.Epinions,byallowinguserstohaveatrust/distrustrelation-shipwitheachother,createdasocialnetworkofusersinadditiontothereviewingplatform.Inthissectionweaimedtopredicttheexistenceoflinkswhileutilizingtheratinginforma-tionasthesideinformation.Inordertohavearichersideinformation,weconsideredallavailableinformationabouttheitemssuchascategory,scores,timestamps.Wecomparedouralgorithmwithanumberofbaselinealgorithms.Todothiscomparison,wecreatedauser-userbinarymatrix.Herein,1indicatedatrustrelationbetweentwousersand0indicatedunobservedrelations.Tocreatethetestset,ifwewanted,forexample,20%ofthedatabeconsideredasourtestset,werandomlyreplaced20%oftheobservedentriesby0andkept70%oftheremainingentriesunchangedasourtraining.The10%remainedasourvalidationsets.Tocompareourresultswithotherbaselines,andalsoexploretheoftrainingsize,weperformedourexperimentsontsizesoftraining,validationandtestsets.Forthetrainingset,weconsidered40%,50%,60%,70%and80%ofthedata.Ineachcase10%oftheremainingdatawasvalidationsetandtherestbecameourtestset.Foreachtrainingsize,weperformedtheexperimentetimesandreportedtheaveragevalues.Table9.2showstheresultsRMSEandMAEonEpinionswithttrainingsizes.Asitisshown,ourproposedalgorithmoutperformedallotherbaselinealgorithmsinalltrainingsizes.Generally,perhapsunsuprisingly,byincreasingthetrainingsize,theerroralsoreduced.Howeverasthetrainingsizeincreasedthereisnotatdropintheerror.Thisallowedustoconcludethatweselectsmallertrainingsizesandstillmaintainaccuracywhilealso126Table9.2:LinkpredictionresultsonEpinionsdatasetandtheoftrainingsize.RMSEMethodParametersTrainPercentage40%50%60%70%80%BPSOr=20,u=15,i=10,T=500.77080.76830.76830.76260.7610MFr=20,=0.015,=0.01,T=500.70630.71270.71910.71730.7269SO|0.78510.77800.77260.76500.7611UIB|0.70260.70220.70320.69930.7022CCCu=10,Ci=20,T=500.78490.77760.77120.76790.7625LFMr=10,=0.01,u;i=0.05,=0.01,T=500.72120.71100.71880.71500.7122BMFr=10,=0.01,u;i=0.05,T=500.70860.70840.70940.70610.7087SPPr=10,=0.05,=0.01,T=500.68970.69030.69260.68920.6929SSPPr=10,=0.05,=0.01,T=500.98790.98670.98510.98360.9856SFMr=10,=0.015,=0.33,=0.00,T=500.67380.67430.67700.67310.6777MC-DTr=370.66230.64950.64280.64300.6551MAEMethodParametersTrainPercentage40%50%60%70%80%BPSOr=20,u=15,i=10,T=500.94040.93180.92450.91560.9078MFr=20,=0.015,=0.01,T=500.83630.84220.84740.84680.8543SO|0.95150.93700.92440.91290.9047UIB|0.82800.82780.82800.82590.8276CCCu=10,Ci=20,T=500.92850.91600.90710.90130.8932LFMr=10,=0.01,u;i=0.05,=0.01,T=500.84250.83400.84020.83770.8342BMFr=10,=0.01,u;i=0.05,T=500.83180.83170.83190.83020.8315SPPr=10,=0.05,=0.01,T=500.82110.82150.82270.82060.8227SSPPr=10,=0.05,=0.01,T=501.26171.26051.25991.25661.2596SFMr=10,=0.015,=0.33,=0.00,T=500.81650.81670.81810.81550.8182MC-DTr=370.76250.75910.75620.75620.7573beingabletocomputesolutionsfasterandavoidovThisfurtherassistsinsituationswherethereisnotenoughdataavailabletouselargetrainingsets.9.2.5.2LinkpredictiononWeiboWeiboisasocialnetworkforconnectinguserstogetherandissimilartotheTwitter.Asitisexplainedinsection9.2.1,weselectedasubsetofthisdatasetwithabout4,000users.Fromthese,weextractedallavailablefeaturesandcreatedasimilaritymatrixbetweenusers.Inthisdataset,weaimtopredicttheexistenceoflinksbetweenuserswhileusingthesimilarity127matrixcreatedfromthesefeatures.WeperformedtheexperimentsonttrainingsizessimilartotheEpinions.Table9.3showstheresultsofRMSEandMAEmeasuresonWeibodataset.TheresultsshownintheTable9.3thatourproposedalgorithmisthebestperformingalgorithmamongthebaselinealgorithms.Onecanseethatthereistgapbetweentheresultsofourproposedalgorithmandotherbaselinealgorithms.ThiswasnotthecaseinEpinionsdatasetresults.ThisantinaccuracywasattributedtohavingmoreavailablesideinformationavailableinWeibothanwhatwasavailableinEpinions.Wecanalsoconcludethatafterhaving50%astrainingand10%asvalidation,thereisnotatdropintheresults.9.2.6PerformanceevaluationonNetworkCompletionWenowturntocomparingtheproposedalgorithmstothestate-of-the-artalgorithmsfornetworkcompletion.Wedemonstratetheresultsontwolargereal-datasets:FacebookandGoogle+.Inthisscenario,weutilizedttrainingsizesaswell.Wedividedthedatasetintotraining,validationandtestsubsetswhilewedividedtheusersintothesethreesubsets.Becauseofthisdivision,insteadofbytheobservedlinks,therearenoavailableratingsinourtrainingsubsetsfortheusersonourtestsubset.Forexample,whenthetrainingsizeis20%,werandomlyselected20%ofthenodesandthecorrespondinglinksfromtheeachsocialnetworktopredicttherestofnetwork.WeevaluatedtheperformanceofMC-DTalongwiththebaselinenetworkcompletionalgorithmsonthesetwodatasets.InFigures9.3and9.4,theRMSEandMAEoftalgorithmsonFacebookdatasetareshown,respectively.Asitcanbeobservedfromtheseplots,improvementofallmethodsgraduallydecreasedasmoreofthenetworkstructurewasobserved.128Table9.3:LinkpredictionresultsonWeibodatasetandtheoftrainingsize.RMSEMethodParametersTrainPercentage40%50%60%70%80%BPSOr=20,u=15,i=10,T=500.50230.50270.50170.50410.5063MFr=20,=0.015,=0.01,T=500.49920.49990.49910.50000.5018SO|0.50170.50340.49880.49990.5072UIB|0.50070.50020.49960.49830.5000CCCu=10,Ci=20,T=500.50200.50040.50360.49500.5022LFMr=10,=0.01,u;i=0.05,=0.01,T=500.50080.50070.49970.49840.5000BMFr=10,=0.01,u;i=0.05,T=500.50040.50030.49980.49860.5000SPPr=10,=0.05,=0.01,T=500.49960.50020.49980.49870.5002SSPPr=10,=0.05,=0.01,T=500.49910.50150.49430.48850.4884SFMr=10,=0.015,=0.33,=0.00,T=500.50010.50000.50010.49960.5000MC-DTr=370.34540.32850.32320.32370.3266MAEMethodParametersTrainPercentage40%50%60%70%80%BPSOr=20,u=15,i=10,T=500.55090.54880.54440.55090.5496MFr=20,=0.015,=0.01,T=500.50120.50190.50130.50250.5053SO|0.54890.54960.54140.54360.5488UIB|0.50670.50660.50530.50380.5053CCCu=10,Ci=20,T=500.56120.55740.55640.55110.5550LFMr=10,=0.01,u;i=0.05,=0.01,T=500.50930.51000.50830.50680.5082BMFr=10,=0.01,u;i=0.05,T=500.50210.50220.50170.50060.5019SPPr=10,=0.05,=0.01,T=500.50270.50360.50310.50200.5035SSPPr=10,=0.05,=0.01,T=500.69870.70030.69510.69080.6908SFMr=10,=0.015,=0.33,=0.00,T=500.50050.50020.50020.49970.5001MC-DTr=370.46240.44570.44190.44210.4437Table9.4showstheperformanceoftheMC-DTincomparisonwiththeotherbaselinealgorithmsonGoogle+'ssocialnetwork.Wenoticedthatthealgorithmsshowsimilarbe-haviortotheonepresentedinpreviousexperiment.Sp,MC-DToutperformstheothernetworkcompletionalgorithms.TheseresultstheconclusionsmadeinthepreviousexperimentsontheFacebookdataset.Wemakeseveralobservationsfromresults:MC-DTgreatlyoutperformedtheotherbaselinealgorithmsinallrespects.BoththeRMSEandMAEerrorsofthecompletednetworkaremorethantwotimessmallerfor129MC-DTthantheothers.TheseexperimentsdemonstratedtheenessofMC-DTinexploitingthesimilarityinformationtorecoverthefullnetwork.ItcanalsobeinterestingtocomparetheMC-DTtothenaiveMFmethodwhichdoesnotutilizenodefeatures.WenoticethatMC-DTachievesbetterperformance,asitcombinestheinformationfromthenodefeaturesaswellasthenetworkstructure.Thetgapbetweentheperformanceofthesetwoalgorithmsdemonstratedtheimportanceofsimilarityinformationinnetworkcompletionproblem.Wenotethatwhennetworksareincomplete,theperformanceofMC-DTdegradesgracefully,asitisbeabletorelyonthenodefeaturesinformationwhichcompensateforthelackoflinksinthenetworkstructure.Finally,itisofinteresttonotethatbycomparingtheresultsforsyntheticandrealdatasets,itcanbeseenthatinthesyntheticdatasetthedecreaseoferrorbyincreasingthenumberofobservednodesistinitially,butitslowlyshrinksafterwards.Comparethistorealworlddatasetswherethedecreaseisroughlylinear.Consideringthedependencyofourtheoreticalupperboundonrecoveryerror,thisobservationiscompletelyconsistentwithourtheoreticalresult.TheresultofFigure9.2showsthattofintuitive:Eventhoughthenetworkcontainsmanymissinglinks,MC-DTstilloutperformsothermethodstlybybetterleveragingtheinformationpresentinthenodefeatures.WenotethatsinceMC-DTignoresthemissingpartofthenetworkandcompletesthesub-matirxpurelyfromtheobservedpartofthenetwork,thecompletionerrorbecomeszeroandisnotpropagatedintransductionstage.130Table9.4:ComparisonoftalgorithmsonGoogle+withdeferentsizesofobservednodes.MethodRMSE(30%)MAE(30%)RMSE(60%)MAE(60%)MF0.972010.891320.817490.76601MF-NF0.863840.820940.715050.68935MF-SS0.813680.788200.583690.68160MC-DT0.788060.501860.508510.30011Figure9.3:TherecoveryoffouralgorithmsonFacebookdatasetmeasuredinRMSEundertpercentageofobservednodes.131Figure9.4:TherecoveryoffouralgorithmsonFacebookdatasetmeasuredinMAEundertpercentageofobservednodes.132Chapter10ConclusionsDuetointernetinformationoverloadontheweb,themostrelevantinformationisbecomingachallengingtask.RecommendersystemshaveplayedanimportantroleinreducingthenegativeimpactsofinformationoverloadonWebandhelpuserstotheusefulinformation.Therearemanytrecommendersthatallaretryingtosuggestthemostrelevantinformationtousers.Howevermostofthemfromthedatasparsityanditsextremecase,whichiscold-start.Wehaveproposedanovelfactorizationmodel,thatexplicitlyexploitsthesimilarityinformationaboutusersanditemstoalleviatethedatasparsitychallenge.Twokeyfeaturesofourproposedframeworkarethecompletionofasub-matrixoftheratingmatrix,whichisgeneratedbyexcludingthecold-startusersanditemsfromthesetofusersanditems,andtransductionofknowledgefromrecoveredsub-matrixofexistingusersanditemstothoseofcold-start.Inparticular,itdecouplesthecompletionfromtheknowledgetransduction,whichpreventstheerrorpropagationofcompletionandtransduction.Anothermajorapplicationofthisalgorithmisforcompletingthenetworks,whichthroughlydiscussedandexperimentallyevaluated.Wealsointroducedasemi-supervisedcollaborativerankingmodelbyleveragingsideinformationaboutbothobservedandmissingratingsincollaborativelylearningtherankingmodel.Inthelearnedmodel,unrateditemsareconservativelypushedaftertherelevantandbeforetherelevantitemsintherankedlistofitemsforeachindividualuser.This133crucialgreatlybooststheperformanceandlimitsthebiascausedbylearningonlyfromsparsenon-randomobservedratings.Theproposedalgorithmiscomparedwithmanybaselinealgorithmsonthreerealworlddatasetsthatdemonstratedtheectivenessofproposedalgorithminaddressingcold-startproblemandmitigatingthedatasparsityproblem,whilebeingrobusttosamplingofmissingratings.Wehavealsopresentedanovelapproachforrecommendationwithsocialinformationbycollaborativelyrankingthelatentfeaturesofusersinmatrixfactorizationbyexploitingthesocialcontextofusers.Incontrasttosocialregularizationbasedmethodswhichareabletoexploiteithertrustordistrustrelations,andexcludetheneutralfriendsinregularization,theproposedPushTrustalgorithmisabletosimultaneouslyexploitthetrustanddistrustrelationshipsbetweenusersandconsidersneutralfriendsinranking.Finally,thepotentialofdistrustasasideinformationtoimproveaccuracyofrecommendersystemsandovercomethecold-startproblemsintraditionalrecommendersystemsisexperimentallyinvestigatedonEpinionsdataset.134BIBLIOGRAPHY135BIBLIOGRAPHY[1]J.Abernethy,F.Bach,T.Evgeniou,andJ.-P.Vert.AnewapproachtocollaborativeOperatorestimationwithspectralregularization.TheJournalofMachineLearningResearch,10:803{826,2009.[2]G.AdomaviciusandA.Tuzhilin.Towardthenextgenerationofrecommendersys-tems:Asurveyofthestate-of-the-artandpossibleextensions.IEEETransactionsonKnowledgeandDataEngineering,17(6):734{749,2005.[3]D.AgarwalandB.-C.Chen.Regression-basedlatentfactormodels.InSIGKDD,pages19{28.ACM,2009.[4]D.AgarwalandS.Merugu.Predictivediscretelatentfactormodelsforlargescaledyadicdata.InACMSIGKDD,pages26{35.ACM,2007.[5]S.Agarwal.Thepush:Anewsupportvectorrankingalgorithmthatdirectlyoptimizesaccuracyattheabsolutetopofthelist.InSDM,pages839{850.SIAM,2011.[6]M.Aghagolzadeh,I.Barjasteh,andH.Radha.Transitivitymatrixofsocialnetworkgraphs.In2012IEEEStatisticalSignalProcessingWorkshop(SSP).[7]A.AnnibaleandA.Coolen.Whatyouseeisnotwhatyouget:howsamplingectsmacroscopicfeaturesofbiologicalnetworks.InterfaceFocus,1(6):836{856,2011.[8]I.Barjasteh,R.Forsati,A.-H.Esfahanian,andH.Radha.Semi-supervisedcollabora-tiverankingwithpushattop.arXivpreprintarXiv:1511.05266,2015.[9]I.Barjasteh,R.Forsati,F.Masrour,A.-H.Esfahanian,andH.Radha.Cold-startitemanduserrecommendationwithdecoupledcompletionandtransduction.InProceedingsofthe9thACMConferenceonRecommenderSystems,pages91{98.ACM,2015.[10]I.Barjasteh,R.Forsati,D.Ross,A.-H.Esfahanian,andH.Radha.Cold-startrecom-mendationwithprovableguarantees:Adecoupledapproach.IEEETransactionsonKnowledgeandDataEngineering,28(6):1462{1474,2016.136[11]I.Barjasteh,Y.Liu,andH.Radha.Trendingvideos:Measurementandanalysis.arXivpreprintarXiv:1409.7733,2014.[12]J.BasilicoandT.Hofmann.Unifyingcollaborativeandcontent-basedInICML,page9.ACM,2004.[13]R.M.BellandY.Koren.Lessonsfromthenetprizechallenge.SIGKDDExplo-rationsNewsletter,9(2),2007.[14]D.BillsusandM.J.Pazzani.Usermodelingforadaptivenewsaccess.Usermodelinganduser-adaptedinteraction,10(2-3):147{180,2000.[15]C.J.Burges.Atutorialonsupportvectormachinesforpatternrecognition.DataMiningandKnowledgeDiscovery,2(2):121{167,1998.[16]M.BurkeandR.Kraut.Moppingup:modelingwikipediapromotiondecisions.InProceedingsofthe2008ACMconferenceonComputersupportedcooperativework,pages27{36.ACM,2008.[17]R.Burke.Hybridrecommendersystems:Surveyandexperiments.Usermodelinganduser-adaptedinteraction,12(4):331{370,2002.[18]J.-F.Cai,E.J.Candes,andZ.Shen.Asingularvaluethresholdingalgorithmformatrixcompletion.SIAMJournalonOptimization,20(4):1956{1982,2010.[19]E.J.CandesandB.Recht.Exactmatrixcompletionviaconvexoptimization.Foun-dationsofComputationalmathematics,9(6):717{772,2009.[20]E.J.CandesandT.Tao.Thepowerofconvexrelaxation:Near-optimalmatrixcompletion.InformationTheory,IEEETransactionson,56(5):2053{2080,2010.[21]K.-Y.Chiang,C.-J.Hsieh,N.Natarajan,I.S.Dhillon,andA.Tewari.Predictionandclusteringinsignednetworks:alocaltoglobalperspective.JMLR,15(1):1177{1213,2014.[22]K.ChristakopoulouandA.Banerjee.Collaborativerankingwithapushatthetop.InWWW,pages205{215.InternationalWorldWideWebConferencesSteeringCom-mittee,2015.137[23]M.Claypool,A.Gokhale,T.Miranda,P.Murnikov,D.Netes,andM.Sartin.Com-biningcontent-basedandcollaborativersinanonlinenewspaper.InProceedingsofACMSIGIRworkshoponrecommendersystems,volume60.Citeseer,1999.[24]G.Contardo,L.Denoyer,andT.Artieres.Representationlearningforcold-startrecommendation.arXivpreprintarXiv:1412.7156,2014.[25]P.Cremonesi,Y.Koren,andR.Turrin.Performanceofrecommenderalgorithmsontop-nrecommendationtasks.InACMRecSys,pages39{46.ACM,2010.[26]M.Dadashi,I.Barjasteh,andM.Jalili.Rewiringdynamicalnetworkswithprescribeddegreedistributionforenhancingsynchronizability.Chaos:AnInterdisciplinaryJour-nalofNonlinearScience,20(4):043119,2010.[27]S.Das,H.Salehi,Y.Shi,S.Chakrabartty,R.Burgueno,andS.Biswas.Towardspacket-lessultrasonicsensornetworksforenergy-harvestingstructures.ComputerCommunications,2016.[28]R.Devooght,N.Kourtellis,andA.Mantrach.Dynamicmatrixfactorizationwithpriorsonunknownvalues.InACMSIGKDD,pages189{198.ACM,2015.[29]G.Dror,N.Koenigstein,Y.Koren,andM.Weimer.Theyahoo!musicdatasetandkdd-cup'11.InKDDCup,pages8{18,2012.[30]A.ElbadrawyandG.Karypis.Feature-basedsimilaritymodelsfortop-nrecommen-dationofnewitems.2014.[31]A.ElbadrawyandG.Karypis.User-spfeature-basedsimilaritymodelsfortop-nrecommendationofnewitems.ACMTransactionsonIntelligentSystemsandTech-nology(TIST),6(3):33,2015.[32]Y.FangandL.Si.Matrixco-factorizationforrecommendationwithrichsideinfor-mationandimplicitfeedback.InProceedingsofthe2ndInternationalWorkshoponInformationHeterogeneityandFusioninRecommenderSystems,pages65{69.ACM,2011.[33]R.Forsati,I.Barjasteh,F.Masrour,A.-H.Esfahanian,andH.Radha.Pushtrust:Anientrecommendationalgorithmbyleveragingtrustanddistrustrelations.InProceedingsofthe9thACMConferenceonRecommenderSystems,pages51{58.ACM,2015.138[34]R.Forsati,I.Barjasteh,D.Ross,A.-H.Esfahanian,andH.Radha.Networkcompletionbyleveragingsimilarityofnodes.SocialNetworkAnalysisandMining,6(1):102,2016.[35]R.Forsati,M.Mahdavi,M.Shamsfard,andM.Sarwat.Matrixfactorizationwithexplicittrustanddistrustsideinformationforimprovedsocialrecommendation.ACMTransactionsonInformationSystems,32(4):17,2014.[36]O.Frank.NetworksamplingandmodelModelsandmethodsinsocialnetworkanalysis,pages31{56,2005.[37]Z.Gantner,L.Drumond,C.Freudenthaler,S.Rendle,andL.Schmidt-Thieme.Learn-ingattribute-to-featuremappingsforcold-startrecommendations.InICDM.IEEE,2010.[38]Z.Gantner,S.Rendle,C.Freudenthaler,andL.Schmidt-Thieme.MyMediaLite:Afreerecommendersystemlibrary.InACM,RecommenderSystems,2011.[39]T.GeorgeandS.Merugu.Ascalablecollaborativeframeworkbasedonco-clustering.InICDM,2005.[40]A.A.Gittens.Topicsinrandomizednumericallinearalgebra.PhDthesis,CaliforniaInstituteofTechnology,2013.[41]J.GolbeckandJ.Hendler.Filmtrust:Movierecommendationsusingtrustinweb-basedsocialnetworks.InProceedingsoftheIEEEConsumercommunicationsandnetworkingconference,volume96.Citeseer,2006.[42]A.Goldberg,B.Recht,J.Xu,R.Nowak,andX.Zhu.Transductionwithmatrixcompletion:Threebirdswithonestone.InNIPS,pages757{765,2010.[43]Q.GuandJ.Zhou.Learningthesharedsubspaceformulti-taskclusteringandtrans-ductivetransferInICDM'09,pages159{168.IEEE,2009.[44]R.Guha,R.Kumar,P.Raghavan,andA.Tomkins.Propagationoftrustanddistrust.InProceedingsofthe13thInternationalConferenceonWorldWideWeb,pages403{412.ACM,2004.[45]R.aandM.Sales-Pardo.Missingandspuriousinteractionsandtherecon-structionofcomplexnetworks.ProceedingsoftheNationalAcademyofSciences,106(52):22073{22078,2009.139[46]A.GunawardanaandC.Meek.Aapproachtobuildinghybridrecommendersystems.InProceedingsofthethirdACM,Recommendersystems,2009.[47]S.K.Gupta,D.Phung,B.Adams,T.Tran,andS.Venkatesh.Nonnegativesharedsubspacelearninganditsapplicationtosocialmediaretrieval.InACMSIGKDD,pages1169{1178.ACM,2010.[48]S.K.Gupta,D.Phung,B.Adams,andS.Venkatesh.Regularizednonnegativesharedsubspacelearning.Dataminingandknowledgediscovery,26(1):57{97,2013.[49]S.HannekeandE.P.Xing.Networkcompletionandsurveysampling.InAISTAT,pages209{215,2009.[50]J.L.Herlocker,J.A.Konstan,L.G.Terveen,andJ.T.Riedl.Evaluatingcollaborativerecommendersystems.ACMTOIS,22(1):5{53,2004.[51]M.JamaliandM.Ester.Amatrixfactorizationtechniquewithtrustpropagationforrecommendationinsocialnetworks.InProceedingsofthefourthACMconferenceonRecommendersystems,pages135{142.ACM,2010.[52]M.JamaliandM.Ester.Atransitivityawarematrixfactorizationmodelforrecom-mendationinsocialnetworks.InProceedingsoftheTwenty-SecondinternationaljointconferenceonAIntelligence-VolumeVolumeThree,pages2644{2649.AAAIPress,2011.[53]D.Jannach,M.Zanker,A.Felfernig,andG.Friedrich.Recommendersystems:anintroduction.CambridgeUniversityPress,2010.[54]K.arvelinandJ.ainen.Irevaluationmethodsforretrievinghighlyrelevantdocuments.InACMSIGIR,pages41{48.ACM,2000.[55]M.KimandJ.Leskovec.Thenetworkcompletionproblem:Inferringmissingnodesandedgesinnetworks.InSDM,pages47{58.SIAM,2011.[56]Y.Koren.Factorizationmeetstheneighborhood:amultifacetedcollaborativemodel.InKDD'08,pages426{34.[57]Y.Koren.Factorintheneighbors:ScalableandaccuratecollaborativeACMTransactionsonKnowledgeDiscoveryfromData(TKDD),4(1):1,2010.140[58]Y.Koren,R.Bell,andC.Volinsky.Matrixfactorizationtechniquesforrecommendersystems.Computer,42(8):30{37,2009.[59]J.Lee,S.Bengio,S.Kim,G.Lebanon,andY.Singer.Localcollaborativeranking.InProceedingsofthe23rdinternationalconferenceonWorldwideweb,pages85{96.ACM,2014.[60]D.LemireandA.Maclachlan.Slopeonepredictorsforonlinerating-basedcollaborativeInSDM,volume5,pages1{5.SIAM,2005.[61]D.Liben-NowellandJ.Kleinberg.Thelink-predictionproblemforsocialnetworks.JournaloftheAmericansocietyforinformationscienceandtechnology,58(7):1019{1031,2007.[62]B.Lika,K.Kolomvatsos,andS.Hadjiefthymiades.Facingthecoldstartprobleminrecommendersystems.ExpertSystemswithApplications,41(4):2065{2073,2014.[63]J.Lin,K.Sugiyama,M.-Y.Kan,andT.-S.Chua.Addressingcold-startinapprec-ommendation:latentusermodelsconstructedfromtwitterfollowers.InACMSIGIR,pages283{292.ACM,2013.[64]G.Ling,M.R.Lyu,andI.King.Ratingsmeetreviews,acombinedapproachtorecommend.InACMConferenceonRecommenderSystems,pages105{112.ACM,2014.[65]J.Liu,C.Wu,andW.Liu.Bayesianprobabilisticmatrixfactorizationwithsocialrelationsanditemcontentsforrecommendation.DecisionSupportSystems,2013.[66]N.N.Liu,X.Meng,C.Liu,andQ.Yang.Wisdomofthebetterfew:coldstartrecommendationviarepresentativebasedratingelicitation.InACMConferenceonRecommenderSystems,pages37{44.ACM,2011.[67]W.Liu,J.Wang,andS.-F.Chang.Robustandscalablegraph-basedsemisupervisedlearning.ProceedingsoftheIEEE,100(9):2624{2638,2012.[68]M.Long,J.Wang,G.Ding,W.Cheng,X.Zhang,andW.Wang.Dualtransferlearning.InSDM,pages540{551.SIAM,2012.[69]H.Ma,I.King,andM.R.Lyu.Learningtorecommendwithsocialtrustensemble.InACMSIGIR,pages203{210.ACM,2009.141[70]H.Ma,M.R.Lyu,andI.King.Learningtorecommendwithtrustanddistrustrelationships.InProceedingsofthethirdACMconferenceonRecommendersystems,pages189{196.ACM,2009.[71]H.Ma,H.Yang,M.R.Lyu,andI.King.Sorec:socialrecommendationusingproba-bilisticmatrixfactorization.InProceedingsofthe17thACMconferenceonInformationandknowledgemanagement,pages931{940.ACM,2008.[72]H.Ma,D.Zhou,C.Liu,M.R.Lyu,andI.King.Recommendersystemswithsocialregularization.InProceedingsofthefourthACMinternationalconferenceonWebsearchanddatamining,pages287{296.ACM,2011.[73]F.Masrour,I.Barjasteh,R.Forsati,A.-H.Esfahanian,andR.Hayder.Networkcom-pletionwithnodesimilarity:amatrixcompletionapproachwithprovableguarantees.InASONAM,2015.[74]P.MassaandP.Avesani.Trust-awarecollaborativeforrecommendersystems.InOntheMovetoMeaningfulInternetSystems2004:CoopIS,DOA,andODBASE,pages492{508.Springer,2004.[75]M.McPherson,L.Smith-Lovin,andJ.M.Cook.Birdsofafeather:Homophilyinsocialnetworks.Annualreviewofsociology,pages415{444,2001.[76]P.Melville,R.J.Mooney,andR.Nagarajan.Content-boostedcollaborativeforimprovedrecommendations.InAAAI/IAAI,pages187{192,2002.[77]A.K.Menon,K.-P.Chitrapura,S.Garg,D.Agarwal,andN.Kota.Responsepre-dictionusingcollaborativewithhierarchiesandside-information.InACMSIGKDD,pages141{149.ACM,2011.[78]A.K.MenonandC.Elkan.Alog-linearmodelwithlatentfeaturesfordyadicpredic-tion.InICDM,pages364{373.IEEE,2010.[79]A.K.MenonandC.Elkan.Linkpredictionviamatrixfactorization.InMachineLearningandKnowledgeDiscoveryinDatabases,pages437{452.Springer,2011.[80]A.MnihandR.Salakhutdinov.Probabilisticmatrixfactorization.InNIPS,pages1257{1264,2007.[81]J.S.Morgan,I.Barjasteh,C.Lampe,andH.Radha.Theentropyofattentionandpopularityinyoutubevideos.arXivpreprintarXiv:1412.1185,2014.142[82]A.Nemirovski,A.Juditsky,G.Lan,andA.Shapiro.Robuststochasticapproximationapproachtostochasticprogramming.SIAMJournalonOptimization,19(4):1574{1609,2009.[83]Y.Nesterov.Introductorylecturesonconvexoptimization,volume87.SpringerScience&BusinessMedia,2004.[84]Y.Nesterov.Gradientmethodsforminimizingcompositefunctions.MathematicalProgramming,140(1):125{161,2013.[85]W.Pan,E.W.Xiang,N.N.Liu,andQ.Yang.Transferlearningincollaborativeforsparsityreduction.InAAAI,volume10,pages230{235,2010.[86]M.Papagelis,G.Das,andN.Koudas.Samplingonlinesocialnetworks.KnowledgeandDataEngineering,IEEETransactionson,25(3):662{676,2013.[87]S.-T.ParkandW.Chu.Pairwisepreferenceregressionforcold-startrecommendation.InACMConferenceonRecommenderSystems,pages21{28.ACM,2009.[88]S.-T.Park,D.Pennock,O.Madani,N.Good,andD.DeCoste.Naeotsforrobustcold-startrecommendations.pages699{705,2006.[89]A.Paterek.Improvingregularizedsingularvaluedecompositionforcollaborativetering.InProceedingsofKDDcupandworkshop,volume2007,pages5{8,2007.[90]M.J.Pazzani.Aframeworkforcollaborative,content-basedanddemographicAIntelligenceReview,13(5-6):393{408,1999.[91]A.Popescul,D.M.Pennock,andS.Lawrence.Probabilisticmodelsforcol-laborativeandcontent-basedrecommendationinsparse-dataenvironments.InUAI,pages437{444,2001.[92]I.Porteous,A.U.Asuncion,andM.Welling.Bayesianmatrixfactorizationwithsideinformationanddirichletprocessmixtures.InAAAI,2010.[93]B.Recht.Asimplerapproachtomatrixcompletion.TheJournalofMachineLearningResearch,12:3413{3430,2011.[94]S.Rendle.Factorizationmachines.InICDM,pages995{1000.IEEE,2010.143[95]S.Rendle.Factorizationmachineswithlibfm.ACMTransactionsonIntelligentSys-temsandTechnology(TIST),3(3):57,2012.[96]S.RendleandL.Schmidt-Thieme.Online-updatingregularizedkernelmatrixfactor-izationmodelsforlarge-scalerecommendersystems.InProceedingsofthe2008ACMconferenceonRecommendersystems,pages251{258.ACM,2008.[97]J.D.RennieandN.Srebro.Fastmaximummarginmatrixfactorizationforcollabo-rativeprediction.InICML,pages713{719.ACM,2005.[98]S.Roweis.Nipsdataset(2002).http://www.cs.nyu.edu/~roweis.[99]C.Rudin.Thep-normpush:Asimpleconvexrankingalgorithmthatconcentratesatthetopofthelist.TheJournalofMachineLearningResearch,10:2233{2271,2009.[100]L.SafouryandA.Salah.Exploitinguserdemographicattributesforsolvingcold-startprobleminrecommendersystem.LectureNotesonSoftwareEngineering,1(3):303{307,2013.[101]B.Sarwar,G.Karypis,J.Konstan,andJ.Riedl.Item-basedcollaborativerecommendationalgorithms.InWWW,pages285{295.ACM,2001.[102]M.SaveskiandA.Mantrach.Itemcold-startrecommendations:learninglocalcollec-tiveembeddings.InACMConferenceonRecommenderSystems,pages89{96.ACM,2014.[103]A.I.Schein,A.Popescul,L.H.Ungar,andD.M.Pennock.Methodsandmetricsforcold-startrecommendations.InACMSIGIR,pages253{260.ACM,2002.[104]H.ShanandA.Banerjee.Generalizedprobabilisticmatrixfactorizationsforcollabo-rativeInICDM,pages1025{1030.IEEE,2010.[105]M.Sharma,J.Zhou,J.Hu,andG.Karypis.Feature-basedfactorizedbilinearsimilaritymodelforcold-starttop-nitemrecommendation.InSDM,2015.[106]Y.Shi,M.Larson,andA.Hanjalic.Collaborativeingbeyondtheuser-itemmatrix:Asurveyofthestateoftheartandfuturechallenges.ACM(CSUR),47(1):3,2014.[107]M.Shiga,I.Takigawa,andH.Mamitsuka.Annotatinggenefunctionbycombiningexpressiondatawithamodulargenenetwork.Bioinformatics,23(13):i468{i478,2007.144[108]R.R.SinhaandK.Swearingen.Comparingrecommendationsmadebyonlinesystemsandfriends.[109]L.H.Son.Dealingwiththenewusercold-startprobleminrecommendersystems:Acomparativereview.InformationSystems,2014.[110]N.Srebro,J.Rennie,andT.S.Jaakkola.Maximum-marginmatrixfactorization.pages1329{1336.NIPS,2004.[111]H.Steck.Trainingandtestingofrecommendersystemsondatamissingnotatrandom.InKDD,pages713{722.ACM,2010.[112]H.Steck.Gaussianrankingbymatrixfactorization.InProceedingsofthe9thACMConferenceonRecommenderSystems,pages115{122.ACM,2015.[113]M.Sun,F.Li,J.Lee,K.Zhou,G.Lebanon,andH.Zha.Learningmultiple-questiondecisiontreesforcold-startrecommendation.InProceedingsofthesixthACMinter-nationalconferenceonWebsearchanddatamining,pages445{454.ACM,2013.[114]M.Trevisiol,L.M.Aiello,R.Schifanella,andA.Jaimes.Cold-startnewsrecommen-dationwithdomain-dependentbrowsegraph.InACMConferenceonRecommenderSystems,volume14,2014.[115]N.Verbiest,C.Cornelis,P.Victor,andE.Herrera-Viedma.Trustanddistrustaggre-gationenhancedwithpathlengthincorporation.FuzzySetsandSystems,202:61{74,2012.[116]P.Victor,C.Cornelis,M.D.Cock,andA.Teredesai.Trust-anddistrust-basedrecommendationsforcontroversialreviews.IEEEIntelligentSystems,26(1):48{55,2011.[117]P.Victor,C.Cornelis,andM.DeCock.Trustnetworksforrecommendersystems,volume4.Springer,2011.[118]P.Victor,N.Verbiest,C.Cornelis,andM.D.Cock.Enhancingthetrust-basedrec-ommendationprocesswithexplicitdistrust.ACMTransactionsontheWeb(TWEB),7(2):6,2013.[119]M.VolkovsandR.S.Zemel.Collaborativerankingwith17parameters.InAdvancesinNeuralInformationProcessingSystems,pages2294{2302,2012.145[120]D.Zhang,Q.Zou,andH.Xiong.Cruc:Cold-startrecommendationsusingcollabora-tiveininternetofthings.arXivpreprintarXiv:1306.0165,2013.[121]K.Zhou,S.-H.Yang,andH.Zha.Functionalmatrixfactorizationsforcold-startrecommendation.InACMSIGIR,pages315{324.ACM,2011.[122]T.Zhou,H.Shan,A.Banerjee,andG.Sapiro.Kernelizedprobabilisticmatrixfactor-ization:Exploitinggraphsandsideinformation.InSDM,volume12,pages403{414.SIAM,2012.146