CONTINUOUSUSERAUTHENTICATIONANDIDENTIFICATIONUSINGUSER INTERFACEINTERACTIONSONMOBILEDEVICES By VaibhavBhushanSharma ATHESIS Submittedto MichiganStateUniversity inpartialentoftherequirements forthedegreeof ComputerScience-MasterofScience 2015 ABSTRACT CONTINUOUSUSERAUTHENTICATIONANDIDENTIFICATIONFROMUSER INTERFACEINTERACTIONSONMOBILEDEVICES By VaibhavBhushanSharma Weinvestigatewhetheramobileapplicationcancontinuouslyandunobtrusivelyauthen- ticateandidentifyitsusersbasedononlytheirinteractionswiththeUserInterfaceofthe application.Auniqueadvantagethatthismodalityprovidesovercurrentlyexploredimplicit modalitiesonmobiledevicesisthateveryuserwhousesthemobileapplicationisautomati- callyenrolledintothesystem.EveryusermustinteractwiththeUserInterface ofanapplicationinordertouseitandthereforethismodalityisalwaysguaranteedtohave tnumberofinputsfortrainingandtestingpurposes.Usingttypesofinput controlsavailableontheAndroidplatform,wecollectedinteractionsfrom42usersine tsessions.Wecreatedbasefromeachtypeofinputcontrolandcombine themintoanensembleinordertoauthenticateandidentifyusers.WeSupport VectorMachinebasedensembleachievesameanequalerrorrateof5%incase ofauthenticationandameanaccuracyof90%incaseofidenWeSupport VectorMachinebasedensembleoutperformothertechniquesinbothcases.While theensembleperformanceforauthenticationandidenisnotfoundtobe tforittoreplacecurrentprimaryauthenticationmechanismsusedinmobileap- plications,itstrulycontinuousnatureprovidesmotivationforittobeusedincombination withprimarymechanisms. ACKNOWLEDGEMENTS Iwouldliketothankmyadvisor,Dr.RichardJ.Enbodyforguidingmeintheright directionwhenIwasworkingtowardsasolutionfortheresearchproblempresentedinthis thesis.IwishtoexpressmygratitudetoDr.ArunRossforteachingawonderfulcourse onpatternrecognition.Iwouldalsoliketothanktwofriends,SunpreetSinghAroraand SorrachaiYingchareonthawornchaifortheirhelpfulsuggestionsduringdiscussionsaboutthis research.Iwouldalsoliketothankallthevolunteerswhoprovidedtheirusagedata.This workwouldnothavebeenpossiblewithouttheirsupport.Finally,Iwishtothankmyfriends andfamilyforalwaysencouragingmeandbelievinginmeduringtimes. iii TABLEOFCONTENTS LISTOFTABLES .................................... vi LISTOFFIGURES ................................... vii Chapter1Introduction ................................ 1 1.1ExplicitAuthentication..............................1 1.2ImplicitAuthentication..............................2 1.2.1ExploredImplicitAuthenticationSchemes...............2 1.2.1.1Touch-basedAuthentication..................2 1.2.1.2Keystroke-basedAuthentication................3 1.2.1.3Sensor-basedAuthentication..................3 1.2.2UIInteraction-basedAuthentication..................3 1.3ImplicitUserIden............................4 1.3.1Sensor-basedImplicitUserIden................5 1.3.2UIInteraction-basedImplicitUserIden............6 Chapter2RelatedWork ............................... 7 Chapter3GeneralIdeaandGoals ......................... 10 3.1Motivation.....................................10 3.1.1Authentication..............................10 3.1.2Iden...............................11 3.2UserInterfaceDesignusingAndroidInputControls..............11 3.3ContinuousAuthenticationandIdenusingUIInteractions.....13 3.3.1EnrollmentPhase.............................13 3.3.2Authentication/IdencationPhase..................13 Chapter4DataAcquisition .............................. 15 4.1MSUSIRBApprovalforDataCollection....................15 4.2DesignofAndroidApplicationforDataCollection...............15 4.2.1Intra-sessionConsistencyofUserInterface...............16 4.2.2Inter-sessionConsistencyofUserInterface...............18 4.3DataCollectionProcedure............................18 4.4DataStatistics..................................19 4.5DataCollectionDevice..............................20 Chapter5Framework ........................ 21 5.1Ensemble..............................21 5.1.1Authentication..............................23 5.1.2Iden...............................23 5.2FeatureSet....................................24 5.3EvaluationMethodology.............................25 iv 5.3.1EvaluationforAuthentication......................26 5.3.1.1SupportVectorDataDescriptor................26 5.3.1.2OneClassSupportVectorMachine..............26 5.3.1.3TwoClassSupportVectorMachine..............27 5.3.2EvaluationforIden.......................27 5.3.2.1SupportVectorMachine-based...........27 5.3.2.2GaussianDiscriminantAnalysis-based.......27 5.3.2.33-NearestNeighbor-based..............28 5.4EnsembleParameters...............................28 5.4.1Authentication..............................28 5.4.2Iden...............................28 Chapter6ExperimentalResults ........................... 30 6.1Authentication..................................30 6.2Idenon...................................33 6.3EvaluationonReal-WorldApplication.....................36 Chapter7Discussion .................................. 39 Chapter8Conclusion ................................. 42 BIBLIOGRAPHY .................................... 43 v LISTOFTABLES Table3.1:AndroidInputControlsusedfordatacollection............11 Table4.1:DataSetcollectedfrom42users.....................20 Table5.1:Featuresubsetusedfortinputcontrol........25 vi LISTOFFIGURES Figure1.1:TouchgesturesonAndroidEditTextfortwousers.Itcanbeseenthat thesetwousersfocusedontwotpartsoftheEditTextinput controltobringkeyboardfocustoit...................5 Figure1.2:Radiobuttonusagefortwousersfromourdatacollectionshowsepa- rability.Itcanbeseenthatthesetwousersexertedt pressureswhenattemptingtoselectaRadiobuttonforthetime.5 Figure3.1:AndroidInputControlsusedfordatacollection............12 Figure5.1:EnsembleCframeworkforuserauthenticationandidenti- usingAndroidinputcontrols..................22 Figure6.1:ROCcurvecomparingtheensemblewhenusingSVDD,One ClassSVMandTwoClassSVMasbase...........31 Figure6.2:EqualErrorRatefortheensemblewhenusingSVDD,One ClassSVMandTwoClassSVMasbase...........32 Figure6.3:Boxplotshowingidentionaccuracyoftheensemble basedonGaussianDiscriminantAnalysis,SupportVectorMachineand 3NearestNeighbor.......................34 Figure6.4:ChangeinIdenaccuracyofensemblewithincreasing numberofenrolledusers.Theensembleclassiremainsconsistently accuratewithchangeinusercount....................35 Figure6.5:Numberofappusagesessionsvs.top K predictedclasses fromcandidate.........................36 Figure6.6:Inputcontrolinteractionsthenumberofsessions. Topthreemostprobableclassesweretakenfromcandidateinputcon- trolGDAand3NNclasserswereused,alleappusage sessionswereusedtogettheaveragetesterror.............37 vii Chapter1 Introduction Withthewidespreaduseofmobiledevices,alargenumberofmobileusersusetheir devicesforaccessingsensitiveinformation.Mobiledeviceswhichhavebeencompromised cancausetheftofsuchcriticalinformationandbeharmfultothemobileuser'swellbeing. Examplesofsuchinformationincludeusageofmobilebankingapplications,criticalinfor- mationsavedintoemailaccountsandprivatecontactinformationstoredforfriendsand family.Inordertopreventattackersfromgettingaccesstosuchinformation,mobiledevice manufacturerstypicallyprogramdevicestoexplicitlyauthenticatethemobileuserbefore allowingaccesstothedevice. 1.1 ExplicitAuthentication Typicallyusedexplicitauthenticationmechanismsincludeaskingtheuserthesetup apasscodeorapatternonbootofthedevice.However,avarietyofissuesmanifest themselvesintheseauthenticationschemes.Usersmustrememberthepasscodesorpatterns atalltimes.Thisformofexplicitauthenticationalsodivertstheuser'sattentionawayfrom thepurposehewishestousethedevicefor.However,themostcriticalissuewithsuch authenticationschemesisthattheseschemesprovideallornothingaccesstothedevice.In otherwords,theusereithergetsaccesstoalltheapplicationsonthemobiledeviceorgetsno accessatall.Inthecontextofaconveniencevs.securitysuchschemescompromise heavilyonuserconvenience.Acriticalassumptionmadebysuchschemesisthatonlythe 1 legitimateuserwillusethedeviceafterhavingunlockeditandthedevicewillbelockedagain aftertheusagesession.However,mobileusersoftenencountersituationswhenthedevice hastobeputawayforabriefperiodoftimeduetoanothersituationrequiringtheirurgent attentionleavingthedevicevulnerabletoattackers.AsreportedbyLookoutSecurity[21], onein10mobileusersintheUnitedStatesarevictimsofmobiletheftwhichcanleadto furtherinformationtheft.Suchscenariosnecessitateotherauthenticationschemes[19]which arelessobtrusive,morecontinuouswhileremainingaccurateenoughtobeusefulinpractical situations. 1.2 ImplicitAuthentication ImplicitAuthentication(IA)schemeshavebeenanactiveareaofresearchinthepast fewyears.Thegeneralideawhendoingimplicitauthenticationistoonlyuseinputswhich aregivenbyusersindirectlyandmakeadecisiononwhetheranysuchinputorgroupof inputswereprovidedbyalegitimateuserthatwehavepreviouslyseenoranunknownuser. Investigationhasbeencarriedouttocheckifmodalitiesderivedfromimplicitinputcontain informationwhichisdiscriminantenoughtodistinguishbetweenaknownuserandaset ofunknownusers.Ithasbeenfoundthatusersaresurprisinglyreceptive[12]tousingsuch non-intrusiveschemesforauthenticationeventhoughIAschemescontinuetohavesecurity limitations.ThecriteriawhichmakesIAmoredesirableisthatitisstrictlynon-intrusive andcontinuousprovidingusersafeelingofsecurityandatthesametime,notcreatingany barriersfromprovidingdesirablefunctionalitytousers.SomeoftheIAmodalitiesthathave beenexploredaredescribedinthenextsection. 1.2.1 ExploredImplicitAuthenticationSchemes 1.2.1.1 Touch-basedAuthentication Themostprominentformofcapturingtheuser'simplicitinputstoamobiledevice wouldbeviathetouchscreensinceallofauser'sinteractionswiththemobiledeviceoccur viathetouchscreen.Anexampleofsuchamodalitywouldbetouch-basedauthentication[8] 2 whereintheauthorsgatheredverticalandhorizontalgesturesperformedby41users whileusingtmobilephoneapplications.However,thismodalityrequiresusersto providescrollgesturesasinputstothedevice,therebycreatingalimitationofthemodality notgettinginputsfastenoughondeviceswithlargerscreens. 1.2.1.2 Keystroke-basedAuthentication Fengetal.[5]exploredthepossiblyofauthenticatingusersviatheirbehaviorobserved whentheyusedthesoftkeyboardavailableonmobiledevices.Whilethismodalityhasthe advantageofbeingresilienttoattack,itislimitedtobeingusefulonlywhenattackersare forcedtousethesoftkeyboardfortheirpurposes.Inscenarioswhenmalicioustaskscanbe accomplishedwithoutprovidinganykeystrokes,thismodalityhaslimitedvalue. 1.2.1.3 Sensor-basedAuthentication Kayaciketal.[15]devisedasensor-basedauthenticationmechanismwhichcreateda userfromsensordataduringthemobiledevice'susage,measuredstabilityand switchedtoaauthenticationphaseonceauserprostabilized.However,thistechnique requiresinvasiveaccesstoauser'sprivateinformationsuchaslocationinferredfromcelltow- ersseveraltimesaday,accesstosignalsandaccelerometerreadings.Insituationswhere ausermaynotbeentirelycomfortablewithsharingthisinformationwithaauthentication mechanism,thistechniquewouldhavelimitedvalue. 1.2.2 UIInteraction-basedAuthentication Inthisreport,weproposeanewmodalityforimplicitlyauthenticatinguserstotheir mobiledevices.EverymobileapplicationwillalwayshaveaUserInterface(UI)tointeract withitsusers.Weobserveuserbehaviorasseenthroughinteractionswiththeapplication's UIanduseitfornotonlyauthenticating,butalsoidentifyingusers.Thistechniquehasthe advantageofbeingavailableoneverymobiledeviceregardlessofscreensize.Everypopular mobileapplicationwillhaveaUIwhichuserswillinteractwithandtherebyprovideimplicit inputsforthismodality.Theseinputscanbecapturedinanon-obtrusivemanneranddo notimposeanysprequirementsontheapplicationsuchasrequiringtheapplication 3 togatherinputsfromtheuserviaasoftkeyboard.Also,thismodalitycancaptureinputs withoutrequiringinvasiveaccesstothemobiledevice'ssensors.Thus,thismodalityis almostalwaysguaranteedinputsregardlessofthesensorspresentonthedevice. Aspartofthisinvestigation,weuseeveryavailableformofUIinteractioninthe AndroidOS[10]toauthenticateusersbasedontheirUIinteractionsduringapplication usagesessions.Weproposethatitispossibletouseacombinationofallkindsofavailable inputcontrolsontheAndroidOStoperformimplicitauthenticationandidenon anAndroiddevice. WehypothesizethataninteractionofauserwiththeUIofamobileapplicationcontains somedistinguishinginformationwhichwhenputtogetherwitheverypieceofinformation extractedfromUIinteractionsinthesameappusagesessioncanhelpdeterminetheidentity oftheuser.TheintuitionbehindthishypothesesisillustratedinFigure1.1.This showsthecoordinateswheretwotuserstouchedinordertobringthekeyboard's focustotheedittext.Itcanbeseenclearlythatthesetwousersfocusedontpartsof theedittextandthereforeshowshowtheedittextinputcontrolisabletocapturediscriminant informationfromthesetwousers.Figure1.2demonstratestheseparabilitybetweentouch featuresextractedbetweentwotuserswhentheywereinteractingwithradiobuttons. Itshowshowthefeaturesnaturallylendthemselvestoaclearandalsohow featurevaluesfortusersoccupytspacesinthefeaturespace.Theobjective ofthisstudyistoevaluateifsuchUIinteractionsarediscriminantenoughtoprovidean implicitformofuserauthenticationinsteadofexplicitformssuchassecretpinsorpasswords. 1.3 ImplicitUserIden Whilemodalitiesprovidingauthenticationmechanismshelpdetermineifatestinput belongstothelegitimateuserornot,theproblemofuseridenassignseverytest 4 Figure1.1:TouchgesturesonAndroidEditTextfortwousers.Itcanbeseenthatthesetwo usersfocusedontwotpartsoftheEditTextinputcontroltobringkeyboardfocus toit. Figure1.2:Radiobuttonusagefortwousersfromourdatacollectionshowseparability.It canbeseenthatthesetwousersexertedtpressureswhenattemptingtoselect aRadiobuttonforthetime. inputtooneuserinasetofalreadyknownusers.Thusbeingabletonotonlyauthenti- catebutalsoidentifyauserbasedonhistestinputsprovidesastrongerguaranteeonthe ofatestinput. 1.3.1 Sensor-basedImplicitUserIden Shietal.[23]makesuseofvoice,location,touchandaccelerometersensorinputsfor establishinguseridentityandtriggersexplicitauthenticationwhenuseridentityisfoundto havechanged.Amajorlimitationofthisapproachisitsdependenceonmultiplesensors beingavailableonthemobiledevice. 5 1.3.2 UIInteraction-basedImplicitUserIden Wehypothesizethatauser'sinteractionswiththeUIofamobileapplicationaredis- criminantenoughtoestablishtheuser'sidentity.WecollectUIinteractiondatafrom42 usersbyaskingvolunteerstouse5mobileapplicationswiththesameUI.Wethentrainon 4appusagesessionsforeveryuserandtrytoclassifythe5thappusagesessionasbelonging to1ofthe42knownusers.Beingabletoidentifyusersinthismannerhastheadvantage ofnotimposingtherequirementofhavingmultiplesensorsavailableonthemobiledevice whilealsonothavingtoinvasivelycaptureanyoftheuser'sprivateinformationviaany othersensorsapartfromitstouchscreen. Theremainderofthisreportdetailsaliteraturesurveyofthisresearcharea,providesa descriptionofthedatacollectionperformedandframeworkusedandprovides anevaluationoftheframework.Wediscusssomelimitationsofthistechnique andprovideconcludingremarksonthismodality. 6 Chapter2 RelatedWork Thehypothesesstatedaboveliesinthegeneralareaofimplicitauthenticationon mobiledevices.Thegeneralideaofimplicitauthenticationistoauthenticatetheuserbased onamodalityoracombinationoftmodalitiesforwhichtheuserprovidessamples inanimplicitway,e.g.gesturesthattheuserperformsonthescreenofamobiledevice. tschemesmaketassumptionsaboutthemobiledevice,e.g.Shietal.[23] assumetheaccessibilityoffoursensormodalitiesincludingvoice,location,multitouchand locomotion. Franketal.[8]extractedfeaturesfromscrollinggesturesperformedbyusersduring interactionwiththeirdatacollectionapplicationsandshowedusers'touchinteractions tobeapracticalmodalityforauthenticatingusers.However,thismodalitymakes useofanassumptionthatthemobileapplicationsthattheuserorattackeruseswill providetinputstothistechnique.Shahzadetal[22]demonstrateexplicittouchges- turesperformedbyusersasanotherpossiblemodalitythatcanbeusedtoauthenticateusers. Fengetal.[4]demonstrateanothertouch-basedmodalityandevaluateitonalarge datasetoftestusers.Thesetechniquessuggestauthenticationmechanismsthatcapture datafromtheentiredeviceirrespectiveoftheapplicationbeingused.However,bothFeng etal.andFranketal.suggestthattakingapplicationcontextmayimprovetheoverallau- 7 thenticationaccuracy.Khanetal.[18]argueforamoreapplication-centricapproachanduse thesametouch-basedmethodsondatasetscollectedfromthesameapplication. Aframeworknamed Itus fordoingapp-centricimplicitauthenticationhasalsobeen proposedbyKhanetal.[17].Thisframeworkproposesgreatery,extensibility andcontrolbeprovidedtomobileapplicationdevelopersforperformingimplicitauthen- tication.Otherproposedimplicitauthenticationtechniquesincludeusingsensor-based modalities[14],[26],recognizingusersfromtheirgait[7]andusingthedevicepicking-upmo- tionasamodalityforuserauthentication[6].Sincethemostrecentworkinthisareahasbeen inthedirectionofmakingimplicitauthenticationmoreapp-centric,thenextlogicalstep seemedtobetotrytouseapp-spUserInterfaceinteractionsforauthenticatingtheuser. Themostcomprehensiveevaluationoftauthenticationschemeshasbeendone byKhanetal.[16].SixtIAschemeswereevaluatedagainstfourindependently availabledatasetscontainingdataforover300participants.Howeversincenoneofthe existingtechniqueshadperformedauthenticationusingonlytheUIofamobileapplication, therewerenopubliclyavailabledatasetsonthismodality,therebypromptingustocollect adatasetofourown. OneproposedauthenticationschememostsimilartoourswouldbeLatentGesture[20]. However,thisschemeusesonlyasubsetoftheinputcontrols[1]availableontheAndroid OS.Theevaluationofthisschemeforbothauthenticationandidenisdoneona muchsmallersetofusersthanourevaluation.Inthisstudy,theuserswerechosenonthe basisoftheirpriorexperiencewiththemobiledeviceandwerealsogiventheopportunityto testtheapplicationsbeforethedatacollectionprocedurewasbegun.Thismethodofdata collectioncontrastsheavilywithourdatacollectionprocedurewhereinwedidnotselect usersbasedonanycriteria,apartfromaskingthemtovolunteertheirtimeforourdata 8 collectionanddidnotusersanyopportunitytogettrainedtointeractwithourdata collectionapplications.Wethinkthismethodofdatacollectioncloselymimicsapractical attackscenariowhereinanattackermaynothavebeenpreviouslytrainedtointeractwith themobileapplicationbutwouldstillbeabletosuccessfullyinteractwiththeapplications. 9 Chapter3 GeneralIdeaandGoals Inthissection,weprovideintuitionforusingUIinteractionsasauniquemodalityand describehowwemodeledthismodalityforourinvestigativepurposes. 3.1 Motivation SeveralexistingapproachestoIAhavedemonstratedthateveryuserbehavestly whileinteractingwithamobiledevice.TheoverallperformanceofanIAschemedependson howwelltheschemeextractsdiscriminantfeaturesfromitsmodalityandthe algorithmusedtoauthenticateandidentifytestinputs.Theintuitionbehindthismodality canbesummarizedasfollows: AslongastheUserInterfaceforamobileapplicationremains consistent,userbehaviorwhileinteractingwiththeUserInterfacealsoremainsconsistent . Forexample,thewayausertypesemailintothetextboxofanemailapplicationist thanthewayausertypesthepasswordintoamobilebankingapplication.Suchobservations motivatedaninvestigationintocheckingifthisbehaviorisconsistentenoughforoneuser whilealsobeingdiscriminantenoughtodistinguishamongtusers.Wedividedthis investigationintotwoseparateproblemsdescribedinthefollowingsubsections. 3.1.1 Authentication Thisproblemcanbephrasedasfollows: Givenatrainingsetcontainingsamplesforalegitimateuserandasetofimpos- tors,classifyUIinteractionsextractedfromeveryappusagesessionasbelonging 10 tothelegitimateuseroranimposter. 3.1.2 Iden Thisproblemcanbephrasedasfollows: Givenatrainingsetcontainingsamplesforasetofenrolledusers,classifyUI interactionsextractedfromeveryappusagesessionasbelongingtoauseramong thesetofenrolledusers. 3.2 UserInterfaceDesignusingAndroidInputCon- trols AndroidconsidersallUIelementsthatallowuserstointeractwiththemwhiledynami- callyprovidingvisiblefeedbacktousersas inputcontrols [1].Forthepurposesofthisstudy, ninettypesofinputcontrolswereusedtocaptureinteractionswithusers. AsshowninFigure3.1,ninettypesofinputcontrolswereusedforthisstudy. ThesealongwiththeircorrespondinginteractionsrecordedforeachuserareshowninTable 3.1. InputControl FunctionalityofInputControl button taptostartnextactivity checkbox taptoselectvalue radiobutton taptoselectvalue switch taporhorizontalfrom lefttorighttotoggle togglebutton taptotogglevalue picker verticaltopickavalue edittext taptobringkeyboardfocus spinneritem taptoselectitem spinnerbutton taptoshowdropdown Table3.1:AndroidInputControlsusedfordatacollection AsshownbyTable3.1,threetkindsofbuttoncontrolswereusedviz.but- ton(shownasthe Next buttoninFigure3.1),togglebuttonandthespinnerbuttonusedto initiatethespinnerdropdown.The Next button'sfunctionwastosuspendthecurrent 11 Figure3.1:AndroidInputControlsusedfordatacollection Activity [9]oftheappandstartthesuccessive Activity ,thetogglebuttonwasusedtochange avaluefrom True to False andviceversa,thespinnerbuttonwasusedtotriggeradrop 12 downmenufromwhichtheusercouldselectavalue.Sinceallthesethreecontrolsperformed threentfunctions,allinteractionsperformedwiththesethreetypesofbuttonswere treatedseparately. 3.3 ContinuousAuthenticationandIdenus- ingUIInteractions ThegeneralideabehindverifyingUIinteractionsforauthenticatingandidentifying usersasanewmodalityistotrainonauser'sbehaviorwiththeUIofanapplicationduring theinitialsessionswhentheapplicationisbeingusedandthenduringlaterusagesessions, authenticateoridentifytheuser. 3.3.1 EnrollmentPhase Duringthisphase,allinteractionsofauserwithanyelementthatispartoftheUIof anapplicationsarelogged.Asmanyfeaturesthatcanbederivedfromeveryinteraction arelogged.Thenumberoffeaturesthatcanbederivedfromeveryinteractiondepends ontheAPIexposedbytheunderlyingoperatingsystem.Mostofthefeatureswelogged foreveryinteractioncouldbederivedfromtheAndroidGestureDetector[11]class.The numberofusagesessionstobeusedforenrollmentcanbedeterminedbytheapplication developer.InSection7,wewilldiscusstheofchangingthenumberofinputcontrols onauthenticationandidenaccuracy. 3.3.2 Authentication/IdentionPhase Duringthisphase,everyinteractionofauserwithanyelementoftheUIis Thetaskdependsonwhetherthedecisionisoneofauthenticationor idenAtthisstageonecriticalassumptionismade: Everyinteractionperformed duringthesamesessionofusageofanapplicationmusthavebeenperformedbythesame user .Hence,allinteractionsperformedduringthesamesessioncanbecombinedtogetone decision.Incaseofauthentication,theentiresessionisashaving beenperformedbyeitherthelegitimateuserortheimposter.Incaseofidenthe 13 entiresessionisashavingbeenperformedbyoneuseramongthesetofknown users. 14 Chapter4 DataAcquisition Inthissection,wedescribethedataacquisitionprocotolfollowedwhencollectingdata forUIinteractions. 4.1 MSUSIRBApprovalforDataCollection Theprotocolfollowedforcollectingdataforthisstudycanbefoundinthedocuments approvedbytheMichiganStateUniversity-SocialScienceBehavioral/EducationInstitu- tionalReviewBoard(SIRB)forIRB#15-277.Thedatacollectionforthisstudybegan onApril23,2015andendedonMay22,2015.Aspartofthisprotocol,userswere askediftheywishedtovolunteerforthisdatacollectionprocedure.Willingvolunteerswere thenexplainedthedatacollectionprocedureandthethreelevelsofauthorization.Thedata collectionasstatedinSection4.3wasthenperformedfromeveryconsentingvolunteer.One oftheobjectivesofthedesignofthedatacollectionapplicationswastopreventanyonewith accesstothedatafromidentifyinganyofthevolunteerswhoparticipatedinthisstudy. 4.2 DesignofAndroidApplicationforDataCollection AnAndroiddevicewaschosenforthisdatacollectionexercise.Giventheninetypesof inputcontrolsshowninTable3.1,itwasdeterminedthateveryusershouldinteractwith15- 20instancesofeachtypeofinputcontrolduringeveryapplicationusagesession.Inaddition, itwasdeterminedthatdatafromesuchusagesessionshouldbeobtainedfromeveryuser. 15 Thisvalueofcollectingdataforesessionswassettotherightbalancebetween obtainingttrainingandtestdataandkeepinguserfrustrationwithinreasonable limits.Thequestionswerecreatedinamannerwhichpreventedanypersonalorprivate informationbeingenteredbytheuserswhilestillprovokingsomethoughtinusersbeforean interactionwithaninputcontrolwasbegun.Forexample,oneofthequestionsaskedwas :\DoesMarchprecedeApril?".Thesedesignprincipleswereusedtocloselymimicusage inpracticalscenarioswhereinuserswilltypicallystayawareoftheirinteractionswiththe UIofanapplication.Anattemptwasmadetokeepthetypeofquestionssimilarinevery Activity oftheapplicationinordertoreducethetimerequiredtocompleteeachsessionof datacollection. 4.2.1 Intra-sessionConsistencyofUserInterface Usersinteractedwith15-20instancesoftheninettypesofinputcontrolsduring everyappusagesession.Anattemptwasmadetomaintainconsistencyinuserinteractions witheverytypeofinputcontrol. Foreveryinputcontrol,anattemptwasmadetokeepallinstancesofthesameinputcontrol verticallyaligned.Also,allinstancesofinputcontrolsofthesametypewerekeptofthe samelengthandbreadthtoprovidethesameamountofareaforeachusertointeractwith theinputcontrol.Wedescribethedesigndecisionsmadetomaintainconsistencyamongall interactionswiththesametypeofinputcontrol. Edittext :Wecapturedthetouchinteractiondonebyeachusertobringkeyboardfocus toit. Button :Theonlybuttoninstanceswererecordedfromthe Next buttonshowninthe bottomrightcornerofFigure3.1. Checkbox :AsshowninFigure3.1,instancesofbothCheckboxwereplacedintwo columns.InordertocollectdatafrombothplacementsofCheckboxinstances,equal 16 numberofcorrectanswerswereplacedinboththecolumns. Radiobutton :Radiobuttoninstanceswouldallowonlyhaveonepossibleselectionin theirgroup.TwoinstancesofRadiobuttonwereincludedineachgroupandequalnum- berofcorrectanswerswereplacedinboththe1standthe2ndRadiobuttoninstance ineachgroup. Spinner :PlacementsimilartoRadiobuttonwasusedwhendesigningtheActivity containinginstancesofSpinner.InstancesofSpinnerwerecreatedtosimulatethe functionalityprovidedbySpinnerinordertocollectdatanotonlyfromtheSpinner itemselectionbutalsofromtheinitialbuttonclicktoinitiatetheSpinnerdropdown menu. Togglebutton :Togglebuttonsaretfromotherinputcontrolsbecausetheycan haveonlytwopossiblevalues,oneofwhichwillbesetasthedefaultvalueinthe Togglebuttoninstance.WeneededtomakeusersinteractwithTogglebuttoninstances inanaturalwayinsteadofsimplychangingthestateofeveryTogglebuttoninstance. Forthispurpose,wecreatedasetof20questionsforwhichtheanswerswereeither True or False .10ofthesequestionshadacorrectanswerof True andtheremaining 10hadacorrectanswerof False .WemanipulatedtheTogglebuttoninstancessuch that13ofthe20Togglebuttoninstancesrequiredtheuserstochangethestateofthe instancewhenattemptingtoanswercorrectly. Switch :Weattemptedtolimituserstotoggletheswitchfromlefttorightduringthe interaction.Switchescanbetoggledbyeithertappinganywherewithinthevisible areaoftheSwitchorbyslidingtheSwitchtotheintendedstateusinghorizontal gestures.Whilemostusersstartedbytogglingswitchesusinghorizontalgestures, manyofthemresortedtousingtapgestureswithswitchesinthelaterappusage sessions. 17 Picker :Everyuserwasaskedtopickavaluefromagivensetofvaluesbydoingvertical butthesetofpickervalueswouldcirclebacktothe1stvalueafterthelastvalue wasthrough.However,everypickerwouldhaveaninitialvaluesetbeforethe user'sinteraction.Thetargetvaluewasintentionallykeptequidistantfromtheinitial selectedvalueinthepicker. 4.2.2 Inter-sessionConsistencyofUserInterface Amongtheeapplicationsusedfordatacollection,questionsweredesignedforthe onlytheapplication.Theorderofthequestionswaschangedinordertocreatethe remainingfourapplications.ThisdesigndecisionwastakeprimarilytokeeptheUIofallthe applicationsconsistentandtopreventusersfrommemorizinganswerstothequestionsthey werebeingaskedbythedatacollectionapplications.Thisdesignofthedatacollectionapp madeusersthinkatleastmomentarilybeforeinteractingwiththequestions'scorresponding inputcontrolforprovidingananswer.Sincemostinputcontrolswereverticallyaligned, changingtheorderofthequestionschangedtheapplicationsenoughtopreventmemorized interactionswithoutchangingtheUIoftheapplications. 4.3 DataCollectionProcedure Thequestionswereintentionallymadesimpletoanswertokeeptheleveloffrustration lowamongvolunteersbutatthesametimethequestionswerethoughtprovokingenough topreventusersfrommemorizingananswerwithoutreadingthecompletequestion.This procedurealsohelpedmakethedatacollectionapplicationsmorerealisticsinceduringreal Androidapplicationusage,usersgenerallyreadandthinkbeforetouchingonanypartof thescreen.Noprivatequestionswereaskedaspartofthedatacollectionexercise.An exampleofaquestionasked:\Whichlettercomesafterthelettera?"Everyuserwas idenbyasubjectnumberwrittenintheconsentform. Foreverytouchactionperformedonaninputcontrolinstance,featuresfromonly 18 theuser'sinteractionwiththatinputcontrolinstancewereextractedandused,e.g. Whileansweringaquestion,ifauserselectedthe1stcheckboxonapageofanapplication andlaterchangedtheirmindanddeselectedthatcheckbox,featuresfromonlythe1st interactionwereextracted.Thedatacollectionprocedurestartedwithgettingtheuser's writtenconsent,thentheuserswereaskedtoanswerthequestionsinthedatacollection applicationsoneaftertheother.Everytouchinteractionperformedbytheuserwiththe datacollectionapplicationswaslogged.Theonlyinstructiongiventoeachuserbefore beginningthedatacollectionprocedurewastousetheapplicationsasnormallyasthey could.Thismethodofdatacollectionhelpedcapturevarianceinuserbehaviorsincesome userschangedsittingpositionsafterthetwousagesessions,someuserswouldstop switchingthebeingusedtointeractwiththeapplication.Theactualanswersbeing givenbyeachuserwereignoredandonlytouchinteractionsperformedbyeachuserwith eachinputcontrolwerelogged.Aftereveryappusagesession,userswereaskedtohand overthemobiledevicebacktousandthenextappusagesessionwouldbeginaftera coupleofminutes.Thus,usershadachancetoreorientthemselvestothemobiledevice atthebeginningofeverysessionwhichhelpedcaptureinter-sessionvarianceinuserbehavior. 4.4 DataStatistics Table4.1showssomestatisticsonthetotalamountofdatacollectedforthisexercise. Thenumberoftoucheventsforbuttonsismuchhigherthanotherinputcontrolsbecause buttontoucheventswereobtainedfromusersintwoways 1. Userswereaskedtotouchabuttononthebottomrightcornerofeverypageofthe applicationtomovetothenextpageandeachapplicationusagesessionconsistedof 11such\nextbuttons". 2. Usershadtotoucha\spinnerbutton"inordertocauseeveryspinnerdropdownto showupandeachapplicationusagesessionconsistedof20spinners. 19 InputControlNumberofSamplesAvg.samplesperusagesession button230611 checkbox484923 radiobutton426520 togglebutton288014 switch341616 edittext392619 spinner435721 spinner-button446121 picker335216 Total33812 Table4.1:DataSetcollectedfrom42users 4.5 DataCollectionDevice AlldatacollectionwasdoneonasingleAndroiddevice-theGoogleNexus7(2012) runningAndroid4.4.3. 20 Chapter5 Framework 5.1 Ensemble Thepresenceofmultipleinstancesofninettypesofinteractionsduringeach applicationusagesessionmotivatedtheuseofninetInaddition,regard- lessofwhethertheproblemofauthenticationoridennisbeingaddressed,the decisionshouldbethesamefortheentiresessiongiventheassumptionofthe sameuserperformingalltheinteractionsseenduringasession,thusfurthermotivatingthe useofanensembleThesetwodecisionsfortakinganensembleinordertoclassify everysessionforboththeauthenticationandidentproblemsweredoneasfollows. Aninteractionwitheveryinstanceofaninputcontroltypewasbythe correspondinginputcontrole.g.interactionswithinstancesofcheckboxwere byacheckboxinteractionswithinstancesofbuttonwere byabutton Allinteractionsbelongingtothesamesessionwerecombinedbyeachinputcontrol Incaseoftheauthenticationproblem,athresholdvaluewasusedwithevery inputcontroltochoseaprediction.Incaseoftheidenproblem, thepredictedclasswiththemostnumberofvotesfortheentiresessionwaschosenby everyinputcontrol 21 Themostfrequentlypredictedclassamongthesetofpredictionsfromtheinputcontrol waschosenasthepredictionforboththeauthenticationandiden cationproblems ThedesignoftheensemblecanbesummarizedinFigure5.1. Figure5.1:Ensembleframeworkforuserauthenticationandiden usingAndroidinputcontrols Theframeworkdetailedbelowmakesuseofakeyassumption: Allthein- teractionsbeingperformedduringanappusagesessionarebeingperformedbythesameuser . Thisassumptionleadstorobustnessintheensemblebecausealltheinteractions havetobeultimatelyasbelongingtothesameclass.Hence,all performedonallinputcontrolinteractionshavetobeultimatelyasbelonging tothesameclassandhencecanbecombinedintoasingledecision.Thisensemble wasperformedtlyforauthenticationandiden 22 5.1.1 Authentication Everyinputcontroleveryinteractionasbelongingtoeitherthelegit- imateuseroranimposter.Anassumptionmadeduringthisevaluationisthattherewillbe onlyonelegitimateuserforthedeviceandallotherusersareassumedtobeimpostors.Since alltestinteractionscomingfromthesameappusagesession must belongtothesameclass, thenumberofforalltestinteractionsderivedfromthesametypeofinput controlandfromthesamesessioniscalculatedandcomparedagainsta threshold parameter. Ifthisnumberfallsbelowthethreshold,thentheinputcontrolclaserchosestopredict the legitimate classfortheentiresession,elseitchosestopredictthe imposter classforthe entiresession.Finally,thepredictedclassfromeveryinputcontrolfromthesame appusagesessionarecombinedtopredictapredictedclassbyndingthemost frequentlyoccurringelement(mode)foreveryappusagesession. 5.1.2 Iden Everyinputcontrolseveryinteractionasbelongingtooneof 42classesbycreating861((42*41)/2)runningalltestinteractionsthroughthe 561andaddingupvotesforeverytestinteraction.Inaddition,sincealltest interactionscomingfromthesameappusagesession must belongtothesesameclass,the votesforeverytestinteractionbelongingtothesamesessionareaddedup.Atthisstage, insteadofchoosingonlyasingleclasswiththemostvotes,thetop K classesarechosenas themostprobableclassesaspredictedclassesforallthetestinteractionscomingfromthe sameappusagesession.Thisprocedureisdoneforeverysetoftestinteractionsperformed onthesameinputcontrol.Thus,withninettypesofinputcontrols,wegetthe top K classesfromeachofthenineinputcontrolFinally,thetop K classesfor everysetofinteractionsfromthesameappusagesessionarecombinedtopredicta predictedclassbythemostfrequentlyoccurringelement(mode)foreveryapp usagesession.ThisframeworkisalsoillustratedinFigure5.1. 23 5.2 FeatureSet Thefeaturesetexploredforeverytouchinteractionisasfollows. 1. xcoordinateoftouchgesturestart 2. ycoordinateoftouchgesturestart 3. erpressureattouchgesturestartasreportedbyAndroid 4. ersizeattouchgesturestartasreportedbyAndroid 5. distancetravelledbythealongthexaxisbetweentouchgesturestartandtouch gestureend 6. distancetravelledbythealongtheyaxisbetweentouchgesturestartandtouch gestureend 7. rencebetweenpressurebetweentouchgesturestartandtouchgestureend 8. rencebetweensizebetweentouchgesturestartandtouchgestureend 9. euclideandistancetravelledbetweentouchgesturestartandtouchgestureend 10. directiontravelledinbetweentouchgesturestartandtouchgestureend 11. timelapsebetweentouchgesturestartandtouchgestureend Table5.1capturesthefeaturesubsetfoundtobemostusefulforThe columnsindicatetheassociatedwitheachinputcontroltype,rowsindicateifthe featurewasusedbytheornot. FeatureselectionwasdonebyusingSequentialForwardSelection.Ascanbeconcluded fromTable5.1theincoordinates,distance,pressure,sizeanddirec- tionoftouchwerenotfoundtobeparticularlyuseful.Thexcoordinatewasparticularly discriminantforallusersandsowastheamountoftimeeachuserspentintouchingthe 24 Feature button checkbox spinner-button radiobutton togglebutton edittext spinner-item switch picker xcoordinate 3 3 3 3 3 3 3 3 3 ycoordinate 3 3 3 3 3 3 3 pressure 3 3 3 3 3 3 3 3 3 size 3 3 3 3 3 delta-x delta-y 3 delta-pressure delta-size euclideandistance direction 3 delta-time 3 3 3 3 3 3 3 3 3 Table5.1:Featuresubsetusedfortinputcontrol screen.AllfeaturevalueswerereportedbytheAndroidGestureListenerexceptfortime. Timewasloggedinmillisecondswhenatouchgesturestartwasreportedandloggedagain whenatouchgestureendwasreportedonthesameinputcontrol.Thiswascap- turedforeverytouchinteractionandused.Theycoordinatenotbeingfoundtobeuseful incaseoftheedittextinputcontrolcanbeintuitivelyexplainedbythefactthatwhenmost userstouchaninstanceofanedittextinordertobringkeyboardfocustoit,thewidthof theedittextinstancewasmuchmorethantheheightoftheinstancetherebyprovidingusers abiggerrangetotargetinthewidththantheheight.Incaseoftheswitchinputcontrol, thedelta-yfeaturebeingfoundtobeusefulcanbeexplainedbythefactthatuserswere requiredtotaportheswitchhorizontallyfromlefttoright.Thesetwotways ofinteractingwiththeswitchinputcontrolresultedintusersmovingitint waysalongtheheightoftheswitchinstance.Similarinsightcanbeappliedincaseofthe pickerinputcontrolwhereinthedirectionoftheturnsouttobeuseful. Whilescrollsandkeystrokesperformedbyuserswerealsocapturedaspartofthisdatacol- lectionprocess,basedonscrollsandkeystrokesonmobiledevicesarealready exploredtechniquesandhencewerenotusedinbyinputcontrols. 5.3 EvaluationMethodology Theevaluationoftheensembleforauthenticationwasdoneinanentirely tmannerthanthatforidenWediscussbothevaluationsinthefollowingtwo 25 sections.AllSupportVectorMachine[3]basedwereimplementedusingLibSVM[2]. 5.3.1 EvaluationforAuthentication Whenauthenticatingausertoadevice,wehaddatafrom42users,eachuserproviding datafromeapplicationusagesessions.Inordertocorrectlyevaluatetheperformanceof theensembleforauthentication,wehadtoprovideeachofthe42usersachance tobethelegitimateuserandallowtheremaininguserstoattacktheensemble However,wealsohadtodividetheimposterusersetintotrainingandtestimpostors.In additiontothis,the threshold parametervaluealsotheperformanceoftheensemble Inordertoaccommodateallthesefactorsintoevaluatingtheensemble forauthentication,wecalculatedthemeanFalseAcceptanceRate(FAR)andFalseRejection Rate(FRR)foreverypossiblevalueof threshold ,testusagesession,legitimateuser,training impostorsandtestimpostors.Threetbaseone-classweretriedforthe ensembletask.Weexplaintheseinthefollowingthreesections.Inallthree basealgorithms,anyimpostorswhowereusedfortrainingwerenotusedfor testing.Thismethodofchoosingimpostorsduringthetestphasecloselyresemblesthereal worldscenariowhereinunknownattackersonwhomtheensemblehasnotbeen previouslytrainedmaytrytoattacktheapplication.Theresultsfromthisevaluationare reportedinSection6.1. 5.3.1.1 SupportVectorDataDescriptor TheSupportVectorDataDescriptor(SVDD)[24]triestoahypersphereinfeature spacearoundthetrainingdataprovidedforthelegitimateuserwhilealsominimizingthe volumeofthishypersphere.Parametersforthiswerefoundbydoing4-foldcross validationsincetrainingdatawasfromfourappusagesessions. 5.3.1.2 OneClassSupportVectorMachine TheOneClassSVM-basedtriestocreateahyperplaneinfeaturespaceallowing allthetrainingpointstolieononesideofthehyperplane.Parametersforthiswere foundbydoing4-foldcrossvalidation. 26 5.3.1.3 TwoClassSupportVectorMachine ThestandardSupportVectorMachine[3]basedwasusedwhereinfourusage sessionsofthelegitimateuserwerechosentocreatetrainingsetforthe legitimate class. Trainingdataforthe imposter classwascreatedbychoosingoneusagesessionfromfour impostors.Thusequalnumberofsamplesforboththeclasseswereusedduringtraining. Fortesting,theremainingusagesessionfromthe legitimate wasusedandalltheremaining userswhowerenotpartofthe imposter classduringtrainingwereusedfortesting. 5.3.2 EvaluationforIden Theperformanceoftheensembleformulti-useridencansimplybe consideredtobetheaccuracyofthewhenclassifyingeachofthe42testapplication usagesessions.Sincedatawascollectedfromeapplicationusagesessions,therearee possibletestusagesessionswhichcouldbeused.Consideringeachoftheeapplicationtest usagesoneatatime,theremainingfourappusagesessionswereusedfortrainingandthe meanaccuracyofthewascalculated.threetbasealgorithms wereusedtotesttheperformanceoftheensembleInallthreecases,thesame basealgorithmwasusedforallthenineinputcontrolandthe 1-vs-1 multiclass algorithmwasusedforallthethreebasealgorithms.Theresults fromthisevaluationarereportedinSection6.2. 5.3.2.1 SupportVectorMachine-based ASupportVectorMachine-basedwasusedtodistinguishbetweensamples oftwoclasses.TheRadial-BasisFunctionwasusedasthekernelfunctionandparameters forthebasewerefoundbydoing4-foldcrossvalidationsincetrainingdatawas collectedfromfourtusagesessions. 5.3.2.2 GaussianDiscriminantAnalysis-based AGaussianDiscriminantAnalysis-basedclasserwascreatedbyestimatingthepa- rameters-meanandcovariance-fromthetrainingdata.Nocrossvalidationwasrequired 27 forthisbaseThistechniquehadtheadvantageofnothavingtostore thetrainingdataaftertheinitialparameterestimationandwasfoundtobefasterthanthe SVMand3NNbasedensemble 5.3.2.3 3-NearestNeighbor-based A3-NearestNeighbor-basedwastriedasthebasefortheensemble. Notrainingandcrossvalidationphasewasrequiredbutthistechniquehasthedisadvantage ofhavingtostorethetrainingdataatthetimeofTheensembleusing onlythistechniquewasfoundtobetheslowestofthethreeensembletechniques. 5.4 EnsembleParameters Anensembleofthecandidateinputcontrolwasrequiredfortheauthentication andidenproblems.Wediscussheretheparametersintroducedtoachievethis ensemble.Parameterswhichwererequiredforthebasealgorithmsarediscussed inSection6.Oneparameterwhichtheaccuracyoftheensembleforboth authenticationandidenisthenumberofinteractionsfromthesamesessionavailable perinputcontrol 5.4.1 Authentication Everycandidateinputcontrolcombinesforinteractionsfrom thesamesessionintoonepredictedclasswhichcanbeeither legitimate or imposter .The totalnumberofforthe legitimate classaresummedupandcomparedagainst a threshold parameter.AplotshowingthechangeinFalseAcceptanceRate(FAR)andTrue AcceptanceRate(TAR)isshowninSection6.1.Thisistheonlyparameterwhichisrequired fortheentireensemble 5.4.2 Iden Themostimportantparameterforthisframeworkisthevaluechosenfor K .Choosingalargevalueof K causestoomanyincorrectpredictionstobeincludedinthe 28 ensemblenwhilechoosingtoosmallavaluerequiresthecandidateinput controltobeveryaccurateinreportingtheirpredictions.Anotherparameter whichtheaccuracyofthetestappusagesessionisthenumberoftestsamples availableperinputcontrolinthetestappusagesession.Weevaluatetheofboth theseparametersinSection6.2. 29 Chapter6 ExperimentalResults Weevaluatetheensembleforauthenticationandidenproblemsand presentourresultsinthefollowingsections. 6.1 Authentication Weevaluatedtheperformanceoftheensemblerusingthreeentbaseclas- -SVDD,OneClassSVMandTwoClassSVM.Everyusagesessionofanassumed legitimate userwastestedononcewhilethe impostor classwastestedonimpostordata notseenduringtraining.Figure6.1showsthemeanTrueAcceptanceRate(TAR)plotted againstthemeanFalseAcceptanceRate(FAR)seenwhilevaryingthe threshold parameter valuefrom0to40fortheensemblebasedonthethreebasecForevery technique,allinteractionsavailablefromthetestusagesessionareusedtoreacha predictedclass.ItcanbeseenthattheTwoClassSVMtechniquegivesusthebestoverall performancewhileSVDDhastheworstperformanceamongthe3.However,allthethree techniquesperformbetterthanarandomguessatthepredictedclassdemonstrating thatthelevelofdiscriminantinformationpresentedbythismodality.Figure6.1canbeused byapplicationdeveloperstostrikethebalancebetweenthenumberofimpostorsthatmay successfullyattacktheensemblevs.thenumberoftimesalegitimateuserwillget rejectedbytheensembleAnexampleofsuchabalancecanbeseenatthevalue ofFARbeing5%,aTARof76%isobtained.Inotherwords,theauthenticationframework 30 incorrectlyadmitsonly5%ofallimpostorswhilestilladmittingthelegitimateuser76%of thetime. Figure6.1:ROCcurvecomparingtheensemblewhenusingSVDD,OneClassSVM andTwoClassSVMasbase Ametricwhichindicatestrueperformanceoftheschemeiscalledthe EqualErrorRate(EER) .Thisvaluefortheaccuracyoftheisthevalueatwhich theprobabilityofalegitimateuserbeingrejected equals theprobabilityofanimposter beingacceptedbytheensembleFigure6.2showsthemeanFARandFRRseenat tvaluesofthe threshold parameterfortheensemblebasedonthethreebase algorithms-SVDD,OneClassSVMandTwoClassSVM.Itcanbeseenthat theensembleusingTwoClassSVMasabaseachievesthelowestEERof 7%,withtheensembleusingOneClassSVMcoming2ndat16%andtheSVDD- 31 basedensembleachievinganEERof30%.ItcanalsobeseenthattheEERforboth OneClassSVMandTwoClassSVMbasedapproachesisreachedatsmaller threshold values ofthreeandsixrespectivelywhereastheEERfortheSVDD-basedapproachisreachedata threshold valueof14.The threshold parameterstateshowmanyinteractionsinthesession fortheinputcontrolareallowedtobeastheimpostorbeforetheinputcontrol changesitsdecision.Thus,TwoClassSVMbasedensembleworks bestamongthe3techniquescorrectlyclassifyingbothlegitimateusersandimpostors93% ofthetime. Figure6.2:EqualErrorRatefortheensemblewhenusingSVDD,OneClassSVM andTwoClassSVMasbase 32 6.2 Iden Weevaluatedtheaccuracyoftheensembleusingthreetbase -K-NearestNeighbor,SupportVectorMachines(SVM)usingtheRBFkernelandGaussian DiscriminantAnalysis(GDA)ontheusageperformedby42tusersonthesee applications.Figure6.3showsthemedianandvarianceobservedinthetesterrorfor tkindsofwhenalleoftheappusagesessionswereused.Italsoshows themedianandvarianceinthetesterrorwhenonlythelastthreeoftheeappusage sessionswereused.Wheneappusagesessionswereavailable,fourofthemwereusedfor trainingandonewasusedfortesting.Thetestappusagesessioncouldbeswitchedaround etimes.Whenonlythreeappusagesessionswereavailable,thetestappusagesession couldbeswitchedaroundthreetimes.ForK-NearestNeighbors,avalueofthreewasused forthenumberofneighborstobecheckedforeveryincomingtestsample.IncaseofSVM, aRadialBasisKernelwasusedwheretheparametervaluesfor C and wereselectedby crossvalidatingononeappusagesessionandtrainingontheremainingappusagesessions, e.g.Whenalleappusagesessionswereusedforevaluation,onesessionwasselectedas thetestsessionandoftheremainingfoursessions,threewereusedfortrainingandonewas usedforcrossvalidation.ItcanbeobservedfromFigure6.3thattheSVMand GDAclassireportedtheleasterroramongthethreeHowever,theaccuracy givenbytheSVMismarginallybetterthantheaccuracyreportedbytheGDA Further,Figure6.3alsoshowsthevarianceinthetesterrorfortheGDA islessthanSVMinbothcases(lastthreeappsessionsandalleappsessions). Themediantesterrorreportedforthe3-NearestNeighbor(3NN)isconsistently higherthanboththeGDAandSVMbutithasthedistinctadvantageofbeing simpletoimplementanduse.Figure6.3alsoshowsthatthemedianerrorreportedforthe lastthreeappusagesessionswasconsistentlylessthanthemedianerrorobservedforall eappusagesessionsirrespectiveofthechoiceoftheThisobservationshows thatuserbehaviorhadmorevarianceduringthetwoappusagesessionsthanthelast 33 3.Wecanalsoobservethevarianceintesterrorforthe3NNforeappusage sessionsbeingmuchmorethanthevarianceintesterrorforthelastthreeappusagesessions. Figure6.3:Boxplotshowingidenaccuracyoftheensemblebasedon GaussianDiscriminantAnalysis,SupportVectorMachineand3NearestNeighborclassi Figure6.4showshowaccuracychangesasthenumberofenrolledusersisincreased. Theminimumvalueof11userswasusedbecausethetopsevenpredictedclassesfromeach candidateinputcontrolwerecombinedintotheensembleincaseof the3NNwhilethetopthreepredictedclassesfromeverycandidatewere usedincaseofGDA.TheaccuracyofGDAand3NNisshowninFigure6.4as thenumberofenrolleduserschangesfrom11to42.Itcanbeobservedthattheaccuracyof theensembledoesnotvarymuchasthenumberofusersincreases.Figure6.4demonstrates thattheensembleiscapableofcorrectlyclassifyingalargenumberofuserswithout adegradationinaccuracyevenwithaweakcandidatesuchas 34 3NN. Figure6.4:ChangeinIdenaccuracyofensemblewithincreasingnumber ofenrolledusers.Theensembleremainsconsistentlyaccuratewithchangeinuser count. Ninecandidateinputcontrolwereusedaspartoftheensemble.Fromeach thetop K mostprobableclassesfortheappusagesessionwerechosen.Inorder toexaminetheofthisparameterontheoverallaccuracyoftheweplotthe numberofappusagesessionsvs.thechosenparametervalueof K inFigure 6.5.Thisclearlyshowsthatsmallvaluesof K rangingfromonetoearecapableof givingagoodensembleHowever,asthevalueof K startstoincrease,more incorrectpredictionsfromthecandidateareusedbytheensembleinits predictioncausingitsaccuracytodegrade.Thetotalnumberofavailableuserswas42 andtherefore,thevalueof K wasadjustedfrom4to42. Inordertoinvestigatehowheavilytheensemblereliesonthenumberofavail- abletestsamplesperinputcontrol,weincreasedthenumberoftestsamplesinanappusage 35 Figure6.5:Numberofappusagesessionsvs.top K predictedclassesfrom candidate sessionfromoneupto32andmeasuredtheaveragetesterrorreportedbytheensemble whenthetopthreeclasses( K =3)wereprovidedasoutputbyeachofthecandidate Figure6.6showssuchaplot.Itcanbeseenthatevenwithasingleinteractionper inputcontrol,whenusingGaussianDiscriminantAnalysis,around26ofthe42appusage sessionsarestillcorrectly 6.3 EvaluationonReal-WorldApplication MSUPaths[25]isamobileapplicationavailableonboththeiOS[13]andAndroid[10] operatingsystems.Thisapplicationallowsuserstonavigatebetweenanytwobuildings 36 Figure6.6:Inputcontrolinteractionsthenumberofsessions.Topthree mostprobableclassesweretakenfromcandidateinputcontrolclasers,GDAand3NN wereused,alleappusagesessionswereusedtogettheaveragetesterror ontheMichiganStateUniversitycampus.Theusercanselectabuildingfromalistof buildingsloadedintheapplicationandtheapplicationloadsasetofdirectionsfromthe user'scurrentlocationtothebuildingselectedbytheuser.Wemodtheapplicationto listenforgesturesonfourtinputcontrolspartoftheMSUPathsuserinterfaceand askeduserstonavigateto20tbuildings.Userswereprovideddetailedinstructionson 37 howMSUPathsistobeusedfornavigationanddatawascollectedfrom10volunteers.The averagedatacollectiontimewasfoundtobe10minutes.Weusedthe19navigations fortrainingandthelastnavigationfortestingandfoundtheidenaccuracytobe 100%. 38 Chapter7 Discussion EthicalImplicationsduringDataCollection Anadvantageofthismodalityisthatitisimpossibleforanyusertoescapefromtheir interactionsbeingcapturedbythemobileapplication.Everyuser must interactwiththe UserInterfaceofanapplicationinordertouseit.Thus,everyapplicationbecomesfully capableofcapturinguserdataatalltimesregardlessofwhichdeviceitisbeingusedon. Applicationdeveloperswhochosetousethismodalitywillberequiredtotlydisclose thisdatacollectiontotheiruserseitheratthetimeofappinstallationortheappusage sessionduringwhichthedatacollectionisbegun. ofInputControlType Ninettypesofinputcontrolwereusedinthisinvestigationforboth trainingandtestingpurposes.Thisdecisionwastakenprimarilytoknowifthelargestavail- ablesetofinputcontrolsisusefulforthismodalityornot.However,tsubsetsofthese inputcontroltypesareusedbymobileapplicationdevelopersintheUIoftheirapplications. Itwouldbeusefultoknowwhichinputcontrolsmorediscriminantinformationthan otherssothatausabilityvs.securitycanbereachedbyapplicationdevelopers. 39 ofParametersduringEnsemble Bothauthenticationandidencationensemblehadtheirownsetofparam- eterswhichrequiredfurthertuning.Furtherinvestigationcanbeperformedalongthe linesofwhichparametervaluesaremoresuitedtottypeofapplications,e.g. amobilebankingapplicationmayprefertoallowonlyasmallfractionofimpostorswhich choosingtorejectthelegitimateusermoreoften.Thistrdependsonthefunctionality beingprovidedbythemobileapplication. ofPlacementofInputControlsintheUser Interface Duringthisinvestigation,weattemptedtouniformlyplaceinputcontrolsthatwere ofthesametypetoensureuserinteractionswerenotd.Forexample,allCheckbox instanceswereverticallyalignedtopreventtheuser'sfromtravelinginthehorizontal directionbetweeninteractionswithtCheckboxinstances.Itshouldbefurtherin- vestigatedifinstancesplacedinacertainwayprovidedmorediscriminantinformationwhen interactedwithbytheuser. ofMobileDevice Thesamemobiledevicewasusedfortheentiredatacollectionexerciseandthesame instructorwasinvolvedincollectingdatafromallusers.Itneedstobefurtherinvestigated ifsimilarperformanceoftheensembleforbothauthenticationandiden isseenonothermobiledevicesaswell.Inaddition,italsoneedstobecheckedifany similaritiesinuserbehaviorareseeninUIinteractionsacrosstdevices. ofAlgorithms Thesamealgorithmwasusedforallthenineinputcontrol whencreatinganensembleforboththeauthenticationandiden However,tcombinationsofbasealgorithmsmayimprovetheperformanceofthe 40 ensemble 41 Chapter8 Conclusion Anewmodalityforauthenticatingandidentifyingusersfromtheirusageofamobile devicewasexplored.EveryavailableinputcontroltypeontheAndroidOSwasusedto createbasewhichwerethenusedinanensembleDatawascollected from42usersforthismodalityfromeapplicationswhichhadthesameUserInterface. Threetbasealgorithmswereusedfortheensembleforboth theauthenticationandidenonproblems.Thebestperformanceforauthentication wasfoundtohavebeenachievedbyaSVM-basedensemblewithameanEqual ErrorRate(EER)of5%.Similarly,foridenthebestmeanaccuracyof90%was reportedbyaSVM-basedensembleIncaseofidentheSVM-based ensemblewasfoundtoberobustwhenthenumberofenrolleduserswasincreased andalsowhenthenumberofavailableinteractionsperinputcontrolwasvaried. Insummary,thisworkprovidesabasicframeworkforauthenticatingandidentifying usersinatrulyimplicitandcontinuouswayonmobiledevices.WhileweUIinteractions tobeusefulforbothauthenticationandidenseveralavenuesoffurtherresearchfor thismodalitystillexist.ThisstudydoesnotmakeuseofanyUIinteractionsseeninother swipe-basedandkeystroke-basedmodalitiessoitwouldbeinterestingtointegratethose techniquesalsointoabiggerensembleframework.Suchaframeworkwouldtrulyintegrate everypossibleinteractionmadebyauserwithamobileapplication. 42 BIBLIOGRAPHY 43 BIBLIOGRAPHY [1] InputControls|AndroidDevelopers. http://developer.android.com/guide/ topics/ui/controls.html ,2015. [2] Chih-ChungChangandChih-JenLin.LIBSVM:Alibraryforsupportvectormachines. ACMTransactionsonIntelligentSystemsandTechnology ,2:27:1{27:27,2011.Software availableat http://www.csie.ntu.edu.tw/ ~ cjlin/libsvm . [3] CorinnaCortesandVladimirVapnik.Support-vectornetworks. Mach.Learn. , 20(3):273{297,September1995. [4] TaoFeng,JunYang,ZhixianYan,EmmanuelMunguiaTapia,andWeidongShi.Tips: context-awareimplicituseridenusingtouchscreeninuncontrolledenviron- ments.In Proceedingsofthe15thWorkshoponMobileComputingSystemsandAppli- cations ,page9.ACM,2014. [5] TaoFeng,XiZhao,B.Carbunar,andWeidongShi.Continuousmobileauthentica- tionusingvirtualkeytypingbiometrics.In Trust,SecurityandPrivacyinComputing andCommunications(TrustCom),201312thIEEEInternationalConferenceon ,pages 1547{1552,July2013. [6] TaoFeng,XiZhao,andWeidongShi.Investigatingmobiledevicepicking-upmotionas anovelbiometricmodality.In Biometrics:Theory,ApplicationsandSystems(BTAS), 2013IEEESixthInternationalConferenceon ,pages1{6.IEEE,2013. [7] JordanFrank,ShieMannor,andDoinaPrecup.Activityandgaitrecognitionwith time-delayembeddings.In AAAI ,2010. [8] MarioFrank,RalfBiedert,EugeneMa,IvanMartinovic,andDawnSong.Touchalyt- ics:Ontheapplicabilityoftouchscreeninputasabehavioralbiometricforcontinuous authentication. InformationForensicsandSecurity,IEEETransactionson ,8(1):136 {148,12013. [9] Google.Activity|AndroidDevelopers. http://developer.android.com/ reference/android/app/Activity.html ,2015. [10] Google.Android. https://www.android.com/intl/en_us/ ,2015. [11] Google.GestureDetector|AndroidDevelopers. http://developer.android.com/ reference/android/view/GestureDetector.html ,2015. [12] EijiHayashi,OrianaRiva,KarinStrauss,AJBrush,andStuartSchechter.Goldilocks andthetwomobiledevices:goingbeyondall-or-nothingaccesstoadevice'sapplica- tions.In ProceedingsoftheEighthSymposiumonUsablePrivacyandSecurity ,page2. ACM,2012. 44 [13] AppleInc.Apple-iOS8. https://www.apple.com/ios/ ,2015. [14] HilmiGunesKayMikeJust,LynneBaillie,DavidAspinall,andNicholasMicallef. Datadrivenauthentication:Ontheenessofuserbehaviourmodellingwithmo- biledevicesensors. [15] HilmiGunesKayacik,MikeJust,LynneBaillie,DavidAspinall,andNicholasMicallef. Datadrivenauthentication:Ontheenessofuserbehaviourmodellingwithmo- biledevicesensors. arXivpreprintarXiv:1410.7743 ,2014. [16] HassanKhan,AaronAtwater,andUrsHengartner.Acomparativeevaluationofimplicit authenticationschemes.In ResearchinAttacks,IntrusionsandDefenses ,pages255{ 275.Springer,2014. [17] HassanKhan,AaronAtwater,andUrsHengartner.Itus:animplicitauthentication frameworkforandroid.In Proceedingsofthe20thannualinternationalconferenceon Mobilecomputingandnetworking ,pages507{518.ACM,2014. [18] HassanKhanandUrsHengartner.Towardsapplication-centricimplicitauthentication onsmartphones.In Proceedingsofthe15thWorkshoponMobileComputingSystems andApplications ,page10.ACM,2014. [19] OrianaRiva,ChuanQin,KarinStrauss,andDimitriosLymberopoulos.Progressive authentication:Decidingwhentoauthenticateonmobilephones.In Proceedingsofthe 21stUSENIXConferenceonSecuritySymposium ,Security'12,pages15{15,Berkeley, CA,USA,2012.USENIXAssociation. [20] PremkumarSaravanan,SamuelClarke,DuenHorng(Polo)Chau,andHongyuanZha. Latentgesture:Activeuserauthenticationthroughbackgroundtouchanalysis.In Pro- ceedingsoftheSecondInternationalSymposiumofChineseCHI ,ChineseCHI'14,pages 110{113,NewYork,NY,USA,2014.ACM. [21] LookoutMobileSecurity.PhoneTheftinAmerica. https://www.lookout.com/ resources/reports/phone-theft-in-america ,2014. [22] MuhammadShahzad,AlexX.Liu,andArjmandSamuel.Secureunlockingofmobile touchscreendevicesbysimplegestures:Youcanseeitbutyoucannotdoit.In Proceedingsofthe19thAnnualInternationalConferenceonMobileComputing& Networking ,MobiCom'13,pages39{50,NewYork,NY,USA,2013.ACM. [23] WeidongShi,JunYang,YifeiJiang,FengYang,andYingenXiong.Senguard:Pas- siveuseridenonsmartphonesusingmultiplesensors.In WirelessandMobile Computing,NetworkingandCommunications(WiMob),2011IEEE7thInternational Conferenceon ,pages141{148,Oct2011. [24] DavidMJTaxandRobertPWDuin.Supportvectordatadescription. Machinelearn- ing ,54(1):45{66,2004. 45 [25] MichiganStateUniversity.MSUPaths. https://github.com/MSUPaths/MSUPaths_ Android ,2015. [26] JiangZhu,PangWu,XiaoWang,andJoyZhang.Sensec:Mobilesecuritythrough passivesensing. 2013InternationalConferenceonComputing,NetworkingandCom- munications(ICNC) ,0:1128{1133,2013. 46