ACOLLABORATIVESOFTWARETOOLCHAINFORAUTOMATICCOLLECTIONANDCOMPARATIVEANALYSISOFSENSORCHARACTERIZATIONDATAByCharlesSamuelBolingATHESISSubmittedtoMichiganStateUniversityinpartialoftherequirementsforthedegreeofElectricalEngineeringŒMasterofScience2016ABSTRACTACOLLABORATIVESOFTWARETOOLCHAINFORAUTOMATICCOLLECTIONANDCOMPARATIVEANALYSISOFSENSORCHARACTERIZATIONDATAByCharlesSamuelBolingReproducibleresearchhasbeenrecognizedasagrowingconcerninmostareasofscience.Toachievewidespreadadoptionofrepeatable,transparentresearchpractices,somecommentatorshaveaneedforbettersoftwareforauthoringreproducibledigitalpublications.Compli-catingthisgoal,investigationsincreasinglyinvolveinterdisciplinaryteams,sophisticatedwwsforacquiringandanalyzingdata,andhugedatasetsthatrelyonconsiderablemetadatatointerpret.Computationalscientistshavebeguntoadopttoolsformanagingthecomplexhisto-riesoftheirdataandprocedures,butsoftwarewhichsimultaneouslyallowsresearcherstospecifyexperiments,remotelycontrolequipment,andcaptureandorganizedataremainsimmature.Thisthesisdemonstratesasoftwarearchitectureforprogrammableremotecontrolofcustomandcom-merciallabequipment,automaticannotationandqueryablestorageofdatasets,andprovenance-awareofexperimentandanalysisprocedures.Thedesignconsistsofasuiteofsmall,single-purposesoftwareserviceswhichmaybecontrolledremotelyfromawebbrowser,includingagraphicalprogrammingtool,anabstractionlayerforinterfacingwithcommercialandcustomembeddedsystems,andahybriddocument/tabledatabaseforpersistentstorageofannotatedex-perimentaldata.Thesoftwareimplementationembracesmodernwebtechnologiesandbestprac-ticestoproduceamodular,user-extensibleframeworkthatiswell-suitedforhelpingtointegratecomputer-controlledresearchlabswiththeemergingInternetofThings.ACKNOWLEDGMENTSIamespeciallythankfulforthetremendoussupportofmyfamilyandlovedonesthroughoutmytimeatschool.Inparticular,IwanttoexpressmyappreciationforPaula,whosepatience,feedback,andreassurancehavebeeninvaluableinmuchmorethanjustmyacademicwork.ThesoftwaredescribedinthisthesiswouldnothavebeenpossibletorealizewithouttheofmanyprolongeddesigndiscussionswithstudentresearchersIanBacusandYousefGtat,andIamtremendouslygratefulfortheirinputanddevelopmentsupport.Furthermore,theentireAMSaClabhasbeenagreathelpandacommunitytoworkinthelastfewyears.ThisworkwaspartiallysupportedbyfundingfromNIHgrant1R01ES022302.iiiTABLEOFCONTENTSLISTOFTABLES.......................................viLISTOFFIGURES.......................................viiCHAPTER1MOTIVATION.................................1CHAPTER2BACKGROUND................................62.1Usecase:Electrochemicalsensorarrays.......................62.2RequirementsandTerminology............................72.2.1Automation..................................82.2.2Metadataanddataprovenance........................92.2.3Versioncontrol................................102.2.4Collaboration.................................102.2.5Extensibility..................................112.2.6Usercompliance...............................112.2.7Security....................................122.3Reviewofexistingexperimentmanagementsoftware................122.3.1Electroniclabnotebooks...........................132.3.2Wwdesigntools............................132.3.3Laboratoryinformationmanagementsystems(LIMS)............172.3.4Equipmentautomationtools.........................182.4Enhancingpublicationvalue.............................192.4.1Semanticprovenancemodels.........................192.4.2Researchobjects...............................202.5Summary.......................................21CHAPTER3ARCHITECTURE...............................233.1Networkarchitecture.................................233.1.1Physicalarchitecture.............................233.1.2Monolithicapproach.............................243.1.3Microservices.................................273.1.4Switchboardservice..............................273.2Devicecontrol.....................................293.2.1Instrumentmanager..............................293.2.2Deviceenumeration..............................293.2.3DeviceAPIsandprotocolcomposition....................303.3Datamodel......................................313.3.1Researchartifactmodel............................313.3.2Datasetmanagement.............................323.4Userexperience....................................333.5Securitymodel....................................353.6Summary.......................................35ivCHAPTER4IMPLEMENTATION.............................374.1Overview.......................................374.1.1Designprinciples...............................394.1.1.1Request/Responsevs.Publish/Subscribe.............404.1.1.2Dynamicloading..........................414.2Serviceinterconnect..................................434.2.1RESTAPIs..................................444.2.2WAMProuting................................454.3Userinterface.....................................464.3.1Thinclientdesign...............................474.3.2Angular2...................................474.3.3UIcomponentsondemand..........................484.3.4Jupyter....................................494.4Databasemanagement.................................504.4.1NoSQLandschemalessdatabases......................504.4.2HDF5.....................................524.5Devicemanagement..................................534.5.1Enumeration.................................534.5.2Protocolstacks................................544.6Summary.......................................55CHAPTER5SUMMARY..................................565.1Contributions.....................................565.1.1Hardware-connectedlabinformatics.....................565.1.2Modularityviamicroservices.........................575.1.3Designforcustomizability..........................575.2Implementationstatusandfuturework........................585.3Conclusion......................................60APPENDICES.........................................61APPENDIXA:ACRONYMS................................62APPENDIXB:GLOSSARY................................64BIBLIOGRAPHY........................................68vLISTOFTABLESTable2.1Comparisonoflabinformaticssoftware......................13viLISTOFFIGURESFigure2.1Electrochemicalsensorcharacterization.....................8Figure2.2EditinganIPythonnotebook...........................14Figure2.3EditingaTavernaww............................16Figure3.1Research-relevantartifacts.............................25Figure3.2Monolithicwebarchitecture............................26Figure3.3Microservice-basedwebarchitecture.......................28Figure3.4AtypicaleGorww.............................34Figure4.1High-levelstructureoftheeGorsystem......................38Figure4.2Dynamicloadingofmicroservices........................43Figure5.1Userinterfacescreenshot.............................59viiCHAPTER1MOTIVATIONPoorreproducibilityofpublicationshasbeenrecognizedasagrowingprobleminanum-berofresearchareas,particularlythoseinwhichexperimentsarecomplexandsensitivetosmallvariationsinmethodology.Large-scalereplicationeffortshavesuggestedreproducibilityratesinsomeaslowas10%[8].Researchershavebeguntocallforimproveddocumentationoftheprocessinasdiverseasexperimentalpsychology[17],pharmaceuticalresearch[8],astronomy[1],andareasofcomputersciencesuchasmachinelearning[11].Muchoftherecentattentionpaidtothisissuecomesinresponsetouniquecharacteristicstheprocesshasde-velopedinthedigitalage,includingunprecedentedpublicationvolume,insufdocumentationofcomplexexperimentalprocedures,andtheavailabilityofsoftwarepackagesformanipulatingstatistics,butsomecriticssuspectthatmanylong-standingresultsarealsoinadequatelyvIthasbeensuggestedthatfactorssuchaspublishingpressure,bias,andinadequatestatisticalpowerofhypothesespromotethewidespreadpublicationandcitationofunvclaimsinallareasofexperimentalscience[36].Tomakemattersworse,thelastfewyearshaveseenmanycasesofpeerreviewfraud[25],outrightdataf[24],andethicallydubiousactivitiessuchasp-hacking[34],thepracticeofmassagingdatatocrosstheacceptedthresholdofstatisticalInadditiontothesesystemicproblemswiththepublicationprocess,theday-to-daypracticeofmodernscienceandengineeringresearchpresentsasteadilygrowinghostofdataorganiza-tionchallengestoinvestigators.Oftenanexperimentalprograminvolvesmanypersonnel,eachwithauniquespecializationandresearchfocus,performinginterdependentexperimentsatmul-tipleuniversities.Eachsuchexperimentistheproductofahugehostofanddataareoftencollectedinad-hocorincompatibleformats,makingitdiftodrawhonestcomparisonsbetweenresultsortoisolatemethodologicalproblems.Especiallyinthecaseoftechnologyde-velopmentandexploratoryresearch,itisdesirableforresearchersinthesekindsoflargeprojects1touseuniformdataacquisitionprotocols,unambiguouslydescribetheirexperimentalprocedures,andcollatetheirworkintoself-contained,consistentlyformattedunitsfordistributiontocollabo-rators.Onepropositionforimprovingthereproducibilityoffutureworkistobetterstan-dardizedocumentationpracticesforlaboratoryprocedures[36]andultimatelytotransitionfromtraditionalpaper-basedpublicationmodelstoelectronicformatswhichcapturetheintricaciesofmodernwork.Ideally,aunitofdisseminatedresearchwouldprovideenoughdetailforfutureresearcherstoreplicateeverystepoftheexperimentandanalysisassociatedwithapubli-cationandforreviewerstoidentifysourcesoferrors,detailswarrantingfurtherexamination,andacademicmisconduct.studiesaswellasexploratoryresearchcouldfromtheadoptionofxiblesoftwaretoolsforcollectingdataandchroniclingexperimentalprocedures.Particularlyinfiin-silicoflwhereexperimentsconsistofthetransformationandanalysisofdatasetswithinthedigitaldomain,researchstandstofromsoftwarethatcanautomateandstandardizetaskssuchasexperimentaldesignandrecordkeeping,andsomepublicationorgani-zationshavebeguntoencouragesharingofcode,procedures,andrawdataalongsidesubmittedmanuscripts.Softwaretoolsformanagingcomplexsimulationanddataanalysispipelineshavebeguntoemergeinrecentyearswhichoffersupportforanumberofpowerfulfeatures,includ-ingdataprovenance,sharingandwws,andpackagingexecutionenvironmentsintovirtualmachinesforlaterexecutionondifferenthardware[28,53].Thesetoolstypicallydonotattempttomodelorautomatenon-softwareresearchtasksindetail.Toaddresssomeofthein-formaticschallengesofmorefihands-onflresearch,severalcompanieshavedevelopedso-calledlaboratoryinformationmanagementsystem(LIMS),whicharebettersuitedtotheinventoryanddatamanagementneedsoftraditionalfacilitiessuchaswetlabs.However,manysci-endeavorsinvolvesomemixtureofstructuringin-silicoanalysiswwsanddirectlymanipulatingphysicalsystems,andsoftwaretoolchainsforuniformlymanagingproceduresofthisnatureremainimmature.Inwhereresearchinvolvesbothsophisticatedsoftwareanalysisandintensivebatteries2ofphysicalexperiments,investigatorscouldfromasoftwareplatformwhichpro-tocoldesign,dataacquisition,resultannotationandarchiving,signalprocessing,andothertasksinvolvedinthecompleteresearchanddevelopmentlifecycle.Suchatoolshouldbe(i)automatic,employingcomputercontrolwheneverpossibletoproduceorganized,uniformandrepeatableex-periments;(ii)extensibleandmodular,promotingadoptionofnewequipment,experimentalmeth-ods,anddataanalysistechniquesviauser-craftedplugins;(iii)collaborative,allowingresultsandproposedexperimentstobeshared,annotated,andreviewedatmanylevelsofdetail;(iv)bespoke,accommodatingandcomplementingthefocusofeachresearcherinvolvedinaninterdisciplinaryresearchanddevelopmentproject,and(v)provenance-aware,enablingdifferentialanalysisofexperimentaloutcomesandmethodologies.Existingapproachesdonotcombinedataacquisitionandarchivingfeatureswithinterfacesforprocesscustomizationinawaythatmeetsalltheabovegoals.Inmanycasestoolsbuiltforthesepurposesarealsoinsufadaptableforthefast-pacedandvariedneedsofactivescientists,causinguserstoabandonthesoftwareonceitpresentsmorelimitationsthanFortunately,modernwebtechnologieshavebeguntoenablesoftwaredesignstrategiesthatmakeacomplex,customizableend-to-endsolutionfeasible.Network-enabledserviceswithdi-versepurposesandinternalinfrastructureshavebecomeincreasinglyinteroperablethankstotheadoptionofself-documentingwebapplicationprogramminginterface(API)s.Theincreasingso-phisticationofwebbrowsershasallowedforanexplosionofrichclient-sidesoftwareexperiences,enablingfull-featuredanduserinterfaceswhichareplatform-independentandeasilyupdated.Anumberoftechnologiessuchasdistributedversioncontrol,demand-scalingcloudhostingservices,andreal-timefull-duplexnetworkdatastreaminghaveemergedaspowerfultoolsforrapidlybuild-ingrobustandxiblewebapplicationswithunprecedentedcapabilities.Theavailabilityofin-expensivesensorandnetworkhardwarehasbeguntospurthegrowthoftheemergingInternetofThings(IoT),avisionofthenearfutureinwhichubiquitouscomputingdevicescollectdata,communicatewitheachother,andinteractwiththeirenvironments.Togethertheseadvancementsprovidearichsoftwareecosystemforimplementinganext-generationLIMSforperformingcom-3plexexperiments,curatingdetaileddatasets,andgeneratingpublicationunitswithend-to-endreproducibility.Thisthesisdescribesthedesignandimplementationofasuiteofsoftwaretoolsfordataacquisi-tionandprovenancetrackingwiththegoalofleveragingcomputerautomationtocreateadisseminationformatwithrichfacilitiesforcomparingresults,identifyingnewdirectionsofre-search,andfosteringcollaborationthantraditionalprintpublications.Thedesignofthedescribedsoftwaretoolembracesmodernwebtechnologies,separatingfunctionalunitsintoindependentnetworkedserviceswhichcommunicatebydiscoverablewebAPIs.Thisarchitectureenablesin-vestigatorstointeractwitheachother'sresearchremotelyandtoindependentlycreatereusableservicesoftheirown.Thecorecomponentsofthesystemaremodularandlooselycoupled,andusersareencouragedtomodify,create,andsharesoftwarecomponentstomeettheuniqueneedsoftheirresearch.Bydesigningamodulararchitecturewhichanticipatesrapidlychangingrequire-mentsandenablesuserstotakeanactiveroleinsoftwaremaintenance,theplatformisintendedtogrowwithitsuserbaseandenjoybroaderusefulnessandgreaterlongevitythanexistingfreeandcommerciallabinformaticspackages.Theframeworkprovideshigh-levelcapabilitiesforremotelycontrollinglabequipmentandroutingcapturedsensordata,withavisionofconnectingresearchlabstothenascentIoT.Todemonstrateandexplorethesystem'scapabilities,aembeddedsystemwasdevelopedforperformingcustomizableelectrochemicalexperiments,whichincludesamulti-channelarbitrarywaveformgenerator.Thesystem'sarchitectureisdescribedindetailalongwithanoverviewofthedevelopers'implementationchoices.Throughouttheexposition,thedesignandintegrationofmultiplecustomelectrochemicalinstrumentsservetodemonstratehowusersmightaddandmodifysoftwarecomponentstomeettheneedsoftheirownresearch.Thefollowingchaptersexploretheexistingspaceofsoftwaretools,layouttheproject'sdesigngoalsdescribethehigh-levelstructureofthedesign,andexplainthetechniquesandtechnologiesusedtoimplementthecompletedsystem.Finallywedescribethecharacteristicsofthedesignthatwefeelareuniqueornotable.Duetothelargevolumeoftechnicalterminologyinvolvedindiscussingsoftwaretoolsandsoftwaretechnologies,aglossaryoftermsisincludedat4theendofthethesis.5CHAPTER2BACKGROUNDAgrowingbodyofworkintheofmeta-researchhasanumberofobstructionstoresearchreproducibilityandpossibletechniquesforimprovingthetrustworthinessofpublications.Onepromisingapproachforaddressingsomeofthesefactorsisthewidespreadadoptionofstandardizedproceduresforexperimentdesign,recordkeeping,andpublication,sup-portedwhereverpossiblebysoftwaretoolsforautomationandresearchlifecyclemanagement.Thischapterthefunctionalrequirementsofasoftwaresystemfordesigningandexecut-ingcomplexexperimentsandorganizingtheirresultsandtheneedforanext-generationcollaborativelabinformationmanagementsystem.Wedescribetheelectrochemicalsensorresearchwhichproducedourgroup'sneedforthesoftware,analyzethetoolingrequirementsofourproject,andthenexploretheexistingecosystemofsoftwaretoolsforautomationandcurationofresearch.2.1Usecase:ElectrochemicalsensorarraysOurdevelopmentofanext-generationcollaborativeLIMSismotivatedbyaconcreteresearchtask,namelycharacterizationanddesignofelectrochemicalsensorarraysforpreciseconcentrationes-timationofabroadrangeofchemicaltargets[42,71,72].Electrochemicalsensorsaresensitivetoanumberofinteractingenvironmentalconditionssuchastemperature,humidity,ambientw,andpresenceoftraceinterferentchemicals[44].Thesensitivityofagivensensortoaparticularanalytecompoundisalsoacomplexfunctionofdevicegeometry,electrolyteandsubstratemate-rials,andappliedelectricalstimulus.Inordertomakemeasurementsmeaningful,asmuchofthissecondaryinformationaspossiblemustbecollatedwiththerawelectricaloutputofthesensors.Additionally,atypicalcharacterizationexperimentinvolvesasequenceofmanipulationsofcon-trollableparameterssuchasthewratesofinputgasesorappliedvoltagewaveforms.Theend6engineeringgoaloftheseexperimentsistodeterminetheinversefunctionmappingeachsensor'sinstantaneousoutputcurrent,inputvoltage,andobservableenvironmentalparametersintoacon-centrationofthedevice'schemicalenvironment.Foranexperimentaldatasettoaffordsuchananalysis,theinputconditionsshouldbecontrolledasaccuratelyaspossible,andespeciallyforbatteriesoftestsinvolvingmanysensorsoperatingintandemitisnecessarytoemploycomputercontroltoachieveuniformresults.Thisexperimentalscenario,depictedschematicallyinFigure2.1,willserveasarunningexam-pletodemonstratethecapabilitiesandrequirementsofthesoftwaretooldescribedbythisthesis.Ouridealexperimentalsetupinvolvescommerciallabequipmentaswellascustomdataacquisi-tionhardware,simultaneousoperationofmanysensorswithdifferentphysicalcharacteristics,andpreciselytimedcomputerchoreographyofelectricalinterrogationprotocolsandgaswrates.Furthermore,theexactnatureoftheexperimentsbeingrunchangesfrequentlyasresearchersiden-tifynewquestions,designnewsensors,andinvolvenewequipmentintheirwork,requiringourcontrolanddatamanagementsoftwaretogrowwiththechangingrequirementsofitsusers.Webelievethatasoftwareframeworkcapableofschedulingandautonomouslyexecutingex-perimentsofthislevelofcomplexityhasthepotentialtobemorebroadlyusefulinanyenvironmentwithsimilarwwneeds.Bygeneralizingourdesignfromthisusecase,wehopetomeetourproject'sneedsandsimultaneouslyprovidetheresearchcommunitywithpowerful,much-neededopen-sourcesolutionsforasetofproblemsthatrecurinmanydifferentareas.Thedesigngoalsofoursoftwarepackageareenumeratedinthefollowingsection,followedbyanoverviewoftheexistingtoolswhichsomeoftheserequirements.2.2RequirementsandTerminologyLaboratorysciencepresentsadiversesetofoperationsmanagementandinformaticschallenges,andmanyresearchreproducibilityeffortsstandtofromcarefullydesignedsoftwaretools.Inthissectionweconsidersomeofthemanyscalabilitychallengesfacedbyatypicalresearch7Figure2.1:Electrochemicalsensorcharacterization.Schematicofanex-ampleexperimentalapparatusforcharacterizinganarrayofelectrochemicalgassensors.Inset:somecommonlyusedstimuluswaveformsforinterrogatingelectrochemicalsensors.groupandexaminesomeproposedtechniquesforaddressingthem.Thisdiscussionhasguidedthedesignofthesoftwareframeworkpresentedlaterinthisthesis.2.2.1AutomationThedesiretoscaleexperimentstomuchhigherthroughputprovidesamajormotivationforexplor-ingsoftwaresolutionsforlabmanagement.Asresearchersbegintoworkwithmanydevicesandcontrolparameterssimultaneously,datacollectionandtrackingtasksbecomediftomanage.Additionally,whenattemptingtoprovideanalysesoflargesensorcharacterizationdatasets,signalprocessingexpertsrequireaccurateinformationaboutthetimingofinputandoutputevents,andadequateresolutionofcontroleventsisextremelydiftoobtainundermanual8operation.Byemployingcomputercontrolofactuatorsanddatacollectionequipmentwheneverpossi-ble,researchersshouldbeabletomaximizetheconsistencyoftheirresultswhilesimultaneouslyimprovingtheirproductivity.Theabilitytoautomaticallyre-runataskwithdparametersovernightratherthancarefullymanipulatingcontroldialsforhoursonendwouldallowscientiststofocustheirexpertiseonidentifyingnewresearchquestionsratherthanontediousandmeticu-lousexperimentexecution.Anidealsoftwaretoolforlabautomationshouldallowinvestigatorstodesign,andcomposeexecutabletasks,enablingresearcherstobuildcomplexexperimentalprotocolsfromalibraryofreusablecomponents.2.2.2MetadataanddataprovenanceMuchofthedatathatiscollectedandexchangedbyresearchersisstoredinad-hocformats,of-tendetachedfromtherelevantmetadatanecessarytomaketheseresultsmeaningful.Examplesofmetadatawhichareoftenomittedfromrawdatasetsincludemeasurementunits,inputconditions,sampleandequipmentIDs,andannotationssuchasthehypothesisofanexperimentorwheretofurtherdocumentationorreferences.Thesekeypiecesofinformationareoftenrecordedorrememberedonlybytheoriginalexperimenterandmayeasilybecomeunavailabletofuturere-searchers.Evenwhendatacollectionandmanagementpoliciesareestablishedwithinagroup,itrequirescarefuldisciplinetoenforcetheserulesmanually,especiallyinatypicalfast-pacedresearchenvironmentwithlittledirectoversight.Furthermore,inmanycasesdrawingconclusionsaboutadatasetreliesoninformationaboutexperimentalconditionsthatisdiftoacquireforeverytrialandisnotobviouslyrelevantattheoutset,forcingresearcherstobacktrackandrepeatworkinordertobeintheirresults.Byusingsoftwaretocollectandmanageinformationaboutthewofdatathroughanexperiment,userscanbeprovidedwithpowerfultoolsforexaminingtheirwwsatmanylevelsofdetailwithoutrequiringcostlyandtime-consumingrepeattrials.Systematicallytrackingandorganizingthehistoryofdatasetsastheyarecollected,reformat-9ted,andundergotransformationsandanalysisisthefocusofthegrowingareaofdataprovenance[12].Provenancetechniquesaimtoallowresearcherstoproperlyattributeadataset,understandhowitwascreated,anddeterminewhereandhoworerrorswereintroduced.Cap-turingandserializingaccurateandsufprovenanceinformationaboutasystemremainsaresearchtopicofitsown[15],butanumberofexistingsoftwaretoolsprovidesomefeaturesthatcatertothisneed.2.2.3VersioncontrolWheneversoftwareprovidestheabilitytocreateandmodifycomplexdocumentsorartifacts,ver-sioncontrolisavaluablefeatureforimprovingproductivityandauditability.Similarinconcepttodataprovenance,aversioncontrolsystem(VCS)keepscheckpointsofimportantpointsinasedithistory,allowingauthorstoreviewpaststates,recoverlostwork,andmakechangestoasingleratherthanattemptingtomanuallykeeptrackofbackups.Versioncontroltoolsareindispens-ableinthesoftwareindustryfortrackingsourcecode,wherepopulartoolsincludeGit[14]andSubversion[57],butsomeversioncontrolfeaturesarenowcommonplaceinofprogramssuchasMicrosoftWord'sfiTrackChangesflmode[49].Existingversioncontrolsoftwareforplaintextisextremelymature,full-featured,andpowerful,andmaybeusedasathird-partytoolforanyworkwhereplaintextcodeandareartifactsofinterest.2.2.4CollaborationModernresearchlabsareincreasinglyinterdisciplinaryandrelyonremotesharingoftechniques,data,andpublications.Softwaredesignedforassistingresearcherswithperforminganddocu-mentingtheirworkshouldrtheserealities,ideallyofferingnativesupportforsharingandcollaborativelyreviewingresourcesovertheInternet.Softwaresystemswithdistributioninmindarealsowellequippedtoenforcepoliciesaboutdatausageandtomaintainend-to-endprovenanceinformationaboutartifactsbymanagingrecordsinaserver-sidedatabase.Furthermore,theuse10ofelectronicmediaenablesuserstoassembleinformation-richdisseminationunits,andsoftwarewhichsupportsportableandinformation-denseformatsprovidesforlong-termcollab-orationaswellaspublication.2.2.5ExtensibilityAcommonusercomplaintaboutcommercialsoftwarewithproprietarycodebasesisthattoolsareoverlyrigidandill-suitedforadaptingtotherapidlychangingneedsofusers[52].Thefastpaceandnecessaryinteractionwithbleeding-edgetechnologiesprovidesonepossiblereasonfortheproliferationoflabmanagementsoftwarepackageswithslightlydifferentgoalsandfeaturesets.Toaddressthisproblem,wefeelthatresearchersshouldbeallowedandencouragedtocustomizeandmodifytheirlabmanagementsoftwaretomeettheirneeds.Open-sourceprojectsaretheoreticallyarbitrarilyextensible,sinceusersmaydirectlymodifythesoftware,butinmanycasesopensourcetoolsarestillnotdesignedwithcustomizationinmind.Systemswithamodulardesignthatsupportcommunity-craftedplugins,user-levelscripting,andstraightforwardintegrationwiththird-partytoolsareabletogrowalongsideusers'changingneedsandallowdedicateduserstocompoundtheinitiallearninginvestmentovertime.Suchsystems,whenwell-designed,oftenfromgreaterlongevityandfeature-richnessthantraditionalmonolithicprograms[31,47].2.2.6UsercomplianceAknownchallengefacedwhendevelopingsoftwareforapplicationssuchasreproducibleresearchisthatfeature-richtoolsoftenpresentuserswithasubstantiallearningcurve,deterringwidespreadadoption.Toolswhichdonotconferanobviousadvantageimmediatelyordisruptusers'existingwwsarelikelytogounused,wastingdevelopmenteffort.Researchonthetopicsuggeststhateaseofuseandaccessibilityofdocumentationareimportantconcernsforpromotinguseradoption[39].Usabilitycanalsobeimprovedanddemonstratedbyprovidingconcreteexamplesofhowthesoftwarecansolveproblemsfacedroutinelybydomainscientistsandencouragingusers11totailorthetoolstotheiruniquepreferencesandneeds.Otherimportantdeterminantsofusercomplianceincludeupgradability,technicalsupport,reliability,andcompatibilitywithexistingtools[70].Addressinguserexperienceconcernsfromtheoutsetofadesignandincorporatingfeedbackinthedevelopmentprocesscanresultinanultimatelyricherproduct,andthisisoneofthekeyinsightsofthenow-popularAgiledevelopmentmethodology[7].2.2.7SecurityIntellectualpropertyisanimportantissueinbothindustrialandacademicresearch,giventhatfunding,commercialcompetitiveness,andlegalandprofessionalrecognitionareoftencontingentonpriority.Internet-connectedsoftwarewhichmanagespotentiallysensitivedataanddesigndocumentsmustthereforemakedigitalsecurityaprincipalconcern.Anarchitectureforonlineexperimentanalysisanddesignmustcarefullyconformtothelatestsecuritybestpracticesandmaintaincarefulaccesscontrolswhileallowingforcollaboration.2.3ReviewofexistingexperimentmanagementsoftwareThecomplexneedsofmodernresearchhavecreatedalargespecializedsoftwaremarket,andtherearenowdozensoftoolsforcomputerizingvariouslaboratorymanagementandresearchtasks.Therearenowmanycompaniesofferinglabinformaticssoftwarewithabroadrangeofcapabil-ities.Sincemanyoftheseprogramsareproprietary,itisdiftocomparetheirfeaturesetsprecisely,andmanypackagesaredefunctorpoorlydocumented.Belowweattempttoprovideabroadoverviewofthemajorclassesofsoftwaremostalignedwithourgoals,givingafewexam-plesofprominentproductsineachcategory.AcomparisonofthesecategoriesoftoolsandthefunctionstheyprovideisgiveninTable2.1.12ELNWMSLIMSModel-baseddesignProcessXXXXAnalysisanddocumentationXDatamanagementXXXCollaborationXXXHardwarecontrolXTable2.1:Comparisonoflabinformaticssoftware.Acomparisonofthefeaturestypicalofeachmajorcategoryofinformaticssoftware.Thetoolkitdescribedinthisthesisintendstoprovideallveofthelistedcapabili-ties..2.3.1ElectroniclabnotebooksAnelectroniclabnotebook(ELN)isasoftwaretoolforhelpingresearcherstochronicletheirday-to-dayinvestigationsandresults.AtypicalELNpackageallowsresearcherstocomposerich-textdocumentsconsistingoftextandalongsidetechnicalartifactssuchasdatatables.SeveralsurveysofcommerciallyavailableELNshavebeenpublished[61,22],butthedomainisstillevolvingrapidlyandsomeoftheseprogramshavebeguntointegratecomplexcapabilitiessuchasversioncontrol,experimentandmore.ManyofthecommercialproductsinthisdomainofferuserscompliancewiththeFDA'srecommendationonelectronicrecordkeeping[26],asetofguidelinespromotingthorough,auditabledocumentationofresearchperformedintheagriculturalandhealthsectors.Mostgeneral-purposeprogramminglanguageenvironmentstargetedtowardcomput-ingnowincludesomedegreeofELNfunctionality.Thesetoolsaretypicallyenvironmentsforliterateprogramming[37]whichareabletoembedplotsanddatatablesalongsidecodeandnatu-rallanguagedocumentation.PopularsolutionsinthisdomainincludeMathematica[75],R[58],IPython/Jupyter[55],andMATLABNotebook[46].2.3.2Wwdesigntoolscomposing,anddocumentingcomplexproceduresisacoreorganizationalneedofmanyresearchgroups.Anumberofso-calledWMSshaveemergedtohelpmanagetaskschedulesand13Figure2.2:EditinganIPythonnotebook.ScreenshotofanelectroniclabnotebookpageinIPython/Jupyterv4.1.0[55]integratingdocumentation,code,inlinemath,anddependenciesindomainssuchasmanufacturing[3],highperformancecomputing[28],andbusi-nessmanagement[13].Wweditorsprovideuserswithameansofconstructingexecutabletasksbydescribinghowdatamovesthroughthem,typicallybyvisuallymanipulatingadirectedgraphofprocessesasinFigure2.3.Insomecaseswwsmayservepurelyasdocumenta-tion,whilewwtoolsforin-silicoscienceareoftenexecutableandmaybebundledwithdatatoprovidedirectreplicationofanalysiswsonothermachines.ThemostprominentexamplesofwwsoftwaretargetedtowardscientistsarebuilttofacilitatethedesignandexecutionofhighperformancecomputingsimulationssuchasApacheTaverna[53]andVisTrails[28].Lessattentionhaspaidtoprocessesthatarenotcompletelydigitalandarethereforeharderto14fullyautomate.Theapplicationofsimilarsoftwaretomanagingbusinessprocessesandsoftwaredevelopmentsuggeststhatthesetoolsmayalsobevaluableaidsfordescribingcomplicatedscien-experiments,andsomeLIMSpackagesprovidesomeofthisfunctionality[19].Bycombiningthesewwontoolswithsoftwareforcontrollinglabequipment,itmaybepossibletoprovidedomainscientistswithapowerfulframeworkforexecutableofcomplicatedlaboratoryprocedures.Arelatedclassofsoftwarereproducibilitytoolsencouragesuserstobundleinputsetsandse-quencesofdata-transformingprogramsintoasingledistributableintendedtoaccompanypub-lishedresults.Anotableexampleis[16],whichusesvirtualmachinestoproduceself-containedcomputingenvironmentsforreproducingdigitalanalysisunderidenticalconditionsondifferentphysicalcomputers.ReproZipautomaticallydeterminesallthenecessaryforreplicatinganin-silicowwbymonitoringtheoperatingsystemduringordinarytaskexecution.Thesetech-niquesofferapromisingstrategyforimprovingscientists'abilitytocapturetheintricaciesoftheirworkforlaterrevieworreusewhileavoidingexcessivedemandsontheuser'sdiscipline.Industrygroupshavealsomadeseveralattemptstoproducestandardizeddatamodelsforbusi-nessprocessesandequipment,perhapsthemostpopularofwhichisBusinessProcessModelNotation(BPMN)[3],typicallyrepresentedbyadirectedgraphorwchartmuchlikethedatamodelsusedinwwsoftware.Themostfull-featuredmodelexpandingonthisconceptisISO15926[73].Thismodelpromisesalevelofgeneralitythatissuftoenablein-teroperabilitybetweenbusinessesindifferentsectorsandcountrieswhichrelyonlarge,variedsetsofequipmentandsoftware.Thestill-growingonencompassesinformationasdiverseasprocessandstructuraldescriptionoforganizationsanddevices,com-ponentlifecycleinformationandmore.ISO15926'srepresentationformatisbasedonsemanticwebtechnologiessuchasOWL,whichemploysagraphmodeltodescribesemanticrelationshipsbetweenentities,whereeachentityandrelationshiphasanassociatedhyperlink.Thestandardhasbeenunderdevelopmentfor25years,butmanydocumentshaveyettobepublishedandnosoftwareimplementationsarecurrentlyfreelyavailable.Theextremecomplexityofthe15Figure2.3:EditingaTavernaww.Screenshotofaproteinsequenceanalysisww[76]beingeditedinApacheTavernav2.5[53],anopen-sourcewwmanagementtool.16modelisalsoanimpedimenttoadoptionbyendusersaswellasimplementation.2.3.3Laboratoryinformationmanagementsystems(LIMS)ALIMSisatoolfortrackingtheoperationsandassetsofalaboratory.Commercialtoolsbythisnameprovideawiderangeoffeaturestargetedtowarddifferentaspectsofanenterprise-levelindustriallabsuchaslettingresearchersmonitortheirongoingexperiments,loggingsamplesanddatasets,andnotifyingrelevantpersonnelwhenmaintenancetaskslikerestockingneedtheirattention.Thisisnowoccupiedbyastaggeringnumberofapplicationvendorsandproductswithabroadrangeofspecializations,featuresets,maturitylevels,andpricetags[43].Thesepackagesrangefromgeneral-purposesystemsbuiltaroundawikiorspreadsheettooltospecializedsystemsforinteractingwithtypesofchemicalanalysisequipment.SomeLIMSpackagesprovideawwmanagementsystem(WMS)andmanyofthemcontainbuilt-inELNs.Theprimaryplayersinthisapplicationdomaintargettheneedsoflabsinthehealthcare,foren-sics,andpharmaceuticalsectorsandaremostlydesignedformanagingandoptimizinghugebatchprocessesoned,equipmentpipelines.Thedesignsresultingfromtheseassump-tionswouldseemtomakemanyoftheseprogramsapoorfortherapidlyevolvingexperimentalwwseeninacademicsensorengineering,thoughthereareexceptions.Inparticular,Agi-lent'sOpenLABsuite(formerlyKalabie)[2]offersanotebooktoolwhichcombinesdatacollec-tion,storage,analysis,andcollaborationcapabilities.ThispackageisalsocapableofintegratingwithdatacollectedfrominstrumentsmanufacturedbyAgilentandsomeofitsbusinesspartners.ThetoolappearstoprovidemanyofthecapabilitiesfoundinatypicalLIMScombinedwithsomesupportforreal-timehardwarecontrol,makingitanattractivecandidateformeetingseveralofourapplication'sneeds.Unfortunately,thistoolisrestrictedtoasetofassociatedhardwareandatthetimeofthiswritinglacksdesirablefeaturessuchasmodularity,user-customizability,andversioncontrol.MostLIMStoolkitsareproprietaryandclosed-source,butgiventhedemandforthistypeofapplicationfromlargeindustrialgroupstheisinsomewaysfairlymature.Someofthe17architecturaldecisionsthatarecommonplaceinmodernLIMS,especiallytheircloud-orientedmodel,supportforusercustomization,andfocusonauditability,seemwell-suitedforthekindofend-to-endresearchmanagementsystemweintendtobuild.ThroughoutthisthesiswerefertothesoftwaretoolweareinterestedinbuildingasaLIMSduetothebroadrangeoffunctionalityseenintoolswhichlabelthemselvesinthisway.2.3.4EquipmentautomationtoolsToextendautomationofprocessesbeyondthepurelycomputationaldomain,severalvendorsoffertoolsforcoordinatingsimultaneousoperationofactuatorsanddataacquisitionmod-ules.LikelythemostvisiblesoftwarepackageprovidingthisfunctionalityisNationalInstrumentsLabVIEW[23],aswellassimilartoolsformodel-baseddesignsuchasSimulink[62]LabVIEW'sGvisualprogramminglanguageallowsuserstoconnectdevices,signalprocessingblocks,andgraphicalinterfaceelements,ultimatelybuildingacustomfrontpanelandcontrollerforafivirtualinstrumentfl(VI)whichmaycommunicatewithmanydifferentpiecesoflabequipment.LabVIEWinteractswithNationalInstruments'lineofdataacquisitionandcontrolhardwareandalsoshipswithalargelibraryofdriversforinstrumentsproducedbymanyvendors.Gprogramscanberegardedtosomedegreeasww-styleexecutableprocessbutdifferentversionsofLabVIEWhavewell-documentedcompatibilityproblems,preventingVIsfromservingasself-containedprocessdisseminationunits.Othersoftwaretoolkitshavebeguntocapitalizeontherecentemergenceofaffordablenetwork-connectedmicrocontrollersandsingle-boardcomputers.Onetoolkitoverlappingwithsomeofourapplicationrequirements,ZettaJS,intendstoprovideahardwareabstractionlayerforcontrollingandcoordinatingembeddeddataacquisitionplatformsovertheweb[77],withthestatedgoalofconnectingdevicestotheIoTusingexistingwebtechnologies.Unfortunately,relativelyfewLIMSvendorsincorporateequipmentautomationintotheirfea-turesets.EvenfewerpackagesseemtorecognizethewaysELNcapabilitiescouldbecomple-mentedbyend-to-endexperimentdesignandexecutionsupport.Wefeelthatthereisapromising18nicheforsoftwaresynthesizingthebestfeaturesofautomationsoftware,cloud-basedLIMS,andmetadata-richELN,andthisthesisintendstoarticulatethedesignofsuchaframework.2.4EnhancingpublicationvalueTohelpmanagethecomplexitiesofmodernresearchandpromotereproduciblescience,somecommentatorshaveaneedformorerichlystructuredpublicationunitsthancurrentlyexist[6].SimpleexamplesofrichpublicationsincludePDFscontaininghyperlinkstoexternalpapersorotherresources,andthesehavealreadybeguntoproliferatenowthatmostre-searchisexchangeddigitally.Especiallyincomputingitisalsodesirableforpublicationtobeexecutable,unifyingcodeanddocumentationandallowingforcompletereproducibilityofaunitofresearch.Knowledgeengineeringanddataarchivingresearchershaveproposedseveralap-proachesforrepresentingthebroadspaceofresearch-relevantinformationinamachine-readableform,andsomeofthesemodelsarerecognizedinthissection.2.4.1SemanticprovenancemodelsAcademicworkonstructuredrepresentationsofresearchartifacts,theirrelationships,andtheirprovenancehaslargelybuiltonsemanticwebtechnologiessuchastheso-calledlinkeddatanet-work[10].ThesemanticwebreferstoabodyofInternetresourceswhichareconnectedtooneanotherbyhyperlinkswhichstandforkindsofrelationships,suchassubclassOforwas-DerivedFrom.Similarly,linkeddataareresourcesonthewebwhichincludehyperlinkstosemanticallyrelatedexternalpages.Inparticular,thisprovidesamechanismforpublisheddatasetstorecordtheirprovenancebyexplicitlystatingachainofrelationshipstotheirpointofcre-ation.Astandards-trackrecommendationendorsedbythetheWorldWideWebConsortiumknownasW3CPROV[50]hasrecentlybeendevelopedtospecifyhowdisseminationunitsshouldiden-tifytheirSemanticwebtechnologyhasenjoyedmanyyearsofacademicdevelopmentandresultedinsomepromisingprojectssuchasDBpedia[40].However,criticismof19thesemanticweb'svisionandapproachhasbeenreadilyavailablethroughoutitslonghistory[45],andinsomewaysthetoolingsupportforintegratingmodernwebappswithresourcedescriptionframework(RDF)metadataremainslimited.Animportanttoolusedinthelinkeddatacommunityisthenotionofanontologylanguage.ThesearesetsofRDFpredicateswhichprovideavocabularyfordiscussingabstractrelationshipsbetweenentities,allowingdomainexpertstoencodecontextualinformationaboutconceptsandresourcesintheirinamachine-readableformat.W3CPROV,forinstance,extendsmoregenericontologylanguagessuchasOWL2[54]toincludetermswhichareexplicitlyconcernedwithinformationaboutanentity'srelationshiptoitspredecessors.ThesefiknowledgegraphsflcanthenbeexaminedandsearchedforrelationshipsbyusingaspecializedgraphquerylanguagesuchasSPARQL[56].Unfortunately,existingknowledgedatabasesrequirecomplexexternaltoolstodrawinferencesbasedonthesemanticvaluesofthepredicatesinquestion,suchasconcludingfromtheknowledgethatAreliesonBandBreliesonCthatAreliesonCindi-rectly.Thiscanmakeknowledgegraphsverydiftoworkwith.Someinterestingtheoreticalworkhasrecentlymadeeffortstoaddresstheseshortcomingsbyapplyingmathematicaltoolsforautomatedreasoningtoknowledgerepresentation,e.g.[64].Semanticwebtechnologiesprovidesomeinterestingideasfororganizingdocumentsandprovidinguserswithmeaningfulconnectionsbetweenresearchartifactsofinterest.2.4.2ResearchobjectsAresearchobjectisaproposedformatforarchivingdataaswellasanexampleofarichlyannotatedelectronicpublicationformat[6].Theprogenitorsofthismodelarguethatpaperpublicationsareinadequatetocapturetheintricaciesofmodernresearchactivitieswhichdrawonaheterogeneousmixtureofdigitalandphysicalresources.Instead,theseauthorscallforscientiststouserecentdevelopmentsinsocialnetworktechnologyandinformationcapturetocollaborativelycreateandsharerichdigitalscienceresourcessuchasexecutablewwsandelectroniclabnotebooks.Thisvisionhasprovidedamajorsourceofmotivationforourpresentwork:weaimto20buildane-laboratorysoftwareframeworkwhereresearchobjectsarenativeandandsci-entistsmayconstruct,review,andtheirexperimentsandanalysesinaxible,provenance-awaretoolkit.Manyoftheexistingpublicationsontheresearchobjectmodel[18,9]uselinkeddataandsemanticwebtechnologytospecifytheformataresearchobjectshoulduseforencodingrela-tionshipsbetweentheconstituentartifactsofaresearchobjectaswellasbetweendatasetsandtheresourcesthatproducedthem.Thisapproachseemslikeaninterestingwaytoleveragetheexistingorganizationmechanimssofthelinkeddatawebtofurtherpromotetheusefulnessofrichpublicationunits.2.5SummaryAnumberofsoftwaretoolsforautomatingdatacollection,analyzingandcomparingdatasets,andinterdisciplinarycollaborationhaveemergedinrecentyears.Manyofthesepackagesprovidemuch-neededinformaticscapabilitiesthatarecurrentlybeingleveragedbybothacademicandindustrylabs,especiallyinthebiomedicalandhealthcaresectors.However,addressingthefullsetofchallengesposedbyinterdisciplinaryhigh-throughputsensorresearchanddevelopmentwillrequiretheintegrationofLIMSfunctionality,anelectronicnotebookeditor,andascriptingorgraphicalprogrammingsolutionforequipmentautomationintoacloud-basedsoftwareframeworkthatcurrentlydoesnotexist.Theremainderofthisthesiswilldescribetheproposeddesignandprototypeimplementationofasuiteofsoftwaretoolswhichsynthesizesandexpandsupontheprogramsdescribedabove.ThefollowingsectionsdescribethedesignandimplementationofeGor,alabinformaticssoft-warepackageintendedtocovermostoftheusecasesofthetoolsreviewedinthissectionandmore.eGorcomprisesseveralprogramswhichinpracticetypicallyrunonseveraldifferentmachinesandhelptomanagethecompleteprocessofdevelopingaexperiment.Thiscustomtoolin-tendstoprovideallofthemajorfunctionslistedinTable2.1andimprovebothuserproductivity21andresearchreproducibilitybysynthesizingthesecapabilitiesintoasinglesoftwaretool.22CHAPTER3ARCHITECTURECharacterizationanddevelopmentofsensorarrayspresentsabroadrangeofresearchchallenges,notleastofwhichrelatetodataorganization.ALIMSadequatetotheneedsofourexampleap-plicationmustprovideanumberofinteractingsoftwarecomponentstomediatebetweenusersandtargetresourcessuchasdatastores,richlyfeaturedresearchdocuments,computer-controllablelabequipment,andcollaborators.Thischapterabstractlydescribestheconstituentcomponentsofthesoftwareframeworkwehavebuiltforcollaborativedesign,execution,andanalysisofexperiments.ForeaseofreferencewerefertooursoftwarebyitspseudonymfieGorfl,theDigitalLabAssistant.Whendescribingeachelement,wedocumentsomeofthephasesofouriterativedesignprocessthatledtothesedecisions.3.1NetworkarchitectureGiventhattheresourcesofinteresttooursoftwaresystemareinherentlydistributed,acarefuldesignofthesystem'snetworkinterconnectiscriticaltoitsscalability,security,andusefulness.Belowwedescribethephysicalsystemconstraintsdrivingsomeofourdesigndecisionsandex-plainhowweiterativelyarrivedatourdesign.3.1.1PhysicalarchitectureTypicalwwsforinterdisciplinarydigitalresearchinvolveanumberofcomputingresourceswhicharephysicallyandlogicallyseparatedfromeachother.Theseinclude(i)individualwork-stationswhereresearchersperformanalysisandcomposecodeanddocumentation,(ii)onlineinformationbankssuchaschemicalandbiologicaldatabases,(iii)intranetandcloudstoragedrivesforarchivingandsharingdocumentsanddata,(iv)logsofresearch-relevantcommunica-tionssuchasemailcorrespondence,and(v)dedicated,typicallysharedscientiresourcessuch23aslabinstrumentsandhigh-performancecomputers.Inmanycases,especiallyinelectricalengi-neering,adevicewegenericallycategorizeasapieceoffilabequipmentflisafocusofresearchinitsownright,andcanbefurtherdecomposedtoincludecomputercontrollers,instrumentationelec-tronics,andphysicalprocessesordevicesofinterest.Oftensomeoralloftheseresourcesinteractwitheachotherinanad-hocfashionmanuallyfacilitatedbyusers.Webelievethattremendousgainscanbemadeforresearchorganization,accuracy,andreproducibilitybycoordinatingthein-teractionsbetweenthesecomponentswithacarefullydesignedsoftwareframework.AschematicdiagramofsomeoftheseinteractingcomponentsisdepictedinFigure3.1.Themostimportantgoalofthepresentworkistoautomaticallyexecutephysicalexperimentsbyemployingcomputercontrol,automaticallycollatingrawexperimentaldatawithsecondarydataandmetadatatoproduceself-containedresearchartifactsthataremoreamenabletounambiguousanalysisthanpresentad-hocformats.Ideallywewouldlikeforcollaboratingresearchersatdiffer-entuniversitiestobeabletorevieweachothers'experimentsinrealtime,allowingforcontinuousfeedbackbetweeninvestigatorswithdifferentareasofexpertise.Althoughsomepiecesofmodernlabequipmentpossessnetworkinterfacesandcandirectlyactaswebserversintheirownright,amajorityofinstrumentsofinterestoperateovershort-rangeorlegacycommunicationlinks.Inordertoallowuserstoremotelyinteractwithphysicalresourcesofthiskind,atleastoneadditionalmachineisrequired.Thismachineistypicallyrep-resentedbyaPCphysicallylocatedinaresearchlabandconnecteddirectlytoexternalhardwaredevicesovernon-networkedconnectionssuchasUSB.3.1.2MonolithicapproachAtraditionalarchitectureforwebapplicationsoftwareinvolvesasingleserverexecutableservingpresentation-layerapplicationstoclientsandmakingdatabaseaccessesontheirbehalf,asinFigure3.2.Inourcase,theserverwouldalsomediateaccesstolabequipment,providinguserswithindirectandhigh-levelaccesstotheseresourcesinmuchthesamewayasitabstractsoverthedatabase.24Figure3.1:Research-relevantartifacts.Representationofsomeofthedigi-talresourcesfoundintypicalwwsandtheirrelationships.Rawdatasetscapturedfromanexperimentalrunareofteninsuftorecon-structmeaningfulplotsorperformdetailedanalysis,andresearchersmustrelyonundocumented,hiddeninformationsourcestoperformacompleteanalysis.25Figure3.2:Monolithicwebarchitecture.Atraditionalfimonolithicflwebapplicationarchitecturewhereoneserverprocessmanipulatesadatabaseonbehalfofmanyclients.Thisarchitectureisattractiveforitseaseofdeploymentanditsapparentsimplicity,andearlyintheproject'sdevelopmentwepursuedadesignalongtheselines.However,attemptingtobundleallofeGor'sserver-sidefunctionalityintoasingleprogrameventuallycauseddifwithsystemintegration.Forexample,couplingthecodeforcommunicatingwithlabinstrumentsintotheserver'sapplicationlogiccomplicatesbothportionsoftheprogramandmakesitdiftotestanddeveloptheminisolation.Thisagreeswithacommonobservation[65]thatarchitecturesofthiskindareoftenlessmodular,makingthemmoredifformultipleprogrammerstodevelopindependentlyandcomplicatingtheprocessofintroducingnewfunctionality.Wefeelthatamorecompartmentalized,modularapproachbetterthestructureofthedomainbeingmodeledaswellasconferringanumberofsoftwareengineering263.1.3MicroservicesAsopposedtotheconventionalfrontend-backenddivide,somedevelopershavesuggestedanar-chitectureforwebapplicationsbasedonsimplecommunicatingmodulestermedmicroservices.Inatraditionalmonolithicarchitecture,programmerscomposeacomplicatedapplicationhierarchi-cally,usingonemainmodulewhichcallslibraryfunctionsfrommanysubordinatecomponents.Amicroservicearchitecturesplitsfunctionalityintomanyindependentprogramswhichcommuni-cateusingordinarynetworkprotocols,andmodulesaredesignedtoassumethattheirdependenciesarecompletelyseparateprogramspotentiallyrunningonothermachines[41].Thisapproachpromisesbettermodularitythantraditionalwebapplicationssincecapabilitiescanbeaddedandextendedindependentlyofoneanother[5].Sinceallservicesexposetheirfunc-tionalityoverasimilarwebAPI,implementationsaredecoupledfromeachotherandinternallyhaveverydifferentarchitecturestailoredtotheirspecial-purposeneeds.Servicesmayevenbewrittenincompletelydifferentprogramminglanguages.Thexibilitythatthisapproachaffordsisagoodwithourdesiretoadapttheframeworktomeetusers'changingneeds.Furthermore,amicroservicearchitecturelendsitselfnaturallytoadesignwherecapabilitiesandresourcesaredistributedgeographically,asisthecasewithlarge,remotelycollaboratinggroupsofresearchers.Insomecasesmicroservicearchitecturesalsoscalebetterasperformancedemandsonthesystemincrease[74].AschematicdepictingtheconnectionsbetweensomeofourcoremicroservicescanbefoundinFigure3.3.Inourapproach,nomicroserviceistrulyficentralflŒservicesmaycommunicatewithanyotherserviceprovidedtheyknowitsURIandpresentanauthorizedaccesstoken.Throughoutthefollowing,weusethetermsmicroserviceandserviceinterchangeably.3.1.4SwitchboardserviceDespiteitsinternallydistributeddesign,thewebapplicationmustpresentaprimarygatewayforuserinteraction.Inourdesignthisroleistakenbyamicroservicewerefertoasaswitchboard,27Figure3.3:Microservice-basedwebarchitecture.High-levelinterconnec-tionbetweenthecriticalmicroservicescomposingourdesign.whichisprimarilyresponsibleforenumeratingmicroservicesandprovidingproxyaccesstothematappropriateuniformresource(URI)s.Theswitchboardthatusersareau-thorizedtomanipulatetheirtargetresources,thendelegatestheirrequeststothemicroservicesresponsibleforperformingactualresourceaccesses.Sincetheswitchboardisitselfamicroservice,multipleswitchboardservicesmaybeemployedbyasystem,affordingsystemadministratorsaccesscontrolsfordifferentcomponents.Additionally,theswitchboardofacompletelydifferentinstallationofthesoftwareatadifferentfacilitymaybetreatedasanavailablemicroservice,facilitatingcollaborationbyallowingappro-priatelyauthorizeduserstoaccessexternalresourcesasiftheywerepartofone'sowninstallation.283.2DevicecontrolAcoregoalofourdesignistoenableresearcherstoincorporatechoreographyofphysicallabequipmentintotheexecutablewwstheycreate.Interactingwiththevarietyofcommercialandcustomhardwarefoundinatypicalexperimentallabrequiresaxibleapproach,giventhatcomputercontrolinterfacesanddataformatsforequipmentareheterogeneousandverypoorlystandardized.ThissectiondescribesanapproachforbuildingamodularlibraryofdevicedriverswhichintegratewiththerestoftheeGorframeworkwhileprovidinguserswithtoolsforextensionandcustomization.3.2.1InstrumentmanagerTheinstrumentmanagerisaserviceresponsiblefordetectingconnecteddevices,determiningtheappropriatedevicedriverforcommunicatingwiththem,andpresentingainterfacetotheswitchboard.Thisservicerunsasabackgroundapplicationontheclientmachinewhichisphysicallyconnectedtolabequipmentandisresponsibleforrelayingcontrolcommandstoappropriatedevicesaswellasroutingcapturedinstrumentdatatosinkssuchasadatabaseorreal-timedisplayviewport.MuchastheswitchboardserviceothermicroservicesandmountsthematappropriateURIs,theinstrumentmanagercurrentlyconnecteddevices,determinesanappropriatedriverandcommunicationprotocolforexchangingmessageswiththem,andexposestheirhigh-levelfunctionalityasanAPIavailableatanappropriateendpoint,allowingtherestofthesystemtobehaveasiftheinstrumentsthemselveswereordinarymicroservices.3.2.2DeviceenumerationOneoftheinstrumentmanager'schiefresponsibilitiesistodeterminewhichdevicesarepresentlyconnectedtothePChostingtheservice.Theprocessofestablishingaconnectionwithapieceofequipmentanditsidentityisdependentonthephysicalinterfaceaswellasdevice-packetformatting.Fortunately,manyinstrumentsfollowastandardconvention29foridentifyingthemselvestocontrollerPCs.Insomecases,however,theinstrumentmanagermustreceiveexplicituserguidanceaboutwhichdevicesareconnected.Onceadeviceproducesanresponseortheuserexplicitlyanattacheddevice,theinstrumentmanagerlocatesdetaileddeviceinformationbyqueryingourdeviceinfor-mationservice.Inparticular,thedatabaserecordretrievedbytheinstrumentmanagerincludesadevicedriverandaprotocolstackfortranslatinglow-leveldevicecommandstoandfromagenerichigh-levelformat.Thisapproachallowsthedevice-connectedPCtoalwaysusethelatestdriverforeachdevice,retrievedevicesondemand,andcommunicatewithanydeviceknowntoagiveneGorinstallationwithminimaluserinterference.Afterthedownloadeddevicedrivercodehasbeensuccessfullyinstalled,theinstrumentmanagermapsanappropriateURItotheattacheddeviceanddelegatesrequeststransmittedtotheinstrumenttotheappropriateprotocolstackanddevicedriver.Inearlieriterationsofthedesign,theinstrumentmanagerlookedfordevicedriversandprotocollibrariesinadirectoryonitslocalratherthanretrievingthemfromthenetwork.Thiswouldhaverequireduserstomanuallyinstallorupdatelibrariesforinteractingwithdevicedrivers.Additionally,thedatabase-orientedapproachallowstheconcretecommunicationcodeforagiveninstrumenttobeassociatedwiththeabstractdatamodelrepresentingtheinstrumentasaresearchartifact,allowinguserstoexaminetheirequipmentatalevelofdetailwhendevelopinganexperiment.3.2.3DeviceAPIsandprotocolcompositionTheprotocolstackbundleassociatedwithagivendeviceisexpectedtoexposeanAPIthatal-lowsinstrumentsthemselvestobetreatedasmicroservices.Theuniformityofthisdesignmakesitpossibleforthesoftwaretomodelmanykindsofremoteresourcesusingasimilarapproach,andleveragesexistingnetworkinfrastructuretomanagehowcommandsaredelegatedtodevices.Animportantresponsibilityofthesedeviceproxyservicesistranslatingcomplexsequencesofcom-mandsreceivedbythenetworktoandfrombit-levelpacketsformattedforindividualinstruments.BorrowingfromInternetdesignterminology,werefertothesequenceofdatatransformationsand30wcontroloperationsinvolvedinthisprocessasaprotocolstack.Tosimplifyandmodularizethecreationofcommunicationprotocolsforinteractingwithawiderangeoflabequipment,protocolstacksaredesignedusingalibraryofbasicdatatransformationsasbuildingblocks.Inadditiontofunctionallypureencodinganddecodingprocesses,agivenfilayerflofaprotocolstackmaytriggerchangesinwcontrolorprovidesignalstootherlayersinresponsetocertainpackets.Theresultingframeworkgivesprogrammersthefreedomtomanydifferentkindsofcommunicationstrategies.Bycompartmentalizingdevicedriversinthisway,weimprovethemaintainabilityofthein-strumentmanagementcodebaseandprovideuserswiththeabilitytoextendeGorwiththeirownmodules.Thisisespeciallyimportantfordevicedriverssincethenumberofpossiblecommu-nicationprotocolsisfartoolargetomaintainanadequatelibraryofdriverswithoutcommunitysupport.3.3DatamodeleGormustmanagedatawithveryheterogeneousstructures.Inparticular,researchartifactssuchasequipment,experimentalruns,andpublicationsmaybeattachedtoquitedifferentsetsofinforma-tion.Additionally,wewishtopresenttheserecordstoanumberofservices,eachofwhichmusthaveaccesstoenoughinformationtoprovideacomplexsetoffunctionalities.Thissectionoutlinesanobjectschemafocusedonxibilitythatservesasthecoremodelforrecordsinourdatabaseofresearchartifacts.Adistinguishingfeatureofthismodelisthatagivenartifactmayhaveseveralattachedgroupsofassetsincludingcodeanddatathatindicatehowtheartifact'sattributesshouldbetreatedindifferentexecutioncontexts.3.3.1ResearchartifactmodelOneofthemostbasicdatatypesinourobjectmodelisreferredtoasanartifact,andisintendedtoprovideagenericrepresentationofresearch-relevantentitiessuchasequipment,experiments,31publications,analysispipelines,etcetera.Agivenartifactisequippedwithasetofficapabilitiesfl,whichareadditionaldatarecordsthatareinterpretedindifferentwaysindifferentsoftwarecon-texts.Examplecapabilitiesaresearchartifactmighthaveincludealabnotebook'sabilitytobeedited,anexperimentalww'sabilitytobeexecutedonphysicalequipment,aninstrument'sabilitytooperateasastandalonemicroservice,oraninstrument'sabilitytocaptureandtabulateresults.Eachservicemayoptionallyloadsomeorallcapabilitiesandinterprettheminservice-dependentwaystoprovideextendedfunctionality.Artifactsmayalsopossessfiassetsfl,whichareandresourceswithinternalstructuresthatareopaquetotheeGorsystem.Examplesofassetsincludeimages,codeforexternaltools,andattachmentssuchasPDFdocuments.AssetsarebyaURIandoptionaltypeinformationandmaybeaccessedorcreatedonaserver'slocalbyserviceswithappropriateaccesspermissions.UsinganassetratherthananobjectmodeltopackagedataisappropriatewhenthedatadoesnotpossessaninternalstructurethatshouldbemanagedbyeGordirectly.Forinstance,atextcontainingsourcecodemightbeanappropriatechoiceofassetŒitscontentmaychange,buteGordoesnotneedtorepresentitinternallyasastructuredobject.Managingandinterpretingthecontentofanassetistypicallythepurviewofexternaltools,thoughoperatingthesetoolsmaybemediatedbyaneGorservice.Inthecaseofatextitwouldbemoreappropriatetouseexistingversioncontroltoolstorepresenttheassetinastructuredway.3.3.2DatasetmanagementOrdinarilydataarecapturedviaeGor-controlledlabinstruments,adaptedviaanappropriatepro-tocolstack,anddeliveredtooneormoredatasinkservices.Typicaldatasinksincludereal-timeplottingandsignalprocessingservices.Tosupportlateranalysisandexperimentreuse,oneofthecoreeGorfeaturesisadatatabulationservice,whichsupportsstreaminglivedatacapturesintoadatastructureforpermanentstorage.Toachieveefusageofspaceandfastretrievaltimes,largetablesofrawdataarestoredbyadifferentstrategythanmetadatadocuments.Tosomeextentthesearraydatasetscanbetreatedas32ordinaryassetsbelongingtoanfiexperimentalrunflartifact,butdatasetsarespecialbecausetheirhigh-levelstructuremustbecross-referencedwitheGorartifactsencodingtheirmetadata.Externallygenerateddatasetsmayalsobeaddedtothesystembyuploadingknownfor-mats,whicharedispatchedtoappropriateadapterservicesandcommittedtothedatabase.Simi-larly,previouslyrecordeddatasetsmaybeexportedanddownloadedforprocessingwithexternalcomputingtools.Inthesesituations,theuseristrustedtoprovidethestructuralinforma-tionneededtoenrichandcontextualizetherawdatatheyenterandtoappropriatelydocumenttheexternaltransformationsthattakeplace.3.4UserexperienceEachcapabilityhasacorrespondinguserinterfacecomponent,allowinguserstomanipulatearti-factsaswellasinspectthesystem'sinnerworkingsfromthegraphicalbrowserfrontend.Usingasimilarmechanismtotheapproachdescribedabovefordownloadingdrivercodeondemand,theinterfacepluginforagivencapabilityisloadedwhentheuserexaminesitsassociatedartifact.Artifactsmaydeclaresomecapabilitiesashiddenbydefaultinordertoavoidclutteringtheuser'sworkspace.AusageexampleofeGor'scorefunctionalityfromauser'spointofviewisdepictedinFigure3.4,withthefollowingmajorphasesindicatedbynumeralsinthe1.Theuserconstructsavirtualworkbenchdescribingtheandinterconnectionsbetweentheirlabequipment.Theworkbenchthesetofresourcesavailabletooneormorewws,whicharebywiringcomponentinputsandoutputstogetherandprovidingascriptofwhenandhowtochangeparametersastheexperimentruns.2.Theuserschedulestheirwwtorunontheequipmentduringanavailabletimeslot.3.Thewwiscompiledintoatimetableofdeinstructions.Assumingthisprocesscompleteswithouterrors,thewwisrecordedinthedatabaseashavingbeen33Figure3.4:AtypicaleGorww.Atypicalsequenceofuseropera-tionsforinteractingwitheGortodesign,schedule,execute,andanalyzeanexperiment.scheduledforthedesiredexecutiontime.4.Thescheduled,compiledwwissubmittedtotheinstrumentmanagertoawaitexecu-tion.Nearingthescheduledexperimenttime,anexperimentexecutorserviceensuresthattheexperiment'spreconditionsaremet.5.Theexperimentexecutorserviceexecutesdesiredhardwarecommandsattheuser'stimes.Astheexperimentruns,real-timedataiscapturedandstreamedtothedatasinksindicatedinthewwallowingresearcherstothattheexperimentisproceedingasexpected.6.Acompletelogindicatingthestatusoftheexperiment,anyerrors,andanyfailurestomeettheuser'sconstraintsisreturnedtotheserverforarchivingandlaterreview.Theexperimentexecutorserviceattemptstothatexperimentalpostconditionsaremetandprepares34forthenextscheduledexperiment.7.Therawoutputlogistransformedandreturnedtotheuser,producingaresultstruc-turethatonlytheuser'soutputsofinterest.3.5SecuritymodelEspeciallywhendealingwithsensitivedataandremoteaccesstoexpensivelabequip-ment,carefulaccesscontrolisanimportantarchitecturalconcern.Asinatraditionalclient-servermodel,whenclientsauthenticatethemselvestothesystemtheyareprovidedwithanaccesstokenthatcanbeusedtopreservetheircredentialsbetweenbrowsersessions.Whenausermakesarequestthroughasequenceofproxyservices,thisaccesstokenisprovidedalongwiththerequestandispassedalongtoeachserviceonthewaytotherequest'sdestination.EacheGorservicerequiresclientservicestoproduceanaccesstokenbeforeitwillperformworkontheirbehalf,andaservicemayqueryausermanagementservicewithanaccesstokentodeterminetheidentityofauserandwhethertheiraccessisauthorized.Switchboardservicesforcollaborator'sinstallationsmayalsoquerytheusermanagementser-viceoftheinstallationrequestingaccessinordertoeitherdenyaccesstoatokenoutrightortogenerateanaccesstokencorrespondingtoaforeignuser.Usersmayprovidelabmatesorcollabo-ratorswithauthorizationrightstoservicesandartifactswhichtheyownormanage,anddoingsothelistofuserIDstheservicewillpermittoaccesscertainoperations.3.6SummaryWehaveoutlinedanend-to-endsystemarchitectureforasetofinteractinge-laboratorysoftwarecomponents.Wefeelthatthisapproachprovidesareasonablecombinationofourtargetsystem'sambitiouslistofdesirablefeaturesandincludesanumberofnoteworthydesignideas.Inparticular,webelievethattheemphasisonmodularityfromthegroundupwillberewardedbyin35scalability,extensibility,andusercustomizationthatarenotseeninexistingLIMS.Inthenextchapterwedescribeoureffortstoimplementthisvisionindetail.36CHAPTER4IMPLEMENTATIONThecompleteeGorsystemisadistributedapplicationconsistingofnumeroussoftwarecompo-nentsrunningonseveraldifferentcomputers.Tomanagethiscomplexity,thedevelopershaveattemptedtomakedisciplineduseofbestpracticesforwebprogrammingandtomakejudicioususeofarangeofcutting-edgethirdpartylibraries,onlyacceptingthosewhichhavearecordofstabilityandcontinuedmaintenance.Ourcompletesystemdrawsonawiderangeoftechnologiesfromeverylevelofthesoftwarestack.Thischapterprovidesadescriptionofwhatnewcomponentswereimplementedbytheproject'sdevelopmentteam,whichexternaltoolsandlibrarieswereused,andwhatarchitecturalandpracticalconcernsfactoredintotheselectionofthesemethods.Thisisintendedtoprovideanoverallunderstandingofhowtheapplicationisstructuredratherthandetaileddeveloperdocumen-tation,whichcanbefoundathttps://github.com/egor-elab/doc.4.1OverviewFigure4.1showsahigh-levelschematicofthecurrentstructureoftheeGorplatform.Thecom-ponentsdepictedaredividedacrossthreedifferentmachinesinthesimplestscenario,althoughmorecomplexarepossiblesinceallservicesinteractovernetwork-readyprotocolssuchasHTTP.Thesemachinesare,fromtoptobottom(i)theend-user'sPC,whereaninteractivesingle-pagebrowserapplicationisusedtointeractwithvariouseGorservicesgraphically,(ii)aserverrunningmicroservicesforfunctionssuchasauthentication,remotelyaccessiblepersistentdatastorage,androutingrequeststootherservicesanddigitalresources,and(iii)oneormoremachinesphysicallyconnectedtoequipmentofinterest,responsibleformanagingandissuingcommandstoappropriatedevicedrivers.Thischapterwillelaboratetheorganizationandcommunicationstrategiesusedtoimplement37Figure4.1:High-levelstructureoftheeGorsystem.Ahigh-levelarchi-tecturalviewoftheimplementedeGorsystem,dividedacrossthreedifferentmachineswhereprimaryactivitytakesplace:auser'smachine,connectedtoaneGorserverviaawebbrowser,whichissuescommandstoalabPCrunningdevicemanagementservicestoconnectwithlabequipment.38thisframeworkinsoftware,followedbyatechnicaldiscussionofeachcomponent'sinternals.Otherthanthein-browsergraphicalinterface,whichiswrittenasasingle-pageapplicationinHTML5andJavaScriptusingtheAngularframework[32],themajorityoftheeGorwebappli-cationiswritteninPython[60],makinguseofthematureandmodernpaletteofnetworkingandcommunicationslibrariesavailableinthelanguage.Animportantexceptionisfoundinsomepor-tionsofthedatabaseaccesslayer,whichuseNodeJS[21]librariestopresentasimpleandeffectiveAPI.Theuser-facingwebapplicationstructureusesallthemajorthirdpartycomponentsofthepopularMongoDB,ExpressJS,AngularJS,andNodeJS(MEAN)stack,butinteractswithseveralPythonmicroservicestoaddhardwareconnectivity.Additionally,theeGorteamhasdevelopedamodelembeddeddevicetohelpdemonstratehowphysicalactuatorsanddataacquisitionmodulesmightinteractwiththesystemŒthesoftwareforthistargetiswritteninC++.Thisdevice,knownasaMEASUREI,isintendedtoactasaxibleinstrumentforperformingelectrochemicalexperimentsandrecordingsandliesoutsidethescopeoftheeGorprojectproper,butthroughoutthischapterweuseitasaconcreteexampleofthekindofdevicetheframeworksupports.aMEASUREIcanstoreandproducearbitrarywaveformsbyanumberofmethods,recorddigitally-convertedanalogmeasurementsandstreamthemoveraserialinterfaceinreal-time,andinteractwithexternalcircuitsviabanksofI/Opins.Thesoftwareforallthesecorecomponentsisopen-sourceandavailableinseveralGitrepos-itorieshostedathttps://github.com/egor-elab.ThediversesetoflanguagesusedhelpstodemonstrateachiefstrengthofeGor'smicroservicearchitecture:componentsaresufdecoupledthattheycanindividuallybeimplementedinalanguageandstylewell-suitedtotheiruniquechallenges.4.1.1DesignprinciplesOverthecourseofdevelopingandtheeGortoolchain,severalrecurringpatternshaveemergedwhichseemnaturalforaddressingtheapplication'sgoalsandhaveinformedsub-sequentiterationsofthedesign.Thissectiondiscussesseveralkeydesignpatternswhichhave39beenobservedandemployedthroughoutthecodebase,providingafeelforthephilosophyofthecompletesystembeforedelvingintoimplementationdetails.4.1.1.1Request/Responsevs.Publish/SubscribeThemostcommonformofhigh-levelnetworktrafonthewebisHTTP,whichusesrequest/responseexchangestopassdatabetweenhosts.Forinstance,aclientsuchasawebbrowsermightissueanHTTPrequesttoaserverwiththecontentsGET/users,causingtheservertorespondwithatextpayloadencodingaresourcenamedfiusersfl.Theclientsubmitsuserinputsuchasformdatainasimilarway(typicallyviaanHTTPPOSTaction),resultinginanacknowledgmentmessagefromtheserver.Thisapproachissufxibletoallowformuchofthebroadrangeofcontentfoundonthemodernweb,especiallysinceserversoftendeliverJavaScriptsourcecodeforclientstoexecutelocallyinadditiontostatictextdocumentssuchasHTML.Issuingcommandstolabequipmentcanoftenbemodeledinasimilarway:acontrollingcomputersubmitsamessageandthedevicerespondswitha(possiblyempty)acknowledgmentthatthecommandwasreceivedandexecuted.Inmanywaystheseoperationsarealsoanalogoustotheubiquitouspro-grammingconstructofcallingasubroutine,andsomeauthorsplacetransactionssuchasHTTPactionsundertheumbrellaofremoteprocedurecall(RPC)s[68].Request-and-responsecommunicationis,however,apoorforsystemswherecommunica-tionsmustbeinitiatedbidirectionally,withevent-drivenapplicationsprovidingakeyexample.Atypicaldata-collectinglabinstrumentordigitalmicrosystemproducesvaluesinrealtimewhichmustbetransmittedtodestinationssuchasdatabasesanddisplaymonitorsatroughlythesamerateastheyarecaptured.Implementationscanstillaccommodatethiswintoarequest/responseframeworkbyperiodicallyrequestingbuffersfromthedatasource,butthisapproachisfraughtwithdifandistypicallycomplicatedtousewhenmanydatasourcesneedtobemanagedsimultaneously.Amoreelegantdesignpatternforsystemswithsoftreal-timerequirementsisgivenbythepublish/subscribeapproach,alsosometimescalledtheObserverpattern[29].Inthisscheme,a40fitopicflorfiobservableflobjectmaintainsalistoffisubscribersfl,andeachofthemwhenavariableofinterestchangesoraneventisfipublishedfltotheeventstream.Thepublish/subscribetechniquehasbeenadoptedtosolvesoftwareproblemslikereal-timedataacquisitionaswellasforbuildingfireactiveflapplicationssuchasuserinterfacesandgames,wheregraphicalinterfacesareexpectedtoreactseamlesslytoeventstreamssuchasuserinputandnetworkcommunications.eGoradoptsthispatternforboththeseusecases,treatingdatacollectiondevicesaspublishersofstreamsofdatafragmentswhichmaybesubscribedtobyotherservicesorgraphicalinterfacesthroughoutthesystem.4.1.1.2DynamicloadingOneofthechiefobservationsunderpinningeGor'sdesignisthatresearchers'needsaretoodiverseandrapidlychangingtobesatisfactorilyaddressedbyasinglerigidlyconstructedapplication.Theimplementationefforthasthereforefocusedonconstructingacoreinfrastructurewhichallowsfuturedeveloperstoeasilyintegratenewfunctionalitywithoutdisturbingthesystem'soverallop-eration.EachmajorcomponentofeGorallowsitsuserstoloadnewextensionsatruntime.Thecoresubsystemseachspecifyaninterfaceforhowamoduleshouldallowitselftobeinstalledandexposeitsfunctionalitytothenetwork,andotherwisecommunity-contributedextensionsarenotrequiredtodependoneGorAPIsoreventobewritteninthesameprogramminglanguageastherestoftheframework.InadditiontobeinganessentialpartofthedailyworwforeGor'sdevelopers,adistributedversioncontrolsystem(namelyGit[14])providesaruntimemechanismforachievingthisdynamicloadingfunctionality.Gitwasdesignedtoallowmanyprogrammerstocollaborateonasoftwareproject,sharecontributionsremotely,andreviewandrevertchanges.Importantly,Gitisdistributedinthesensethateachusermaymaintainanindependenttimelineofthehistoryofthecodebaseonaprivatecomputer,withorwithoutnetworkconnectivity,sharingorpublishingchangesinapeer-to-peerfashionifandwhentheychoose.AnexampleofthedynamicloadingprocessisillustratedinFigure4.2.Thefollowingsequence41ofstepsdescribeshow,forinstance,thedevicemanagementsystemmightlazy-loadsadevicedriver,waitingtodownloadandinstalltheappropriatecodeuntilthedevicehasbeenphysicallyconnectedtoagivenhostmachineforthetime.HerethephasesarelistedwiththesamenumberingasinFigure4.2.1.Aclientserviceattemptstoaccessfunctionalityonanotherservicewhichisnotpresentlyrunning,orexplicitlyrequestsforanewservicetobeloaded.Inthiscase,theclientisaserviceresponsibleforenumeratingtheserialportsavailableonthesystemandattemptingtoretrieveidentifyinginformationfromconnecteddevices,anditsrequestforadevicedriverincludesidentifyinginformationbutmaynotnamethedriverexplicitly.2.Thefiservice-hostingservicefl,responsibleformanagingdynamicallyloadedprograms,queriesadatabaseofserviceinformationtodeterminewhereitcandownloadtherequestedcode.ThedatabaserespondswithaURIspecifyingeitheradirectlinktothenecessaryoraGitrepository.3.Theservice-hosterdownloadsthemodule,possiblyfromaninternalserverorfromapubliclyhostedlocationsuchasGitHub[30],andexecutesnecessarystartuproutines.Iftheservice-hosterdeterminesthattheserviceisalreadyloaded,itinsteadchecksifanupdatedversionexistsandprovidestheclientwiththeoptiontodownloadandusethenewversion.eGorser-vicesareexpectedtoimplementacommoninterfaceofsetup,start,stop,andcleanupscriptssothattheservice-hostercaninstallandrunthemautomatically.Theservice-hosteralsoindicatestothenewserviceinstancehowitcancommunicatewiththesystemswitchboardresponsiblefortriggeringthedownload.4.Oncethedynamicallyloadeddevicedriverislive,itregistersitspublicAPIwiththeswitch-board,atwhichpointthedriver'sfunctionalityisavailableforotherservicestouse.5.Theswitchboardcompletesanypendingprocedurecallsrequestedbytheclientservice,androutessubsequentrequeststotheappropriateservice.42Figure4.2:Dynamicloadingofmicroservices.Phasesofthedynamicload-ingprocessfordownloadingandinstallingauserdevicedriveratrun-time.Theservice-hostingserviceandtheservicesithosts(green)runonthesamephysicalhardware,whereaseachotherservicemaybeonadifferentde-viceconnectedviatheInternet.Asimilarprocedureisemployedforseveralothersubsystems,suchasloadingnewuserin-terfacecomponentsoraddingnewwaveformgenerationroutinestoourreal-timeelectrochemicalinterrogationplatform.4.2ServiceinterconnectAsdescribedinChapter3,themicroservicearchitecturalpatternprovidesastrategyforcom-partmentalizingthedevelopmenteffort,promotingmodularityofdesign,allowingforfuturecus-43tomizationandextension,andbuildingasystemthatemploysmanydifferentsoftwaretechnolo-giesandphysicalmachines.However,communicationbetweenmicroservicesinvolvessomechal-lengescomparedtotraditionalarchitecturesandreliesonseveralrecentlyemergedwebtechnolo-giestoallowservicestolocateanduseoneanother.Nonetheless,themicroserviceapproachpairswellwitheGor'shigh-levelgoalsandhasenabledustobuildaxibleandsophisticatedsystem.4.2.1RESTAPIsOneoftheelementsoftheweb'smoderninfrastructurethathasmadenetworkedmicroservice-orientedapplicationsapracticalpossibilityisthewidespreadadoptionbybusinessesandopen-sourcesoftwareprovidersofrelativelyuniform,publiclyavailableAPIsoverHTTP.Mostcom-monly,companiesexposereusablepubliccomponentsoftheirwebserversasHTTPinterfaceswhichaspiretorepresentationalstatetransfer(REST)principles,i.e.,theymodeltheevolutionofanapplication'sstateasasequenceoftransitionsbetweenstateswhicharemodeledbyURIs.Theseconventionshaveallowedforunprecedentedinteroperabilitybetweenapplicationswrittenatdifferentcompaniesforverydifferentpurposes.Asaprominentexample,Google'sMapsAPIprovidesamechanismforotherapplicationstoretrievegeographicalinformationoveranInternetconnectionratherthanmaintainingindependentlocationdatabases.GiventhatmanyconsumersoftheseAPIsarewebbrowserapplicationswhichuseJavaScripttoissuebackgroundHTTPrequests,JavaScriptObjectNotation(JSON)isapopularserializationformatforpassingdatapayloadstoandfromAPIendpoints.Thisstructure,whereawebsiteprovidesanindexablecollectionofJSON-encodedresourceswhichcanberetrievedandmanipulatedviaHTTPverbs,isoftenwhatismeantbyaRESTAPIintoday'ssoftwarejargon.RESTAPIsareusefulinterfacesformakingapplicationstateanddataexternallyaccessible,butarealsoaviableoptionforstructuringnetworkedcommunicationbetweendifferentpartsofthesameapp.Mostcommonly,aRESTAPIisusedtoprovidestructureddatabaseaccesstoclientcoderunninginabrowserapp.Forexample,anewswebsitemightallowclientstoretrievealistofarticles(inJSONformat)bymakingaGET/articlesHTTPrequest,thenretrievethe44user'sselecteddocumentbyqueryingGET/articles/2,thencommitasubmittedcommenttothedatabasewithPOST/articles/2/comments.ManyoftheexistingtoolsforconstructingRESTAPIswithwebprogrammingframeworkssuchasPython'sFlask[4]orExpressinNodeJS[20]provideforthisusecase.IneGorweassignURIsinasimilarhierarchicalfashion,butthetotalapplicationconsistsofanumberofgroupsofmicroservices,eachpotentiallypossessingaRESTAPI.Asnewservicesareloadedbyaparticularswitchboard,theirAPIsareattachedtothetreeofexistingURIs,muchasaonanewharddrivemightbemountedataparticularpathonaUNIXAneGorswitchboardachievesthisbyservingaproxyataURIcorrespondingtoaknownlabmachine,suchas/machines/0,relayingtraftoandfromthatmachinewhichinturnprovidesaccesstoconnecteddevicesasRESTAPIsatappropriateURIs.InformationaboutthewaveformtypesthatadevicecalledfiaMEASUREIfliscapableofproducingwouldthenbeavailablebyaccessingGET/machines/0/devices/aMEASUREI/wave/info,assumingthattheRESTAPIforaMEASUREIunderstandshowtointerpretthepath/wave/info.SpecifyingaRESTAPIispartofauser'sresponsibilitywhenadriverforanewdevice,asexplainedfurtherinsection4.5.2.4.2.2WAMProutingWebApplicationMessagingProtocol(WAMP)isanopenprotocolandsoftwarestackcreatedbyTavendo,whoprovidereferenceimplementationsinseverallanguagesintheformoftheAutobahnprotocollibrariesandarequestroutercalledcrossbar.io[67].Theauthorsofthesetoolsclaimthattheirprotocolsimultaneouslyaddressesmanyoftheusecasesofexistingprotocolsformachine-to-machinecommunicationsuchasAdvancedMessageQueueingProtocol(AMQP)andsocket.io[59].TheprotocolisbuiltontopofWebSockets,whichusesaTCPconnectiontoachievereliablefull-duplexstreamingandisnowsupportedbyallmajorwebbrowsersandanumberofwebframeworksinseveralprogramminglanguages.OneadvantageprovidedbythisprotocoldesignapproachisthatmachinesforhostingmicroservicesoractingasclientsforWAMPnetworkscanrequirelessspecialsoftwarethanisrequiredforusingsomemessagequeuinginfrastructures45suchasAMQP,simplifyingtheinstallationprocessforendusers.WAMPprovidesasetofcapabilitieswhichareagoodmatchforourapplication,includingbuilt-insupportforroutingremoteprocedurecallsbetweenanytwoconnectedservicesandbidi-rectionalpublish/subscribe-stylemessagepassing[35].Notably,WAMPalsodescribeshowaservicecalledarouterfacilitatesorganizedcommunicationbetweennodesbyredirectingremoteprocedurecallsanddatastreamstoappropriateclientendpoints.Theprotocolwasexplicitlyde-signedtosimplifytheimplementationofIoTapplications,especiallythosewithservice-orientedarchitecturesthatspanmultipledevicesofdifferenttypes.Furthermore,WAMPisdesignedtotar-getmanydifferentlanguagesandtargetdevices,providingacommonnetworkinterfacebetweenserver-sidecode,browsers,andmobileapps,andtheabstractionandseparationofconcernspro-videdbysuchaframeworkiswell-matchedtoheterogeneousservice-orientedarchitecturessuchaseGor's.Thisprovidesanattractivesolutionforaddressingmanyoftheproblemsfacedwhendevelopingoursystem,especiallygiventhatitallowsfordynamicregistrationandremovalofremotely-callablemethods,xibleroutingofdatasourcesthroughdifferentmachinesandend-points,andisinherentlybidirectionalinthesensethatanyservicecaninitiatecommunicationwithanyothersolongasithassufsecurityprivileges.Inourexistingimplementation,WAMP'scapabilitieshaveprimarilybeenusedforservice-to-servicecommunicationandtoorganizestream-ingdatatransactionsfromdatasourcestosinkssuchasreal-timeplotsandarraystorageservices.Amoreelegantandtrulyservice-orienteddesigncouldbeachievedbyadoptingWAMPforissuingusercommandsandmakingdatabaseaccessesaswell,butatthetimeofthiswritingWAMPhaspoorertoolinganddocumentationthansomeofitsmoreestablishedcounterparts.4.3UserinterfaceProvidingausefulandmanageableinterfaceforuserswhoarenotcomputerexpertsisoneofeGor'smostimportantdesignconstraints.Thedevelopmentefforthasleveragedseveralpowerfullibrariesfordevelopingwebapplicationsandprovidingdesireduserinterfacefeaturesto46allowresearcherstoimmediatelytakeadvantageofeGor'scapabilities.4.3.1ThinclientdesignAllgraphicalinterfacecomponentsofeGorareimplementedasinteractivewebpagesandrequireonlyamodernwebbrowserandanInternetconnectiontouse.InthiswaytheuserinterfaceactsasathinclientportalconnectinguserstoaneGorserver.Thisapproachhastheadvantageofrequiringnoinstallationontheuser'spartotherthanregisteringanaccount,aswellasmakingsoftwareupdatestransparenttousers,sincethelatestversionisautomaticallyretrievedfromtheservereachtimeauserconnects.Furthermore,ourapproachimplementstheuserinterfaceasasingle-pageapplication,meaningthatauser'sentiresessiontakesplacewithoutretrievingmorethanonepagefromtheserverorrefreshing,insteadusingasynchronousHTTPrequestsinthebackgroundandbidirectionalWebSocketcommunicationtosynchronizeapplicationstatewiththeserver.Structuringclient-serverinteractionsinthiswayhelpstodecouplethebrowserfromtheback-end,allowingthesecomponentstobedevelopedindependently.ByleveragingabstractionsprovidedbyWAMP'sapplicationframework,itispossibleforuserinterfacecomponentssuchasplotsandcontrolpanelstoactaspeerswithothermicroservices,simplifyingthestructureoftheprogram.4.3.2Angular2TheJavaScriptframeworkunderpinningeGor'sbrowserappuserinterface(UI)isAngular2,acompleterewriteofthepopularandsophisticatedAngularJSframeworkforbuildingsingle-pageapplications[32].AngulargivesprogrammersatoolkitforcustomHTML5tagswithdynamicbehaviorandforfitwo-waydatabindingflbetweenelementsofawebpageandJavaScriptobjects.ThismeansthatdisplayelementsareautomaticallyupdatedwhenJavaScriptvariableschange,andsimilarlyuserinputssuchaschangestoformelementsareautomaticallyinboundJavaScriptdatastructures.Thissynchronizationbetweenelementsofapage's47documentobjectmodel(DOM)andcorrespondingvariablesintheJavaScriptprogramallowsforadeclarativeprogrammingstyletobeusedtodescribehowanappisdisplayedwhilewritingthecontrollogicwithsequential,imperativeJavaScript.AdvantagesofAngular2overitspredecessorandotherclient-sideprogrammingframeworksincludeastructured,object-orientedstyle,afocusonreactiveprogrammingusingtheObserverpattern,andimprovedsupportforasynchronousfunctionalitysuchaslazy-loadingcomponentsandapplicationstructure.Inparticular,thedesignpatternsembracedbyAngularalloweGor'sdeveloperstocarryamodular,service-orientedphilosophythroughtotheuserinterface,compart-mentalizingfunctionalityintoaconnectedgroupofindependent,dynamicallyloadedservices.4.3.3UIcomponentsondemandAlthoughAngularisbuiltforcreatingsingle-pagebrowserapplications,theirassociatedJavaScriptcodeoftenmakesmanybehind-the-scenesnetworkrequeststoretrieverequestedorup-to-dateinformation.Amapapplicationprovidesafamiliarexample:ratherthanloadinggeographicalinformationabouttheentireglobewhenthepageistloaded,newconnectionstoRESTAPIsaremadeasynchronouslyinthebackgroundtoretrievemoredataastheuserpansandzoomsthemaptoviewdifferentlocations.Inadditiontoprovidingfull-featuredabstractionsforretrievingandmanagingremotedatasourcesofthiskind,Angular2hascapabilitiesfordynamicallyloadingapplicationcodeaswell.Atypicaluse-caseforthisfeatureistolazy-loadcomponentsthatdonotneedtobepresentinthepageinitially,reducingstartuptimebywaitingtodownloadsomeJavaScriptmodulesorHTMLtemplatesuntiltheusernavigatestoastatewhichrequiresthem.eGor'sdesignmakesuseofthisdynamicloadingfunctionalitytoallowforarbitraryexten-sionstotheUI.EachdocumentstoredineGor'sartifactdatabasemaybeassociatedwithoneormoreUIcomponents,self-containedAngularmoduleswhichprovidespecial-purposefunctional-ityassociatedtoauser,device,orexperiment.Forinstance,theaMEASUREIelectrochemicalmeasurementdeviceprovidesareal-timedatamonitorandseveralcontrolpanelscorrespondingtothedifferentwaveformgenerationmechanismsitsupports.Thedatabaserecordstoringinforma-48tionaboutthisdevicecontainsanentryforeachoftheseUIwidgetsincludinglinkstoJavaScriptcodeanAngularcomponentanditsbusinesslogic,HTMLandCSSdeclaringhowtheyshouldbedisplayed,andlinkstoadditionalassetsuchasimagesandPDFoperator'smanuals.AswithotherelementsoftheeGorframeworkthatemploydynamicloading,thisdesignallowsthecoresoftwaretoremainsmall,efandsingle-purposewhileallowingfuturedevelopersanduserstocreateandcustomizecomponentstomeettheirneeds.BychoosingWAMPasthenetworkinterconnectbetweenservices,wehavealsomadeitpossibleforJavaScriptcomponentsinthebrowsertointeractwithback-endservicesoverthesamecommunicationinterfaceasthemicroservicesusewitheachother,promotingscalabilityandmodulardesign.Thiscomponent-basedapproachtobuildingthebrowserappcouldbesupplementedbyagraphicaltoolallowinguserstodrag-and-dropinterfacecomponentstobuildtheiridealcontrolpanelinasimilarwaythatusersofLabVIEWareaccustomedtoconstructingcontrolpanelsfortheirvirtualinstruments[23].Thisapproachalsoalignswithourmicroservicearchitecture,wherefunctionalityiscontainedinindependentinteractingcomponentswhichmayhaveverydifferentinternalbehaviors.Thedecoupleddesignalsoallowsthesystemtoembedthird-partywebcomponentsandevenentirelydifferentbrowserappsintothefront-end,enablingeGortointegrateexistingopen-sourcesoftwarepackagessuchaselectroniclabnotebookswithourdesign.4.3.4JupyterTheJupyterproject(formerlyIPython)isanopen-sourcesoftwaretoolprovidingaxiblearchi-tectureforcreatingelectroniclabnotebooksforcomputing.Jupyternowsupportsmanyofthemostpopularprogramminglanguagesforscienceandengineeringapplicationsandhasseveralextensionpackagesprovidingadditionalfunctionality,mostnotablyJupyterHub.Jupyter-Hubprovidesawebserverwhichallowsteamstoshareandcollaborativelyeditnotebooksinthebrowser,embeddingplots,equations,codeandmoreinsideadocumentthatdoublesasanexe-cutableanalysisprogram.Thistooliswell-supportedandhasaddressedmanyoftheimportant49challengesassociatedwithbuildingacollaborativedocumentationtoolforcomputationalscience.eGorexperimentartifactmodelincludesaUIcomponentcalledfiNotebookflwhichembedsaJupyterPythonnotebookinsidetheeGorbrowserapp.Thisallowsresearcherstoattachanalysisandobservationstoanexperimentinaxibleformatusingafull-featuredlanguageandtoolkitforscientcomputing.SincemanyeGorcomponentsweredevelopedwithPython,thisalsoprovidesamechanismforuserstointeractdirectlywithothersystemcomponentsatmanylevelsofabstraction,potentiallyinterleavinginstrumentcontrolcommandsandsignalprocessinginasinglenotebook.PythonlibrariessuchasNumpy[69]orPandas[48]alsoprovidepowerfulhigh-levelAPIsforinteractingwithdatasets,andeGorprovidesaccesstoanexperiment'srawdatafromwithintheassociatedJupyternotebook.ThisapproachusesamatureprogramtosupplementeGorwithmuch-neededELNfunctionalityanddemonstrateshowthemodulardesignallowsforembeddingusefulthird-partytoolsintothebrowserapp.4.4DatabasemanagementAcarefulchoiceandimplementationofthesystem'sdatamodelisimportantforperformance,xibility,anddeterminingtheorganizationoftheapp.eGoremploysmultipleserver-sidedatastoresforpersistingdatasets,imagesanddocuments,informationaboutexperiments,devices,andusers,andrecordsdescribingeGorcomponentssuchascodefordynamicallyloadedmodules.4.4.1NoSQLandschemalessdatabasesFormanyyearswebapplicationsprimarilyusedrelationaldatabasesforpersistentdatastorage,queryingandassemblingthemusingStructuredQueryLanguage(SQL).Thesedatabasesarebasedonwell-understoodtheoreticalfoundationsandhaveanumberofadvantagesforapplicationssuchasbusinessoperationsmanagementandtraditionalwebarchitectures,buthaveperformancedif-whendealingwithcomplexdatastructuresthatarenotnaturallysuitedtoatableformat50[51].Relationaldatabasesareoftenalsorigidlytiedtoadatabaseschema,makingthemill-suitedforrecordswithmanysmallvariationsinstructureorwithstructuresthatchangeovertime.Inresponsetothedifposedbythistechnology,considerableinvestmenthasbeenmadeinrecentyearsindevelopingalternativedatabasestyleswhichoffermorexibilityandparal-lelscalability.Theseso-calledfiNoSQLfldatabaseshavenativedatastructuressuchasgraphs,key-valuepairs,orobjectmodelssuchasthosefoundinobject-orientedprogramminglanguages.Oftenthesetoolsbilltheirdatamodelsasfischemalessfl,contrastingthemselveswithtraditionalrelationaldatabaseswhereadministratorsmustprovideasetofnames,types,andrela-tionshipsfortherowswhenanewtablestructureiscreated.ClaimedofNoSQLdatabasesoftwareincludeimprovedadaptabilitytochangingdataformatsandbetterperformanceforsomeapplications[38].TheprimaryinformationofinteresttoeGoriscomplexlystructuredandconcernedwithrelationshipsbetweenpublications,experiments,datasets,anddevices,andwesoughttoadoptadatabasetechnologycapableofnaturallymodelingtheseartifactsandtheirconnections.Forthisreasonweinitiallyexaminedgraphdatabasesandrelationaldatabaserepresentationsofsemanticwebcontent,butultimatelychosethedocumentdatabaseMongoDBtothenested,inheritance-focusedstructureofourdatamodel.MongoDB'sinternalstructuresalsomapdirectlytotheJSONobjectstructureusedforcommunicationandstaterepresentationthroughouttheeGorsystem.ThepopularityofdocumentdatabasesintheNodeJSecosystemhascreatedathrivingspaceofopen-sourcetoolsforworkingwithsystemslikeMongoDBandconnectingthemtootherimportantapplicationcomponents.ConstructinganAPIlayertoexposeRESTfulaccesstoadatabaseinvolvesasubstantialamountofboilerplateandcanbecomequiteerror-proneanddiftomanageassystemanddatamodelcomplexityincreases.StrongLoopLoopBack[66]isanopen-sourceframeworkwhichgeneratesAPIendpointsanddocumentationforoneormoreserver-sidedatastores.LoopBackisbuiltontopoftheNodeJSwebapplicationframeworkExpressJS,allowingittobeusedinconjunc-tionwiththewidearrayofcommunitypluginsandmiddlewaresavailableforExpress.IneGor,51LoopBackisusedtoproduceobjectmodelsforartifactssuchasinstruments,virtualworkbenches,experimentalruns,resultsets,andeGorsoftwareplugins.LoopBackgenerateshand-customizableRESTAPIsfordeclarativelydatamodels,automaticallyhandlesaccessestoseveraldif-ferentdatabasebackends,andincludesapplicationlogicforimportantfundamentaltaskssuchasaccesscontrols,accountcreation,anduploads.4.4.2HDF5AlthoughNoSQLdatabasessuchasMongoDBprovideaxiblesolutionforpersistentstorageofcomplexdocument-likedata,theyareill-suitedforefqueryinglargearray-likedatasets.HDF5isatechnologyformanipulatingmultidimensionaltime-indexedformatsthathasseenstrongadoptioninthemachinelearning,anddatascienceinrecentyears[33].TheopensourceworkinggroupresponsiblefordevelopingtheHDF5hasalsoprovidedaPythonwebserverforstoringdatasetsandpresentingthemasnetworkresources,whichincludesareferenceimplementationofaRESTAPIforremotelymanipulatingandextractingsubsetsofdatasets.eGorkeepsarecordofagivenexperimentalrunbycapturingitsmetadatainaMongoDBcollectionreservedforcatalogingpastexperiments.Thisdocumentcontainsannotationsabouttheexperiment'spurposeandoutcomes,linkstorelatedartifacts,suchasaworkbenchrecorddescribingtheinstrumentsinvolvedandtheirconnections,andanembeddedresultlogdocumentprovidinginformationontheexperimentalresults.Thisresultlogprovidesinformationaboutwherethetablesofassociateddatacanbefound,intheformoflinkstoHDF5resourcesstoredelsewhereonthenetwork,andmetadataabouthowtheattacheddatasetsshouldbeinterpretedsuchasinformationaboutmeasurementunits.TheHDF5interfacingportionofeGoralsoincludesamicroservicewhichsubscribestodatastreamspublishedbydevicedriversoverWebSockets,bufferstheincomingdata,andappendsittoappropriatelyorganizedHDF5stores.524.5DevicemanagementeGor'scoreframeworkincludesamicroservicewhichrunsonaninstrument-connectedlabPCandisresponsiblefordetectingwhichinstrumentsareconnected,loadingappropriatedevicedriversandprotocol-translatingsoftwaremodules,andpresentingauniformnetworkinterfaceforhan-dlingdevicecontrolsanddataintheformofRESTandWAMPAPIs.Thissectiondiscussesthedesignoftheinstrumentmanagementservice(writteninPython)andexplainshowsomeofthechallengesinitsimplementationwereaddressed.4.5.1EnumerationOnecomplicationthatariseswhenattemptingtocommunicatewithmanydifferentlabdevicesisthatitisdifforaPCtodetermineexactlywhichdevicesareconnectedtoit.Manyscien-instrumentsuselegacyprotocolsandhardwaresuchasserialorparallelportswhichprovidenobuilt-inmechanismforidentifyingadeviceoritscapabilitiestoahostmachine.Toaddressthisproblem,themeasurementequipmentindustrystandardizedtheVirtualInstrumentSoftwareArchitecture(VISA)API,whichisimplementedbyinstrumentsfromanumberofdifferentman-ufacturers[27].Todeterminewhatdeviceisconnectedandloadanappropriatedevicedriver,thehostmustsendamessagetotheinstrumentaskingforidentifyinginformation.TypicallyserialinstrumentsareconnectedtomodernPCsusingUSB-to-serialadapters.Fur-thercomplicatingtheenumerationprocess,informationprovidedtouser-spaceapplicationsaboutUSBconnecteddevicesandUSBconnectioneventsvariesbyoperatingsystemanddoesnotnec-essarilycontaininformationaboutwhetheragivenUSBdeviceisaserialport.TheeGormicroser-viceresponsiblefortrackingconnecteddevicesthereforeusesthefollowingalgorithmtokeepanup-to-daterecordofwhichequipmentisconnected.1.WhenaUSBeventoccurssignalingconnectionordisconnectionofadevice,theinstrumentmanagertriggersare-scanofallserialports.532.Toscanagivenport,theinstrumentmanagertransmitstheVISAcommandfi*IDN?flandwaitsforaresponse.Sincethecommunicationrateofthetargetdeviceisunknown,thisstepmustsequencethroughalistofcommonlyusedbaudrates,pausingaftereachtransmissiontoseeifaresponseisreceived.SincesomedevicesofinteresttonotconformtotheVISAdifferentmessagesaretransmittedtoprobeforsomeotherknowndevices.3.Oncearesponseisreceivedtoanrequest,thedatabaseisqueriedwiththeinstrument'sresponsestringandthebaudratethatwasusedtoretrieveit.Thedatabaseissearchedforaknowndevicedrivermatchingthis4.Ifanappropriatedeviceisfound,theservice-hostingserviceonthelabPCisaskedtodownloadandinstallthemicroserviceusedtomanageinteractionwiththetargetdevice.5.Thedevicedrivermicroserviceisbroughtup,issuesstartupcommandstothedevice,andregistersitshigh-levelAPIwiththeeGorswitchboard.4.5.2ProtocolstacksOneofthemostimportantcapabilitiesoftheeGorsystemisitsabilitytoconnecttocomputer-controlledlabequipment.processesofteninvolveawiderangeoflegacyinstrumentswhichusedifferentcommunicationprotocolsanddataformats.Ratherthanattemptingtoprovideindividualdriversandprotocoltranslatorsforthemanydevicesthatourfutureusersmayneed,wehaveconstructedaPythonlibraryofsimpleprotocolbuildingblockswhichcanbeconnectedintomoreelaborateprotocolstacks.Weabstractlyaprotocollayerasacomposablesoftwareunitfortransformingunitsofdatafromoneformattoanother.Aprotocollayerconsistsofacodec,atransport,andacontroller.Oneormoreofthesesubcomponentsmayinheritthedefaultimplementationfromthefilayerflbaseclass,whichsimplypassesdatathroughunaltered.Acodecisadirect,oftenreversible,transformationfromoneencodingtoanother,atransportisresponsibleforlogicallypartitioning54streamsintodifferentunits,andacontrollerdetermineshowpacketsaregeneratedorconsumedateachlayer.Layersarebidirectionalandsymmetricbydefault,inthesensethatdevice-boundtrafandhost-boundtrafareassumedtohavethesameformatatagivenlayer.Usingdifferentlayerassemblyfunctionsprovidedbythelibrary,layersarefistackedfltoformmorecomplexprotocolsSincetheprotocolstackabstractioncapturestheprocessoftranscodingdatabetweendifferentformatsaswellasmanagingdatatransfers,combiningasequenceofprotocollayerscanbeusedtoassembleaPythondevicedriverforacustomorcommercialinstrument.AcompletedeGor-compatibledrivercapturesandtransmitsdatatoalow-levelbyteinterfaceatoneendandpresentsahigh-levelnetwork-connectedAPIattheother.eGordriverstypicallyexposeasetofremoteprocedurecallsformanipulatingtheassociateddeviceandproduceoneormoreevent-drivendatastreamswhicharepublishedtotheWAMProuter.Adevelopermaymakeanewdevicedriverknowntothesystembyassemblinganappropriateprotocolstack,identifyingitsnetwork-facingcommunicationpoints,andpointingeGor'sdatabaseataGitrepositorycontainingthedrivercodeandanappropriatemetadataviathegraphicalinterface.WehavedevelopedcompletedevicedriversforaMEASUREI,itssuccessoraMEASUREII,andacommercialgaswcontroldevicemanufacturedbyAlicat.Additionally,theeGorteamprovidesafiseedflrepositorycontainingskeletoncodeforacustomdevicedriverathttps://github.com/egor-elab/driver-seed.git.Wearehopefulthatresearcherswhoourtoolchainusefulwillhelpsupporttheproject'slong-termusabilitybydevelopingandcontributingtoalibraryofdevicedriversandothereGorservices.4.6SummaryThischapteroutlinedthedesignpatternsemployedbytheimplementedeGorsoftwareandinvesti-gatedsomeoftheconcretetoolselectionsandproblemsolvingapproacheschosenforassemblingthesystem.Theresultingapplicationdrawsonabroadswathof55CHAPTER5SUMMARYThisthesishasprovidedareviewofthedesignandimplementationofsoftwaresystemsforfa-cilitatingreproducibleresearchandhasdescribedeGor,atoolkitforcollaborativelyspecifying,executing,andanalyzingreal-worldexperiments.Thecompleteddesignhasanumberofsubsys-temsanddrawsinspirationfromanumberofexistingfreeandcommercialsoftwarepackages,andwouldnotbepossiblewithoutthehugerangeofopensourcelibrariesavailableforwebpro-grammingtoday.Nevertheless,creatingeGorhasinvolvedsolvinganumberofdifsoftwaredevelopmentproblemsandhasresultedinseveraldesignapproachesandcapabilitiesthatappeartobeunique.Whileminimumfunctionalityhasbeenachieved,thisprojectisanongoingeffortandwillrelyoncontinuedcontributionfromcoredevelopersandtheopen-sourcecommunitytomakeatrulypowerfulframework.5.1ContributionseGorwasdesignedinresponsetoasetofchallengesthatarenotadequatelyaddressedbyexistingsoftware.Thetooldescribedinthisthesiscombinesanumberoffeaturesthathavenotpreviouslybeenexploredinthisapplicationdomain,centeredaroundamodern,xiblewebappli-cationarchitecturebuiltonabroadrangeofadvancedsoftwaretechnologiesanddesignpatterns.Herewereiteratetheuniquefeaturesofthesystemandtheirvalueforautomated,reproducibleresearchaswellasotherlargescalewebapplications.5.1.1Hardware-connectedlabinformaticseGorrepresentstheonlysoftwaresolutionweareawareofforcloud-basedstructuredmanagementofexperimentaldataandproceduresthatisalsocapableofdirectlyinteractingwitharbitrarycom-mercialorcustomhardware.Thesystem'sdesignallowsuserstoremotelycontrollabequipment56viaanordinarybrowserinterface,butalsoallowsscheduledsequencesofdevicecommandsandparameterstobecapturedalongsidedatasets,discussion,andanalysisinanelectroniclabnotebookdocument.Rawdatasetsandthemetadatanecessaryfortheirinterpretationcanbeautomaticallyrecordedinarichlystructuredarchivalformat,allowingcollaboratorsorreviewerstoexamine,analyze,andsharethecompletelifecycleofanexperimentinvolvingmanipulationsofphysicalequipmentaswellassophisticatedsoftwareanalysis.Webelievethatthisintegratedsetofcapa-bilitieswillimproveresearcherproductivity,thereproducibilityofwork,andthestateofsoftware-drivenwwsingeneral.5.1.2ModularityviamicroservicesThecoreofeGor'sdesignisitsfocusonsubdividingfunctionalityintosmallfunctionalsubsystemscalledmicroserviceswhichcommunicateoverordinarynetworkprotocolsandmaybecomposedintolargersystems.Thisisnotadesigninnovationonitsown,andinfactisbeingincreasinglyadoptedbyanumberoftechnologycompaniestopowertheirinternaloperations.However,eGormakesextensiveuseofdynamicmoduleloadingtoallownewmicroservicestobecreatedandenabledatruntimeandseamlesslyintegratedwithexistingfunctionality.Thisbehaviorismadepossiblebyaspecialmicroserviceresponsibleforinstallingotherservices.Bycompartmentalizingeachserviceintoitsownversion-controlrepository,ystemcomponentscanalsobeupgradedusingthismechanismwhenevertheychange,allowingtheentiresystemtocooperatewithcontinuousintegrationstrategiesandmakingsureallusersstayup-to-date.Additionally,eachdevicedriverisencapsulatedinaservice,andthisstrategyatoncethedesignandmakesitpossibleforusersandeGordeveloperstoaddsupportfornewdevicesfarintothefuture.5.1.3DesignforcustomizabilityTheabilitytodynamicallyloadnewservicesalsoprovidesuserswithapowerfulmechanismforcustomizingandextendingtheeGorsystemtomeettheirneeds.Ultimatelythedevelopersenvi-57sionanecosystemwhereresearcherscancreateandshareservicesindependently,collaborativelyextendingeGorinmuchthesamewayasscientistssharepublicationsortechnicaltips.eGor'sdesignismarkedbyfocusonusercustomization,andweprovidelibrariesforcreatingnewdevicedriversanduserinterfacecomponents.Userinterfacecomponentsmayalsobeloadeddynamicallyintothebrowserinterface,andeachserviceordevicedrivermaybeassociatedwithanynumberofthesegraphicalcontrolpanels.Thebrowserappmakesuseofafihotreloadingflapproachtoallowpartsofthepagetobeupdatedwithoutperformingacompleterefresh,enablingaveryfastdevelopmentcycleforuserstocreateandcustomizetheircontrolpanels.Webelievethenovelarchitecturalchoicesmadetoenablethesecapabilitieswillhelptoovercomethelimitedlongevityofmanylabinformaticssoftwarepackages.5.2ImplementationstatusandfutureworkThedesignvisionelaboratedinthisthesisisambitiousandhasnotyetbeencompletelyrealized.Atthetimeofthiswritingthesystem'scorearchitectureisinplace,butsomeusabilityfeatureshaveyettobeimplemented.Theinfrastructurefordynamicallyloadingmicroservicesandcom-municatingbetweenthemhasbeencompleted,andthesystemcanautomaticallydetectconnecteddevices,lookupappropriatedriversbyqueryingourinternalserver,andbindthedrivertotheconnectedport.Real-timeremoteinteractionwithdevicesviathebrowserinterfaceisavailable,asshowninFigure5.1.AsmallsetofdevicedriversfortheequipmentusedbyourcollaboratorshasbeenimplementedbycomposingtogethersimpleprotocolandcontrolelementsfromalibraryprovidedaspartoftheeGorsystem.Anarchivingsubsystemhasalsobeendeveloped,allowingdatastreamsfrommultipledevicestobepermanentlyrecordedinanefindexabletableformat.Thesedatatablesarecross-referencedwithadatabaseofuser-providedmetadatadescribingtheexperimentwhichcreatedthem,andcanbeexaminedviatheuserinterfaceanddownloadedforexternaluse.Thisfunction-alityisintendedtoallowresearcherstomanagetheirdatasetsalongsidethespefortheir58Figure5.1:Userinterfacescreenshot.Screenshotfromthewebbrowserfront-end,showingauserinteractingwitharemotepieceofequipment.Datamaybecapturedandmonitoredinrealtime,andthedevicemaybecontrolledeitherusinggraphicalinterfaceelementsordirectlycommunicatingwiththedeviceoverabrowser-embeddedterminal.experiments,ultimatelybuildingpublicationunitswhichincludedetailedlinkstoalltheresearchartifactsthatcontributedtoasetofGiventhescopeofeGor'sapplicationdomainandtheopportunitiessuchatoolprovidesforresearcherproductivityandauditability,thedevelopershavebeguntomaintainlargeandcontinuallygrowinglistofdesiredfeatures.Muchofthefunctionalitystilltobeimplementedhastodowithimprovingtherichnessoftheexperimentmetadatamodel.Inparticular,researchersshouldbeabletocomparedifferentexperimentaltrialsandtobetterunderstandtheimpactofchangingaparameterorpieceofequipment.Amoreobject-orientedapproachforingandexperimenttemplateswouldalsobeforimprovingresearchproductivity,andanobjectdatabasemaybeagoodforimprovinghowsystemsaremodeled.Additionally,giventheinfrastructurealreadyinplaceitshouldbestraightforwardtoexpandtheexistingsystemtoallowinteractionbetweenmanydifferentmachinesandindependenteGorinstallations,butthisfunctionalitywasnotnecessarytoourimmediateusecaseandisnotyetimplemented.595.3ConclusionAsitexists,eGorisusableforremotelycontrollingdevicesandforcapturingandsharingdatainricherformatsthanarecurrentlytypicallyfoundinad-hocdatacollection.Thecorearchitecturehasbeenimplementedforallowinguserinterfacecomponents,databaseaccesslay-ers,anddevicedriverstointeract,andduetothearchitecturalfocusonmodularityandruntimeextensibilitywebelievethatfutureuserswillbeabletograduallyextendthesystemtomeettheiruniqueresearchgoals.Thecurrentimplementationactsasausableproofofconceptforthevisionelaboratedinthisthesis,connectingresearchers,equipment,anddatainunprecedentedwaysusingthenascentInternetofThingsasatechnologicalsubstrate.60APPENDICES61APPENDIXA:ACRONYMSAMQPAdvancedMessageQueueingProtocol.45,46,Glossary:AMQPAPIapplicationprogramminginterface.3,4,27,29,30,42,44,45,51,53Œ55,Glossary:APIBPMNbusinessprocessmodelnotation.15,Glossary:BPMNDOMdocumentobjectmodel.48,Glossary:DOMELNelectroniclabnotebook.13,17Œ19,50,Glossary:ELNIoTInternetofThings.3,4,46,Glossary:IoTJSONJavaScriptObjectNotation.44,51,Glossary:JSONLIMSlaboratoryinformationmanagementsystem.2,3,6,13,15,17Œ19,21,23,36,Glossary:laboratoryinformationmanagementsystem(LIMS)MEANawebapplicationsoftwarestackconsistingofMongoDB(database),ExpressJS(webservermiddleware),AngularJS(clientfront-end),andNodeJS(networkprogrammingar-chitecture).39RDFresourcedescriptionframework.20,Glossary:RDFRESTrepresentationalstatetransfer.44,51,53,Glossary:RESTRPCremoteprocedurecall.40,Glossary:RPCSQLStructuredQueryLanguage.50,Glossary:SQLUIuserinterface.47Œ50,Glossary:UIURIuniformresource.28Œ30,32,42,44,45,66,Glossary:URI62VCSversioncontrolsystem.10,Glossary:VCSVISAVirtualInstrumentSoftwareArchitecture.53,54,Glossary:VISAWAMPWebApplicationMessagingProtocol.45Œ47,49,53,55,Glossary:WAMPWMSwwmanagementsystem.17,Glossary:WMS63APPENDIXB:GLOSSARYAMQPanetworkprotocolforsharingdataandcomputationsbetweenaclusterofconnectedcomputers.ImplementedmostprominentlybyRabbitMQ[63].45APIAnapplicationprogramminginterface(API)isapublicallyexposedsetofsoftwarefunction-alityintendedtobeusedforcomposingotherapplications.Inawebprogrammingcontext,sometimesaRESTAPIismeant.3artifactagenerictermforanentityofinteresttoaresearchproject,especiallyadataset,sourcecodeofaprogram,oranexperimentalprotocol10,13backendtheinternalserver-sidelogicofawebapp,typicallyresponsibleforinteractingwithdatabasesandperformingintensivecomputations.27BPMNBusinessProcessModelNotation,aofawchart-likeformatfordescribingproceduresfoundinbusinessandmanufacturingsettings.See[3].15dataprovenanceAgeneralizedtermfortrackingscidataasitundergoesasequenceoftransformationsfromrawdataintoapublication-ready10databaseschemaadescribingthestructureofalloweddatabaseentries.51declarativeaprogrammingstylewhichfocusesonassertingrelationshipsbetweensoftwareenti-tiesratherthandescribingthestatetransitionsnecessarytotransformdata.48designpatternarecurring,reusableapproachforstructuringsoftware;ablueprintforhowapar-ticularprogrammingproblemmaybesolved.39,40,48,55DOMatermforthedatastructureusedtorepresentcomponentsofawebpage,namelythetreeofHTMLorXMLtagsandtheirassociatedproperties.4864ELNAnelectroniclabnotebook(ELN)isasoftwaretoolforhelpingresearcherstochronicletheirday-to-dayinvestigationsandresultsbycomposingrich-textdocumentswhichconsolidatedata,code,plots,andnatural-languageresearchquestionsandanalysis.13frontendtheuser-facingportionofawebapp,e.g.thegraphicalinterfacedisplayedbyabrowser.27in-silicoAdesignationappliedtoendeavorswhichconsistentirelyofcomputeranalysisofdata,namedincontrastwithin-vivobiologicalexperiments.2,14,15IoTTheInternetofThingsdescribesanear-futurenetworkinfrastructurecharacterizedbyun-precedenteddevice-to-devicecommunicationandubiquitousInternet-capablesensorsandactuators.3,18JSONasimpletextformat,originallynativetoJavaScript,forencodinghierarchicaldatastruc-turescontainingofmanydifferenttypes.44laboratoryinformationmanagementsystem(LIMS)Abundleofsoftwaretoolsforcoordinat-ingtheactivitiesofresearchers,trackinginventoryanddatasets,anddescribingandmoni-toringexperimentalprocessesinoneormorelaboratories.2lazy-loadastrategywhereanapplicationwaitstoloadsubcomponentsuntiltheyareabouttobeused,decreasingtheprogram'sstartuptimeandallowingittodependonresourceswhichmaynotbelocatableuntilruntimeinformationisavailable.42,48microserviceasmall,single-purposewebapplicationintendedtocommunicatewithacollectionofothermicroservices.27,66protocolstackaseriesoftranslationstepsconvertingonecommunicationprotocolintoanother.3165RDFaformalismforencodinggraphsofsemanticconnectionsbetweenentitiesviasubject-verb-objecttripleswhichservesasthebaselanguagelevelfortheW3C'ssemanticwebstandards.20researchobjectaproposedtypeofrichelectronicpublicationformatforpackagingdata,exe-cutableprocedures,anddocumentationinasinglesemantically-linkedarchive.20RESTasoftwarearchitecturewhereclientsinteractwithserversbynavigatingasequenceofstatesorresources,eachassociatedwithaparticularURI.44RESTAPIAnAPIadheringtoRepresentationalStateTransfer(REST)principles.RESTAPIsareendpointsforissuingcontrolanddatacommandsoveranHTTPinterface,allowingwebserverstoexposefunctionalityovertheinternetinaclient-agnosticfashion.44,45,48,52,64RPCaprogrammingabstractionwhereasequenceofnetworktransactionsisthoughtofasonemachineremotelyinvokingasubroutineoverthenetwork,receivingitsreturnvalueasaresponse.40prioritycreditforbeingthetopublishordescribeaninventionordiscovery.12semanticwebanapproachtoknowledgemanagementwhereInternetresourcesareannotatedwithgroupsofhyperlinksdescribingtheirrelationshipstootherresources.66SQLastandardizedlanguageforaccessingandmanagingdatabasesusingsetsofsearchcriteria.50switchboardamicroserviceresponsiblefordeterminingwhichothermicroservicesareactiveandmakingthemavailableatappropriateURIs.27Œ29thinclientahardwareorsoftwarecomponentwhichactsasalightweightportalconnectinguserstoserver-sidefunctionality,involvinglittleornoclient-sidesoftwaretouse.4766UItheportionofasoftwareapplicationconcernedwithacceptinginputfromtheuserandproduc-ingoutput;oftensynonymouswithgraphicaluserinterface(GUI).47URIatextstringuniquelyidentifyinganInternetresource.28VCSAsetofsoftwarefeaturesrelatedtotrackingrevisionsandallowingauthorstoreverttopreviousstates.10VISAanindustrystandardforthecommunicationinterfacethatatestingormeasurementinstrumentshouldprovide.53WAMPahigh-levelapplicationprotocolbuiltontopofWebSocketsforallowingheterogeneousservicestocommunicateviaremoteprocedurecallsandpublish/subscribeeventstreams.45WMSAsoftwarepackageforcreatingandcomposingdirectedgraphsofprocessphasesand/ordependencies..13,1767BIBLIOGRAPHY68BIBLIOGRAPHY[1]AlbertoAccomazzietal.AggregationandLinkingofObservationalMetadataintheADS.2016.eprint:arXiv:1601.07858.[2]AgilentTechnologies.OpenLABSoftwareSuite.(Accessedon05/06/2016).URL:https://www.agilent.com/en-us/products/software-informatics/openlabsoftwaresuite.[3]ThomasAllweyer.BPMN2.0.BoD,2010.ISBN:3839149851,9783839149850.[4]FlaskauthorsArminRonacher.Flask(APythonMicroframework).http://flask.pocoo.org/.(Accessedon06/01/2016).[5]ArminBalalaie,AbbasHeydarnoori,andPooyanJamshidi.fiMicroservicesArchitectureEnablesDevOps:MigrationtoaCloud-NativeArchitecturefl.In:IEEESoftw.33.3(May2016),pp.42Œ52.DOI:10.1109/ms.2016.64.URL:http://dx.doi.org/10.1109/MS.2016.64.[6]SeanBechhoferetal.fiResearchobjects:Towardsexchangeandreuseofdigitalknowledgefl.In:(2010).[7]AndrewBegelandNachiappanNagappan.fiUsageandperceptionsofagilesoftwaredevelopmentinanindustrialcontext:Anexploratorystudyfl.In:EmpiricalSoftwareEngineeringandMeasurement,2007.ESEM2007.FirstInternationalSymposiumon.IEEE.2007,pp.255Œ264.[8]C.GlennBegleyandLeeM.Ellis.fiDrugdevelopment:Raisestandardsforpreclinicalcancerresearchfl.In:Nature483(2012).DOI:10.1038/453531a.[9]KhalidBelhajjameetal.fiUsingasuiteofontologiesforpreservingww-centricresearchobjectsfl.In:WebSemantics:Science,ServicesandAgentsontheWorldWideWeb32(2015),pp.16Œ42.[10]ChristianBizer,TomHeath,andTimBerners-Lee.Linkeddata-thestorysofar.Ed.byAmitSheth.Hershey,PA,2011.[11]MikioL.BraunandChengSoonOng.fiOpenScienceinMachineLearningfl.In:ImplementingReproducibleResearch.Ed.byVictoriaStodden,FriedrichLeisch,andRogerD.Peng.CRCPress.69[12]PeterBuneman,SanjeevKhanna,andWang-ChiewTan.fiDataprovenance:Somebasicissuesfl.In:FSTTCS2000:Foundationsofsoftwaretechnologyandtheoreticalcomputerscience.Springer,2000,pp.87Œ93.[13]JorgeCardoso,RobertPBostrom,andAmitSheth.fiWwmanagementsystemsandERPsystems:Differences,commonalities,andapplicationsfl.In:InformationTechnologyandManagement5.3-4(2004),pp.319Œ338.[14]ScottChacon.ProGit.1st.Berkely,CA,USA:Apress,2009.ISBN:1430218339,9781430218333.[15]JamesCheneyetal.fiProvenance:AFutureHistoryfl.In:Proceedingsofthe24thACMSIGPLANConferenceCompaniononObjectOrientedProgrammingSystemsLanguagesandApplications.OOPSLA'09.Orlando,Florida,USA:ACM,2009,pp.957Œ964.ISBN:978-1-60558-768-4.DOI:10.1145/1639950.1640064.URL:http://doi.acm.org/10.1145/1639950.1640064.[16]FernandoChirigati,DennisShasha,andJulianaFreire.fiReproZip:UsingProvenancetoSupportComputationalReproducibilityfl.In:Proceedingsofthe5thUSENIXConferenceonTheoryandPracticeofProvenance.TaPP'13.Lombard,IL:USENIXAssociation,2013,pp.1Œ1.URL:http://dl.acm.org/citation.cfm?id=2482613.2482614.[17]TheOpenScienceCollaboration.fiEstimatingthereproducibilityofpsychologicalsciencefl.In:Science349.6251(2015).DOI:10.1126/science.aac4716.URL:http://science.sciencemag.org/content/349/6251/aac4716.[18]OscarCorchoetal.fiWw-centricresearchobjects:Firstclasscitizensinscholarlydiscourse.flIn:(2012).[19]CoreLIMS.WManagementintheCoreLIMS.https://corelims.com/wwmanagement.htm.(Accessedon05/05/2016).[20]ExpressJSdevelopers.Express-Node.jswebapplicationframework.http://expressjs.com/.(Accessedon06/01/2016).[21]Node.jsDevelopers.Node.js.https://nodejs.org/.(Accessedon06/01/2016).[22]UlrichDirnaglandIngoPrzesdzing.fiApocketguidetoelectroniclaboratorynotebooksintheacademiclifesciencesfl.In:F1000Research(Jan.2016).DOI:10.12688/f1000research.7628.1.URL:http://dx.doi.org/10.12688/f1000research.7628.1.[23]CELLIOTTetal.fiNationalInstrumentsLabVIEW:AProgrammingEnvironmentforLaboratoryAutomationandMeasurementfl.In:JournaloftheAssociationforLaboratory70Automation12.1(Feb.2007),pp.17Œ24.DOI:10.1016/j.jala.2006.07.012.URL:http://dx.doi.org/10.1016/j.jala.2006.07.012.[24]DanieleFanelli.fiHowManyScientistsFabricateandFalsifyResearch?ASystematicReviewandMeta-AnalysisofSurveyDatafl.In:PLoSONE4.5(May2009).Ed.byTomTregenza,e5738.DOI:10.1371/journal.pone.0005738.URL:http://dx.doi.org/10.1371/journal.pone.0005738.[25]CatFerguson,AdamMarcus,andIvanOransky.fiPublishing:Thepeer-reviewscamfl.In:Nature515.7528(Nov.2014),pp.480Œ482.DOI:10.1038/515480a.URL:http://dx.doi.org/10.1038/515480a.[26]UnitedStatesFoodandDrugAdministration.GuidanceforIndustryPart11,ElectronicRecords;ElectronicSignatures-ScopeandApplication.URL:http://www.fda.gov/RegulatoryInformation/Guidances/ucm125067.htm.[27]IVIFoundation.IVI.http://www.ivifoundation.orault.aspx.(Accessedon05/23/2016).[28]J.FreireandC.T.Silva.fiMakingComputationsandPublicationsReproduciblewithVisTrailsfl.In:ComputinginScienceEngineering14.4(July2012),pp.18Œ25.ISSN:1521-9615.DOI:10.1109/MCSE.2012.76.[29]ErichGammaetal.DesignPatterns:ElementsofReusableObject-orientedSoftware.Boston,MA,USA:Addison-WesleyLongmanPublishingCo.,Inc.,1995.ISBN:0-201-63361-2.[30]Inc.GitHub.GitHub.https://github.com/.(Accessedon06/01/2016).[31]GNUEmacs.https://www.gnu.org/software/emacs/.(Accessedon05/07/2016).[32]AngularJSdevelopersGoogle.Oneframework.-Angular2.https://angular.io/.(Accessedon06/01/2016).[33]TheHDFGroup.HierarchicalDataFormat,version5.http://www.hdfgroup.org/HDF5/.1997-NNNN.[34]MeganL.Headetal.fiTheExtentandConsequencesofP-HackinginSciencefl.In:PLOSBiology13.3(Mar.2015),e1002106.DOI:10.1371/journal.pbio.1002106.URL:http://dx.doi.org/10.1371/journal.pbio.1002106.[35]FuguoHuang.fiWebTechnologiesfortheInternetofThingsfl.71[36]JohnP.A.Ioannidis.fiWhymostpublishedresearcharefalsefl.In:PLoSMed(2005).URL:http://dx.doi.org/10.1371/journal.pmed.0020124.[37]DonaldE.Knuth.fiLiterateProgrammingfl.In:Comput.J.27.2(May1984),pp.97Œ111.ISSN:0010-4620.DOI:10.1093/comjnl/27.2.97.URL:http://dx.doi.org/10.1093/comjnl/27.2.97.[38]NealLeavitt.fiWillNoSQLDatabasesLiveUptoTheirPromise?flIn:Computer43.2(Feb.2010),pp.12Œ14.ISSN:0018-9162.DOI:10.1109/MC.2010.58.URL:http://dx.doi.org/10.1109/MC.2010.58.[39]AlbertL.Ledereretal.fiThetechnologyacceptancemodelandtheWorldWideWebfl.In:DecisionSupportSystems29.3(Oct.2000),pp.269Œ282.DOI:10.1016/s0167-9236(00)00076-2.URL:http://dx.doi.org/10.1016/S0167-9236(00)00076-2.[40]JensLehmannetal.fiDBpediaŒalarge-scale,multilingualknowledgebaseextractedfromWikipediafl.In:SemanticWeb6.2(2015),pp.167Œ195.[41]JamesLewisandMartinFowler.Microservices.http://martinfowler.com/articles/microservices.html.(Accessedon05/09/2016).Mar.2014.[42]H.Lietal.fiLowPowerMultimodeElectrochemicalGasSensorArraySystemforWearableHealthandSafetyMonitoringfl.In:IEEESensorsJournal14.10(Oct.2014),pp.3391Œ3399.ISSN:1530-437X.DOI:10.1109/JSEN.2014.2332278.[43]LIMSWiki.LIMSvendor.http://www.limswiki.org/index.php/LIMSvendor.(Accessedon05/06/2016).[44]S.MarcoandA.Gutierrez-Galvez.fiSignalandDataProcessingforMachineOlfactionandChemicalSensing:AReviewfl.In:IEEESensorsJ.12.11(Nov.2012),pp.3189Œ3214.DOI:10.1109/jsen.2012.2192920.URL:http://dx.doi.org/10.1109/JSEN.2012.2192920.[45]CatherineC.MarshallandFrankM.Shipman.fiWhichSemanticWeb?flIn:ProceedingsoftheFourteenthACMConferenceonHypertextandHypermedia.HYPERTEXT'03.Nottingham,UK:ACM,2003,pp.57Œ66.ISBN:1-58113-704-4.DOI:10.1145/900051.900063.URL:http://doi.acm.org/10.1145/900051.900063.[46]MATLABR2016a.Natick,Massachusetts,2016.[47]JamesMcCartney.fiRethinkingthecomputermusiclanguage:SuperColliderfl.In:ComputerMusicJournal26.4(2002),pp.61Œ68.72[48]WesMcKinney.fiDataStructuresforStatisticalComputinginPythonfl.In:Proceedingsofthe9thPythoninScienceConference.Ed.bySt´efanvanderWaltandJarrodMillman.2010,pp.51Œ56.[49]MicrosoftInc.MicrosoftWord.2013.URL:https://products.office.com/en-us/word.[50]PaoloMissier,KhalidBelhajjame,andJamesCheney.fiTheW3CPROVFamilyofforModellingProvenanceMetadatafl.In:Proceedingsofthe16thInternationalConferenceonExtendingDatabaseTechnology.EDBT'13.Genoa,Italy:ACM,2013,pp.773Œ776.ISBN:978-1-4503-1597-5.DOI:10.1145/2452376.2452478.URL:http://doi.acm.org/10.1145/2452376.2452478.[51]C.Mohan.fiHistoryRepeatsItself:SensibleandNonsenSQLAspectsoftheNoSQLHooplafl.In:Proceedingsofthe16thInternationalConferenceonExtendingDatabaseTechnology.EDBT'13.Genoa,Italy:ACM,2013,pp.11Œ16.ISBN:978-1-4503-1597-5.DOI:10.1145/2452376.2452378.URL:http://doi.acm.org/10.1145/2452376.2452378.[52]LorraineMorganandPatrickFinnegan.anddrawbacksofopensourcesoftware:anexploratorystudyofsecondarysoftwareIn:OpenSourceDevelopment,AdoptionandInnovation.Springer,2007,pp.307Œ312.[53]TomOinnetal.fiTaverna:lessonsincreatingawwenvironmentforthelifesciences:ResearchArticlesfl.In:ConcurrencyandComputation:Practice&Experience18(June2006).ISSN:1532-0626.DOI:10.1002/cpe.v18:10.URL:http://dl.acm.org/citation.cfm?id=1148437.1148448.[54]W3COWLWorkingGroup.OWL2WebOntologyLanguage:DocumentOverview.Availableathttp://www.w3.org/TR/owl2-overview/.W3CRecommendation,27October2009.[55]FernandoP´erezandBrianE.Granger.fiIPython:aSystemforInteractiveComputingfl.In:ComputinginScienceandEngineering9.3(May2007),pp.21Œ29.ISSN:1521-9615.DOI:10.1109/MCSE.2007.53.URL:http://ipython.org.[56]JorgeP´erez,MarceloArenas,andClaudioGutierrez.fiSemanticsandComplexityofSPARQLfl.In:ACMTrans.DatabaseSyst.34.3(Sept.2009),16:1Œ16:45.ISSN:0362-5915.DOI:10.1145/1567274.1567278.URL:http://doi.acm.org/10.1145/1567274.1567278.[57]MichaelPilato.VersionControlWithSubversion.Sebastopol,CA,USA:O'Reilly&Associates,Inc.,2004.ISBN:0596004486.73[58]RDevelopmentCoreTeam.R:ALanguageandEnvironmentforStatisticalComputing.ISBN3-900051-07-0.RFoundationforStatisticalComputing.Vienna,Austria,2008.URL:http://www.R-project.org.[59]GuillermoRauch.Socket.IO.http://socket.io/.(Accessedon06/01/2016).[60]GuidoRossum.PythonReferenceManual.Tech.rep.Amsterdam,TheNetherlands,TheNetherlands,1995.[61]MichaelRubacha,AnilK.Rattan,andStephenC.Hosselet.fiAReviewofElectronicLaboratoryNotebooksAvailableintheMarketTodayfl.In:JournalofLaboratoryAutomation16.1(Feb.2011),pp.90Œ98.DOI:10.1016/j.jala.2009.01.002.URL:http://dx.doi.org/10.1016/j.jala.2009.01.002.[62]Simulink.Natick,Massachusetts.URL:http://www.mathworks.com/help/simulink/.[63]PivotalSoftware.RabbitMQ-Messagingthatjustworks.https://www.rabbitmq.com/.(Accessedon06/01/2016).[64]DavidISpivakandRobertEKent.fiOlogs:acategoricalframeworkforknowledgerepresentationfl.In:PLoSOne7.1(2012),e24274.[65]RodStephens.BeginningSoftwareEngineering.1st.Birmingham,UK,UK:WroxPressLtd.,2015.ISBN:1118969146,9781118969144.[66]StrongLoop.LoopBackFramework.https://strongloop.com/node-js/loopback-framework/.(Accessedon06/01/2016).[67]Tavendo.Crossbar.io.http://crossbar.io/.(Accessedon06/01/2016).[68]SabuM.Thampi.fiIntroductiontoDistributedSystemsfl.In:CoRRabs/0911.4395(2009).URL:http://arxiv.org/abs/0911.4395.[69]Ste´fanvanderWalt,SChrisColbert,andGae¨elVaroquaux.fiTheNumPyArray:AStructureforEfNumericalComputationfl.In:Comput.Sci.Eng.13.2(Mar.2011),pp.22Œ30.DOI:10.1109/mcse.2011.37.URL:http://dx.doi.org/10.1109/MCSE.2011.37.[70]HuaiqingWangandChenWang.fiOpensourcesoftwareadoption:Astatusreportfl.In:Software,IEEE18.2(2001),pp.90Œ95.[71]ZheWangetal.fiHighlySensitiveCapacitiveGasSensingatIonicLiquidŒElectrodeInterfacesfl.In:AnalyticalChemistry88.3(Feb.2016),pp.1959Œ1964.DOI:7410.1021/acs.analchem.5b04677.URL:http://dx.doi.org/10.1021/acs.analchem.5b04677.[72]ZheWangetal.fiMethaneŒoxygenelectrochemicalcouplinginanionicliquid:arobustsensorforsimultaneousIn:TheAnalyst139.20(June2014),pp.5140Œ5147.DOI:10.1039/c4an00839a.URL:http://dx.doi.org/10.1039/C4AN00839A.[73]MatthewWest.fiComplexSystemsinKnowledge-basedEnvironments:Theory,ModelsandApplicationsfl.In:Berlin,Heidelberg:SpringerBerlinHeidelberg,2009.Chap.OntologyMeetsBusiness-ApplyingOntologytotheDevelopmentofBusinessInformationSystems,pp.229Œ260.ISBN:978-3-540-88075-2.DOI:10.1007/978-3-540-88075-2_9.URL:http://dx.doi.org/10.1007/978-3-540-88075-2_9.[74]E.Wolff.Microservices:FlexibleSoftwareArchitectures.CreateSpaceIndependentPublishingPlatform,2016.ISBN:9781523361250.URL:https://books.google.com/books?id=X7YzjwEACAAJ.[75]WolframResearch,Inc.Mathematica8.0.Version0.8.2010.URL:https://www.wolfram.com.[76]KatyWolstencroft.myExperiment-W-BlastAlignandTree(KatyWolstencroft)[Taverna2W.http://www.myexperiment.org/wws/3369.html.(Accessedon05/08/2016).Jan.2013.[77]Zetta.Zetta-AnAPI-FirstInternetofThings(IoT)Platform-FreeandOpenSourceSoftware.http://www.zettajs.org/.(Accessedon05/04/2016).75