DECIPHERINGTHEGENETICBASISFORCOMPLEXTRAITVARIATION:UTILIZING ALTERNATIVEGENOME-WIDEASSOCIATIONMETRICSANDMOLECULAR PHENOTYPES By ScottAFunkhouser ADISSERTATION Submittedto MichiganStateUniversity inpartialful˝llmentoftherequirements forthedegreeof GeneticsDoctorofPhilosophy 2019 ABSTRACT DECIPHERINGTHEGENETICBASISFORCOMPLEXTRAITVARIATION:UTILIZING ALTERNATIVEGENOME-WIDEASSOCIATIONMETRICSANDMOLECULAR PHENOTYPES By ScottAFunkhouser Withinanypopulation,complextraitvariationcanbeattributedtoanimpressivenumberofgenetic factors.Identi˝cationofsuchfactorshasbeenmadepossible,inpart,bylargebiomedicaldatasets comprisedofgenotypesandphenotypesforhundredsofthousandsofindividuals.Furthermore, understandingthebiologicalmechanismsthroughwhichgeneticvariationcreatescomplextrait variationhasbeenfacilitatedbyhigh-throughputsequencingtechnology,usedtoquantifymolecular, intermediatephenotypes.Despitesuchdatasetsbeingwidelyavailable,welackunderstandingof thefullspectrumofgenetice˙ects,includinggene-by-sex(G Ö S)interactions.Wealsohave yettouncovervariousmolecularphenotypesthatmaygeneticvariationtocomplextrait variation.Toaddressthesegapsinknowledge,thefollowingchapterswill1)developandutilize statisticalmethodologyformappingG Ö Sinteractionsamonghumantraits,and2)utilizeapig modeltocharacterizeRNArelativelyunderstudiedformoftranscriptional andevaluateitspotentialtolinkgeneticvariationwithcomplextraitvariation. Growingevidencefromgenome-wideparameterestimatessuggestmalesandfemalesfrom humanpopulationspossessdi˙eringgeneticarchitectures.Despitethis,mappingG Ö Sinteractions remainschallenging,suggestingthatthemagnitudeofatypicalG Ö Sinteractionisexceedingly small.WehavedevelopedalocalBayesianregression(LBR)approachtoestimatesex-speci˝c singlenucleotidepolymorphism(SNP)markere˙ectsafterfullyaccountingforlocallinkage- disequilibrium(LD)patterns.ThisprovidedmeanstoinferG Ö SinteractionseitherattheSNPlevel, orbyaggregatingmultiplesex-speci˝cSNPe˙ectstomakeinferencesatthelevelofsmall,LD- basedregions.Insimulations,LBRprovidedgreaterpowerandresolutiontodetectG Ö Sinteractions thanthetraditionalapproachtogenome-wideassociation(GWA),single-markerregression(SMR). WhenusingLBRtoanalyzehumantraitsfromtheUKBiobank(N ˘ 250,000)includingheight, BMI,bone-mineraldensity,andwaist-to-hipratio,we˝ndevidenceofnovelG Ö Sinteractions wheresex-speci˝ce˙ectsexplainaverysmallproportionofphenotypicvariance(R 2 <1x10 -4 )but areenrichedinexpressionquantitativetraitloci(eQTL).Byleveraginglargedatasetsandpowerful metrics,weareprovidingevidencethatG Ö Sinteractionsmayin˛uencephenotypicvariancefora varietyofhumancomplextraits. Adenosinetoinosine(A-to-I)RNAeditingimpactsgenefunctionbyconvertingadenosine toinosinemoleculeswithinspeci˝cregionsofthetranscriptomeandiscatalyzedbyadenosine deaminaseactingonRNA(ADAR).High-throughputsequencingstudies,mostofwhichutilizing humanmodels,havefoundthousandsofA-to-Ieditedlocicommonlylocatedwithinrepetitive elementssuchastheprimate-speci˝cAluelement.Here,weutilizedmatchedwhole-genome sequencingandRNAsequencingfromthesameanimaltodemonstratethatwidespreadRNA editingoccurswithinpigtranscriptomes,largelywithinpig-speci˝crepetitiveelementsknownas PRE-1. ThedegreethatsitesinthetranscriptomeareeditedbyADlev beenobservedtovarywithinpopulationsbutitislargelyunknownhowgeneticvariationas wholein˛uenceseditinglevelvariation.Using168F 2 pigswithSNPgenotypingdataandRNA sequencingfromskeletalmuscle,weidenti˝ed˝veRNAeditingsitesacrossfourgeneswhose editinglevelvariationwassigni˝cantlyattributedtotheadditivee˙ectsofallobservedSNP markers(estimatedgenomicheritability ^ h 2 g = p -value=8.2x10 -5 -4 ).Wethen usedbivariatemodelstoestimatehowgeneticsin˛uencescovariancebetweensite-speci˝cRNA editinglevelsandcomplextraitsinpigs.WefoundmodestevidencethatSNPsnear ADAR contribute tocovarianceinRNAeditingactivityandnumerousgrowthtraitssuchasaveragedailygain(local geneticcorrelation ^ ˆ g local [SE]=-0.87[0.16]; p -value=0.029).Theseresultssuggestpotential pleiotropice˙ectsbetweenRNAeditingactivityandcomplextraitsandencouragesfurtheruseof multi-variatemixedmodelsdetermineifRNAeditingcan"link"geneticvariationwithcomplex traitvariation. Copyrightby SCOTTAFUNKHOUSER 2019 ThisworkisdedicatedtoKelseyFunkhouserandourcat,Duchess v ACKNOWLEDGEMENTS Firstandforemost,IwouldliketothankmyadvisorCathyErnstforgivingmeachancetosucceed asagraduatestudent.Duringmy˝rstyear,Istruggledtoidentifytheareaofgeneticsthatboth excitedmeandgavemecon˝denceasascientist.Cathyprovidedmeanopportunitytoworkwith theNationalSwineRegistry(thankstoDougNewcomandNSRaswell)todevelopmethodsto estimatebreedcompositionfromgeneticdata(Iwouldn'thavesucceededinthatprojectwithout vitalhelpfromJuanSteibelandRonBates).Thisworkexposedmetothegreaterworldofstatistics andquantitativegenetics;today,thesearetwosubjectsthatI love andfeelcompelledtocontinually acquiregreaterskillsin.Iamverythankfulforasupportivecommitteewhohaveallowedmeto growasayoungquantitativegeneticist(truly,IamthankfulbecausepriortocomingtoMSU,Ihad nevertakenastatistics,probability,orlinearalgebracourse). IalsocannotthankCathyenoughforherguidance.WithoutCathy'shelpIcouldnothave obtainedaUSDANIFApre-doctoralfellowship,whichhelpedmeobtaincon˝denceasascientist andatasteofsuccessfulgrantwritingexperience. IwouldliketothankGustavodelosCamposwhotaughtmemanyfundamentalsofquantitative genetics,linearmodels,Bayesianestimationtechniques,(thelistgoeson).Inworkingwith Gustavo,Ibeganappreciatingmanyopenquestionsinthe˝eldofstatisticalandquantitative genetics.Together,wehavespentaconsiderableamountoftimebrainstorminghowwecan improveuponexistingGWASmethods.Theseexperienceshavebeenincrediblyrewardingand willcontinuetoinspiremewellintomyfuturescienti˝ccareer. Lastly,Ihavetothankmyclosefriends,whohavehelpedthesesixyearsbecomethemost formativeyearsofmylifethusfar.Theyinclude(butarenotlimitedto)RyanCorbett,Amanda Koenig,AguGonzálezReymúndez,andKaitlynDaza.Together,we'veleanedoneachother duringthehardesttimesofgraduateschool.Ialsohavetothankmybestfriendandpartner,Kelsey Funkhouser,forbeingthestrongestandmostinspiringscientistIknow. vi TABLEOFCONTENTS LISTOFTABLES ....................................... x LISTOFFIGURES ....................................... xi CHAPTER1INTRODUCTION ............................... 1 1.1Geneticfactorsthatin˛uencevariationincomplextraits...............1 1.2LimitationstocommonGWAStechniquesandalternativesolutions.........3 1.3Functionalimplicationsofnoncodingvariantsongeneexpression.........4 1.4RNAeditinganditsroleingeneexpression.....................5 CHAPTER2DECIPHERINGSEX-SPECIFICGENETICARCHITECTURESUSING LOCALBAYESIANREGRESSIONS .................... 6 2.1Abstract........................................6 2.2AuthorSummary...................................7 2.3Introduction......................................7 2.4Results.........................................10 2.4.1OverviewoftheLBRmodel,inferencemethods,andimplementation...10 2.4.1.1Priorassumptions.........................10 2.4.1.2Local-regression..........................10 2.4.1.3Inferences..............................11 2.4.2LBRo˙ersimprovedpowerwithlowerfalse-discoveryrates........12 2.4.2.1PowerandFDRwhencausalvariantsaregenotyped.......12 2.4.2.2PowerandFDRunderimperfectLD................14 2.4.3Forrealhumantraits,manynewlydiscoveredG Ö Sinteractionsshow relativelysmallsex-speci˝ce˙ects......................16 2.4.4InferredG Ö Sinteractionsareenrichedintissue-speci˝ceQTL.......20 2.5Discussion.......................................21 2.6Methods........................................25 2.6.1Genotypedata.................................25 2.6.2Phenotypedata................................25 2.6.3LBRhyperparameters.............................26 2.6.4Inferenceusingpost-processingofposteriorsamples............26 2.6.5De˝ninglocal,LD-basedwindows......................27 2.6.6Singlemarkerregression...........................28 2.6.7Simulations..................................28 2.7Acknowledgments...................................29 CHAPTER3EVIDENCEFORTRANSCRIPTOME-WIDERNAEDITINGAMONG SUSSCROFAPRE-1SINEELEMENTS ................... 30 3.1Abstract........................................30 3.2Background......................................30 vii 3.3Resultsanddiscussion................................32 3.3.1DNAandRNAsequencing..........................32 3.3.2Identi˝cationofcandidateRNAeditingevents................32 3.3.3Tissuedi˙erences...............................34 3.3.4Controllingforerrorsduetomappingquality................35 3.3.5Pigeditomefunctionalimplications.....................37 3.3.6Pigeditomeassociationwithpig-speci˝cSINEelements..........38 3.4Conclusions......................................40 3.5Methods........................................41 3.5.1Sequencedata.................................41 3.5.2Sequencepreparationandmapping......................42 3.5.3Variantcallingandmismatchdetection....................42 3.5.4Quantitativereal-timePCR..........................43 3.5.5Calculatingprobabilityofmappingerror...................44 3.5.6IncorporatingRepeatMaskerandVariantE˙ectPredictordatausingeditTools44 3.6Authors'Contributions................................45 3.7Acknowledgements..................................45 CHAPTER4ESTIMATINGTHECOHERITABILITYBETWEENSITE-SPECIFIC RNAEDITINGANDECONOMICALLYIMPORTANTTRAITSINPIGS . 46 4.1Abstract........................................46 4.2Introduction......................................47 4.3Results.........................................48 4.3.1HeritableRNAeditingactivityimpactspig longissimusdorsi muscle geneexpression................................48 4.3.2GeneticvariantsnearADARaresuspectedtocontributetoeditinglevel variationacrosssites.............................49 4.3.3SuggestiveevidenceforasharedgeneticarchitecturebetweenRNAedit- ingactivityandcomplextraits........................52 4.4Discussion.......................................54 4.5MaterialsandMethods................................57 4.5.1Sequencingdata................................57 4.5.2Genotypingdata................................58 4.5.3Phenotypes..................................58 4.5.4SequencingdatapreparationandRNAeditingdetection...........58 4.5.5Editinglevelestimation............................59 4.5.6UnivariatevariancecomponentestimationandGWA............60 4.5.7Bivariateanalysistoestimategenomiccovariances.............61 CHAPTER5CONCLUSION ................................. 63 5.1Gene-by-sexinteractionsandideasforfutureanalyses................63 5.2PresentlimitationstomodelingRNAeditingactivityandideasforfuturefunc- tionalgeneticsstudies.................................64 5.3Overallconclusions..................................65 viii APPENDICES ......................................... 67 APPENDIXACHAPTER2SUPPLEMENTARYMATERIAL ........... 68 APPENDIXBCHAPTER4SUPPLEMENTARYMATERIAL ........... 84 BIBLIOGRAPHY ........................................ 98 ix LISTOFTABLES Table2.1:G Ö Sinteractionsinferredthroughsex-speci˝cwindowvariances.........19 Table3.1:A-to-Gmismatchesresultinginaminoacidchanges................39 Table4.1:RNAeditingsitesexhibitingheritablevariabilityin longissimusdorsi muscle tissue........................................49 Table4.2:ProportionofeditinglevelgenomicvarianceexplainedbySNPs˛anking ADAR and OXCT1 .................................51 Table4.3:Topgenomiccovarianceestimatesbetweensite-speci˝cRNAeditinglevels andgrowth,meatquality,andcarcasscompositiontraits.............54 Table4.4:Toplocalgenomiccovarianceestimatesattributableto ADAR -˛ankingSNPs betweensite-speci˝cRNAeditingandgrowth,meatquality,andcarcasscom- positiontraits....................................55 TableA.1:Sex-speci˝cphenotypestatistics..........................68 TableA.2:InferredG Ö Sinteractionsusingsex-speci˝cwindowvariances.Listedareall windowswith ˙ 2 g j 0 : 9 ..........................69 TableB.1:HeritabilityestimatesforallRNAeditingsiteswithsamplesize.........84 TableB.2: ADAR -localizedgenomicvarianceestimatesfor67carcasscomposition,meat quality,andgrowthtraits..............................85 TableB.3:Genomiccovarianceestimatesbetweensite-speci˝cRNAeditinglevelsand 67carcasscomposition,meatquality,andgrowthtraits..............88 x LISTOFFIGURES Figure2.1:StrategyforimplementinglocalBayesianregressionsgenome-wide.......11 Figure2.2:Estimatedpowerandfalse-discoveryratefordiscoveringobservedSNPswith G Ö Sinteractions..................................13 Figure2.3:Powervsfalse-discoveryratefordiscoveringgenomicregionscontaining maskedG Ö Sinteractions.............................15 Figure2.4:Comparingsex-speci˝cgenetice˙ects......................17 Figure2.5:EvidencethatLBR-identi˝edG Ö Sinteractionsareenrichedintissue-speci˝c eQTL........................................22 Figure3.1:DNAtoRNAmismatchcounts..........................34 Figure3.2:SharedA-to-Gmismatchesbetweentissues....................35 Figure3.3:RelativeADARtranscriptabundancebetweentissues..............36 Figure3.4:A-to-Gmismatchlocationsrelativetothenearestannotatedgenes........38 Figure3.5:DistributionofrepetitiveA-to-Gmismatches...................40 Figure4.1:GWAforsite-speci˝ceditinglevels........................50 Figure4.2:Quantile-quantileplottestingforgenome-widegenomiccovariancesbetween site-speci˝ceditinglevelsandcomplextraits...................53 FigureA.1:LDstatisticsacrossdistances...........................78 FigureA.2:Estimatedpowerandfalse-discoveryratefordiscoveringobservedSNPswith e˙ectsinatleastonesex.............................79 FigureA.3:Powervsfalse-discoveryratefordiscoveringgenomicregionscontaining maskedcausalvariants..............................80 FigureA.4:ComparisonbetweenSMRandLBRfordiscoveringG Ö Sinteractions.....81 FigureA.5:eQTLenrichmentasafunctionofthenumberofSNPsselected.........82 FigureB.1:PairwiseLDplotbetweenSNPs˛anking ADAR .................97 xi CHAPTER1 INTRODUCTION 1.1Geneticfactorsthatin˛uencevariationincomplextraits Formorethanadecade,genome-wideassociationstudies(GWAS)haveidenti˝edsingle- nucleotidepolymorphisms(SNPs)thatassociatewithcomplextraitsanddiseases[1,2].Using humanheightasamodelcomplextrait,itbecameclearthattheproportionofphenotypicvariance explainedbyGWAS-identi˝edlociwasmuchlowerthanthenarrow-senseheritability[3,4],where narrow-senseheritability(orsimply,heritability)istheproportionofphenotypicvarianceexplained bytheadditivee˙ectsatallquantitativetraitloci(QTL).Thisherproblemwas partiallysolved;byestimatingtheproportionofvarianceexplainedbyallSNPsgenome-wide (estimatingso-calledorheritability),itbecameevidentthatmanylocipossess exceedinglysmalladditivee˙ectsthatgoundetectedbyGWASmainlyduetotheburdenofmultiple testcorrection[5].Itisgenerallyacceptedthatanytillmissinghercanbeattributedto imperfectlinkage-disequilibrium(LD)betweenobservedSNPsandunderlyingQTL[6]. Althoughthenatureofmissingheritabilityislargelyunderstood(inthesensethatweunderstand thelargedisparitybetweenGWAS-identi˝edlociandtraitheritability),workisongoingtomap thelocationsofbothlarge-andsmall-e˙ectQTLforawidevarietyofcomplex,polygenictraits [7,8,9,10].Thishasbeenmadepossible,inpart,bylargebiomedicaldatasetsthatcontain genotypicandphenotypicrecordsforhundredsofthousandsofindividuals(examplesinclude theUKBiobank,ChinaKadoorieBiobank,FinnGen,23andMe ® ,etc).Whenusing ˘ 700,000 individualsofEuropeanancestry,GWAS-identi˝edlociatastringent p -valuethresholdof1x10 -8 areabletoexplainamuchlargerproportionofvarianceinheightandBMIthanbefore[11].This suggestslocationsofsmalladditivee˙ectscanberevealedwithincreasingsamplesize. Still,manyadditionalgeneticfactorscontributetothevarianceofcomplextraits(bycontributing tobroad-senseheritability).Thesefactorscannotbediscoveredsimplybyincreasingsamplesize 1 butrelymoreonappropriatemodelingandexperimentaldesign.Theyincludeamultitudeof complexinteractionssuchasdominancee˙ects(interactionsbetweenallelesatthesamelocus), epistasis(interactionsbetweenallelesatdi˙erentloci),andvariousgene-by-environment(G Ö E) interactions.Conceptually,onemayinterpretG Ö Etobealocuswithdi˙erente˙ectsdependingon theenvironmentit'splacedin.CommonapproachestoGWASorgenomicheritabilityestimation assumethatadditivee˙ectsatSNPsarehomogenous(constantforeverymemberinthepopulation). However,memberswithinanypopulationmayundergodi˙erentenvironmentalexposures(either endogenousorexogenous)creatingthepossibilityofheterogenousgenetice˙ects,shouldG Ö E exist. Gene-by-sex(G Ö S)interactionsareaformofG Ö E;sexdirectlyin˛uencesboththeendogenous (forinstance,sexhormonesin˛uencetranscriptionalmechanisms)andexogenous(forinstance, contraceptiveuse)environment.TheexistenceofG Ö Sinteractionsisoneofseveraltheoriesused toexplainsexdi˙erencesfornumerouscomplextraits(forquantitativetraitssuchasheight,these includedi˙erencesinbothmeanandvariance).EvidenceforG Ö Sinteractionslargelystemsfrom een-seorgeneticcorrelationestimates[12,13,14].Whenevidentlylessthan one,theseestimatesindicatethatgenetice˙ectsaredisproportionalbetweensexes[15].Even withrelativelylargesamplesizes(>100,000),mappingG Ö Sinteractionsremainschallenging [16,17],suggestingthatformanytraitsthemagnitudeofanyG Ö Sissmallandislikelyescaping GWASdetectionduetotheburdenofmultipletestcorrection.Justasnumeroussmall,homogenous, additivee˙ectsaccumulatetoin˛uencenarrow-senseheritability,numeroussmallG Ö Sinteractions mayin˛uencebroad-senseheritabilitybyinducingmeanandvariancedi˙erencesbetweensexes. Inchapter2,individualsofEuropeanancestryfromtheUKBiobank(N ˘ 250,000)areused tomapG Ö Sinteractions.Inthischapterwediscussthedi˚cultyofmappingG Ö Sinteractions usingtraditionalGWASmethodsanddevelopanalternativestrategyformappingsuchevents. We˝ndevidenceofsmall-magnitudeG Ö Sinteractionsimpactingsuchtraitsasheight,BMI,and bone-mineraldensity. 2 1.2LimitationstocommonGWAStechniquesandalternativesolutions Despitesamplesizescontinuingtogrowinrecentyears,themethodsusedforGWAShave remainedlargelythesamesincethe˝rstreportedGWASin2006[18].Thesemethods,commonly referredtoasmarkeranalyormarkerreg(SMR),testforanassociation betweenacomplextraitandaSNP,doingsooneSNPatatime.Tocontrolthefamily-wiseerror rate(probabilityofmakingatleastonetypeIerror),a p -valuethresholdof5x10 -8 isroutinely adopted. Althoughthisstraightforwardapproachisuseful,ithasafewdrawbacks.Asmentioned previously,theburdenofmultipletestingcanseverelyhinderstatisticalpower.Additionally, patternsofLDarenotfullyaccountedfor;assamplesizecontinuoustogrow,mappingresolution willworsenformoderate-tolarge-e˙ectQTL.ThisisbecausemanySNPscanbeassociatedwith oneormoreQTLduetoLD,creatingspurioussignalsthatamplifyassamplesizeincreases. NumerousmethodshavebeendevelopedthattreatGWASlikeavariableselectionproblem, oneinwhichmultiplecovariates(SNPs,inthiscase)areconsideredsimultaneouslyandonlythe mostrelevantonesareselectedasnon-null,usefulpredictors.Suchmethodshaveshownimproved mappingresolutionwhencomparedtoSMR[19,20,21,22].Underthisvariableselection paradigm,onecanestimatetheadditivee˙ectsofmultipleSNPssimultaneouslyusingBayesian multipleregressionmixturemodels[23,24].Bayesianmixturemodelshaveseveralniceproperties: i)theproportionofSNPswithnon-nullgenetice˙ects(thepolygenicity)canbetreatedasrandom andinferredfromthedataandii)inferringwhethereachSNPhasanon-nulle˙ectcanbedone formallybyestimatingthecorrespondingposteriorprobability(theprobabilityofanon-nullSNP e˙ect,giventhedata).The˝rstpointiscrucialforachievingappropriateerrorcontrol[25].One importantcriticismtoBayesianmixturemodelsforGWAS(thatappliestomultipleregression modelsingeneral)isthatinregionsofhighLD,theassociationofanygivenSNPwithatrait maybeexceedinglysmallwhenconditioningonallotherSNPsintheregion[26].Fernandoet al.demonstratedthatininstanceswhenindividualSNP-traitassociationsaresmallduetoLD,the aggregatee˙ectofmultipleSNPsinawindowcanmoregreatlyassociatewithtraitsandbeused 3 tolocateQTL. Inchapter2,weadaptBayesianmixturemodelstoinfersex-speci˝cgenetice˙ectsandG Ö S interactions.Usingsimulations,weshowthataggregatingsex-speci˝cSNPe˙ectswithinsmall LD-basedwindowscanenhancethepowerandprecisiontodetectG Ö Sinteractionsupontraditional SMRtechniques. 1.3Functionalimplicationsofnoncodingvariantsongeneexpression In2009,Visscheretal.[27]popularizedthetermvartomeanthegeneticvariant thatcausesanobservedGWASassociationsignal.Thistermissomewhatunfortunateinthat GWASislimitedto˝ndinglociwithallelecontentthatassociateswithoneormorephenotypes inapopulation,regardlessifacausalrelationship(suchasabiologicalmechanism)canexplain theassociationornot.OnemajorobservationfromGWAShoweveristhatcommonvariantsthat explainsomeproportionofcomplextraitvariationaretypicallywithinnon-codingportionsof genomesandenrichedinexpressionQTL[28](eQTL;QTLthatexplainvariationintranscript abundanceforoneormoregenes).Thissuggestsvariationinheritablecomplextraitsisatleast partiallydrivenbyvariationintranscriptabundance.Morerecently,ithasbeenshownthatfor somecomplextraits,GWAShitsareequallyenrichedinsplicingQTL(QTLthatexplainvariation inalternatively-splicedisoforms),manyofwhichdidnotin˛uencetranscriptabundance[29]. Inchapter2,weshowthatG Ö Sinteractionsidenti˝edusingBayesianmixturemodelsgenerally showgreatereQTLenrichmentthanG Ö Sinteractionsidenti˝edfromsinglemarkerregression. Thismayindicateourapproachfor˝ndingG Ö Sinteractionsisworkingwelltowardidentifying functionalregionsthatmaycontributetosexdi˙erencesandphenotypicvariance.Inchapter3,we characterizearelativelyunderstudiedformofgeneexpressionknownasRNAeditingusingapig model,andinchapter4weevaluatethepotentialthatheritableRNAeditingvariationcontributes tocomplextraitvariationinpigs. 4 1.4RNAeditinganditsroleingeneexpression RNAeditingcomprisesawidesetofmodi˝cationstoRNAtranscriptsincludingdeletion, insertion,andsubstitutionofribonucleotides[30,31].Inmammals,RNAeditingpredominantly involvesanadenosinetoinosinetransitionwithindouble-strandedpre-mRNAtranscripts,catalyzed byadenosinedeaminaseactingonRNA(ADAR)[32].Atmanyeditedgenes,ADARcatalyzesthis reactionwithoutperfecte˚cacy,resultinginonlyaproportionoftranscripts(namedtheediting level)containingtheinosinevariant.Thismeansthatlikealternativesplicing,RNAeditingenables variationintranscriptcontentfromindividualtoindividual,withoutnecessarilya˙ectingtranscript abundance. RNAeditingbyADARhasbeenshowntobeessentialforthefunctionofsomegenesand essentialforlife.Forinstance,theGluR-Breceptorishighlyeditedatkeypositions(nearlyall transcriptscontaintheinosinevariant)andreductioninGluR-BeditinginmiceleadstoCa2+ permeabilityinneuralcellsanddeathfromseizures[33].Othereditingeventsareshownto in˛uencecomplextraits;editingofserotoninreceptor2Cisknowntoin˛uenceenergydissipation andfatmass[34].PerhapsmostRNAeditingsitesshowhighlyvariableeditinglevelsinapopulation [35].LikeSNPs,editinglevelsthatshowlittlevariationinthepopulationlikelyhavelargee˙ects oressentialfunctions(suchasGluR-Bediting),whilethosethatexhibitlargevariationareexpected topossesssmallere˙ectsbutmaystillexplainsomeproportionofcomplextraitvariation. MostRNAeditingeventsinhumantranscriptomesarewithinAluelements[32],atypeofshort interspersednuclearelement(SINE)uniquetoprimates.Inchapter3,we˝ndevidencethatRNA activityamongpigslargelyoccurswithinPRE-1elements,atypeofSINEelementuniquetopigs, hogs,andpeccaries.This˝ndinghasbeenreplicatedamongseveralsubsequentpigRNAediting studies[36,37].Inchapter4,wefurtherevaluatethepossibilitythathighlyvariableandheritable editinglevelsmayexplainvariationincomplextraits. 5 CHAPTER2 DECIPHERINGSEX-SPECIFICGENETICARCHITECTURESUSINGLOCAL BAYESIANREGRESSIONS ThischapterisavailableonthebioRxiv(doi:10.1101/653386).Itwaspreparedalongsideco- authorsAnaIVazquez,JuanPSteibel,CatherineWErnst,andGustavodelosCampos. 2.1Abstract Manycomplexhumantraitsexhibitdi˙erencesbetweensexes.Whilenumerousfactorslikely contributetothisphenomenon,growingevidencefromgenome-widestudiessuggestapartialex- planation:thatmalesandfemalesfromthesamepopulationpossessdi˙eringgeneticarchitectures. Despitethis,mappinggene-by-sex(G Ö S)interactionsremainsachallengelikelybecausethemagni- tudeofsuchaninteractionistypicallyandexceedinglysmall;traditionalgenome-wideassociation techniquesmaybeunderpoweredtodetectsucheventspartlyduetotheburdenofmultipletest correction.Here,wedevelopedalocalBayesianregression(LBR)methodtoestimatesex-speci˝c SNPmarkere˙ectsafterfullyaccountingforlocallinkage-disequilibrium(LD)patterns.This enabledustoinfersex-speci˝ce˙ectsandG Ö SinteractionseitheratthesingleSNPlevel,orby aggregatingthee˙ectsofmultipleSNPstomakeinferencesatthelevelofsmallLD-basedregions. UsingsimulationsinwhichtherewasimperfectLDbetweenSNPsandcausalvariants,weshowed thataggregatingsex-speci˝cmarkere˙ectswithLBRprovidesimprovedpowerandresolution todetectG Ö Sinteractionsovertraditionalsingle-SNP-basedtests.WhenusingLBRtoanalyze traitsfromtheUKBiobank,wedetectedarelativelylargeG Ö Sinteractionimpactingbone-mineral densitywithin ABO andreplicatedmanypreviouslydetectedlarge-magnitudeG Ö Sinteractions impactingwaist-to-hipratio.WealsodiscoveredmanynewG Ö Sinteractionsimpactingsuchtraits asheightandBMIwithinregionsofthegenomewherebothmale-andfemale-speci˝ce˙ects explainasmallproportionofphenotypicvariance( R 2 < 1x10 4 ),butareenrichedinknown expressionquantitativetraitloci.Bycombiningbiobank-leveldataandtechniquestoestimate 6 sex-speci˝cSNPe˙ectsafteraccountingforlocal-LDpatterns,weareprovidingevidencethatnu- meroussmall-magnitudeG Ö Sinteractionsexisttoin˛uencesexdi˙erencesinavarietyofcomplex traits. 2.2AuthorSummary Manycomplexhumantraitsareknowntobein˛uencedbyanimpressivenumberofcausal variantseachwithverysmalle˙ects,posinggreatchallengesforgenome-wideassociationstudies (GWAS).Toaddtothischallenge,manycausalvariantsmaypossesscontext-dependente˙ects suchase˙ectsthataredependentonbiologicalsex.WhileGWASarecommonlyperformed usingspeci˝cmethodsinwhichonesinglenucleotidepolymorphism(SNP)atatimeistestedfor associationwithatrait,alternativelyweutilizemethodsmorecommonlyobservedinthegenomic predictionliterature.Suchmethodsareadvantageousinthattheyarenotburdenedbymultipletest correctioninthesamewayastraditionalGWAStechniquesare,andcanfullyaccountforlinkage- disequilibriumpatternstoaccuratelyestimatethetruee˙ectsofSNPmarkers.Hereweadapt suchmethodstoestimategenetice˙ectswithinsexesandprovideapowerfulmeanstocompare sex-speci˝cgenetice˙ects. 2.3Introduction Sexdi˙erencesarewidespreadinnature,observedreadilyamongmanyhumantraitsand diseases.Forquantitativetraits,sexmaya˙ectthedistributionofphenotypesatvariouslevels, includingmean-di˙erencesbetweengeneticmalesandgeneticfemales(hereafterreferredtoas malesandfemales,respectively)aswellasdi˙erencesinvariance.Sexdi˙erencesarelikely duetoamyriadoffactorsincludingdi˙erentialenvironmentalexposures,unequalgenedosages forsex-linkedgenesaswellassex-heterogeneityinthearchitectureofgenetice˙ectsatoneor moreautosomalloci(i.e.gene-by-sex(G Ö S)interactions).Inthisway,sexisconsideredan environmentalvariable,providingtwowell-de˝nedconditionsinwhichallelefrequenciesand linkagedisequilibrium(LD)patternsareequivalentbutneverthelessgenetice˙ectsofoneormany 7 autosomallocimaydi˙er. Evidencefordi˙erentgeneticarchitecturesbetweensexesamonghumanpopulationsislargely supportedbygenome-wideparameters[38,12,13,14]includingunequalwithin-sexheritabilities h 2 male , h 2 female andbetween-sexgeneticcorrelationslessthanone( r g < 1 );theformersuggests thattheproportionofphenotypicvarianceexplainedbygeneticfactorsvariesbetweensexes,while thelattersuggestsgenetice˙ectsaredisproportionalbetweensexes[15].Althoughmanytraitsseem tohavebetween-sexgeneticcorrelationthatisevidentiallylessthanone,genome-wideassociation (GWA)studiesintendedtomapG Ö Sinteractionshavestruggledtopinpointsuchloci[17,39]. Basedonthisdichotomy,G Ö Sinteractionspresumablyexistformanytraits,butthemagnitudeofa typicalG Ö Sinteractionissuspectedtobeexceedinglysmall,explainingwhysucheventscommonly eludedetection,particularlyaftermultipletestcorrection.However,justasnumeroussmalle˙ect causallociaccumulatetoa˙ectphenotypicvariance,smallG Ö Sinteractionsmayaccumulateto in˛uencebothsexdi˙erencesandphenotypicvariance. MostGWAstudiesutilizesingle-markerregression(SMR),inwhichthephenotypeisregressed uponallelecontentoneSNPatatime,therebyobtainingmarginalSNPe˙ectsizeestimatesthat donotfullyaccountforLDpatterns.Incontrast,whole-genomeregressionmethods,inwhichthe phenotypeisregresseduponallSNPsacrossthegenomeconcurrently,fullyaccountformulti-locus LD.Thesemethodsareincreasinglybeingusedasaone-stopsolutiontoestimatetrue(conditional) e˙ectsizesofSNPmarkersandtoprovidegenome-wideestimatesincludinggenomicheritability [23,6,5]andbetween-sexgeneticcorrelations[12,13,14].ByestimatingtrueSNPe˙ectsizes,the goalacrossmanystudiesistoselectSNPswithnon-zeroe˙ectsandtobuildamodelforpredicting polygenicscores[40,41,42].Otherworkshavedirectlyillustratedtheuseofwhole-genome regressionmethodsforGWAS[26,19,43,44].Whole-genomeregressionsarecomputationally challengingtousewithbiobank-leveldata;however,recentworksuggestsrelativelyaccurate genomicpredictionandSNPe˙ectestimationcanbeachievedbysimplyaccountingforlocalLD patterns(asopposedtoglobalLDpatterns)[45]. BuildingontheideaofutilizingtrueSNPmarkere˙ects,herewedevelopedlocalBayesian 8 regressions(LBR)inwhichthephenotypeisregresseduponmultipleSNPsspanningmultiple LDblocks(therebyaccountingforlocalLDpatterns)tostudysexdi˙erencesincomplextraits fromtheUKBiobank.TheLBRmodelusesrandom-e˙ectSNP-by-sexinteractions[46,47] thatdecomposeconditionalSNPe˙ectsintothreecomponents:i)onesharedacrosssexes,ii)a male-speci˝cdeviationfromthesharedcomponent,andiii)afemale-speci˝cdeviationfromthe sharedcomponent.UsingsamplesfromtheposteriordistributionofconditionalSNPe˙ects,we developedmethodstoinfersex-speci˝ce˙ectsandG Ö SinteractionsatthesingleSNPleveland byaggregatingSNPe˙ectswithinsmallLD-basedregions,o˙eringmultipleperspectivestostudy sex-speci˝cgeneticarchitectures. Inthisstudy,wehaveutilizedgenotypesfor607,497autosomalSNPsfrom ˘ 259,000distantly relatedCaucasiansfromtheUKBiobankforassessingLBR'sperformanceinanalyzingsimulated andrealcomplextraitsincludingheight,BMI,waist-to-hipratio(WHR),andheelbone-mineral density(BMD).Simulationsshowedthat(i)forinferencesofG Ö Sinteractions,LBRo˙ershigher powerwithlowerFDRthanmethodsbasedonmarginale˙ects(akasingle-markerregression)and (ii)weshowthatunderimperfectLDbetweenSNPsandcausalvariants(i.e.,whencausalvariants arenotgenotyped),aggregatingSNPe˙ectswithinsmallLD-basedregionso˙ershigherpower thanmethodsbasedontestingindividualSNPs. Thetraitsanalyzedinthisstudyspanarangeofgenome-widemetricsandG Ö Ssuggestibility; fromheightandBMIforwhichpreviousstudiesindicatemalesandfemalespossessverysimilar geneticarchitectures[13],toWHR,atraitwithwell-documentedG Ö Sinteractions[48,16,49,50], andBMD,forwhichG Ö Sinteractionsarethoughttoexistbuthaveeludeddetection[51].LBR providedevidenceofG Ö Sinteractionsimpactingheight,BMI,andBMDatregionsofthegenome wheresex-speci˝cgenetice˙ectsarerelativelysmall,howeversuchregionsareenrichedinknown eQTL.ForWHR,LBRreplicatedmanylarge-magnitudeG Ö Sinteractionspreviouslydiscovered usingsingle-markerregression,butalsolocatednovelG Ö Sinteractionsnearsuchgenesasthe estrogenreceptor ESR1 . 9 2.4Results 2.4.1OverviewoftheLBRmodel,inferencemethods,andimplementation Tostudysexdi˙erencesweregressedmaleandfemalephenotypes( y m and y f )onmaleandfemale genotypes( X m and X f )usingaSNP-by-sexinteractionmodeloftheform 2 6 6 6 6 6 4 y m y f 3 7 7 7 7 7 5 = 2 6 6 6 6 6 4 1 m 1 f 3 7 7 7 7 7 5 + 2 6 6 6 6 6 4 X m X f 3 7 7 7 7 7 5 b 0 + 2 6 6 6 6 6 4 X m 0 3 7 7 7 7 7 5 b m + 2 6 6 6 6 6 4 0 X f 3 7 7 7 7 7 5 b f + 2 6 6 6 6 6 4 " m " f 3 7 7 7 7 7 5 : (2.1) Above, m and f aremaleandfemaleintercepts, b 0 = b 0 j ¹ j = 1 ;:::; p º isavectorofmain e˙ects, b m = b m j and b f = b f j aremaleandfemaleinteractions,respectivelyand " m = " m i and " f = " f i aremaleandfemaleerrorswhichwereassumedtofollownormaldistributionswithzero meanandsex-speci˝cvariances.Female-speci˝candmale-speci˝cSNPe˙ectsarede˝nedas f j = b 0 j + b f j and m j = b 0 j + b m j ,respectively. 2.4.1.1Priorassumptions ForSNPe˙ectsweadoptedpriorsfromthespike-slabfamilywithapointofmassatzeroand aGaussianslab[24]speci˝cally, p b k j = ˇ k N 0 ;˙ 2 b k + ¹ 1 ˇ k º 1 b k j = 0 (where k = 0, form).Here, ˇ k and ˙ 2 b k arehyper-parametersrepresentingtheproportionofnonzeroe˙ects andthevarianceoftheslab;thesehyper-parametersweretreatedasunknownandgiventheirown hyperpriors(seeMethods). 2.4.1.2Local-regression Implementingtheabovemodelwithwhole-genomeSNPs( p ˘ 600K)andverylargesamplesize ( n ˘ 300K)iscomputationallyextremelychallenging.However,LDinhomogeneousun-structured humanpopulationsspansoverrelativelyshortregions(R 2 betweenalleledosagestypicallyvanishes within1-2Mb;FigA.1).Therefore,weappliedLBRtolong,overlappingchromosomesegments (Fig2.1).Speci˝cally,wedividedthegenomeintosegmentscontaining1,500contiguous 10 Figure2.1:StrategyforimplementinglocalBayesianregressionsgenome-wide ThephenotypeisregresseduponmultiplesequentialSNPsusingaslidingwindowapproach.Thecore regioncontained1500SNPs(roughly8Mb,onaverage)andeachbu˙erregioncontained250SNPs (roughly1Mb,onaverage).Coreparameters(posteriorsamples)arestitchedtogether,thensex-speci˝c e˙ectsandG Ö SinteractionsareinferredatthelevelofSNP j andwindow j . SNPs(roughly8Mb,onaverage),thenappliedtheregressioninequation2.1toSNPsinthecore segmentplus250SNPs(i.e.,roughly1Mb)ineach˛ankingregion,whichwereaddedtoaccount forLDbetweenSNPsattheedgeofeachcoresegmentwithSNPsinneighboringsegments. 2.4.1.3Inferences WeusedtheBGLR[52]softwaretodrawsamplesfromtheposteriordistributionofthemodel parametersandusedthesesamplestomakeinferenceaboutindividualSNPe˙ectsincluding: (i)theposteriorprobabilitythatthe j th SNPhasanonzeroe˙ectinmales PPM SNP j and females PPF SNP j and(ii)theposteriorprobabilitythatthefemaleandmalee˙ectsaredi˙erent SNP j . InregionsinvolvingmultipleSNPsinstrongLD,inferencesattheindividual-SNPlevelmay bequestionable.ThereforeweborroweduponpreviousworkbyFernandoetal.[26],enabling ustoaggregatemultiplesex-speci˝cSNPe˙ectswithinrelativelysmallregionsusingw varForeachSNP j wede˝nedawindow j aroundtheSNPbasedonlocalLDpatterns(see Methods).Wethende˝nedthemale-speci˝candfemale-speci˝cwindowvariancesas ˙ 2 g m j = 11 v ar X j m j and ˙ 2 g f j = v ar X j f j ,respectively.Here, X j representgenotypesatSNPs withinthe j windowand v ar ¹º isthesamplevarianceoperator.Priortomodel˝tting,the phenotypeisscaledacrosssexes;thus,sex-speci˝cwindowvariancesmaybeinterpretedasthe proportionoftotalphenotypicvarianceexplainedbysex-speci˝cSNPe˙ects.Fromsamplesof sex-speci˝cwindowvariances,wecomputedtheposteriorprobabilityof(i)nonzeromale-speci˝c windowvariance PPM ˙ 2 g j ! ,(ii)nonzerofemale-speci˝cwindowvariance PPF ˙ 2 g j ! ,and(iii) sexdi˙erenceinwindowvariances ˙ 2 g j ! . 2.4.2LBRo˙ersimprovedpowerwithlowerfalse-discoveryrates Weusedsimulationstoassessthepowerandfalsediscoveryrate(FDR)ofLBRandtocompareit withthatofstandardsingle-marker-regression(SMR).TraitsweresimulatedusingSNPgenotypes fromtheAxiomUK-Biobank(119,190malesand139,738females,alldistantlyrelatedCaucasians). Wesimulatedahighlycomplextraitwithonecausalvariant(CV)per ˘ 2Mbwhichonaverage explainedaproportionofthephenotypicvarianceequalto3.3x10 -4 .Oursimulationusedatotal of60,000SNPs(consistingof6,000consecutiveSNPstakenfrom10di˙erentchromosomes)and 150CVs;onthecompletehumangenomethiscorrespondstoatraitwith1,500CVsand aheritabilityof0.5(seeMethodsforfurtherdetails).40%oftheCVs(atotalof60SNPsinour simulation)haddi˙eringsex-speci˝ce˙ectsandtheremaining60%(90SNPs)hade˙ectsthat werethesameinmalesandfemales. 2.4.2.1PowerandFDRwhencausalvariantsaregenotyped First,weanalyzedthesimulatedphenotypesusingallSNPs(includingthe150causalones). InitiallyinterestedininferringG Ö Sinteractions,werankedSNPsbasedonLBR's SNP j metricandbasedonSMR's p -valueforsexdi˙erence( pvalue -di˙,seeMethods)andusedthetwo rankstoestimatepowerandFDRasafunctionofthenumberofSNPsselected(Fig2.2).LBR showedconsistentlyhigherpower(achievingapowerof80%whenselectingthetop-50SNPswith 12 Figure2.2:Estimatedpowerandfalse-discoveryratefordiscoveringobservedSNPswithG Ö S interactions ShownasafunctionofthenumberofSNPsselected.Eachpointrepresentsasampleaverageanderrorbars represent95%con˝denceintervals,eachderivedusing30MonteCarloreplicates.LBR(SNP):local Bayesianregression,utilizing SNP j .SMR:single-markerregression,utilizing pvalue -di˙. highest SNP j andlowerFDRthanSMR.ThefalsediscoveryrateofLBRwasverylowwhen selectingthetop-50SNPswithhighest SNP j andexhibitedaverysharpphase-transition withfastincreaseinFDRthereafter. Wealsocomparedthetwomethodsbasedonarbitrary,albeitcommonlyused,mappingthresh- oldsforSMRandLBR.At SNP j 0 : 95 ,LBRselectedanaverage(acrosssimulation replicates)of38.33SNPswithanestimatedpowerof0.634andestimatedFDRof0.007.Con- versely,at pvalue -di˙ 5x10 -8 ,SMRselectedanaverageof50.7SNPswithanestimatedpowerof 0.436andestimatedFDRof0.451.Altogether,theseresultssuggestthatforG Ö Sdiscovery,LBR o˙ershigherpowerandlowerFDRthanmethodmostwidelyusedinGWAs leastwhenG Ö Sinteractionsareobserved. WhentryingtomapSNPsthathade˙ectinatleastonesex,weused PP SNP j = max h PPM SNP j ; PPF SNP j i and p -valuesfromanF-test(seeMethods)asmetricsforLBRandSMRmethods,respectively. Again,LBRshowedhigherpowerwithlowerFDRthanastandardSMR p -value(FigA.2).At traditionalmappingthresholds,LBRandSMRhadsimilarpowerbutLBRachievedthatpower 13 withmuchlowerFDR;at PP SNP j 0 : 95 ,theaveragenumberofSNPsselectedwas120.83with anestimatedpowerof0.799andestimatedFDRof0.009whileat p -value 5x10 -8 ,thenumberof SNPsselectedwas374.56withanestimatedpowerof0.794andFDRof0.66. 2.4.2.2PowerandFDRunderimperfectLD Inasecondroundofanalyses,weremovedallCVsfromthepanelofSNPsusedintheanalysisto representasituationwhereCVsarenotobserved,andgenotypedSNPsaretaggingCVsatvarying degrees.Asbefore,weinitiallyassessedtherelativeperformanceofLBRtoinfersegments harboringG Ö Sinteractions.PowerandFDRwereassessedatseveralresolutions:1Mb,500Kb and250KbregionsaroundeachCV.Ateachresolution,adiscoverywasconsideredtrueifthe ˝ndinglaidwithinasegmentharboringaG Ö SCV.PowerandFDRwerecomputedatdi˙erent thresholds( SNP j and ˙ 2 g j forLBRand pvalue -di˙forSMR;Fig2.3).Whenusing a1MbtargeththatcorrectG Ö Sdiscoveriesmustbewithin500Kboneithersideofa trueG Ö Sev ˙ 2 g j thresholds(LBR'swindow-basedmetrics)providedhighestpower withinanFDRrangeof0-0.3,thereafterSMRprovidedslightlyhigherpower.Asexpected, whenremovingCVs,powerwasestimatedtobemuchlowerthanwhenCVswereobserved;at ˙ 2 g j 0 : 95 ,theestimatedpowerandFDRwere0.454and0.004,respectively,whileat pvalue -di˙ 5x10 -8 ,estimatedpowerandFDRwere0.22and0.006.AsseeninFig2.3,when consideringa˝nerresolution(500Kband250Kb)theperformanceofbothLBR-basedapproaches wasmorerobustthanSMR.Altogetherthisindicatesthatforthediscoveryandmappingof unobservedG Ö Sinteractions,LBR'swindow-basedmetricprovideshigherpowerwithequivalent FDRand˝nerresolutionthansingle-markerregressionmethods. ToinfersegmentscontainingCVsthata˙ectatleastonesex,weagainusedLBRtodecide whethereithersex-speci˝ce˙ectwasnonzeroatthelevelofindividualSNPsorwindows.Usinga 1MBtargetarea,LBR'swindow-basedmetricsprovidedthehighestpowerwithinanFDRrangeof 0-0.025.Whendecreasingthetargetarea,LBRprovidedthehighestpoweroverlargerFDRranges (FigA.3). 14 Figure2.3:Powervsfalse-discoveryratefordiscoveringgenomicregionscontainingmaskedG Ö S interactions Herepowerisde˝nedastheexpectedproportionofG Ö Sinteractionsthatarebeingtaggedbyatleastone selectedSNP j orwindow j .Falsediscoveryrateisde˝nedastheexpectedproportionofselectedSNPs orwindowsthatarenottagginganyG Ö Sinteractions.Eachpointisanestimateanderrorbarsforbothaxes represent95%con˝denceintervals.Pointestimatesandintervalswerederivedusing30MonteCarlo replicates.Eachfacetcorrespondstoadi˙erentgeta˝xedwidtharoundeachG Ö Sinteraction thatde˝nesthesetofSNPse˙ectivelytaggingit.LBR(SNP):usesthe SNP j metricspanning1-0. LBR(Window):usesthe ˙ 2 g j metricspanning1-0.SMR:usesthe pvalue -di˙metricspanning(on the-log 10 scale)8-0. 15 2.4.3Forrealhumantraits,manynewlydiscoveredG Ö Sinteractionsshowrelativelysmall sex-speci˝ce˙ects Weanalyzedfourcomplexhumantraits(height,BMI,BMD,andWHR)measuredamong ˘ 259,000 distantlyrelatedCaucasiansfromtheUKBiobank( ˘ 119,000malesand ˘ 140,000females).For eachtrait,we˝ttheLBRmodel(equation2.1)acrosstheentireautosomeconsistingof607,497 genotypedSNPsusing417overlappingsegments(Fig2.1)andobtainedevidenceofG Ö Sinterac- tionsatthelevelofSNP j andwindow j . Tocompareboththemagnitudeandsignofsex-speci˝cSNPe˙ects,weplottedeach ^ f j against ^ m j (Fig2.4A).Thetraitwasscaledacrosssexespriortomodel˝tting;thus,male-andfemale- speci˝ce˙ectswerenotconstrainedtothesamescale.Inthisway,onemightexpectmale-speci˝c SNPe˙ectstouniformlydi˙erfromfemale-speci˝cSNPe˙ectsbyamultiplicativefactorifthe varianceofthephenotypeisdi˙erentbetweensexes(samplestatisticswithineachsexareprovided withinTableA.1).Surprisingly,wedidnotobserveevidenceofsex-speci˝cSNPe˙ectsuniformly di˙eringduetodi˙erencesinphenotypicscale;forheight,BMD,andBMI,asseeninFig2.4A, mostlargee˙ectSNPslienearthebluediagonalline.ForWHR,weobservedlargelyconsistent resultsfrompriorstudies[48,16,49]:namelytheprevalenceofnumerousSNPswithrelatively largee˙ectsinfemalesbutlittletonoe˙ectinmales.NotraitsexhibitedevidenceofanySNPs with(i)highcon˝dencemale-andfemale-speci˝ce˙ects(noSNPswith PPM SNP j 0 : 9 and PPF SNP j 0 : 9 )and(ii)di˙eringsignsbetweensexes. Wethenaggregatedsex-speci˝cSNPe˙ectswithinsmallLD-basedregionstoestimatesex- speci˝cwindowvariances ˙ 2 g m j and ˙ 2 g f j andcomparedthemagnitudeofeach(Fig2.4B). Interestinglyfortraitssuchasheight,manylargee˙ectregionsbearslightlylargerwindowvariances formalesthanforfemales.ThiswasnotobservedatthesingleSNPlevel,suggestingthat manyregionsbearingnumeroussmalle˙ectSNPsproduceaggregatee˙ectsthatarepotentially larger(althoughnotreachinga ˙ 2 g j 0 : 9 threshold)inmalesthaninfemales.One exampleisthe GDF5 locus,previouslyknowntostronglyassociatewithadultheight[53],wherea peak ˙ 2 g j signalcenteredonrs143384hadslightlydi˙erentestimatedsex-speci˝cwindow 16 Figure2.4:Comparingsex-speci˝cgenetice˙ects (A) PlotofestimatedfemaleSNPe˙ectsagainstestimatedmaleSNPe˙ectsforall607,497genotyped autosomalSNPs.Pointsarecoloredbytheirposteriorprobabilityofsexdi˙erenceatthelevelofindividual SNPs. (B) Plotofestimatedfemalewindowvariancesagainstestimatedmalewindowvariancesforall 607,497LD-basedwindows,witheachwindow j centeredonadi˙erentfocalSNP j .Pointsarecolored bytheirposteriorprobabilityofsexdi˙erenceatthelevelofwindowvariances. (C) Miami-likeplot depictinglocationandmagnitudeofG Ö Sinteractionsidenti˝edthroughsex-speci˝cwindowvariances. Foreachtrait,showingestimatedmalewindowvarianceabovethex-axisandestimatedfemalewindow variancebelowthex-axis.Verticallinesdenotechangingchromosomes.Asampleofwindowsislabeled withnearestgeneannotation,obtainedfromAxiomUKBWCSGannotations,release34.Graylabels indicatenearestgeneswithrelativelylargewindowvariancesevidentlysharedacrosssexes,whileredlabels indicatenearestgeneswithdetectedG Ö Sinteractions. 17 variances( ^ ˙ 2 g m j =3.0x10 -3 and ^ ˙ 2 g f j =2.6x10 -3 )butweakevidenceofaG Ö Sinteraction ˙ 2 g j =0.544).ForBMD,severallargee˙ectregionsshowsuggestiveevidenceofG Ö S interactionsincludingthe AKAP11 locusandthe CCDC170 locus( ˙ 2 g j =0.856and0.745, respectively),bothpreviouslyassociatedwithbonemineraldensity[54,55,56,57]. TomakeG Ö Sinferencesatthelevelofwindowvariancesirrespectiveofthemagnitudeofsex- speci˝ce˙ects,weadopteda ˙ 2 g j thresholdof0.9,whichinsimulations(Fig2.3)provided optimalpoweratanestimatedFDRof0.029whenusinga1MBtargetarea.Forheight,atotalof eightdistinctregionspossesseda ˙ 2 g j 0 : 9 ,twoofwhichpossesseda ˙ 2 g j 0 : 95 . ForBMI,5distinctregionspossesseda ˙ 2 g j 0 : 9 withnonereachingamorestringent ˙ 2 g j 0 : 95 threshold,andnoneoverlappingwithtwopreviouslysuggestedBMIG Ö SSNPs [58].AsseeninFig2.4C,inferredG Ö SinteractionsforheightandBMIpossessrelativelysmall sex-speci˝cwindowvariances;asanexample,forheight,thewindowcenteredonSNPrs1535515 (near LRRC8C )hada ˙ 2 g j =0.96,while ^ ˙ 2 g m j =2.1x10 -5 and ^ ˙ 2 g m j =1.1x10 -4 .ForBMD, sevenregionsreacheda0.9 ˙ 2 g j thresholdwhileonehigher-con˝denceG Ö Sinteraction ( ˙ 2 g j 0 : 95 )wasdetectedwithin ABO ,thegenecontrollingbloodtype. ForWHR,roughly45distinctgenomicregionspossesseda ˙ 2 g j 0 : 9 ,while34of thesepossesseda ˙ 2 g j 0 : 95 .WefoundmanypreviouslydetectedG Ö Sinteractionsknown toassociatewithWHRorarelatedtrait,WHRadjustedforBMI(WHRadjBMI)[48,16,49,50]. Theseincludedinteractionsat LYPLAL1 , MAP3K1 , COBLL1 , RSPO3 ,and VEGFA amongothers. WealsodetectednumerousnovelG Ö Sinteractions(Table2.1)nearphysiologicallyintriguinggenes suchastheestrogenreceptorgene ESR1 andtheATPbindingcassettetransporterA1gene ABCA1 knowntoplayaroleinHDLmetabolism( ˙ 2 g j 0 : 95 ).AsseeninTable2.1,bothnovel signalspossessedahigh-con˝dencefemale-speci˝ce˙ectwithweakevidenceforamale-speci˝c e˙ect( PPF ˙ 2 g j 0 : 95;PPM ˙ 2 g j 0 : 6 ),howeverthemagnitudeofthefemale-speci˝ce˙ect wasrelativelysmall( ^ ˙ 2 g f j 1 : 4 10 4 ).AsevidentfromTable2.1,mostnovelWHRG Ö S interactionsdetectablewithLBRarethosewithrelativelysmallsex-speci˝ce˙ects. Additionally,weutilizedatraditionalSMRapproach(seeMethods)forthediscoveryofG Ö S 18 Table2.1:G Ö Sinteractionsinferredthroughsex-speci˝cwindowvariances a FocalSNP b trait ^ ˙ 2 g m j c ^ ˙ 2 g f j c PPM ˙ 2 g j PPF ˙ 2 g j ˙ 2 g j Nearestgene d locationeQTL e rs8176719BMD0.060000.001821.0000.7941.000 ABO exon/frameshiftyes rs1535515height0.002110.011700.8190.9990.956 LRRC8C intronyes rs1544926height0.007630.000350.9830.4180.955 COL23A1 UTR-3yes rs6905288WHR0.005670.222000.9201.0001.000 VEGFA downstream rs72961013WHR0.032600.181001.0001.0001.000 RSPO3 downstream rs1128249WHR0.001320.107000.6141.0001.000 COBLL1 intronyes rs12022722WHR0.000800.071800.4901.0001.000 LYPLAL1 downstreamyes rs1776897WHR0.008700.061400.9761.0000.950 HMGA1 upstreamyes rs11057401WHR0.004380.060300.8461.0001.000 CCDC92 exon/missenseyes rs17777180WHR0.000310.059500.2911.0001.000 CMIP intronyes rs4607103WHR0.001950.059200.8091.0001.000 ADAMTS9-AS2 intronyes rs6937293WHR0.004570.046600.8391.0001.000 LOC728012 downstreamyes rs16861373WHR0.000660.043000.3891.0000.995 PLXND1 intron rs73068463WHR0.000680.042200.4611.0001.000 SNX10 intronyes rs9376422WHR0.001070.041800.5241.0001.000 LOC645434 upstream rs6867983WHR0.001920.038200.4401.0000.998 MAP3K1 upstream rs2171522WHR0.002410.036500.5611.0000.998 ITPR2 downstreamyes rs3810068WHR0.000260.035900.1741.0001.000 EMILIN2 upstreamyes rs568890WHR0.001290.031100.8091.0001.000 NKX2-6 upstreamyes rs1332955WHR0.006470.029400.9701.0000.973 LOC284688 downstreamyes rs13133548WHR0.000190.024000.1750.9690.956 FAM13A intronyes rs11263641WHR0.002070.023400.7231.0000.991 MYEOV downstreamyes rs2800999WHR0.002010.022200.6911.0000.979 TSHZ2 intron rs2244506WHR0.001010.020700.4530.9980.985 MIR5694 downstream rs7259285WHR0.001820.017100.7671.0000.989 HAUS8 downstreamyes rs4450871WHR0.000020.016800.0271.0001.000 CYTL1 downstream rs4080890WHR0.001530.016300.5940.9990.975 KCNJ2 downstream rs4684859WHR0.000390.015700.3300.9980.994 PPARG downstream rs7704120WHR0.000490.013700.4760.9980.991 STC2 downstream rs10991417WHR0.000480.012300.3390.9860.966 ABCA1 intronyes rs12454712WHR0.000870.010200.3600.9960.965 BCL2 intronyes rs62070804WHR0.000040.008870.0520.9690.961 ABHD15 exon/missenseyes rs10760322WHR0.000270.008120.2820.9860.968 LHX2 downstream rs1361024WHR0.000220.007600.2030.9820.962 ESR1 intron rs1358503WHR0.000210.007160.3090.9890.966 SEMA3C upstreamyes rs13156948WHR0.000160.006600.0790.9700.957 IRX1 downstream rs12432376WHR0.017400.000741.0000.5520.994 STXBP6 upstream a Listedarelociwithatleast0.95posteriorprobabilitythatsex-speci˝cwindowvariancesdi˙er.Thetableissorted˝rstbytrait,thenby magnitudeofthefemale-speci˝cwindowvariance.Resultsare˝lteredsuchthateachwindowlistedconsistedofadistinctsetofSNPs.A fulllistofallG Ö Ssignalsata ˙ 2 g j 0 : 90 thresholdisprovidedinTableA.2. b FocalSNPisde˝nedasthecenterSNP j inwindow j . c Theproportionofvarianceexplainedbysex-speci˝cSNPe˙ects,expressedasapercentage. d Nearestgeneandlocationidenti˝edthroughAxiomUKBWCSGannotations,release34.Thegene/locusisboldifithasbeenpreviously detectedasaG Ö SinteractionforWHRorWHRadjustedforBMI[48,16,49,50]. e IfthefocalSNPissigni˝cantlyassociatedwithgeneexpressioninatleastonetissue,accordingtoGTExV7. 19 interactionsamongtraitstocompare pvalue -di˙signalsto ˙ 2 g j signals(FigA.4).At pvalue -di˙ 5x10 -8 ,therewerenogenome-widesigni˝cantG Ö S-interactingSNPsforheight,one signi˝cantSNPforBMInearbyawindowwith ˙ 2 g j 0 : 9 ,andonesigni˝cantpeakwithin ABO forBMD(thesamesignaldetectedusing ˙ 2 g j ).Regionswitha ˙ 2 g j 0 : 9 generallycoincidedwithatleastnominally-signi˝cant pvalue -di˙signals;forheightandBMD, regionswith ˙ 2 g j 0 : 9 alsopossessedapeakSNPwith pvalue -di˙ 0.01.ForBMI, ˙ 2 g j 0 : 9 signalspossessedapeakSNPof pvalue -di˙ 0.1.This,togetherwiththefact thatnovelG Ö SinteractionsfoundusingLBRpossessrelativelysmallsex-speci˝ce˙ects,suggests thatLBRmaybedetectingG Ö Sinteractionsthatareotherwisemissedduetolowpower.Lastly forWHR,mostofthehigh-con˝dence ˙ 2 g j 0 : 9 signalscoincidedwithclearandobvious pvalue -di˙peaks. 2.4.4InferredG Ö Sinteractionsareenrichedintissue-speci˝ceQTL Asseenpreviously,manyG Ö SinteractionsinferredusingLBRhaveexceedinglysmallsex-speci˝c e˙ects.TofurtherinvestigatewhetherG Ö Sdetectionsusingthe ˙ 2 g j metricmaybefunc- tionallyrelevant,weinferredwhethersuchsignalsareenrichedineQTLidenti˝edfromGTEx. Speci˝cally,usingahypergeometrictestweaskedwhether ˙ 2 g j -selectedfocalSNPs(SNP j withinwindow j )wereenrichedineQTL,thencomparedtoeQTLenrichmentfrom pvalue - di˙-selectedSNPsasafunctionofthenumberofSNPsselected(FigA.5). ˙ 2 g j -selected focalSNPsshowedconsistentlyhighereQTLenrichmentthan pvalue -di˙-selectedSNPsforall traitsexceptWHR.Forinstance,at ˙ 2 g j 0 : 9 ,thetotalnumberofwindows(focalSNPs) selectedwas36,264,34,and13,forheight,WHR,BMD,andBMI,respectively.Withthese selections,eQTLenrichment p -valueswere2.39x10 -4 ,1.52x10 -12 ,2.01x10 -12 ,and8.33x10 -4 ,for height,WHR,BMD,andBMI,respectively.WhenselectingthesamenumberofSNPsusing pvalue -di˙,enrichmentp-valueswere2.25x10 -2 ,1.56x10 -28 ,5.54x10 -8 ,1.93 -1 ,forheight,WHR, BMD,andBMI,respectively. ToprovidemoreinformationabouthowgeneticregionsbearingG Ö Sinteractionsmayimpact 20 geneexpressioninspeci˝ctissues,wedeterminedwhetherfocalSNPsat ˙ 2 g j 0 : 9 are enrichedintissue-speci˝ceQTL(Fig2.5).Forheight,BMD,andWHR,suchSNPsshowed signi˝canteQTLenrichmentinatleastonetissue,usingaconservativebonferronicorrecteden- richmentp-valueof2.6x10 -4 (correctingfor192testsintotal;48tissuesand4traits).Interestingly, BMD'sG Ö SsignalsareverystronglyenrichedineQTLwithassociatedeGenes(including ABO and CYP3A5 )expressedintheadrenalgland,amongothertissues.Forheight,weobservedsmall enrichment p -valuesacrossmanytissuessinceG Ö SfocalSNPsareenrichedineQTLwithas- sociatedeGenes(including LOC101927975 and CNDP2 )expressedacrossmanytissues.Lastly forWHR,weobservedG Ö SdetectionstobeheavilyenrichedineQTLwithassociatedeGenes expressedin˝broblast,adipose,andskintissues. 2.5Discussion Wehaveinvestigatedthedegreetowhichsex-speci˝cgeneticarchitecturesdi˙eratlocal regions,usinglargebiobankdata(N ˘ 119,000malesand ˘ 140,000females)andBayesianmultiple regressiontechniquesthatestimatesex-speci˝cmarkere˙ectsaccountingforlocalLDpatterns. The˛exibilityoftheBayesianapproachenablesmulti-resolutioninferenceofsex-speci˝ce˙ects: fromindividualSNPe˙ectstowindow-variancesthataggregateSNPe˙ectswithinchromosome segments.Theseinferencescanbedrawnallusingtheresultsofthesamemodel˝t(equation2.1) butdi˙erentpost-processingofsamplesofSNPe˙ectsfromtheposteriordistribution. TheBayesianmultipleregressiontechniqueperformedinthisstudy,alongwithestimationof windowvariances,waslargelyinspiredbyFernandoetal.[26].Inthatstudy,windowswere de˝nedusingdisjoint,˝xedintervals.Incontrast,foreachSNPwede˝neawindowbasedonlocal LDpatterns,resultinginheavilyoverlapping,dynamicallysizedwindows.Themethodspresented herealsobearresemblancetothoseofVilhjálmssonetal.[45],whichutilizedpoint-normalpriors toestimatehumanSNPe˙ectsafteraccountingforlocalLDpatterns.Inthatstudy,posteriormeans ofSNPe˙ectswereestimatedforthepurposesofpredictionwhileinthisstudy,wenumerically derivethefullposteriordistribution,allowingforinferenceofnon-nullSNPe˙ectsandwindow 21 Figure2.5:EvidencethatLBR-identi˝edG Ö Sinteractionsareenrichedintissue-speci˝ceQTL Plottedonthex-axisisthe p -valueobtainedfromahypergeometictestprovidingevidencethatfocalSNPs selectedusing ˙ 2 g j 0 : 9 areenrichedintissue-speci˝ceQTL.Thedashedlinerepresentsa Bonferronicorrectedsigni˝cancethresholdof2.6x10 -4 . 22 variances. Throughsimulations,weshowedthatlocalBayesianregressions(LBR)providesuperiorpower andprecisiontodetectcausalvariantsandthosespeci˝callybearingG Ö Sinteractions.Werational- izeimprovementsinpowerupontraditionalSMRmethodsbynotingthatthemagnitudeofatypical causalvariantorG Ö Sinteractionisexceedinglysmallandcaneludehypothesistestingpartlydueto theburdenofmultipletestcorrection.Wealsonotethattheresolution(peaksize)inSMRsignals isrelativelylargewhenusinglargesamplesizes(duetonotfullyaccountingforlocalLDpatterns). Toovercomethisproblem,weprovidedevidencethatLBRbyestimatingtrue markere˙ectsorbyaggregatingtruemarkere˙ectswithinrelativelysmallachieve improvedresolutionwhenworkingwithlargesamplesizessuchasbiobank-leveldata. WhenusingLBRtoanalyzerealhumantraits,wehaveprovidedcredencetoourposterior probability-baseddiscoveriesbydeterminingthatLBR-detectedG Ö Sinteractionsaregenerally moreenrichedineQTLthanSMR-detectedinteractions.ForBMD,weprovidednewevidence thatsex-speci˝ce˙ectsdi˙erwithin ABO andthatG Ö Sinteractionsarehighlyenrichedinadrenal gland-speci˝ceQTL.ThisencouragesthehypothesisthatsomeG Ö SareeQTLthatmaymodulate geneexpressionintheadrenalgland,withgenefunctiondependentonthepresenceorabsenceof sexhormones.Thiswasalsoanintriguing˝ndinggiventhat ABO bloodgroupshavebeenknown toassociatewithosteoporosisandosteoporosisseverity[59,60].ForWHR,wedetectedpreviously known,large-magnitudeG Ö SinteractionsthatwerediscoveredusingWHRorWHRadjBMI[48, 16,49,50],butadditionallydiscoverednovel,smallmagnitudeG Ö Sinteractionsnearsuchgenes as ESR1 and ABCA1 .InapreviousworkanalyzingWHRadjBMI, ABCA1 showedasigni˝cant female-speci˝cgenetice˙ectonly,howeverthetestforG Ö Sinteractionfailedtoreachsigni˝cance [50]. FortraitslikeheightandBMI,largee˙ectlociareestimatedtohaveverysimilare˙ectsbetween malesandfemalesandlociwithevidenceofG Ö Sinteractionswerethosepossessingrelativelysmall sex-speci˝ce˙ects.AsseeninFig2.4B,manyrelativelylargewindowvariancesforheightare estimatedtobeslightlyhigherformalesthanforfemalesalbeitnotreachinga ˙ 2 g j 0 : 9 23 threshold.Thisisconsistentwiththefactthattheglobalgenomicvarianceforheightwasestimated tobehigherinmalesthaninfemalesinapreviousstudyusingtheinterimreleaseoftheUKBiobank [14].Similarly,thesamepriorstudyestimatedtheglobalgenomicvarianceofBMItobehigher infemalesthaninmalesandweobserve,ifanything,evidenceofsex-speci˝cwindowvariances leadingtothesameconclusion.Theseobservationsmaypotentiallyindicatethatrelativelylarge causalvariantshaveslightlydi˙erentsex-speci˝ce˙ectsfortraitslikeheightandBMI,however,if thatisthecasewearestillunderpoweredtocon˝dentlydetectsuchinteractions. Itisimportanttoacknowledgethatwhilethemethodspresentedhereappearusefultodecipher sex-speci˝cgeneticarchitecturesfromlargehumansamples,additionalworkwillberequiredto determinehowthesetechniquesmayinferheterogeneousgenetice˙ectsinothercontexts(other typesofgene-by-covariateinteractions),orwhenusingdi˙erentsamplesizesorsamplesfrom di˙erentpopulations.Withlargesamplesizes,theincreasedpowerand˛exibilityofLBRcomes withthecostofasigni˝cantlylargercomputationalburdenthantheoneinvolvedinthetraditional SMRapproach;however,workingwithlargedatasetscanbemademanageablebyadjustingthesize ofeach˝ttedsegment(Fig2.1)andparallelprocessingthe˝ttingofeachsegment.Alternatively, LBRmaybeusedasafollowuptotraditionalSMRtests,usingpre-selectedregionsofinterest. AnotherlimitationinherenttoaggregatingSNPe˙ectsusingwindowvariancesisthatthesignof thee˙ectislost.Inthisway,wheninferringG Ö Sinteractionsthroughwindowvariancedi˙erences, wecannotcommentonwhethersex-speci˝ce˙ectshadthesamesignordi˙eringsigns. Toconclude,wehavedemonstratedthepowerfuland˛exibleuseoflocalBayesianregressions forGWAtoinfersex-speci˝cgenetice˙ectsandG Ö SinteractionsusingtheUKBiobank.This waslargelydonebyshowingvariousmeanstoutilizeestimatesoftrue(accountingforlocalLD), sex-speci˝cSNPmarkere˙ectsforGWAevenwhencausalvariantsarenotontheSNPpanelfor analysis.Weanticipatethatmanymoretraitswillbeanalyzedwiththismethodtoincreasinglylearn moreaboutwhatiscontributingtodi˙erencesbetweenmalesandfemalesinhumanpopulations. 24 2.6Methods 2.6.1Genotypedata IndividualsfromtheUKBiobank[61]weregenotypedusingthecustomUKBiobankAxiomArray ( http://www.ukbiobank.ac.uk/scientists-3/uk-biobank-axiom-array/ )containing ˘ 800,000SNPs.SNPqualitycontrolproceededwiththeCaucasiancohort(N=409,700);SNPs withaminorallelefrequency<0.01andmissingcallrate>0.05wereremoved.SNPsfrom sexchromosomesandthemitochondrialchromosomewerenotconsideredinthisstudy,resulting in607,497autosomalSNPs.Individualswithcoe˚cientofrelatednessof0.03orgreaterwere removedfromanalysis,resultingin258,928distantlyrelatedgenotypedindividualsforuseinthis study. 2.6.2Phenotypedata AllphenotypicdatawascollectedusingbaselinemeasurementsofUKBiobankparticipants.For height,thedescriptiontandingfromtheUKBiobankwasused.Individualswithheights (cm)lessthan147ormorethan210wereremovedfromanalysis.ForBMD,thedescriptions bonemineraldensitybonemineraldensity(BMD)andbonemineral density(BMD)(rwereusedinconjunction;forindividualswithmissingbonemineral densityrecords,eitherthe(left),the(right),orifavailable,theaveragebetween(left)and (right)wasused.ForBMI,thedescriptionmassindexwasusedandforWHR, theratioof"Waistcircumftocircumfwasused.Priortomodel˝tting,all traitswerepre-correctedforsex,age,batch,genotypingcenter,andthe˝rst5principlecomponents derivedfromgenomicdata.Theadjustedphenotypesconsistedofleast-squaresresidualsfroma modelthatincludedthee˙ectslistedabove.Foreachtrait,samplesizesandwithin-sexsummary statisticsareprovidedinS1Table. 25 2.6.3LBRhyperparameters HyperparametersusedintheLBRmodel(eq.1)wereerrorvariancesforeachsex,theproportionof nonzeroe˙ectsforeachSNPe˙ectcomponent,andthevariancesofnonzeroe˙ectsforeachSNP e˙ectcomponent n ˙ 2 " m ;˙ 2 " f ;ˇ 0 ;ˇ m ;ˇ f ;˙ 2 b 0 ;˙ 2 b m ;˙ 2 b f o .Variances(ofeitherSNPe˙ectcomponents orsex-speci˝cerrors)weregivenascaled-inverseChi-squareprior,parameterizedbyadegreeof freedomparameter df (setto5)andscalingparameter S . S issetaccordingtobuilt-inrules oftheBGLRpackageusingapriormodelR-squaredof0.03formaine˙ectsand0.01forthe sex-interactionterms.Moredetailonhowthescaleparameter S iscalculatedcanbefoundinPerez anddelosCampos,2014[52]. ˇ k wasgivenabetapriorwithshapeparameters = 2 and = 2 . AnexampleofhowtoimplementLBR(eq.2.1)usingBGLRwiththeabovehyperparameter speci˝cationsisprovidedat https://github.com/funkhou9/LBR-sex-interactions . 2.6.4Inferenceusingpost-processingofposteriorsamples BGLRusesMarkovchainMonteCarlo(MCMC)tosamplefromtheposteriordistributionof sex-speci˝ce˙ects.ForeachMCMCsamplewederivedmaleandfemalee˙ectsusing m j ¹ s º = b 0 j ¹ s º + b m j ¹ s º and f j ¹ s º = b 0 j ¹ s º + b f j ¹ s º ,where s = 1 ;:::; 4 ; 350 indexesMCMCsamples.Here, resultswereobtainedusingthreeseparateMCMCchains.Eachchainwasobtainedusing3,400 MCMCsamples;the˝rst500sampleswerediscardedasburn-inandtheremainingsampleswhere thinnedbyanintervalof2,leadingto1,450samplesperchain. Estimatesofsex-speci˝cSNPe˙ects ^ m j and ^ f j wereobtainedfromtheirposteriormeans. Weestimatedtheposteriorprobabilityofafemale-speci˝cnon-zeroSNPe˙ectusing PPF SNP j = max h Pr f j > 0 jD ; Pr f j < 0 jD i ,where D representstheobserveddata.Thiswasdone bycountingtheproportionof f j samplesabovezeroandbelowzero.Thiswasrepeatedfor inferringthemale-speci˝cSNPe˙ect.Theposteriorprobabilityofsex-di˙erenceatindividual SNP-e˙ectswasestimatedusing SNP j = max h Pr m j > f j jD ; Pr m j < f j jD i whereagaintheseprobabilitieswereestimatedusingthecorrespondingfrequenciesfromthe posteriordistributionsamples. 26 ForeachMCMCsamplewealsoaggregatedSNPe˙ectswithinwindow j using u m j ¹ s º = X j m j ¹ s º and u f j ¹ s º = X j f j ¹ s º .Forthiscalculationweusedacommongenotypematrix X j consistingofall N maleandfemalegenotypestoavoiddi˙erencesinadditivegeneticval- uesarisingfromallelefrequencydi˙erencesbetweenmalesandfemalesoccurringbyrandom sampling.Samplesofsex-speci˝cwindowvarianceswereobtainedusingthesamplevariance: ˙ 2 m j ¹ s º = ¹ N 1 º 1 Í N i = 1 u m i j ¹ s º u m j ¹ s º 2 and ˙ 2 f j ¹ s º = ¹ N 1 º 1 Í N i = 1 u f i j ¹ s º u f j ¹ s º 2 . Estimatesofsex-speci˝cwindowvarianceswereobtainedfromtheirposteriormeans.Infer- ringsex-speci˝cwindowvarianceswasdonebyestimating PPM ˙ 2 g j = Pr ˙ 2 g m j > 0 jD and PPF ˙ 2 g j = Pr ˙ 2 g f j > 0 jD andinferringaG Ö Sinteractionatwindowj*wasdonebyestimat- ing: ˙ 2 g j = max Pr ˙ 2 g m j ˙ 2 g f j > t j jD ; Pr ˙ 2 g m j ˙ 2 g f j < t j jD ; where t j wasusedtoexertjudgmentabouthowdi˙erentsex-speci˝cwindowvariancesmustbe todeclareameaningfulG Ö Sinteraction.Here, t j wasone-tenthofthemeanofallposterior samplesof ˙ 2 g m j and ˙ 2 g f j .Functionstoprocessposteriorsamplestoestimateandinfernon- nullsex-speci˝ce˙ectsandG Ö Sinteractionsisprovidedat https://github.com/funkhou9/ LBR-sex-interactions . 2.6.5De˝ninglocal,LD-basedwindows Tode˝neSNPscontainedwithinwindow j ,aregionofLDcenteredonSNP j ,wecollected allSNP j 0 immediatelysurroundingSNP j forwhich cor ¹ x j ; x j 0 º 2 0 : 1 .Weallowedupto twoconsecutiveSNPsinwhich cor ¹ x j ; x j 0 º 2 < 0 : 1 toallowforpotentialmappingerrorsorother unexplainedinstanceswhereLDwithSNP j dipsonlybrie˛y.Thefunction getWindows() ,which provideswindowsgivenagenotypematrix X ,isprovidedin https://github.com/funkhou9/ LBR-sex-interactions . 27 2.6.6Singlemarkerregression Wealsoperformedsingle-markerregressionanalysesusingfollowingmodel: 2 6 6 6 6 6 4 y m y f 3 7 7 7 7 7 5 = 2 6 6 6 6 6 4 1 m 1 f 3 7 7 7 7 7 5 + 2 6 6 6 6 6 4 x m j x f j 3 7 7 7 7 7 5 j + 2 6 6 6 6 6 4 x m j 0 3 7 7 7 7 7 5 G S + 2 6 6 6 6 6 4 " m " f 3 7 7 7 7 7 5 : (2.2) AswiththeLBRmodel(equation2.1),weassumesex-speci˝cerrorsaredistributednormally withzeromeanandsex-speci˝cvariances.SNPe˙ectsandinteractionswereestimatedusing weightedleastsquares.TotestforaG Ö SinteractionatSNP j ,at-testisused: ^ j G S SE ^ j G S ˘ t N 3 .The p -valuefromsuchatestisreferredtoas pvalue -di˙.Totestforanyassociation(either amongmales,females,orboth),weusedanF-test,comparingarestrictedmodel: 2 6 6 6 6 6 4 y m y f 3 7 7 7 7 7 5 = 2 6 6 6 6 6 4 1 m 1 f 3 7 7 7 7 7 5 + 2 6 6 6 6 6 4 " m " f 3 7 7 7 7 7 5 againsttheunrestrictedmodelinequation2.2. 2.6.7Simulations Simulatedtraitsweredevelopedusing60,000genotypedSNPs(the˝rst6,000SNPsfromthe˝rst tenchromosomes)from119,190malesand139,738females.UsingtheseSNPgenotypes,each traitwassimulatedasfollows: 1. Atotalof150causalvariants(CVs)wererandomlysampledfrom60,000SNPs. Let Z m = n z m ik o N m = 119 ; 190 ; q = 150 i = 1 ; k = 1 and Z f = n z f ik o N f = 139 ; 738 ; q = 150 i = 1 ; k = 1 denotematricesof maleandfemalegenotypesatsampledCVs. 2. AdditiveCVe˙ectsizeswererandomlysampledfromthegammadistribution.90CVs (thosewithhomogenouse˙ects)weresampledfrom Gamma ¹ k = 10 ; = 1 º andweremade negativewithaprobabilityof0.5.Ofthe60CVswithdi˙eringsex-speci˝ce˙ects,30had nonzeroe˙ectsinbothsexesbutwithdeferringmagnitudes:atrandomonesex'se˙ects 28 weresampledfrom Gamma ¹ k = 5 ; = 1 º andtheotherfrom Gamma ¹ k = 20 ; = 1 º .For theremaining30CVs,atrandomonesex'se˙ectswereexactlyzerowhiletheothersex's e˙ectsweresampledfrom Gamma ¹ k = 10 ; = 1 º . Let m = n m k o q = 150 k = 1 and f = n f k o q = 150 k = 1 denotevectorsofmale-speci˝cand female-speci˝cCVe˙ects,respectively,forall150CVs. 3. Errorvariancesformales ˙ 2 m andfemales ˙ 2 f wereadjustedsuchthattheproportionof phenotypicvarianceexplainedbyallQTLis0.05forbothmalesandfemales(onthecomplete genomescalethiscorrespondstoaheritabilityofabout0.5). Let m i ˘ N 0 ;˙ 2 m and f i ˘ N 0 ;˙ 2 f denoteresidualerrorforthe i th maleand i th female. 4. Maletraits ˚ m = ˚ m i N m = 119 ; 190 i = 1 andfemaletraits ˚ f = n ˚ f i o N f = 139 ; 738 i = 1 weresimulated fromalinearcombinationofQTLgenotypesplusaresidualerror: ˚ m = Z m m + m and ˚ f = Z f f + f 5. Steps1-4arerepeatedfor30MonteCarloreplicates. 2.7Acknowledgments EnrichmentanalysisperformedinthismanuscriptwasdoneusingdatafromtheGenotype- TissueExpression(GTEx)Project.Single-tissuecis-eQTLdatawasdownloadedfrom https: //gtexportal.org/home/datasetson02/01/19 . 29 CHAPTER3 EVIDENCEFORTRANSCRIPTOME-WIDERNAEDITINGAMONGSUSSCROFA PRE-1SINEELEMENTS Thischapterhasbeenpublishedpreviously[62].Themanuscriptwaspreparedalongsideco-authors JuanPSteibel,RonaldOBates,NancyERaney,DariusSchenk,andCatherineWErnst. 3.1Abstract RNAeditingbyADAR(adenosinedeaminaseactingonRNA)proteinsisaformoftranscrip- tionalregulationthatiswidespreadamonghumansandotherprimates.Basedonhigh-throughput scansusedtoidentifyputativeRNAeditingsites,ADARappearstocatalyzeasubstantialnumber ofadenosinetoinosinetransitionswithinrepetitiveregionsoftheprimatetranscriptome,thereby dramaticallyenhancinggeneticvariationbeyondwhatisencodedinthegenome.Here,wedemon- stratetheeditingpotentialofthepigtranscriptomebyutilizingDNAandRNAsequencedatafrom thesamepig.Weidenti˝edatotalof8550mismatchesbetweenDNAandRNAsequencesacross threetissues,with75%oftheseexhibitinganA-to-G(DNAtoRNA)discrepancy,indicativeofa canonicalADAR-catalyzedRNAeditingevent.Whenweconsideronlymismatcheswithinrepet- itiveregionsofthegenome,theA-to-Gpercentageincreasesto94%,withthemajorityofthese locatedwithintheswinespeci˝cSINEretrotransposonPRE-1.WealsoobserveevidenceofA-to-G editingwithincodingregionsthatwerepreviouslyveri˝edinprimates.Thus,ourhigh-throughput evidencesuggeststhatpervasiveRNAeditingbyADARcanexistoutsideoftheprimatelineageto dramaticallyenhancegeneticvariationinpigs. 3.2Background Eukaryotesareknownforrelativelycomplexmechanismsusedtoregulategeneexpression. Onesuchmechanism,RNAediting,enablesthecelltoaltersequencesofRNAtranscripts[30] suchthattheyarenolongerforcedtomatchthegenomesequence.Highthroughput 30 methodsforstudyingtargetsofthismechanismtranscriptome-widehavebeenappliedtoprimate studies,whereevidenceformassiveamountsofADAR(adenosinedeaminaseactingonRNA) catalyzedA-to-IRNAeditinghasbeendiscovered,preferentiallywithinSINEretrotransposons suchastheprimateAlu[32,63,64,65,66,67,68].Suchworkhasyettobeperformedwithpig transcriptomesusingthelatestsequencingtechnology.AlthoughlittleisknownaboutpigSINE elementscomparedtothoseinprimates,keyfeaturesofthepig-speci˝cPRE-1retrotransposon makepigsanintriguingmodeltofurtherelucidatetranscriptome-widepatternsofADARtargets. ADARcanonlycatalyzeA-to-IeditingwithindsRNA.Thehigheditibilityoftheprimate speci˝cAluelementisattributedtoitscapacitytoinducedsRNA;theseelementshaveahighcopy number,areshort,relativelyundivergedfromoneanother,andtendtoclusteringenerichregions ofthegenome[69].Whenappearingastandemandinvertedpairswithinthesametranscribed region,thesepropertiesfacilitateintra-moleculardsRNAformationthatserveasADARtargets [32,70].Comparatively,thepigPRE-1elementpossessesmanyofthesesamepropertiesthatare believedtocontributetodsRNAformationwithinthetranscriptome.Notably,PRE-1hasthe3rd highestcopynumberofanySINEcatalogedonSINEBase[71]. SinceAluelementsaregenerallyfoundwithinandneargenes,ADAReditinginhumans preferentiallytargetsnon-codingregionsofmanygenessuchasintrons,UTRsandupstreamand downstreamgeneproximalregions.ADAReditingoftheseregionsisthoughttobeakeycomponent ofRNAprocessingviamechanismsthatincludeAluexonization[72]andRNAipathwayalteration [73].BydemonstratingthatRNAeditinginpigsgenerallytargetsSINEelementswithinnon- codingregionsofgenes,thiswouldsuggestthatRNAprocessingbywayofADAReditingofSINE elementspredatedtheemergenceofprimateandpig-speci˝cretrotransposons.Rarely,ADAR editingoccurswithincodingregionstoalteraminoacidsequences[74].Thistypeofeditingis particularlymysteriousinthatitspatternislesstraceablethannon-codingediting,butisnevertheless site-speci˝candrequiredforthefunctionofessentialproteincodinggenessuchas GluR-B inmice [33].Therefore,inadditiontotheregulationoftranscriptsbywayofeditingnon-codingSINE elements,editingofcodingregionsisanessentialformoftranscriptionalregulationinmice,with 31 theextentofitsconservationacrossMammaliayettobefullydetermined. Here,wedemonstratethepig'scapacityforRNAediting.Bystudyingthisprocessinarelatively distantspeciestohumanwithadistinctrepetitiveelementrepertoire,wewanttodetermineifRNA editingpatternsseeninAlubearinggenomescanlikewisebeobservedinpigs.RNAediting detectionwasdonebyanalyzingasinglepigusingwholegenomesequencingdataandRNA sequencingdatafromliver,subcutaneousfat,and longissimusdorsi muscle.Basedonprevious studiesdoneinprimates,abioinformaticstrategywasusedto˝ndA-to-I(observedasA-to-G) DNAtoRNAmismatchesthatgiveevidenceofADARcatalyzedRNAeditingevents. 3.3Resultsanddiscussion 3.3.1DNAandRNAsequencing Toprovidethematerialsneededforatranscriptome-widesurveyofRNAeditingcandidates, genomicDNAaswellastotalRNAfromliver,subcutaneousfat,and longissimusdorsi (LD) musclewerepuri˝edfromsamplesobtainedfromasingleanimal,similartoanothersingle-animal editomestudy[68].SequencingwasdoneusingtheIlluminaHiSeq2500togenerate150x2paired endreadsfromgenomicDNA,withPolyARNAsequencingusedtogeneratecDNAreadsin thesameformat.Roughly250Mpass-˝ltergenomicDNAreadsweregeneratedwithanaverage overallalignmentrateof89%tothe Susscrofa referencegenomesequence( Susscrofa 10.2.69). Anaverageof106Mpass-˝lterstrandspeci˝ccDNAreadswereobtainedfromeachtissue,with anaverageoverallalignmentrateof76%. 3.3.2Identi˝cationofcandidateRNAeditingevents ToscanthetranscriptomeforpossibleRNAeditingsites,weutilizedacustompipelinein˛uenced bypreviousstudiesdoneinhumancelllinesandprimates[75,68].Priortoalignment,inorder toavoidutilizingbaseswithrelativelypoorbasequalitiesattheendsofreads,rawgenomicDNA andcDNAsequencingreadsweretrimmedforbasequalityattheir3'endsbeforealigningtothe Susscrofa 10.2.69referencegenome.Additionaltrimming6bpfromthe5'endsofcDNAreads 32 wasdonetopreventmisidenti˝cationofDNARNAmismatchesduetoartifactsassociatedwith theuseofrandomhexamersduringcDNAlibrarypreparation[76].Whenconductingasearchfor RNAeditingcandidateswithRNA-seq,strand-speci˝cRNA-seqlibrariescanbeutilizedtoaccount forthestrandednessofeachtranscript,therebyenablingA-to-GDNA-to-RNAmismatchestobe distinguishedfromT-to-CDNA-to-RNAmismatches.Inordertoutilizeourstrand-speci˝ccDNA alignmentsforvariantcallingwhilepreservingthestrandednessofeachalignmenttodistinguishA- to-GfromT-to-Cmismatches,plus-strandalignmentswereseparatedfromminus-strandalignments foreachcDNAsample.FromallgenomicDNAandcDNAalignments,weextractedthosereads thathadonly1recordedalignmentinordertooptimizeourchancesthatgenomicDNAandcDNA readsarisingfromthesamelocusmaptothesamelocation.JointvariantcallingusingSAMTools [77]wasperformed,combininggenomicDNAalignmentswithcDNAplus-strandalignmentsfrom eachtissue.ThiswasrepeatedforallcDNAminus-strandalignments.BothresultingVCF˝les wereanalyzedusingeditTools,anin-houseRpackagemadetoe˚cientlyscanVCF˝lesforDNA RNAmismatchesusingC++sourcecode.editToolswasdevelopedtoimplementRNAediting detectionwithintheRframeworkandtoprovidevisualizationtools;editToolswasusedtogenerate all˝guresinthismanuscriptpertainingtosequencingdata.DefaulteditToolsparameterswere used,inwhichamismatchwasconsideredacandidateRNAeditingsiteifataparticularlocus1) thegenotypeishomozygousaccordingto95%oftheDNAreads,2)atleast10readswereusedto determinethegenotype,3)neithergenomicDNAnorcDNAsamplesareindels,4)atleast5cDNA readsfromthesametissuedi˙erfromthegenotypecall,and5)thesecDNAreadsmusthavea Phred-scaledstrand-biasP-valueof20orless.Speci˝cthresholdsforDNAandcDNAsequencing depthsweredeterminedaccordingtoapreviousstudythatpro˝ledtherhesusmacaqueeditome fromasingleanimal[68].Usingthisapproach,weidenti˝edatotalof6410A-to-Gmismatch eventsrepresenting75%ofallmismatchesfound(8550totalmismatches;Fig3.1).Whenwe restrictoursearchtoknownswinerepetitivesequences,5993outof6410A-to-Gmismatchesare retained,representing93.8%ofallmismatchesinrepetitiveregions.Oftheremainingmismatches inrepetitiveregions,4.1%areT-to-C.ItisnotsurprisingthatT-to-Cmismatchesarethesecondmost 33 Figure3.1:DNAtoRNAmismatchcounts Comparingallmismatchesfoundtranscriptomewide(Left)tothosewithinthebodyofarepetitiveelement (Right).Percentagesshownareoutofallmismatchesfoundineachcategory. commonsinceT-to-CartifactscouldariseifatatrueA-to-Geditingsite,plus-strandalignments wereincorrectlyidenti˝edasminus-strandalignmentsorviceversa.Notethatourobservationof 8550A-to-GmismatchesisintendedtobeaconservativeestimateofthetotalnumberofADAR- catalyzededitingsitesinthesethreetissues,primarilybecausewehaverestrictedoursearchto homozygoussites;atheterozygoussites,itisnotfeasibletodirectlydeterminewhichalleleisbeing edited,orifeditingistrulyoccurringateitherallele. 3.3.3Tissuedi˙erences Tounderstanddi˙erencesincandidateRNAeditingsitesbetweentissues,canonicalA-to-Gmis- matcheswerealignedacrosstissuesiftheyweredetectedatthesamephysicalpositionandonthe samestrand.ThenumberofcandidateRNAeditingeventswasfewerinLDcomparedtoliveror fat(Fig3.1),consistentwithlowerRNAeditingactivityinmusclecomparedtoothertissuesfor rhesusmacaque[68].DespitecandidateRNAeditingsitesshowingstrongtissuespeci˝city,atotal 34 Figure3.2:SharedA-to-Gmismatchesbetweentissues Amismatchbetweentwoormoretissueswasconsideredsharedifitoccurredatthesamephysicalposition andonthesamestrand. of144A-to-Gmismatcheswerefoundtobecommonamongallthreetissues,whereas748were foundtobecommonbetweenliverandfat(Fig3.2). Onefactorthatmaycontributetotissuespeci˝cityofRNAeditingisdi˙erentialexpression ofADAR[78].UsingRNAsamplesfrom33additionalpigs,aquantitativereal-timePCRassay wasusedtoinferADARtranscriptabundancedi˙erencesbetweenliver,subcutaneousfat,and LDmuscle(Fig3.3).AverageADARexpressionwasdeterminedtobesigni˝cantlylowerinLD muscletissuethanineitherfat(p<0.0003)orliver(p<0.00001)tissues,suggestingthatdi˙erential ADARexpressionmaycontributetodi˙erencesincandidateRNAeditingsitesbetweentissues. 3.3.4Controllingforerrorsduetomappingquality AfterimposingsuchstrictrestrictionsasexcludinggenomicDNAandcDNAreadsthathadmore thanonerecordedalignmentandtrimmingtheendsofreadspre-alignment,wewantedtoassess howwellsuchmeasuresprotectagainstmappingerrors,whichareamongtheleadingcausesof RNAeditingmisidenti˝cationwhenusingshortreads[76,79].Mappingqualityisameasurement 35 Figure3.3:RelativeADARtranscriptabundancebetweentissues ExpressionwasmeasuredrelativetotheLDmusclesampleusedforsequencing.Usingaone-way ANOVA,asigni˝cante˙ectoftissueonADARexpressionwasdetected(p<0.0001).Pairwise comparisonsoftissuemeansusingTukeyHSDshowssigni˝cantdi˙erencesinADARexpressionbetween LDandliver(p<0.00001)andbetweenLDandfat(p<0.003),butnosigni˝cantdi˙erencebetweenfat andliver(p=0.0505563). thatprovidesaprobabilitythatareadismisaligned,givenitsnumberofpossiblealignmentsand sumofbasequalitiesforeachalignment[80].Knowingthis,andundertheassumptionofnoRNA editing,foreachmismatchlocus i wecomputedtheprobabilityofobservingatleast5 readsgiventhecDNAsequencingdepth N i andaveragesamplemappingquality MQ i .Among all8550repetitiveandnon-repetitivemismatchpositions,themaximalprobabilityofobserving atleast5readswas ˘ 6 : 772 10 15 forasitewith N =13andaverage MQ =29. IfBonferronicorrectionisusedthen0.05/189,638=6.23x10 -7 canbeusedasathresholdfor transcriptome-widesigni˝cance,where189,638wasthetotalnumberofqueriedcDNApositions withasequencingdepthofatleast5cDNAreadsthatwereatthelocationofhomozygouslociin thegenomicsequence.Fromthisevidenceweconcludethatourpipelinesu˚cientlyminimizes artifactsassociatedwithmappingqualitywhenusingthe Susscrofa 10.2.69assembly. 36 3.3.5Pigeditomefunctionalimplications Littleisknownabouttheaveragee˙ectofRNAeditingtranscriptomewide.Forhumans,one prevailinghypothesisisthattheexonizationofAluSINEelementsiscontrolledinpartbyA-to-G editing.Aninstanceofthismechanismhasbeendemonstrated,whereintronicA-to-Gediting eventscontributetoalternativesplicingof nuclearprelaminA sothatanAluelementisincludedin anexon[72].ToexplorethepossibilitythatRNAeditinginpigstargetsintronstoa˙ectsplicing, editToolswasusedtosynthesizemismatchdatawithVariantE˙ectPredictordatato˝ndthe relativelocationsofeachmismatchrelativetoannotatedtranscripts.Consistentwithwhathasbeen foundinhumans[32],nearlyhalfofalldetectedA-to-Gmismatchesarelocatedinretainedintrons (Fig3.4).Theremainingsitesareconcentratedinothernon-codingregionsincluding3'UTRs, intergenic,andgeneproximalregions.Whilethemajorityofnon-codingeditingeventsinhumans areattributedtothepositionandorientationofSINEelementswithintranscripts[70],codingRNA editingoccursrarely,usuallyoutsiderepetitiveelementsbutneverthelesssite-speci˝cally.Ithas beensuggestedthatsite-speci˝cityofcodingRNAeditingeventsisfacilitatedbynearbySINE elements,whichthroughtheirinductionoflongdsRNAregions,recruitADARinsu˚cientdensity toa˙ectcodingregionsincloseproximity[75].Fromourdata,only49pigA-to-Gmismatches werefoundwithincodingregionsandofthose,34wouldresultinamissensevariant(Table3.1). Itcanbenotedthatanumberofaminoacidchangesresultingfromveri˝edmacaqueDNARNA mismatches[68]canbefoundamongourpigdatasetmismatchesthatcontrolI/VinCOPA, Y/CinBLCAP,I/VinCOG3,K/RinNEIL1,andQ/RinGRIA2.Interestingly,Y/Crecoding ofBLCAPviaRNAeditinghasbeenassociatedwithhepatocellularcarcinoma(HCC)inhumans asHCCsampleswereshowntoexpresseditedBLCAPinsigni˝cantlyhigheramountsthannon- HCCsamples[81].Additionally,exon6K/RrecodingofNEIL1byRNAeditingwaspreviously thoughttobeprimatespeci˝candattributedtotheK/Rsite'sproximitytoAludenseregions[82], howeverwewitnessevidenceofthesameK/Rrecodingofexon6viaanA-to-Geditingeventin pigs.IfinfactSINEelementsrecruitADARtoa˙ectnearbycodingregions,thenourdatasuggest theremarkableconservationofNEIL1K/Rrecodingacrossgenomeswithentirelydi˙erentSINE 37 Figure3.4:A-to-Gmismatchlocationsrelativetothenearestannotatedgenes PercentagesshownareoutofallA-to-Gmismatches. elements. 3.3.6Pigeditomeassociationwithpig-speci˝cSINEelements SincepropertiesoftheprimateAluelementaresuggestedtoin˛uenceRNAeditinginbothcoding andnon-codingregions,oneofourprimaryinterestswastodeterminewhichSINEelementsinpigs arecapableofattractingthemajorityofADARactivity.AgainusingthefunctionalityofeditTools, wemergedourmismatchdatawithdatafromRepeatMaskertodeterminewhichrepetitiveregions containputativeRNAeditingsites.Asmentionedpreviously,5993outof6410A-to-Gmismatches arelocatedwithinthebodyofarepetitiveelement.Uponcloserinspection,5715ofthe5993are withinpigSINEelementsasopposedtoLINEelementsandothers(Fig3.5A),althoughSINEs 38 Table3.1:A-to-Gmismatchesresultinginaminoacidchanges PositionGenesymbol/IDAASIFTTissues 1:63408856ENSSSCG00000029003L/Ptolerated(1)FatLDLiver 1:125424444ENSSSCG00000024660Q/Rtolerated(1)FatLDLiver 2:12622576LDHBI/Mtolerated(1)FatLDLiver 2:49316285ARNTLK/Etoleratedlowcon˝dence(1)Liver 4:98044799COPAI/Vdeleterious(0.02)Fat 5:42375023KRR1I/Tdeleterious(0.01)Liver 6:92516721PTPRMK/Rtolerated(1)Fat 6:146168578NDC1E/Gdeleterious(0.01)Liver 7:62951442NEIL1K/Rdeleterious(0.02)FatLD 7:81602273ENSSSCG00000002045C/Rtolerated(1)FatLDLiver 7:102789222ACOT4T/Atolerated(0.61)Fat 7:129322238RPS21C/R-FatLDLiver 8:28015971ENSSSCG00000008767H/Rtolerated(1)FatLDLiver 8:31629014TLR1I/Vtolerated(1)Liver 8:32309809RPL9I/Vtolerated(0.4)Fat 8:32309814RPL9E/Gdeleterious(0.01)Fat 8:48244993GRIA2Q/Rtolerated(0.07)Fat 9:41146365ENSSSCG00000023913Q/Rdeleterious(0.04)Fat 9:74510703ENSSSCG00000015294K/Rtolerated(0.13)Liver 9:83273454SLC25A13E/Gdeleterious(0.02)LD 11:22178068COG3I/Vtolerated(1)FatLDLiver 12:20231860AOC3Q/Rtolerated(1)Liver 13:131377159EIF2B5Q/Rtolerated(1)Fat 13:156760971UBE2BD/Gtolerated(0.48)FatLDLiver 13:206979572SONR/G-Fat 14:40832826PLBD2R/Gtoleratedlowcon˝dence(0.12)Fat 14:52398588IGLV-3E/Gtolerated(0.05)Fat 14:59613334LYSTS/G-LD 14:81796679OIT3S/Gtolerated(1)Liver 15:59811585HNRNPA2B1L/Ptolerated(0.35)FatLDLiver 15:98217885ENSSSCG00000028949R/Gtoleratedlowcon˝dence(1)FatLDLiver 16:29335640ENSSSCG00000016869N/Dtolerated(1)FatLD 16:42512978ELOVL7S/Gtolerated(1)Fat 17:46041505BLCAPY/Cdeleterious(0)FatLiver occupyjust11.4%oftheswinegenome,whileLINEsoccupy17.5%[83].Ofthe5993repetitive A-to-Gmismatches,58.8%arefoundwithinthePre0_SSelement,aSINEelementofthePRE1 family(Fig3.2B).LittleisknownaboutPre0_SS,butamongallelementsofthePRE1family, Pre0_SSismostidenticaltotheconsensusPRE1sequence.Inmanyinstances,Pre0_SSelements are>99%identicaltooneanother,indicatingthatitiscurrentlyactivelytransposinginpigs[84]. AdditionalmembersofthePRE1familycontainA-to-Gmismatches,althoughatamuchlower 39 Figure3.5:DistributionofrepetitiveA-to-Gmismatches Thedistributionisshownacrossmajorrepetitiveelementfamilies(A)andfurtherbrokendowninto speci˝crepetitiveelementtypes(B).PercentagesshownareoutofallrepetitiveA-to-Gmismatches. frequencythanPre0_SS. 3.4Conclusions WhileAluelementsenablesubstantialRNAeditingamongprimategenomes,weshowthat non-AlubearinggenomescanalsoutilizeRNAeditingasameanstoachieveasimilarresult. Ourhigh-throughputscansuggeststhatpigtranscriptomesarehighlyeditableamongPRE-1SINE retrotransposons.PRE-1,anelementderivedfromanancestraltRNA,hassimilarfeaturestothe primateAlu,derivedfromanancestral7SLRNA;acopynumberof1x10 6 ,consensuslengthof 246bp,andverylittlediversityamongsuchmembersasPre0_SS.Thesefeaturesin˛uencethe secondarystructureofthetranscriptome,whichinturna˙ectADAReditabletargets.Surpris- ingly,conservationofspeci˝ceditingsitessuchasthosein NEIL1 and BLCAP appearsevident betweenhumanandpigs.Therefore,wehypothesizethattranscriptomesecondarystructuremay beconservedamongmammalsenoughtopreserveparticularRNAeditingsites,andthatSINE 40 elements,regardlessoforigin,mayconformtocertainpositionsandorientationsinordertoallow conservationtooccur. Bydemonstratingthatpigtranscriptomeshavepotentialtobehighlyedited,weproposethat pigsmaybeavaluablemodeltounderstandthepatternsofADARcontrolledRNAediting. Additionally,bysheddinglightonthepigeditome,wecanbegintounderstandtheextenttowhich thisphenomenonenhancespiggeneticvariation.Suchsourcesofvariationmayonedayprovide valuableexplanatorypowerforavarietyoftraitsofinteresttobothbiomedicalandagricultural communities. 3.5Methods 3.5.1Sequencedata FromMichiganStateUniversity'spigresourcepopulation(MSUPRP),anF 2 populationresulting fromcrossesbetween4F 0 Durocsiresand15F 0 Pietraindams[85],asinglefemaleanimal waschosenforwholegenomeandtranscriptomesequencing.TotalRNAwasextractedfrom subcutaneousfat,liver,andLDskeletalmuscleusingTRIzol,andaRINgreaterthan7was determinedwiththeAgilent2100Bioanalyzer.cDNAlibrariesweremadeusingtheIllumina TruSeqStrandedmRNALibraryPreparationKit.SequencingwasperformedusingtheIllumina HiSeq2500inRapidRunmodewith150x2paired-endreads.BasecallingwasdonebyIllumina's RealTimeAnalysisv1.18.61andtheoutputwasconvertedtoFastQformatwithIllumina'sBcl2fastq v1.8.4.GenomicDNAwaspuri˝edfromwhitebloodcellsusingtheInvitrogenPurelinkGenomic DNAMiniKitandlibrariesweremadeusingtheIlluminaTruSeqNanoDNALibraryPreparation KitHT.SequencingofgenomicDNAwasdoneusingtheIlluminaHiSeq2500inRapidRunmode with150x2paired-endreads.RealTimeAnalysisv.1.17.21.3andBcl2fastqv1.8.4wereusedfor basecallingandFastQconversion,respectively.ReadqualityofbothwholegenomeandRNAdata wasassessedusingtheFastQCprogram[86]. 41 3.5.2Sequencepreparationandmapping DNAreadsfromwholegenomesequencingweretrimmedforqualityatthe3'endusingCondetri v2.2[87]withparameters: -sc=33-minlen=75 and b=fq .Resultingmate1,mate2and unpairedreadsweremappedto SusScrofa 10.2.69usingBowtiev2.2.1[88]withparameters: -p7 -X1000 .Inorderto˝lteroutDNAreadsthathadmorethanonerecordedalignment,alignments containingthe XS:i: tag,where N indicatesthenumberofalternativealignmentsforaread, wereremoved.Strandspeci˝ccDNAsequencingreadsfromeachtissuesampleweretrimmedwith Condetriwithparameters: -sc=33-minlen=75-pb=fq-cutfirst=6-pb=fq .Resulting pairedandunpairedcDNAreadswerethenmappedto SusScrofa 10.2.69usingTopHatv2.0.12 [89]withparameters: -p7400100 .FilteringoutcDNAreadsthathadmorethanonerecordedalignment wasdonebyselectingalignmentswiththe NH:i:1 tag,whileseparatingplusstrandtranscript alignmentsfromminusstrandalignmentswasdonebyselectingalignmentspossessingthe XS:A:+ or XS:A:- tags,respectively.TheresultingDNAandcDNAalignmentsarethedataused indownstreamvariantcallingandmismatchdetection. 3.5.3Variantcallingandmismatchdetection WeutilizedvariantcallingsoftwareSamtoolsv1.0andBcftoolsv1.2tojointlycallvariantsamong DNAandcDNAreadsfromplusstrandtranscriptsusing: samtoolsmpileup-f-C50-E-Q25-ug-tDP,DV,SP< , ! DNA.bam> where includesall˝lteredDNAalignments,and ,, are˝lteredcDNAalignmentsfromplusstrandtranscripts.Likewise,DNAandcDNAreadsfrom minusstrandtranscriptswereprocessedsimilarlywith: 42 samtoolsmpileup-f-C50-E-Q25-ug-tDP,DV,SP< , ! DNA.bam>< , ! LD_minusstrand.bam>. Notethattheparameter DP,DV,SP isrequiredfordownstreammismatchdetectionwith editTools.Samtoolsoutputfromeachcommandwaspipedintobcftoolswithadditionalpa- rameters: v .ThesestepsproducetwoVCF˝lesthataresimultaneouslyprocessed with find_edits() ,afunctionwithineditToolsavailableat https://github.com/funkhou9/ editTools .Bydefault, find_edits() scanseachvariantsitetosearchforcandidateRNAedit- ingsitesaccordingtothe˝vecriteriarequiredforsu˚cientevidence(seeResultsandDiscussion). Most˝guresinthisreportweregeneratedusingeditToolsplottingmethods,whichutilizedthe ggplot2Rpackage[90]. 3.5.4Quantitativereal-timePCR TotalRNAwasisolatedfromliver,LDskeletalmuscleandsubcutaneousfattissuesfrom34 MSUPRPpigs,includingthepigchosenforsequencing,usingTRIzolreagent(Ambion)ac- cordingtothemanufacturer'sinstructions.ConcentrationsweremeasuredusingaNanoDrop spectrophotometer(ThermoScienti˝c),andqualityandintegrityweredeterminedusinganAgilent 2100Bioanalyzer(AgilentTechnologies,Inc.).TotalRNAwasreversetranscribedusingrandom primerswiththeHighCapacitycDNAReverseTranscriptionKitwithRNaseInhibiter(Applied Biosystems)accordingtothemanufacturer'sinstructions.ApigADARCustomTaqManGeneEx- pressionassaywasdesignedusingtheonlineCustomTaqManAssayDesignTool(ThermoFisher Scienti˝c).Theassaywasdesignedtospanexons2-3ofthepigADARgene(AccessionNo. NC_010446.4).Assayswereperformedintriplicateusing50ngcDNAandtheTaqManGene ExpressionMasterMix(20 l˝nalvolumeperreaction)inaStepOnePlusReal-TimePCRSystem (AppliedBiosystems).Cyclingconditionswere52 Cfor2minand95 Cfor10min,followedby 40cyclesof95 Cfor15sand60 Cfor1min.Relativeexpressionvalueswereobtainedusingthe 43 2- CTmethod,withthemusclesampleusedforsequencingasacalibratorandUbiquitinCasa referencegene(AppliedBiosystemsAssayNo.Ss03374343_g1).Inferenceofdi˙erentialADAR expressionwascalculatedbyone-wayANOVA(maine˙ectoftissueonADARexpression),and TukeyHSD(pairwisecomparisonsoftissuemeans). 3.5.5Calculatingprobabilityofmappingerror Theaveragephred-scaledmappingquality MQ acrossallsamplesatmismatchsite i isprovidedby SAMToolsoutput.From MQ wecancomputetheprobabilityofmappingerror p accordingto: p i = 10 MQ i 10 Itfollowsthattheprobabilityofobserving5readsatahomozygoussitewithacDNA sequencingdepthof N assumingnoRNAeditingcanbemodeledusingthebinomialdistribution, where: P ¹ X 5 j N ; p º = 1 P ¹ X < 5 º = 1 4 Õ j = 0 N j p j ¹ 1 p º N j 3.5.6IncorporatingRepeatMaskerandVariantE˙ectPredictordatausingeditTools TheeditToolsfunction add_repeatmask() wasusedtomergeamismatchdataobject(generated with find_edits() )withsusScr3,aRepeatmaskerdatasetavailablefordownloadat: http: //www.repeatmasker.org/species/susScr.html .Thisfunctionutilizesabinarysearch algorithmimplementedinC++toprocesslargeRepeatMasker˝lese˚ciently.Thefunction write_vep() wasusedtogenerateVariantE˙ectPredictorinputfromamismatchdataobject. TheoutputofVariantE˙ectPredictorwasmergedwiththemismatchdataobjectusing add_vep() . Additionaldocumentationfor find_edits() , write_vep() , add_vep() , add_repeatmask() isavailablewithineditToolsv2.1. 44 3.6Authors'Contributions Conceivedanddesignedthestudy:CWE.ContributedsamplesfromtheMSUpigresource population:ROB,CWE,NER.IsolatedRNAandDNA:NER.Developedsoftwareandanalysis pipeline:SAF,JPS.PerformedqPCRassaysandanalysis:DS,NER,SAF.Wrotethemanuscript: SAF.Allauthorsreadandapprovedthe˝nalmanuscript. 3.7Acknowledgements ComputingresourceswereprovidedbyMichiganStateUniversity'sHighPerformanceCom- putingCluster.SequencingwasperformedattheMichiganStateUniversityResearchTechnology SupportFacility. 45 CHAPTER4 ESTIMATINGTHECOHERITABILITYBETWEENSITE-SPECIFICRNAEDITING ANDECONOMICALLYIMPORTANTTRAITSINPIGS 4.1Abstract Thehighlyconservedpost-transcriptionalmechanismknownasadenosinetoinosine(A-to-I) RNAeditingimpactsgenefunctionbyconvertingadenosinetoinosinemoleculeswithinspeci˝c regionsofthetranscriptome.Thedegreethatspeci˝csitesarelev beenobservedtovarywithinpopulationsandcanbeconsideredamolecularquantitativetrait hypothesizedtoin˛uencehigher-orderphenotypes.Hereweutilized940F 2 animalsandacom- binationofunivariateandbivariatemixedmodelstostudythesharedgeneticcontributionsto RNAeditingactivityin longissimusdorsi muscletissueandeconomicallyimportantpigtraits.We identi˝ed˝veRNAeditingsitesacrossfourgeneswhoseeditinglevelvariationwassigni˝cantly attributedtotheadditivee˙ectsofallobservedSNPmarkers(estimatedgenomicheritability ^ h 2 g = p -value=8.2x10 -5 -4 ).Usingamulti-polygenicmodeltolocalizegenomic heritabilityestimatestoaregionofinterest,acrossall˝veeditingsiteswefoundsuggestiveevi- dencethataportionofthegenomicheritabilitycanbeattributedtoSNPs˛anking ADAR .When usingbivariatemodelstoestimatelocalgeneticcorrelationsbetweensite-speci˝ceditinglevels and67complextraits,wefoundnominally-signi˝cantevidencethatthe ADAR locuscontributesto anegativerelationshipbetweeneditingactivityandphenotypicallyrelatedgrowthtraitsincluding averagedailygain(localgeneticcorrelation ^ ˆ g local » SE ¼ =-0.87[0.16]; p -value=0.029).This worksuggestspotentialpleiotropybetweenRNAeditingactivityandcomplexgrowthtraitsinpigs andencouragesfurtheruseofmixedmodelstodetermineifRNAeditingcanlinkgeneticvariation tocomplextraitvariation. 46 4.2Introduction Accordingtopopularandprevailingtheory,quantitativetraitloci(QTL)in˛uencecomplex traitslargelybyin˛uencinggeneexpression.Functionalgeneticistshaveinvestigatedthistheory primarilybyperforminggenome-wideassociations(GWA)to˝ndQTLthatassociatewithtran- scriptabundance,otherwiseknownasexpressionQTLoreQTL[91,92,93].Geneexpression, however,involvesacomplexarrayofprocessesbeyondupregulatinganddownregulatingtranscript abundance.Forinstance,RNAsplicinghasbeenshowntobein˛uencedbygenetice˙ectsandcan explainasubstantialpartofcomplextraitanddiseaseriskvariation[29,94]. Still,additionalformsofgeneexpressionarecontinuallybeingevaluatedfortheirability tolinkgeneticvariationtohigher-ordertraitvariation.Ahighly-conservedpost-transcriptional mechanismknownasadenosinetoinosine(A-to-I)RNAeditingregulatesgeneexpressionby convertingadenosinetoinosinemoleculeswithinpre-mRNAtranscripts[31],aprocesscatalyzed byadenosinedeaminaseactingonRNA(ADAR).Atnumerouseditedsitesinthetranscriptome,the proportionoftranscriptscontainingtheeditedinosinevarso-calledlev beenshowntovarybetweenindividualsinapopulation[35].Inthisway,site-speci˝cRNAediting activityhasbeenconsideredaheritablequantitativetrait;numerousstudieshaveperformedGWA toidentifyeditingQTL(edQTL)insuchspeciesashumans,mice,drosophila,cattle,andpigs [95,96,97,98,99,36],withageneralconsensusacrossspeciesandpopulationsthatgenome-wide signi˝cantedQTLroutinelyco-localizewiththeRNAeditingsitetheyareassociatedwith. WhilethestrongestedQTLsignalsmostlyappearcis-acting,thedegreethatadditivegenetic e˙ects(genome-wide)in˛uenceeditingvariationremainslargelyunknown.Similarly,welackan understandingofhowsimilarity(orcovariance)inRNAeditingactivityandcomplextraitsmay beattributedtosharedadditivegeneticsources.Here,wewillrefertotheproportionofvariation attributabletotheadditivee˙ectsofallsingle-nucleotidepolymorphisms(SNPs)asenomic her( h 2 g ).Likewise,wewillrefertothecomponentofcovariancerelatedtoadditive e˙ectsofSNPsasenomiccovarorwhatissometimesreferredtoas [100].Indeed,ifRNAeditingactivityandcomplextraitspossessagenomiccovariance,thiscould 47 indicatepleiotropice˙ectsbetweenthetwoandfurtherthehypothesisthatRNAeditingcanserve asadirectlinkfromgeneticvariationtocomplextraitvariation. Inthisstudy,wehaveutilizedanimalsfromMichiganStateUniversity'sPigResourcePopulation (MSUPRP)[85]toquantifygeneticcontributionstoRNAeditingactivityin longissimusdorsi muscletissue.Weutilizeacombinationofunivariateandbivariatepolygenicmodelstoestimate thegenomicheritabilityofsite-speci˝cRNAeditingactivityandestimatethegenomiccovariance betweensite-speci˝ceditingandeconomicallyimportantpigtraits.Wefurtherdecomposegenomic heritabilityandgenomiccovarianceestimatesintolocalregionsofinteresytheADAR inferhowsuchregionsmaya˙ectbothRNAeditingactivityandhigher-ordertraits. UsingasampleofhighlyheritableRNAeditingsites,we˝ndsuggestiveevidencethatSNPsnear ADARin˛uencebothRNAeditingactivityandcomplexgrowthtraits,whichencouragesfurther study. 4.3Results 4.3.1HeritableRNAeditingactivityimpactspig longissimusdorsi musclegeneexpression TostudyRNAeditingactivityinpig longissimusdorsi (LD)muscletissue,weutilizedavery cohorconsistingofthreeadultpigs,eachwithwhole-genomesequencing(WGS)andLDRNA- sequencing(RNASeq)toidentifyhigh-con˝denceRNAeditingsitesthatweredetectableacross multipleanimals.Wefollowedastandardprocedureoutlinedpreviously[62]toidentifyDNA-to- RNAmismatches,resultingin104A-to-Gmismatches(indicativeofpotentialA-to-IRNAediting activity)detectableinatleasttwoofthethreediscoverycohortanimals. WethenutilizedanysiscohorconsistingofasubsetofanimalsfromMichiganState University'sPigResourcePopulation(MSUPRP)eachwithLDRNAseq(N=168);foreachanimal weestimatededitinglevelsateachofthe104sites,wherewede˝netheestimatededitinglevel tobetheratioofthenumberofapparentlyeditedreads(containingG)overthetotalnumberof reads.Unsurprisingly,only47/104sitesshowedevidencethattheeditinglevelcouldbenormally distributedacrosspigs,usingalow-thresholdshapiro-wilktest( p -value 1x10 -10 ).Thismay 48 Table4.1:RNAeditingsitesexhibitingheritablevariabilityin longissimusdorsi muscletissue SiteStrandGene a Location a N b ^ ˙ 2 g (SE) c ^ ˙ 2 " (SE) d ^ h 2 g p -value e Chr1:126,167,425- BLOC1S6 3'-UTR1650.31(0.16)0.69(0.14)0.318.8x10 -4 Chr6:39,368,241- UQCRFS1 intron1660.41(0.18)0.60(0.14)0.411.6x10 -4 Chr15:110,910,484+ CCNYL1 3'-UTR1660.39(0.17)0.61(0.13)0.391.7x10 -4 Chr16:26,512,555- OXCT1 intron/3'-UTR f 1390.58(0.22)0.45(0.15)0.568.2x10 -5 Chr16:26,512,645- OXCT1 intron/3'-UTR f 1590.34(0.17)0.65(0.14)0.342.3x10 -4 a Geneannotationandeditingsitelocationprovidedbyensembl95predictions. b Samplesize(thenumberofanimalswithadetectableeditinglevel) c Genomicvariancecomponent.REMLestimate(StandardError) d Residualvariancecomponent.REMLestimate(StandardError) e p -valuefromalikelihoodratiotest,testing H 0 : ˙ 2 g = 0 . f Multiplepredictedisoformspresentateditingsite re˛ectthatwhileRNAeditingactivitymaybeallowedtovaryinthepopulationatsomesites,other sitesshowmuchmoreconstraint,asshownpreviously[101]. Foreachof47RNAeditingsitesthatshowedvariationineditingactivityamongtheMSUPRP, we˝tarestrictedmaximumlikelihood(REML)-basedunivariategenomicbestunbiasedlinear predictor(GBLUP)model(seeMethods)todecomposeeditinglevelvarianceintogenomicand residualcomponents.Exactly˝vesitesshowedasigni˝cantgenomicvariancecomponentusing alikelihoodratiotest(LRT)ataBonferroni-correctedthresholdof0.05/47=0.001(Table4.1). Variancecomponentestimatesandsamplesizesareshownforall47editingsitesinTableB.1. 4.3.2GeneticvariantsnearADARaresuspectedtocontributetoeditinglevelvariation acrosssites ForeachheritableRNAeditingsite(Table4.1),wesoughttoidentifySNPsstronglyassociated witheditinglevelsusingmixed-modelGWAmethodsthatcontrolforkinshipamongtheF 2 animals (seesection4.5).Weidenti˝edcis-actinggenome-widesigni˝cantsignals( p -value 1x10 -5 ;5% estimatedFDR)forbotheditingsiteswithin OXCT 1 butsurprisinglynocis-actinggenome-wide signi˝cantsignalsweredetectedfortheremainingthreeRNAeditingsites(Fig4.1A).Curiously, weobservedsuggestiveGWApeaks(peak p -values:3x10 -4 -5 )near ADAR (otherwise knownas ADAR1 inhumans)forall˝veeditingsites. Toinvestigatethesuggestive ADAR -localizededQTLsignals,pairwiselinkage-disequilibrium 49 Figure4.1:GWAforsite-speci˝ceditinglevels (A) ManhattanplotforeachRNAeditingsite.RedlineindicatesestimatedFDRof10%.Eachfacet correspondstoadi˙erentRNAeditingsite.Foreachfacet,bluedashedlinesindicatepositionofADARon chromosome4andpositionoftheeditingsite. (B) PairwiseLDplotbetween26SNPsselected˛anking ADAR.TheSNPnearestADARismarkedwithablueasterisk. estimates(R 2 )wereobtainedfora1MBregionsurroundingADAR(Fig4.1B).Thisutilized genotypesat26SNPsforallgenotypedMSUPRPanimals(N=1015),wherepairwisetwo-SNP haplotypefrequenciesusedinR 2 calculationswereestimatedusingmaximumlikelihood[102]. Intriguingly,theSNPnearest ADAR (howevernotwithin ADAR ),H3GA0013586,isinrelatively poorlinkage-disequilibriumwithotherSNPsinthe ˘ 1MBregion(pairwiseR 2 withH3GA0013586 <0.6).Longer-rangelinkagedisequilibriumwithH3GA0013586wasobservedtodropbeyondthis 1MBregion(FigB.1).Hypothetically,ifcausalvariantswithin ADAR contributetovariancein RNAeditingactivityacrosssites,thissuggeststhataweakedQTLsignalat ADAR couldre˛ect poorlinkage-disequilibriumbetween ADAR causalvariantsandSNPmarkers. ForeachRNAeditingsitewesoughttoquantifytheproportionofgenomicvarianceinediting activityexplainedbySNPs˛anking ADAR ,aswellasSNPs˛anking OXCT1 .Wede˝nedour 50 Table4.2:ProportionofeditinglevelgenomicvarianceexplainedbySNPs˛anking ADAR and OXCT1 LocalregionEditingSite(gene) ^ ˙ 2 g local (SE) a ^ ˙ 2 g BG (SE) b ^ ˙ 2 g local / ^ ˙ 2 g c p -value d ADAR chr1:126167425( BLOC1S6 )0.121(0.104)0.125(0.118)0.498.02x10 -4 chr6:39368241( UQCRFS1 )0.192(0.145)0.186(0.129)0.511.70x10 -4 chr15:110910484( CCNYL1 )0.141(0.115)0.167(0.124)0.469.12x10 -5 chr16:26512555( OXCT1 )0.088(0.091)0.342(0.176)0.211.23x10 -3 chr16:26512645( OXCT1 )0.076(0.080)0.254(0.151)0.231.16x10 -2 OXCT1 chr1:126167425( BLOC1S6 )0.008(0.026)0.313(0.162)0.033.40x10 -1 chr6:39368241( UQCRFS1 )0.005(0.022)0.412(0.177)0.013.80x10 -1 chr15:110910484( CCNYL1 )0.000(0.016)0.387(0.170)0.005.00x10 -1 chr16:26512555( OXCT1 )0.885(0.541)0.256(0.125)0.787.26x10 -14 chr16:26512645( OXCT1 )0.338(0.253)0.251(0.141)0.571.37x10 -5 a ThegenomicvarianceREMLestimate(Standarderror) b ThekggenomicvarianceREMLestimate(Standarderror) c Theratio ^ ˙ 2 g local ^ ˙ 2 g local + ^ ˙ 2 g BG d p -valuetesting H 0 : ˙ 2 g local = 0 regionofinterest˛anking ADAR tobethe27SNPsidenti˝edpreviously(Fig4.1B),andsimilarly de˝nedourregionofinterest˛anking OXCT1 tobe25SNPswithin500Kboneithersideofthe chr16:26512555editingsite.Likebefore,wemodeledRNAeditinglevelsusingaREML-based univariateGBLUPmodel,butusedapolygenicrandome˙ect(aggregatee˙ectofSNPs withinaregionofinterest)andkgpolygenicrandome˙ect(aggregatee˙ectduetoall SNPsotherthantheSNPsofinterest)(seesection4.5).Withthismodel,wedecomposedthe genomicvarianceintolocalandbackgroundcomponentsandusedaLRTtodeterminewhether inclusionofthelocalpolygenice˙ect˝tsthedataanybetterthanomittingit(Table4.2).We observedsuggestiveevidencethattheaggregatee˙ectofSNPsnear ADAR contributestoediting levelvariation(LRT p -values:9.12x10 -5 -2 ),butwereunabletocon˝dentlyquantify theproportionofgenomicvarianceitexplainsdueuncertaintyinlocalandbackgroundgenomic varianceestimates.Ourbestestimateshowedthatroughly50%ofthegenomicvarianceinRNA editingactivityfor BLOC1S6 , UQCRFS1 ,and CCNYL1 sitescanexplainedbySNPs˛anking ADAR ,whilethatestimatereducesto20%for OXCT1 editingactivity. 51 4.3.3SuggestiveevidenceforasharedgeneticarchitecturebetweenRNAeditingactivity andcomplextraits GiventhatvariationinRNAeditingactivityacrosssitesispotentiallyattributabletoSNPs˛anking ADAR ,wesoughttoinferwhether ADAR -˛ankingSNPsalsocontributetovariationincomplex traits.Usingthesametwo-polygenicunivariatemodelasbefore,andthefullMSUPRPwith phenotypeandSNPgenotypedata(N=940),wetestedwhethervarianceamong67growth,meat quality,andcarcasscomposition(GMQCC)traitscouldbeattributedto27SNPs˛anking ADAR . Surprisingly,15/67traits(morethanexpectedbychance)showednominalevidence( p -value 0.05) that ADAR -˛ankingSNPscontributetotheirvariance(TableB.2),withaveragedailygain(ADG) possessingthestrongestevidence( ^ ˙ 2 g local [SE]:0.14[0.08]; ^ ˙ 2 g BG [SE]:0.27[0.05]; p -value= 2.9x10 -4 ).Incontrast,only1/67GMQCCtraitsshowednominalevidencethat OXCT1 -˛anking SNPscontributetotheirvariance. Wenextutilizedabivariatemodeltodecomposethecovariancebetweensite-speci˝cediting activityandGMQCCtraitsintogenomicandresidualcomponents(seesection4.5)(Table4.3). AllanimalswithRNAeditingrecords( n 1 ˘ 168)wereamongthelargersetofMSUPRPanimals withGMQCCrecords( n 2 ˘ 940),enablingustomodeltheresidualcovariancebetweenediting activityandGMQCCtraits.Foreachofthe˝veRNAeditingsites,weinferredthegenomic covariancebetweeneditinglevelsand67GMQCCtraits,totaling335tests.Weobservedno signi˝cantgenomiccovariancesaftermultipletestcorrection( p -value 0.05/335=1.5x10 -4 ), butobservedthat p -valuesbegindeviatingfromwhatisexpectedataround p -value=0.1(Fig4.2). ThisprovidessmallandsubtleevidencethatgenomiccovariancesbetweenRNAeditingactivity andcomplextraitsexist,butperhapsatamagnitudethatweareunderpoweredtodetecteitherdue toinsu˚cientsamplesizeorimperfectlinkage-disequilibriumbetweenSNPsandcausalvariants. Themostsigni˝cantgenomiccovariancesestimatedwerebetweenlongissimusmusclemoisture content(moisture)andthechr16:26,512,645editedsitewithin OXCT1 ( ^ ˆ g [SE]=-0.70[0.20]; p -value=0.004),andbetween45-mincarcasstemperature(temp_45m)andthechr15:110,910,484 editedsitewithinCCNYL1( ^ ˆ g [SE]=0.70[0.20]; p -value=0.001).Afulllistofallgenomic 52 Figure4.2:Quantile-quantileplottestingforgenome-widegenomiccovariancesbetween site-speci˝ceditinglevelsandcomplextraits covariancesestimatedareshowninTableB.3. Interestingly,ADG,whichshowedthehighestevidenceoflocalgenomicvarianceattributable to ADAR -˛ankingSNPs,showedmodestevidenceofagenomiccovariancewitheditingactivityat chr15:110,910,484,givenaLRT p -valueof0.1(TableB.3)Giventhatgenomiccovariancesprovide noinformationaboutwhereinthegenomegenetice˙ectsaresharedbetweentraits,wesoughtto inferlocalgenomiccovariancesattributableto ADAR -˛ankingSNPsusingatwo-polygenicbivariate model(seesection4.5).Asbefore,whenconsideringmultipletests,weobservednosigni˝cantlocal genomiccovariancesattributableto ADAR -˛ankingSNPs.Asexpectedhowever,ADG,alongwith phenotypicallyrelatedgrowthtraits,showedthehighestevidenceofan ADAR -localizedgenomic covariancewitheditingactivity(Table4.4).Allourtop ADAR -localizedgenomiccovariance estimateswerenegative(exceptforDaysto105kg);ifthesearetruepositivesignals,itsuggests geneticvariantsnear ADAR contributetoanegativerelationshipbetweenRNAeditingactivity (particularlyatchr15:110910484)andgrowthtraitssuchasaveragedailygain,bodyweightat22 weeks,emptybodylipid,totalbodyfattissue,etc.Evenifthisistrue,additionalworkwouldbe neededtodetermineif ADAR variantsexhibitpleiotropice˙ectsbetweenRNAeditingactivityand growthtraits,orifseparatecausalvariantsa˙ectingeditingactivityandgrowthtraitsaresimplyin 53 Table4.3:Topgenomiccovarianceestimatesbetweensite-speci˝cRNAeditinglevelsandgrowth, meatquality,andcarcasscompositiontraits a Editingsite(gene)Trait b ^ ˆ g (SE) c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g chr1:126,167,425( BLOC16S )b0.63(0.21)0.19-0.040.160.006 temp_45m0.67(0.22)0.17-0.020.150.015 moisture-0.48(0.23)-0.140.12-0.020.050 chr15:110,910,484( CCNYL1 )temp_45m0.70(0.20)0.19-0.21-0.020.001 conn_tiss-0.55(0.23)-0.140.05-0.080.020 ˙toln-0.51(0.19)-0.180.09-0.090.023 chr16:26,512,555( OXCT1 )moisture-0.52(0.18)-0.220.03-0.190.011 picnic0.62(0.16)0.33-0.260.060.024 color-0.47(0.19)-0.18-0.02-0.190.026 chr16:26,512,645( OXCT1 )moisture-0.70(0.20)-0.220.05-0.170.004 lrf_22wk0.53(0.21)0.18-0.010.170.012 fat0.52(0.19)0.210.050.260.018 chr6:39,368,241( UQCRFS1 )boston-0.59(0.14)-0.32-0.09-0.400.013 temp_24h-0.48(0.20)-0.14-0.34-0.480.060 wt_3wk-0.69(0.31)-0.090.04-0.050.060 a Shownarethetopthreegenomiccovarianceestimates(rankedby p -value)foreachRNAeditingsite b b=b*objectivecolor;temp_45m=carcasstemperatureat45m;conn_tiss=connectivetissue sensorypanelanalysis;˙toln=fat-freetotalleantissue;picnic=picnicshouldercutweight;color =subjectivecolor;lrf_22wk=lastribbackfatat22weeks;fat=fatpercentage;boston=boston shouldercutweight;wt_3wk=bodyweightat3weeks c Geneticcorrelationestimate(Standarderror),where ˆ g = ˙ g 1 g 2 ˚ q ˙ 2 g 1 ˙ 2 g 2 d GenomiccovarianceREMLestimate e ResidualcovarianceREMLestimate f Phenotypiccovariance g P -valuetesting H 0 : ˙ g 1 g 2 linkage-disequilibrium. 4.4Discussion Todate,fewRNAeditingstudieshaveestimatedgenome-wideparameterssuchasheritability, includingapreviousstudythatevaluatedregulationof5-HT2Creceptoreditingactivity[103]. Heritabilityestimatesarevaluableinthattheyestimatethedegreethattheaggregatee˙ectofall causalvariants(includingthosewithrelativelyweake˙ects)in˛uencephenotypessuchasRNA editinglevels.Incomparison,traditionalGWAtechniquesincludingthoseusedtoidentifyedQTL, areknowntobeunderpoweredsuchthatonlyrelativelylargee˙ectlociaredetectable[104]. Consistentwithpreviousstudies[95,96,97,98,99,36],amongoursampleof˝vesigni˝cantly heritableRNAeditingsitesweobservethestrongestgeneticcontributionsofeditingactivitytobe cis-acting.However,onlyeditingsiteswithin OXCT1 werefoundtohavegenome-widesigni˝cant cis-actingedQTL,suggestingtheremainingthreesitesareunderalternativegeneticcontrol.We 54 Table4.4:Toplocalgenomiccovarianceestimatesattributableto ADAR -˛ankingSNPsbetween site-speci˝cRNAeditingandgrowth,meatquality,andcarcasscompositiontraits a Editingsite(gene)Trait b trait ^ h 2 g local c ^ ˆ g local (SE) d ^ ˙ g 1 g 2 local e p -value f chr15:110910484( CCNYL1 )ADG0.14-0.87(0.16)-0.170.029 chr15:110910484( CCNYL1 )wt_22wk0.06-0.86(0.21)-0.090.046 chr15:110910484( CCNYL1 )Days0.090.75(0.27)0.090.052 chr15:110910484( CCNYL1 )mtfat0.04-0.88(0.22)-0.070.054 chr15:110910484( CCNYL1 )tofat0.09-0.79(0.24)-0.100.061 chr15:110910484( CCNYL1 )˙toln0.03-0.86(0.24)-0.070.064 chr15:110910484( CCNYL1 )mtpro0.05-0.78(0.27)-0.070.070 chr6:39368241( UQCRFS1 )Days0.080.65(0.33)0.080.146 chr6:39368241( UQCRFS1 )wt_22wk0.06-0.73(0.28)-0.090.146 chr15:110910484( CCNYL1 )bf10_22wk0.05-0.67(0.35)-0.060.149 a Top10localgenomiccovarianceestimates,rankedby p -value. b ADG=averagedailygain;wt_22wk=bodyweightat22weeks;Days=Daysto105kg;mtfat=empty bodylipid;tofat=totalbodyfattissue;˙toln=fat-freetotalleantissue;mtpro=emptybodyprotein; bf10_22wk=10thribbackfatat22weeks c Estimatedgenomicheritabilityofthetrait,localizedto ADAR -˛ankingSNPs,asestimatedfroma two-polygenicbivariatemodel d Localgeneticcorrelation(Standarderror) e LocalgenomiccovarianceREMLestimate f Phenotypiccovariance g P -valuetesting H 0 : ˙ g 1 g 2 local furtherprovidesuggestiveevidencethatvariantsnear ADAR maybecontributingaproportionof editinglevelvariationacrossmultipleeditingsites,usingmodelsthatdecomposetheaggregate e˙ectofallcausalvariantstoaregionofinterest. Genomiccovariancesprovideahigh-levelunderstandingofwhethertwotraitsareundersimilar geneticcontrol.Underthecascadinghypothesisthatgeneticvariation(A)contributestoediting levelvariation(B),whichcontributestocomplextraitvariation(C)(A ! B ! C;anexampleof erticalpleiotrop[100]),itisrequiredthatthegenomiccovariancebetweeneditinglevelsand complextraitsbenon-null(thisassumessu˚cientlinkage-disequilibriumbetweenSNPmarkers andcausalvariants).Inpursuitofsuchevidence,wewereunabletoinfernon-nullgenomic covariancesamong335pairwisetests(5RNAeditingsitesby67complextraits),butobserved subtleevidencethatgenomiccovariance p -valuesbegindeviatingfromwhatisexpectedataround p -value=0.1.Thiscouldindicatesmall-magnitudegenomiccovariancesbetweenRNAediting activityandcomplextraits,possiblyduetoimperfectlinkage-disequilibriumbetweenSNPmarkers andcausalvariants.WefurtherlocalizegenomiccovariancestoSNPs˛anking ADAR to˝nd suggestiveevidencethatthe ADAR locuscontributestoanegativerelationshipbetweenRNA editingactivity(particularlyatthechr15:110910484editingsite)andnumerousphenotypically 55 relatedgrowthtraits(Table4.4).Whilemanyofourtopgenomiccovariancesignalsdidnot correspondwithtop ADAR -localizedgenomiccovariancesignals(comparingTable4.3and4.4), thisisnotunordinarygiventhatgenomiccovariancesre˛ectco-localizationofcausalvariants,on averageacrossthewholegenome.Curiously,wenoticedthatseveralediting-traitpairswithtop ADAR -localizedgenomiccovarianceevidencehadgenome-widegenomiccovariance p -valuesthat deviatedfromwhatwasexpected(TableB.3;Figure4.2).Forexample,ADGandtheeditinglevel atchr15:110910484hadagenome-widegenomiccovarianceLRT p -valueof0.1,whilefat-free totalleantissue(˙toln)andchr15:110910484hadaLRT p -valueof0.02. Despite˝ndingsuggestiveevidencethatvariantsnear ADAR mayin˛uencebothRNAediting activityandnumerousgrowthtraits,ourinabilitytoobtainsigni˝cantevidenceofthishypothesis aftermultipletestcorrectionwarrantsfurtherstudy;alargersamplesizeissuggestedtoobtain su˚cientstatisticalpowerandmorepreciseestimatesoflocalgenomicvariances/covariances.We alsonotethatouranalysisofRNAeditingactivitywaslimitedto˝vesites.Assuggestedfrom anearlierstudy[101],perhapsrelativelyfewRNAeditingsitesshowconsiderablevariationin editinglevelsfromindividualtoindividual,andperhapsfewerstillexhibitvariationattributable togeneticvariation.Still,ourabilitytodetectheritableRNAeditingsitesinLDmuscletissue waslikelya˙ectedtobytwomajorfactors:1)skeletalmuscletissueisknowntoexhibitrelatively lowRNAeditingactivity[35],and2)theRNAsequencingdepthofouranalysiscohort(N=168) wasrelativelymodest( ˘ 63Mreadsfor24/168animalsand ˘ 23Mreadsfortheremaining144 animals).Thissecondpointmeantwewerelimitedtosurveyingeditedgenesthatarerelatively highlyexpressedinLDmuscletissue.Anotherlimitationa˙ectingthisstudywasthatgenotypes ateditedsiteswithintheanalysiscohortwereunobserved.However,foreachofthe˝vesites considered,itisunlikelythatvariationintheeditinglevel(proportionofreadscontainingaG) simplyre˛ectedallelecontentvariationataSNPbecause1)all˝vesitespossessedsuggestive edQTLsignalsat ADAR and2)editinglevelgenomicheritabilityestimateswererelativelylow(< 0.6);byde˝nition,theheritabilityofallelecontentataSNPisexactlyone[105]. Inthisstudywehaveutilizedacombinationofunivariateandbivariatemixedmodelsto 56 decomposevariationinRNAeditingactivityandvariationincomplextraitsintosharedgenetic sources.Thisapproachishighlysuggestedtoin˛uenceourunderstandingofRNAeditingregulation andthedegreethatgeneticallyregulatedRNAeditingactivitycouldin˛uencecomplextraits.As datafrommoreanimalsandmoretissuesbecomesavailable,weforeseetheanswertowhether RNAeditingcandirectlylinkgeneticvariationtocomplextraitvariationbecomingclearer. 4.5MaterialsandMethods 4.5.1Sequencingdata TodiscovercandidateRNAeditingsitesfordownstreamgeneticanalysis,averycohor consistingofthreeadultanimalswithmatchedwholegenomesequencingandRNAsequencing fromLDmuscletissuewereused.TwoanimalswereYorkshirepigsfromtheFunctionalAnnotation ofAnimalGenomesProject(FAANG; https://www.animalgenome.org/community/FAANG/ ) andthethirdwasanF 2 pigfromMichiganStateUniversity'spigresourcepopulation(MSUPRP), originatingfromfourF 0 Durocsiresand15F 0 Pietraindams[85]. WholegenomesequencingofthetwoFAANGanimalswasdoneusing100bppaired-endreads; oneanimalwassequencedatadepthof ˘ 489Mreadsandtheothersequencedat ˘ 564Mreads. WholegenomesequencingoftheF 2 pigwasdoneusing150bppaired-endreads,totaling ˘ 249M reads.LDMuscleRNAsequencingfromtheFAANGanimalswasdoneusing100bppaired-end strand-speci˝creads,withoneanimalsequencedatadepthof ˘ 56McDNAreadsandtheother sequencedat ˘ 42McDNAreads.Finally,LDRNAseqfromtheF 2 pigconsistedof ˘ 104M150bp paired-end,strand-speci˝ccDNAreads.Moredetailsregardinglibraryprepandsequencingofthe F 2 animalcanbefoundinFunkhouseretal.[62]. AteachcandidateRNAeditingsite,editinglevelswereestimatedamongasubsetofthe MSUPRPpossessingLDmuscleRNAsequencing(N=168).RNAsequencingoftheseanimals isdetailedinfullinVelez-Irizarryetal.[106],resultinginadepthof ˘ 63Mstrand-speci˝ccDNA readsfor24/168animalsand ˘ 23Mstrand-speci˝ccDNAreadsfortheremaining144animals. 57 4.5.2Genotypingdata ToanalyzethegeneticarchitectureofeachRNAeditingsite,weutilizedtheMSUPRP,apopulation thathasbeengenotypedusingtheIlluminaPorcineSNP60BeadChipwithSNPsmappedtothe Sscrofa11.1genomeassembly.Genotypedatapre-processingisdetailedinanearlierstudy[106]. Brie˛y,usingallMSUPRPgenotypedanimals(N=940),thefollowingSNPswereremovedfrom analysis:monomorphicandnon-autosomalSNPs,SNPswithevidentmendelianerror,andSNPs withaminorallelefrequencylessthan0.01.Thisresultedin43,130markersusedinallgenetic analyses. 4.5.3Phenotypes PhenotypesfromtheMSUPRPhavebeendescribedinpreviousstudies[85,106,107].Atotalof 67traits(29growthtraits,20carcasscompositiontraits,and18meatqualitytraits)weretestedto begeneticallycorrelatedwith5heritableeditinglevels.Briefdescriptionsandsummarystatistics foreachtraitcanbefoundin[106]. 4.5.4SequencingdatapreparationandRNAeditingdetection UsingtheverycohorconsistingofthreeanimalswithmatchedWGSandRNAsequencing, preparationofsequencingdataanddiscoveryofRNAeditingsiteswaslargelyconsistentwith Funkhouseretal.[62].ForbothWGSandRNAseq,trimmomaticversion0.38[108]wasused toremovelowqualitybasesatthe3'end,retainingminimumlengthsequencesof56bps.Six basesatthe5'endwerealsoremovedfromcDNAreadstoremoveanyartifactualbasesintroduced duringcDNAsynthesis[76].TrimmedDNAreadsweremappedtotheSusScrofa11.1reference assemblywithBowtieversion2.3.2andtrimmedRNAreadsweremappedtothesamereference withTopHatversion2.1.1.DNAandRNAreadsthathadmorethanonerecordedalignmentwere removedfromfurtheranalysis.Priortovariantcalling,RNAseqalignmentsweresplitsuchthat plus-strandtranscriptalignmentswereseparatedfromminus-strandtranscriptalignments. 58 TodetectRNAeditingsitesfordownstreamgeneticanalysis,variantcallingwasperformed usingSamtoolsversion1.7andbcftools1.9.64,whereby,foreachanimal,variantswerejointly calledbetweenDNAandstrand-speci˝cRNAalignments.Resultingvariantcallingdataisthen processedbyeditTools( https://github.com/funkhou9/editTools ),asuiteofcompiledR functionsdesignedtorapidlyscreenvariantcallingdataforRNAeditingevidence.ADNA-to-RNA mismatch,indicativeofanRNAeditingevent,wasdetectedaccordingtothefollowingcriteria:1) thegenotypeishomozygousaccordingto95%ofDNAreads,2)10ormorereadswereusedto callthegenotype,3)atleast5cDNAreadsdi˙eredfromthegenotype,and4)thecDNAreads haveaPhred-scaledstrand-bias p -valueof20orless.OnceLDmuscleDNA-to-RNAmismatches wereidenti˝edineachofthethreediscoveryanimals,editingsiteswereretainedfordownstream analysisif1)theyweretheofthecanonicalA-to-GformindicativeofADARactivity,2)theywere detectableinboththeF 2 animalandatleastoneoftheFAANGanimalsand3)thegenotypeatthe RNAeditingsitewashomozygousreference.Thisresultedin104putativeADAR-catalyzedRNA editingsitesfordownstreamanalysis. 4.5.5Editinglevelestimation Ateachofthe104putativeRNAeditingsitespreviouslyidenti˝ed,editinglevelswereestimated withintheysiscohorasubsetofMSUPRPpigswithLDmuscleRNAsequencingdata(N =168).RNAsequencingfromtheanalysiscohortwastrimmedandmappedinthesamewayas RNAsequencingfromthediscoverycohort.Foreachofthe168animals,variantcallingateachof the104putativeRNAeditingsiteswasperformedwithSamtoolsversion1.7andbcftools1.9.64; ateachsite,editinglevelswereobtainedbydivingthenumberofhigh-quality(basequality>=25) readssupportingtheeditedallelebythetotalnumberofhigh-qualityreads.Editinglevelswere discardediftheywereestimatedwithlessthan10high-qualityreads. TofurtheridentifyRNAeditingsiteswithvariableeditinglevelsacrossanimals,weretainedan RNAeditingsiteforanalysisif1)itwasdetectableinatleast10/168MSUPRPanimals,and2)the editinglevelwasnot˝xedacrossanimalsandshowedatleastweakevidenceofgaussianvariance 59 (shapirowilk p -value>1x10 -10 ).Thisresultedin47RNAeditingsites,suitableforgenomic heritabilityestimation. 4.5.6UnivariatevariancecomponentestimationandGWA Agenomicbestlinearunbiasedprediction(GBLUP)modelwasusedtodecomposesite-speci˝c editinglevelvarianceintogenomicandresidualcomponents.Themodelcanbeexpressedas: y = 1 + x sex + g + " (4.1) where y isavectorofestimatededitinglevelsforeachanimal(centeredandscaledtomean 0andunitvariance)atoneof47RNAeditingsites, isanoverallmean, x isanindicatorof thesexofeachanimal, g avectorofpolygenicvalues,and " areresiduals.Theoverallmeanand sex-speci˝cdeviationfromthemean( sex )areassumed˝xedvalues,whilepolygenicvaluesand residualsareassumedrandomanddistributedas: " ˘ N 0 ; I ˙ 2 " and g ˘ N 0 ; G ˙ 2 g ,where genomicrelationships G = ZZ 0 tr ¹ ZZ 0 ºš n ,and Z isacenteredgenotypematrix(withanimalsinrows andSNPsincolumns),centeredbysubtractingeachcolumnbyitssamplemean.Estimatesof interest, ^ ˙ 2 g and ^ ˙ 2 " ,andstandarderrorsthereofwerederivedusingREMLandtheinverseofthe informationmatrix,respectively. Foreachmodel˝t(eq.4.1),GWAwasperformedbytransformingpredictedadditivegenomic values ^ g toestimatesofadditiveSNPe˙ects ^ b andtheir(co)variances Var ¹ ^ b º [109,110].The teststatisticusedtotestthenullhypothesisofnoassociationbetweenalleledosageatSNP j andeditinglevelwereobtainedwith T j = ^ b j r Var ^ b j ˘ N ¹ 0 ; 1 º .P-valuesfromsuchatesthave beenshowntobeequivalenttoatestinwhichasingleSNPisassociatedwiththetrait,while modelingarandompolygenice˙ect(akintoEMMAX)[109,110,111].Functionsto˝tequation 4.1,andtransformgenomicvaluestoSNPe˙ectsandvariancesareprovidedinthegwaRpackage ( https://github.com/steibelj/gwaR ). 60 Toestimatethegenomicvariancelocalizedtoasiteofinterest,weusedatwopolygenice˙ect model: y = 1 + x sex + g local + g BG + " (4.2) where g local aregenomicvaluesarisingfromselectedSNPmarkers,and g BG arebackground genomicvaluesarisingfromallmarkersexcepttheselectedSNPmarkers.Itisassumed g local ˘ N 0 ; G local ˙ 2 g local and g BG ˘ N 0 ; G BG ˙ 2 g BG ,wheregenomicrelationships G local and G BG arederivedusingtheircorrespondingSNPsets. Totestfornon-nullvariancecomponentsofinterest(suchas ˙ 2 g or ˙ 2 g BG ),achi-squaredtest statisticfromalikelihoodratiotestiscomputed: LRT ˙ 2 k = 2 LL ˙ 2 k = 0 ,where L isthelog likelihoodevaluatedattheREMLestimateforthefullmodel(eitherequation4.1whentesting globalgenomicvariancecomponentsorequation4.2whentestinglocalizedgenomicvariance components)and L ˙ 2 k = 0 istheloglikelihoodforareducedmodelinwhichthe k th variance componentofinterestisremoved.Ithasbeenshownthat LRT ˙ 2 k asymptoticallyfollowsamixture of ˜ 2 1 and ˜ 2 0 distributions[112],thereforethe p -valuefromsuchatestwas 1 F 1 LRT ˙ 2 k ! 2 ,where F 1 ¹º isthechi-squaredcumulativedistributionfunctionwith1degreeoffreedom. 4.5.7Bivariateanalysistoestimategenomiccovariances Jointlymodelingsite-speci˝ceditinglevelsandhigher-orderphenotypeswasdoneusingtrait- speci˝cmeans,sex-speci˝ce˙ects,polygenice˙ectsandresiduals: 2 6 6 6 6 6 4 y 1 y 2 3 7 7 7 7 7 5 = 2 6 6 6 6 6 4 1 1 1 2 3 7 7 7 7 7 5 + 2 6 6 6 6 6 4 x 1 sex 1 x 2 sex 2 3 7 7 7 7 7 5 + 2 6 6 6 6 6 4 g 1 g 2 3 7 7 7 7 7 5 + 2 6 6 6 6 6 4 " 1 " 2 3 7 7 7 7 7 5 ; where y 1 = n y 1 i o n 1 ˘ 168 i = 1 areeditinglevelsatoneof5editingsites(thosepre-determinedto possessasigni˝cantpolygenice˙ect)and y 2 = n y 2 i o n 2 ˘ 940 i = 1 arephenotypesatoneof67growth, 61 carcasscomposition,ormeatqualitytraits.Thetwotraits(aneditinglevelandahigherorder phenotype)aremodeledtobejointlydistributedas: 2 6 6 6 6 6 4 y 1 y 2 3 7 7 7 7 7 5 ˘ N © « 2 6 6 6 6 6 4 1 1 + x 1 sex 1 1 2 + x 2 sex 2 3 7 7 7 7 7 5 ; 2 6 6 6 6 6 4 G 1 ˙ 2 g 1 + I n 1 ˙ 2 " 1 G 1 ; 2 ˙ g 1 g 2 + B n 1 ; n 2 ˙ " 1 " 2 G 2 ; 1 ˙ g 1 g 2 + B n 2 ; n 1 ˙ " 1 " 2 G 2 ˙ 2 g 2 + I n 2 ˙ 2 " 2 3 7 7 7 7 7 5 ª ® ® ¬ ; whereforexample, I n 1 isthe n 1 -by- n 1 identitymatrixand B n 1 ; n 2 isan n 1 -by- n 2 logicalmatrix consistingof0sand1susedtolinkcommonanimalsbetween y 1 and y 2 records. G 1 ; 2 arethe genomicrelationshipsbetween y 1 animals(inrows)and y 2 (incolumns)and G 1 arethegenomic relationshipsbetween y 1 animalsonly.Thegeneticcorrelationisde˝nedas ˆ g = ˙ g 1 g 2 q ˙ 2 g 1 ˙ 2 g 2 .Esti- matesofindividualvariancecomponentsandcovarianceswereobtainedusingREML,andstandard errorsofgeneticcorrelationestimateswereapproximatedusingthedeltamethod[113].Inferring non-nullgenomiccovariancesorcorrelationswasdoneusing LRT ˙ g 1 g 2 = 2 LL ˙ g 1 g 2 = 0 ,with a p -valuecalculatedfrom 1 F 1 LRT ˙ g 1 g 2 .Estimatinglocalgenomiccovarianceswereper- formedsimilarly,only(co)variancestructuresusedtoestimatetrait-speci˝cvariancecomponents andcross-traitcovariancesutilizedandkggenomicrelationshipmatrices. 62 CHAPTER5 CONCLUSION 5.1Gene-by-sexinteractionsandideasforfutureanalyses Amongthefourcomplextraitsstudiedinchapter2,theybroadlyfallintotwocategories:i) subtleG Ö Sinteractions(suchasheight,BMI,andbone-mineraldensity),andii)large-magnitude G Ö Sinteractions(waist-to-hipratio).Formeasurementsinthesecondcategory,theymaybe consideredadi˙erenttraitwhenmeasuredinmalesthanwhenmeasuredinfemales.Inother words,ifthefactors(geneticornot)thatcontributetoameasurableoutcomearedramatically di˙erentbetweensexes,thenonecanproperlyde˝nethemaleoutcomedi˙erentlyfromthefemale outcome.Todate,fewadditionalhumantraitsareknowntofallintothesecondcategory.Asa caveattothisdiscussion,itispossiblethatsmallsex-speci˝ce˙ectsatgenomicregionspossessing suggestiveG Ö Sinteractionscouldsimplyre˛ectpoorlinkage-disequilibriumbetweenSNPsand QTL.Thisispossibletocheckusinghigher-density( ˘ 13millionSNP)imputeddata. TofurtherexplorehowG Ö Sinteractionsmayin˛uencepopulation-levelphenotypicvarianceby creatingmeanandvariancedi˙erencesbetweensexes,itwillbeinterestingtoexamineminorallele frequencies(MAF)ateachdetectedG Ö Sinteraction.Underasinglecausalvariantmodelinwhich sex-speci˝cenvironmentalvariancesareidentical,andassumingHardy-Weinbergequilibrium,a G Ö SinteractionwithahighMAFwillmainlycreatedi˙erencesinvariancesbetweensexes.Onthe otherhand,alowerMAFG Ö Sinteractionwillcreatedi˙erencesinmeansbetweensexes(aswell asdi˙erencesinvariances).ItwillbeworthwhiletoformulatetheexactrelationshipbetweenG Ö E causalvariantallelefrequencyanditse˙ectonmean/variancedi˙erencesbetweenenvironments, whichwilldependoncertainassumptionsbutbeapplicabletoanyG Ö Escenariowhereallele frequenciesareidenticalbetweenenvironments. InRawliketal.[14],usingabivariatemixedmodelanddatafromtheUKBiobankthey estimatedgenome-widesex-speci˝cadditivegeneticvariances.Ifadditivegeneticvariancesdi˙er 63 betweensexes,thisindicatesadditiveSNPe˙ectsdi˙erbetweensexes(becauseautosomalallele frequenciesbetweensexesareassumedthesame).Thisalonedoesnotindicatewhethermale- speci˝candfemale-speci˝cSNPe˙ectsuniformlydi˙erbyaproportionalityconstant,nordoes itindicatewhereinthegenomesex-speci˝ce˙ectsdi˙er.Itwouldbeinterestingtoformally decomposesex-speci˝cadditivevariancesintoapproximatelyindependentgenomicregions[114], ordecomposegeneticcovariancestosuchregionsasanalternativemeanstomapG Ö S interactions. OneplausiblebiologicalmechanismexplainingwhyG Ö Sinteractionscreateobservabledi˙er- encesbetweenmalesandfemalesisthatsomeeQTLmayhavesex-dependentfunction[115](only regulatetranscriptabundanceinmales,oronlyinfemales).Thisisconsistentwiththeobservation thatmanylargee˙ectG Ö Sinteractionshaveane˙ectinonesexbutlittletonoe˙ectintheother. Tofurtherexplorethishypothesis,itwouldbeworthwhiletodetermineifG Ö Sinteractionsare enrichedinG Ö SinteractingeQTL. 5.2PresentlimitationstomodelingRNAeditingactivityandideasforfuture functionalgeneticsstudies Inthiswork,weassesssite-speci˝cRNAeditingactivitybyestimatingeditinglevelsateach RNAeditingsite.Herewede˝nethetrueeditinglevelatanRNAeditingsitetobetheproportion oftranscripts(inapopulationoftranscriptstranscribedfromthesamelocus)containingtheedited inosinevariant.Inpractice,weonlyestimateeditinglevelsusingsequencingdata,witheach estimatesubjecttomeasurementerror.Ifmeasurementerrorissubstantial,thiscouldnegatively impactpowertodetectglobalandlocalgenomicvariancecomponentscontributingtoeditinglevel variance.ThisencouragesRNAeditinglevelstobeestimatedusingrelativelydeepsequencingto limiteditinglevelestimationerror. Inchapter4,wereportmodestevidencethatcovariancebetweeneditingactivity(asmeasured inskeletalmuscle)andcomplextraitsisdrivenbyasharedgeneticarchitecture.Onepotential reasonforthiscouldbethatthegrowth,carcasscomposition,andmeatqualitytraitsstudiedare 64 underdi˙eringmolecularcontrol.Forinstance,itispossiblethatvariationinthesetraitscan beattributedtoRNAeditingactivityinadi˙erenttissueotherthanskeletalmuscletissue(RNA editingactivitymeasuredinskeletalmusclemaybelowlycorrelatedwithRNAeditingactivityin othertissues[35]).Itwillbeparticularlyinterestingtore-visitthehypothesisthatgeneticvariation contributestocovariancebetweenRNAeditingactivityandcomplextraitsasdatafrommore (related)individualsandtissuesbecomesincreasinglyavailable. InfunctionalgeneticsstudiessuchasRNAeditingstudies,itisfairlycommontoidentify eQTL(oredQTL[99])thatco-localizewithphenotypicQTL.ThisprovidesevidencethataDNA segmentcontributestovariationingeneexpressionandvariationinphenotype,howeveritdoesnot necessarilyimplythattheDNAsegmentcontributestocovariancebetweengeneexpressionand phenotype.Inchapter4,we˝ndevidencethatSNPs˛anking ADAR contributetovariationinRNA editingactivityandvariationingrowthtraitssuchasaveragedailygain,butonlydetectmodest evidencethat ADAR -˛ankingSNPscontributetocovariancebetweenRNAeditingactivityand growthtraits.Ultimately,itmaybeworthre-visitingmanyco-localizationsbetweenphenotypic QTLandedQTL(orsQTL,eQTL,etc.)usingbi-variatemodelstodetermineiftheco-localized DNAregioncontributestocovariance. 5.3Overallconclusions Here,multipleperspectiveswereusedtoaddressthelong-standingquestion:howdoesgenetics contributetophenotypicvariation?Underaquantitativegeneticperspective,wehaveprovided evidencethatnumeroussmallmagnitudeG Ö Sinteractionsmaytogethercontributetobroad-sense heritabilityamongtraitssuchashumanheight,BMI,bone-mineraldensityandwaist-to-hipratio. Fromafunctionalgeneticperspective,wehaveinvestigatedthedegreethatRNAediting,as measuredfromskeletalmuscletissue,mayserveasaplausiblebiologicalmechanismlinking geneticvariationwithcomplextraitvariationinpigpopulations. Thisworkillustratesthewell-acceptedbeliefthattraditionalGWASmarker regressionseverelyunderpoweredtodetectmanyQTL;inchapter2,weprovided 65 evidencethatSMRmaybeunderpoweredtodetecttypicalG Ö Sinteractions,evenwithrelatively largesamplesizes(N ˘ 250,000).Inchapter4,wefoundRNAeditingactivityfornumeroussites tobehighlyheritable,yetonlyacoupleeditingsitesshowedgenome-widesigni˝canteditingQTL (edQTL),suggestingnumerousRNAeditingsitesmaybein˛uencedbysmalladditivegenetic e˙ectsundetectablebySMR(givencurrentsamplesizesandSNPmarkers). Intotal,thisworkencouragestheuseoflocalBayesianregressionstofurtherstudyG Ö Sand otherG Ö Einteractionsamongunstructuredhumanpopulationsforwhichlargesamplesizesexist. ItalsoencouragesmolecularphenotypessuchasRNAeditingtobestudiedusingmulti-variate modelstodecomposecovariancebetweenmolecularphenotypesandcomplextraitstogenetic sources. 66 APPENDICES 67 APPENDIXA CHAPTER2SUPPLEMENTARYMATERIAL TableA.1:Sex-speci˝cphenotypestatistics traitstatisticmalefemale heightSamplesize119190139738 heightSamplemean176163 heightSampleSD6.766.2 height25%Quantile172159 height75%Quantile181167 WHRSamplesize119153139681 WHRSamplemean0.9350.816 WHRSampleSD0.06510.07 WHR25%Quantile0.8920.766 WHR75%Quantile0.9740.861 bhmdSamplesize106662124970 bhmdSamplemean0.5740.516 bhmdSampleSD0.1450.118 bhmd25%Quantile0.4820.434 bhmd75%Quantile0.6470.587 BMISamplesize119061139591 BMISamplemean27.827 BMISampleSD4.225.13 BMI25%Quantile24.923.4 BMI75%Quantile3029.6 Heightunits:cm BMDunits:g/cm2 BMIunits:Kg/m2 68 TableA.2:InferredG Ö Sinteractionsusingsex-speci˝cwindowvariances.Listedareall windowswith ˙ 2 g j 0 : 9 FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs1535515height0.00002110.00011740.81862070.99885060.9563218LRRC8CLRRC8D rs580251height0.00001340.00009010.74643680.99586210.936092LRRC8CLRRC8D rs519989height0.00001250.00008260.72321840.99034480.9165517LRRC8CLRRC8D rs12064668height0.00001340.00009010.74643680.99586210.936092LRRC8CLRRC8D rs10737711height0.00001340.00009010.74643680.99586210.936092LRRC8CLRRC8D rs6688061height0.00001940.00009070.83379310.9979310.9225287LRRC8CLRRC8D rs55668929height0.00001940.00009070.83379310.9979310.9225287LRRC8CLRRC8D rs1544926height0.00007630.00000340.98321840.41816090.9554023COL23A1 rs10903280height0.00007650.00000420.9839080.45862070.9514943COL23A1 rs72819017height0.0000760.00000320.9820690.39471260.9549425COL23A1 rs57478839height0.00007650.00000420.9839080.45862070.9514943COL23A1 rs35519588height0.00006930.0000010.96436780.15149430.9455172COL23A1 rs890802height0.00007630.00000340.98321840.41816090.9554023COL23A1 rs61739424height0.00006930.0000010.96436780.15149430.9455172COL23A1 rs2913847height0.00007630.00000340.98321840.41816090.9554023COL23A1 rs1388358height0.00012180.00001140.99310340.50344830.9445977ANXA1RORB rs2172162height0.00012180.00001140.99310340.50344830.9445977ANXA1RORB rs76907378height0.00011690.00000980.99034480.40045980.9388506ANXA1RORB rs7020553height0.00011690.00000980.99034480.40045980.9388506ANXA1RORB rs11143787height0.00011690.00000980.99034480.40045980.9388506ANXA1RORB rs11021216height0.00009550.00000870.98482760.53333330.9397701SESN3FAM76B rs11021219height0.00009550.00000870.98482760.53333330.9397701SESN3FAM76B rs10831376height0.00008570.00000760.97517240.49080460.923908SESN3FAM76B rs2636063height0.00000260.00006640.17241380.93264370.9135632FAM189A1 rs2672705height0.00000480.00006690.34114940.95057470.9137931FAM189A1 rs79512105height0.00000250.00006550.18689660.92505750.9032184FAM189A1 rs77268983height0.00009380.00000280.96574710.25701150.9434483SMAD6SMAD3 rs12593707height0.00009380.00000280.96574710.25701150.9434483SMAD6SMAD3 rs1895886height0.00006530.00000730.98459770.60689660.9041379FAM69CCNDP2 rs747175height0.00007380.00000860.99310340.66827590.9241379FAM69CCNDP2 rs1365249height0.00006530.00000730.98459770.60689660.9041379CNDP2 rs2278161height0.00006530.00000730.98459770.60689660.9041379CNDP2 rs653004height0.00009060.00000260.97126440.29655170.9298851SIK1FLJ41733LINC00322 rs4818928height0.00009060.00000260.97126440.29655170.9298851SIK1FLJ41733LINC00322 rs1003792height0.00009060.00000260.97126440.29655170.9298851SIK1FLJ41733LINC00322 rs12627203height0.00009060.00000260.97126440.29655170.9298851SIK1FLJ41733LINC00322 rs2071931WHR0.00008980.00022930.978390810.9229885H6PD 69 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs7517657WHR0.00004510.00025630.91448280.99931030.962069LOC284688METTL11B rs1332955WHR0.00006470.00029440.96965520.99977010.9726437LOC284688METTL11B rs80290375WHR0.00007440.00034680.93747130.99954020.9390805LOC284688GORAB rs6427245WHR0.00006470.00029440.96965520.99977010.9726437LOC284688GORAB rs7537355WHR0.00010640.000410.98873560.99977010.9650575LOC284688GORAB rs12139302WHR0.00010640.000410.98873560.99977010.9650575LOC284688GORAB rs7522128WHR0.00008150.00036380.960.99977010.9537931GORABLOC284688 rs61838774WHR0.00000130.00016210.13080460.97517240.965977LYPLAL1RNU5F-1 rs2168333WHR0.00000750.00069720.447586211LYPLAL1RNU5F-1 rs12747505WHR0.00000290.00029790.24735630.99655170.9908046LYPLAL1RNU5F-1 rs12724708WHR0.00000580.00045790.35655170.99839080.9931034LYPLAL1RNU5F-1 rs6541227WHR0.0000070.00063460.427356310.9997701LYPLAL1RNU5F-1 rs17005614WHR0.0000020.00020480.17977010.96413790.9448276LYPLAL1RNU5F-1 rs12030989WHR0.0000070.00063460.427356310.9997701LYPLAL1RNU5F-1 rs2820436WHR0.00000760.00070280.457471311LYPLAL1RNU5F-1 rs2605100WHR0.00000770.00070220.467126411LYPLAL1RNU5F-1 rs1538749WHR0.00000790.00071280.480229911LYPLAL1RNU5F-1 rs12022722WHR0.0000080.00071840.489885111LYPLAL1RNU5F-1 rs2605110WHR0.00000770.00070220.467126411LYPLAL1RNU5F-1 rs2061154WHR0.0000080.00071840.489885111LYPLAL1RNU5F-1 rs2791545WHR0.0000130.00043750.57356320.99931030.9942529LYPLAL1RNU5F-1 rs3923113WHR0.00001260.00106490.564367811GRB14COBLL1 rs10195252WHR0.00001260.00106490.564367811COBLL1GRB14 rs1128249WHR0.00001320.00106960.613563211COBLL1GRB14 rs75297654WHR0.00000850.00074240.40390811COBLL1 rs17244632WHR0.00000960.00074380.479080511COBLL1 rs13067911WHR0.00000270.00015330.18850570.99770110.9931034PPARGTSEN2 rs4684859WHR0.00000390.0001570.32988510.99839080.9937931PPARGTSEN2 rs73029213WHR0.00000240.00014790.32045980.99586210.9914943PPARGTSEN2 rs17036788WHR0.00000590.00016330.53149430.99908050.9935632PPARGTSEN2 rs6795735WHR0.0000170.00058360.731034511ADAMTS9-AS2 rs4132228WHR0.00001640.00058340.697701111ADAMTS9-AS2MIR548A2 rs4607103WHR0.00001950.00059150.809195411ADAMTS9-AS2MIR548A2 rs7433808WHR0.00001950.00059150.809195411ADAMTS9-AS2MIR548A2 rs7638389WHR0.00001950.00059150.809195411ADAMTS9-AS2MIR548A2 rs2194094WHR0.00001640.00058340.697701111ADAMTS9-AS2MIR548A2 rs60960425WHR0.00001710.0002870.79678160.99517240.983908RPL32P3 rs79763737WHR0.00000150.00020080.1119540.97080460.9590805EFCAB12 rs16861373WHR0.00000660.00042970.38896550.99954020.9947126PLXND1 70 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs79870266WHR0.00000660.00042970.38896550.99954020.9947126PLXND1 rs9833879WHR0.00000660.00042970.38896550.99954020.9947126PLXND1TMCC1 rs2306374WHR0.0000010.00006610.12275860.92229890.9022989MRAS rs9818870WHR0.0000010.00006610.12275860.92229890.9022989MRAS rs4301033WHR0.00000080.00007860.07632180.95011490.9367816TSC22D2LOC646903 rs73162462WHR0.00000080.00007860.07632180.95011490.9367816TSC22D2LOC646903 rs62271364WHR0.00000080.00007860.07632180.95011490.9367816TSC22D2LOC646903 rs4450871WHR0.00000020.00016830.026666711CYTL1MSX1 rs13133548WHR0.00000190.00024040.17540230.96873560.9558621FAM13A rs3822072WHR0.00000190.00024040.17540230.96873560.9558621FAM13A rs13147493WHR0.00000190.00024040.17540230.96873560.9558621FAM13A rs974801WHR0.0000050.00007160.36689660.96160920.9064368TET2 rs9884482WHR0.0000050.00007160.36689660.96160920.9064368TET2 rs2285720WHR0.0000050.00007160.36689660.96160920.9064368TET2 rs10488872WHR0.00007420.00000250.97747130.29172410.9448276PAPSS1DKK2 rs10000444WHR0.00007420.00000250.97747130.29172410.9448276PAPSS1DKK2 rs17037679WHR0.00007420.00000250.97747130.29172410.9448276PAPSS1DKK2 rs6818614WHR0.00007420.00000250.97747130.29172410.9448276PAPSS1DKK2 rs28399230WHR0.00007420.00000250.97747130.29172410.9448276PAPSS1DKK2 rs13156948WHR0.00000160.0000660.07885060.96965520.9570115IRX1LOC340094 rs6867983WHR0.00001920.00038150.440459810.9981609MAP3K1ANKRD55 rs3936510WHR0.00001880.00038120.409195410.9981609MAP3K1ANKRD55 rs9687846WHR0.00001880.00038120.409195410.9981609MAP3K1ANKRD55 rs37521WHR0.00000330.0000930.29011490.96459770.9243678PLK2ACTBL2 rs10073521WHR0.00001390.00011640.41471260.97011490.9002299TNFAIP8 rs17145265WHR0.00001630.00014220.52620690.99356320.9363218TNFAIP8 rs55682871WHR0.00001560.00011840.49241380.98114940.9050575TNFAIP8 rs7704120WHR0.00000490.00013740.4760920.99839080.9912644STC2NKX2-5 rs6879065WHR0.00000490.00013740.4760920.99839080.9912644STC2NKX2-5 rs1023617WHR0.00000490.00013740.4760920.99839080.9912644STC2NKX2-5 rs3836828WHR0.00000290.00013430.30459770.99816090.9908046STC2 rs9502498WHR0.00001410.00010570.47839080.99540230.9301149RREB1LY86 rs4960245WHR0.00001360.00010.44459770.99310340.9133333RREB1LY86 A˙x-37047069WHR0.00001330.00009780.41862070.99218390.9057471RREB1LY86 rs56005336WHR0.00008530.00061160.972643710.9503448GRM4HMGA1 rs76412020WHR0.00008550.00061090.970344810.9494253GRM4HMGA1 rs114355919WHR0.00008530.00061160.972643710.9503448GRM4HMGA1 rs7742369WHR0.00008530.00061160.972643710.9503448GRM4HMGA1 rs10947487WHR0.00007340.00056840.945287410.9349425HMGA1GRM4 71 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs117525671WHR0.00007340.00056840.945287410.9349425HMGA1GRM4 rs1776897WHR0.0000870.00061410.975862110.9503448HMGA1GRM4 rs2780226WHR0.0000870.00061410.975862110.9503448HMGA1GRM4 rs114344942WHR0.00008530.00061160.972643710.9503448HMGA1 rs139876191WHR0.00007340.00056840.945287410.9349425HMGA1 rs35381162WHR0.00007340.00056840.945287410.9349425HMGA1 rs1150781WHR0.0000870.00061410.975862110.9503448C6orf1 rs6918981WHR0.00007670.00057560.951724110.9363218C6orf1NUDT3 rs4711750WHR0.00006250.00222380.953333311VEGFALOC100132354 rs6899540WHR0.00004550.00114920.853333311VEGFALOC100132354 rs6905288WHR0.00005670.00222490.9211VEGFALOC100132354 rs9472126WHR0.00004550.00114920.853333311VEGFALOC100132354 rs1885659WHR0.00001940.00040760.59586210.99977010.9744828VEGFALOC100132354 rs2396081WHR0.00002310.00073520.677471311VEGFALOC100132354 rs12526378WHR0.00001940.00040760.59586210.99977010.9744828VEGFALOC100132354 rs36184164WHR0.00000580.0001630.30321840.98643680.9370115VEGFALOC100132354 rs4236084WHR0.00001940.00040760.59586210.99977010.9744828VEGFALOC100132354 rs10046368WHR0.00000050.000420.060459811VEGFALOC100132354 rs17789218WHR0.00002590.00025460.53770110.98252870.9121839SIM1LOC728012 rs2503097WHR0.00004340.0004590.803678211SIM1LOC728012 rs743011WHR0.00002590.00025460.53770110.98252870.9121839SIM1LOC728012 rs2073267WHR0.00004340.0004590.803678211SIM1LOC728012 rs7756047WHR0.00004310.0004180.792183910.9990805SIM1LOC728012 rs6937293WHR0.00004570.00046560.839080511SIM1LOC728012 rs972275WHR0.00003010.00069240.795632211CENPWRSPO3 rs2800725WHR0.00003010.00069240.795632211CENPWRSPO3 rs2800734WHR0.00002890.00068450.75793111CENPWRSPO3 rs1936799WHR0.00003060.00068360.798390811CENPWRSPO3 rs1936801WHR0.00002840.00068380.746206911CENPWRSPO3 rs9491696WHR0.00002840.00068380.746206911RSPO3 rs2745353WHR0.00002840.00068380.746206911RSPO3 rs6932207WHR0.00002840.00068380.746206911RSPO3 rs41285262WHR0.00002930.00069240.767126411RSPO3 rs1892172WHR0.00002840.00068380.746206911RSPO3 rs13202608WHR0.00002670.00067910.683448311RSPO3 rs4620145WHR0.00002840.00068380.746206911RSPO3 rs6569474WHR0.00002840.00068380.746206911RSPO3 rs72961013WHR0.00032620.0018148111RNF146RSPO3 rs7766444WHR0.00001090.00037210.46689660.9979310.9917241LOC645434LOC100132735 72 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs9376422WHR0.00001070.00041790.524137911LOC645434LOC100132735 rs74623604WHR0.00001070.00041790.524137911LOC645434LOC100132735 rs2908521WHR0.00001030.00041660.49793111LOC645434LOC100132735 rs651837WHR0.00001030.00041660.49793111LOC645434LOC100132735 rs632057WHR0.00001050.00041620.528505711LOC645434LOC100132735 rs668459WHR0.00001030.00041660.49793111LOC645434LOC100132735 rs72976928WHR0.00001030.00041660.49793111LOC645434LOC100132735 rs628751WHR0.00000860.00037860.423678210.9993103LOC645434LOC100132735 rs592423WHR0.00000930.00039730.449655210.9993103LOC645434LOC100132735 rs6570354WHR0.00000910.00037960.469655210.9993103LOC645434LOC100132735 rs12526447WHR0.0000020.00007580.18160920.98068970.9616092ESR1 rs7772579WHR0.00000780.00007710.39425290.98735630.9344828ESR1 rs1999805WHR0.00000190.00007560.16068970.97885060.9611494ESR1 rs2152750WHR0.00000190.00007560.16068970.97885060.9611494ESR1 rs1361024WHR0.00000220.0000760.20344830.98183910.962069ESR1 rs2504069WHR0.00000990.00009130.45540230.9919540.9452874ESR1 rs1055144WHR0.00000320.00035060.264597710.9988506MIR148ANPVFRNU6-16P rs12700666WHR0.00000320.00035060.264597710.9988506MIR148ARNU6-16P rs12700667WHR0.00000320.00035060.264597710.9988506MIR148ARNU6-16P rs73068463WHR0.00000680.00042240.461379311SNX10 rs74979045WHR0.00000460.00042040.303678211SNX10 rs1534696WHR0.0000050.00042170.341609211SNX10 rs7798433WHR0.00000460.00042040.303678211SNX10 rs1358503WHR0.00000210.00007160.30896550.98850570.9662069SEMA3CHGF rs35736598WHR0.00000150.00006690.22827590.98160920.9581609SEMA3CHGF A˙x-30952281WHR0.00000210.00007160.30896550.98850570.9662069SEMA3CHGF rs10091014WHR0.0000090.00024020.55816090.99540230.9777011NKX2-6STC1 rs568890WHR0.00001290.00031060.808965511NKX2-6STC1 rs6983481WHR0.00000450.00020150.25701150.96344830.9333333NKX2-6STC1 rs67846476WHR0.0000090.00024020.55816090.99540230.9777011NKX2-6STC1 rs445114WHR0.00000650.00008980.31701150.97931030.916092LOC727677POU5F1BPCAT1 rs622856WHR0.00000650.00008980.31701150.97931030.916092LOC727677POU5F1BPCAT1 rs444318WHR0.00000810.00009530.45471260.98896550.9326437LOC727677POU5F1BPCAT1 rs17464492WHR0.00000650.00008980.31701150.97931030.916092LOC727677POU5F1BPCAT1 rs12541832WHR0.00000650.00008980.31701150.97931030.916092LOC727677POU5F1BPCAT1 rs13281615WHR0.0000070.00009190.35977010.98275860.9236782LOC727677POU5F1BPCAT1 rs11783615WHR0.00000720.00009190.39356320.98459770.9245977LOC727677POU5F1BPCAT1 rs10991415WHR0.00000260.00010570.19632180.95908050.9347126ABCA1 rs2472377WHR0.0000120.0001340.53333330.9919540.9636782ABCA1 73 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs2515609WHR0.00000230.00010590.1620690.9579310.9354023ABCA1 rs10991417WHR0.00000480.00012250.33885060.98620690.9664368ABCA1 rs62568211WHR0.00001930.0001430.56045980.98850570.9537931ABCA1 rs10760322WHR0.00000270.00008120.28183910.98574710.9675862LHX2NEK6 rs943484WHR0.00000420.0000820.34505750.98827590.9636782LHX2NEK6 rs76204549WHR0.00000260.00007920.21034480.98275860.9636782LHX2NEK6 rs117790707WHR0.00000020.00007710.02942530.97287360.965977LHX2NEK6 rs12002771WHR0.00000380.00008110.31310340.98620690.9643678LHX2NEK6 rs10986225WHR0.0000030.00008030.2420690.98413790.9641379LHX2NEK6 rs1998951WHR0.00000460.00008230.38137930.98896550.9634483LHX2NEK6 rs78393198WHR0.00000260.00007920.21034480.98275860.9636782LHX2NEK6 A˙x-2640174WHR0.0000030.00015630.25954020.9719540.948046FGFR2MIR5694 rs2244506WHR0.00001010.00020680.45264370.99770110.9845977FGFR2MIR5694 rs7907754WHR0.0000030.00015630.25954020.9719540.948046FGFR2MIR5694 rs2254069WHR0.0000030.00015630.25954020.9719540.948046FGFR2MIR5694 rs4436487WHR0.0000030.00015630.25954020.9719540.948046FGFR2MIR5694 rs56297542WHR0.00000730.00009710.39862070.95632180.9009195FGFR2MIR5694 rs1907282WHR0.00000220.00009570.18988510.94045980.9068966FGFR2MIR5694 rs7089185WHR0.00000220.00009570.18988510.94045980.9068966FGFR2MIR5694 rs10788149WHR0.00000580.00009810.34459770.95425290.905977FGFR2MIR5694 rs578270WHR0.00001820.00012610.73816090.98873560.9252874FRMD8 rs2073800WHR0.00001820.00012610.73816090.98873560.9252874FRMD8 rs512715WHR0.00002490.00014820.83517240.99908050.9478161NEAT1 rs673753WHR0.00001830.00012640.74505750.98919540.9264368NEAT1MIR612 rs1784859WHR0.00001820.00012610.73816090.98873560.9252874NEAT1MALAT1MIR612 rs1783210WHR0.00001820.00012610.73816090.98873560.9252874MALAT1MIR612 rs1784100WHR0.00001820.00012610.73816090.98873560.9252874MALAT1MIR612 rs11263641WHR0.00002070.00023430.72275860.99977010.9908046CCND1MYEOV rs74471298WHR0.00000750.00017490.34068970.99632180.9777011CCND1MYEOV rs10160464WHR0.00000750.00017270.35333330.99632180.976092LOC100505834CCND1MYEOV rs4980785WHR0.00000830.00018090.43356320.99908050.9832184LOC100505834CCND1MYEOV rs7105934WHR0.00000790.000180.38850570.99908050.9832184LOC100505834CCND1MYEOV rs11233117WHR0.00000840.00008030.47333330.97655170.9112644ANO1FGF3 rs2343876WHR0.00002430.00034160.539080510.9914943SSPNITPR2 rs112251480WHR0.00002290.00033210.448275910.9885057SSPNITPR2 rs718314WHR0.00002320.00036060.523448310.997931ITPR2SSPN rs931384WHR0.00002320.00036060.523448310.997931ITPR2SSPN rs2171522WHR0.00002410.00036470.560689710.9981609ITPR2SSPN rs10842713WHR0.00002330.00035610.532183910.9981609ITPR2SSPN 74 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs598322WHR0.00001560.00013290.68988510.99678160.9528736HCAR1KNTC1HCAR2 rs4759364WHR0.00001750.0001690.75218390.99908050.9756322HCAR1KNTC1HCAR2 rs10847452WHR0.00001660.00013290.72229890.99747130.9494253HCAR1KNTC1HCAR2 rs10773433WHR0.00001660.00013290.72229890.99747130.9494253HCAR1KNTC1HCAR2 rs1798219WHR0.00001750.0001690.75218390.99908050.9756322HCAR1KNTC1HCAR2 rs586573WHR0.00001660.00013290.72229890.99747130.9494253HCAR1KNTC1HCAR2 rs118091133WHR0.00001660.00013290.72229890.99747130.9494253HCAR1HCAR3 rs1798192WHR0.00001660.00013290.72229890.99747130.9494253HCAR1HCAR3 rs1696352WHR0.00001660.00013290.72229890.99747130.9494253HCAR1HCAR3 A˙x-7035109WHR0.00001660.00013290.72229890.99747130.9494253HCAR1HCAR3 rs71456771WHR0.00001750.0001690.75218390.99908050.9756322HCAR1HCAR3 rs10847570WHR0.00001560.00013670.6779310.99678160.9443678HCAR1HCAR3 rs1316952WHR0.00004380.00060330.846206910.9997701DNAH10 rs11057396WHR0.00004380.00060330.846206910.9997701DNAH10CCDC92 rs11057401WHR0.00004380.00060330.846206910.9997701CCDC92 rs10773048WHR0.00003150.00055570.706896610.9997701CCDC92 rs4765127WHR0.00004380.00060330.846206910.9997701ZNF664ZNF664-FAM101A rs74551816WHR0.00015510.00000480.98183910.3820690.9641379NOVA1STXBP6 rs12432376WHR0.00017410.000007410.55241380.9942529NOVA1STXBP6 rs11627916WHR0.00017410.000007410.55241380.9942529NOVA1STXBP6 rs7161009WHR0.00016730.00000560.99977010.4379310.9926437NOVA1STXBP6 rs4983099WHR0.00015460.0000040.98114940.28781610.9627586NOVA1STXBP6 rs8021667WHR0.00016730.00000560.99977010.4379310.9926437NOVA1STXBP6 rs1955872WHR0.00015870.00000580.98505750.4319540.9645977NOVA1STXBP6 rs116145925WHR0.00000140.00007470.14252870.96321840.9356322UBE2I rs115466201WHR0.00000140.00007470.14252870.96321840.9356322UBE2I rs2286973WHR0.00003390.00012260.87632180.99862070.916092CLEC16A rs12930396WHR0.0000040.00009370.26551720.95011490.9209195LITAFLOC388210RMI2 rs4131548WHR0.0000040.00009370.26551720.95011490.9209195LITAFLOC388210RMI2 rs17777180WHR0.00000310.00059460.290574711CMIP rs4889326WHR0.00000190.0005910.215172411CMIP rs2925979WHR0.00000160.00059020.17793111CMIP rs11865332WHR0.00000160.00059020.17793111CMIP rs9892297WHR0.00002620.00011870.85241380.99977010.9370115TNFSF12POLR2A rs62070804WHR0.00000040.00008870.05218390.96896550.9609195ABHD15 rs3110647WHR0.00000230.00007310.26965520.95632180.9285057HNF1BDDX52 rs34064336WHR0.00000070.00006940.05563220.92735630.9158621HNF1BDDX52 rs4080890WHR0.00001530.00016310.59448280.99862070.9747126KCNJ2 rs1605750WHR0.00001880.00016830.6540230.99908050.9717241KCNJ2 75 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs16975820WHR0.00001190.00015670.47816090.99816090.9731034KCNJ2 rs17779747WHR0.00001880.00016830.6540230.99908050.9717241KCNJ2 rs1594476WHR0.00001190.00015670.47816090.99816090.9731034KCNJ2 rs3810068WHR0.00000260.00035940.17402311EMILIN2SMCHD1 rs684320WHR0.00000480.00036020.306206910.9997701EMILIN2 rs9954931WHR0.00000480.00036020.306206910.9997701EMILIN2 rs679153WHR0.00000480.00036020.306206910.9997701EMILIN2 rs623561WHR0.00000520.00036030.3410.9997701EMILIN2 rs4800269WHR0.00000930.00005890.5779310.9880460.9036782AQP4-AS1AQP4LOC728606 rs12454712WHR0.00000870.00010240.36045980.99586210.9648276BCL2 rs4940576WHR0.00000870.00010240.36045980.99586210.9648276BCL2 rs11661511WHR0.00000870.00010240.36045980.99586210.9648276BCL2 rs2288404WHR0.00000240.00007270.2680460.96988510.934023INSR rs1799815WHR0.00000060.00007020.05126440.94551720.9268966INSR rs10408374WHR0.0000010.00007130.13678160.95632180.9312644INSR rs34465381WHR0.00001260.00009840.62160920.99471260.9236782HAUS8CPAMD8 rs7259285WHR0.00001820.00017110.766896610.9889655HAUS8CPAMD8 rs7260259WHR0.00000980.00009280.52643680.99034480.9163218HAUS8 A˙x-15620245WHR0.00001580.00016750.722758610.9885057HAUS8 rs6512161WHR0.00001580.00016750.722758610.9885057HAUS8 rs7259348WHR0.00000980.00009280.52643680.99034480.9163218HAUS8 rs6102059WHR0.00000110.00007560.08275860.95885060.9303448MAFBLOC339568 rs1936963WHR0.00001680.00021260.549885110.9749425TSHZ2 rs2741366WHR0.00001350.00021180.483218410.9765517TSHZ2 rs73140232WHR0.00001120.00018440.39770110.99908050.9462069TSHZ2 rs2000339WHR0.0000160.0002080.48390810.968046TSHZ2 rs6013630WHR0.00001340.00018530.50022990.99908050.9455172TSHZ2 rs2800999WHR0.00002010.00022240.690804610.9786207TSHZ2 rs1293430WHR0.00001770.00019170.63747130.99931030.9450575TSHZ2 rs2256596BMD0.00025460.000071410.94321840.9374713RREB1 rs35742417BMD0.00025460.000071410.94321840.9374713RREB1 rs2714341BMD0.00023470.00007110.99816090.94068970.9117241RREB1SSR1 rs9403141BMD0.00013580.00000050.95816090.08551720.9468966MCHR2PRDM13 rs7451306BMD0.00013580.00000050.95816090.08551720.9468966MCHR2PRDM13 rs17428220BMD0.00027290.00012930.99333330.81264370.9045977EVX1HIBADH rs776746BMD0.00003080.0002130.770114910.9457471CYP3A5 rs45442295BMD0.00001280.00019520.45172410.99954020.9344828CYP3A7CYP3A7-CYP3AP1 rs45446698BMD0.00000910.0001930.1779310.99954020.9342529CYP3A4CYP3A7 rs6945984BMD0.00003080.0002130.770114910.9457471CYP3A4CYP3A7 76 TableA.2(cont'd) FocalSNP a trait ^ ˙ 2 g m j b ^ ˙ 2 g f j b PPM ˙ 2 g j c PPF ˙ 2 g j c ˙ 2 g j d nearbygenes e rs12333983BMD0.00003080.0002130.770114910.9457471CYP3A4CYP3A7 rs3735451BMD0.00003080.0002130.770114910.9457471CYP3A4 rs6956344BMD0.00003080.0002130.770114910.9457471CYP3A4 rs2242480BMD0.00003080.0002130.770114910.9457471CYP3A4 rs8176719BMD0.00060010.000018210.7940231ABO rs687621BMD0.00059780.000017610.77839081ABO rs657152BMD0.00060010.000018210.7940231ABO rs514659BMD0.00059780.000017610.77839081ABO rs643434BMD0.00060010.000018210.7940231ABO rs612169BMD0.00059780.000017610.77839081ABO rs581107BMD0.00059840.000011510.6239081ABO rs505922BMD0.00059780.000017610.77839081ABO rs507666BMD0.00058330.00000710.43977011ABO rs651007BMD0.00058330.00000710.43977011ABOSURF6 rs579459BMD0.00058330.00000710.43977011ABOSURF6 rs495828BMD0.00058330.00000710.43977011ABOSURF6 rs635634BMD0.00058330.00000710.43977011ABOSURF6 rs56196860BMD0.00012830.00000160.95908050.06436780.9365517FKBP4 rs1038196BMD0.0000280.00012360.82275860.9979310.9165517HMGA2 rs1351394BMD0.0000280.00012360.82275860.9979310.9165517HMGA2 rs1042725BMD0.0000280.00012360.82275860.9979310.9165517HMGA2 rs8756BMD0.0000280.00012360.82275860.9979310.9165517HMGA2 rs12424086BMD0.00002650.00012350.79126440.99701150.9197701HMGA2LLPH rs34029815BMD0.00002650.00012350.79126440.99701150.9197701HMGA2LLPH rs947211BMI9.52E-050.00001530.98988510.76873560.9022989SLC41A1RAB7L1 rs1775146BMI9.52E-050.00001530.98988510.76873560.9022989SLC41A1RAB7L1 rs13119835BMI2.31E-050.0001090.88229890.99839080.9057471NDST3 rs4833565BMI2.31E-050.0001090.88229890.99839080.9057471NDST3 rs10781293BMI2.00E-060.00010380.18482760.94482760.9121839PIP5K1B rs10869623BMI2.00E-060.00010380.18482760.94482760.9121839PIP5K1B rs941714BMI6.10E-060.00006670.5140230.97632180.9094253MIR656MEG9 rs3742407BMI7.00E-060.00006870.58712640.98413790.9151724MEG9 rs2295654BMI7.00E-060.00006870.58712640.98413790.9151724MEG9 rs4906037BMI6.10E-060.00006670.5140230.97632180.9094253DIO3OSMEG9 rs2400968BMI7.00E-060.00006870.58712640.98413790.9151724DIO3OSMEG9 rs235348BMI2.30E-060.00006940.29678160.96045980.9275862TSPEARUBE2G2 rs690333BMI1.16E-050.00007280.52436780.96942530.9006897TSPEARUBE2G2 a FocalSNPisde˝nedasthecenterSNP j inwindow j . 77 b Proportionofvarianceexplainedbysex-speci˝cSNPe˙ectswithinwindow j . c Posteriorprobabilitythatsex-speci˝ce˙ectsarenonzero. d Posteriorprobabilitythate˙ectsdi˙erbetweensexes. e Nearestgenesidenti˝edthroughAxiomUKBWCSGannotations,release34. FigureA.1:LDstatisticsacrossdistances 78 FigureA.2:Estimatedpowerandfalse-discoveryratefordiscoveringobservedSNPswithe˙ects inatleastonesex Estimatedpower(left)andFDR(right)shownasafunctionofthenumberofSNPsselected.Eachpoint representsasampleaverageanderrorbarsrepresent95%con˝denceintervals,eachderivedusing30 MonteCarloreplicates.LBR(SNP):localBayesianregression,utilizing PP SNP j .SMR:single-marker regression,utilizingtheF-test-based p -value. 79 FigureA.3:Powervsfalse-discoveryratefordiscoveringgenomicregionscontainingmasked causalvariants Herepowerisde˝nedastheexpectedproportionofcausalvariantsthatarebeingtaggedbyatleastone selectedSNP j orwindow j .Falsediscoveryrateisde˝nedastheproportionofselectedSNPsor windowsthatarenottagginganycausalvariants.Eachpointisanestimateanderrorbarsforbothaxes represent95%con˝denceintervals.Pointestimatesandintervalswerederivedusing30MonteCarlo replicates.Eachfacetcorrespondstoadi˙erentgeta˝xedwidtharoundeachcausalvariantthat de˝nesthesetofSNPse˙ectivelytaggingit.LBR(SNP):usesthe PP SNP j metricspanning1-0.LBR (Window):usesthemaximumbetween PPM ˙ 2 g j and PPF ˙ 2 g j spanning1-0.SMR:usestheF-test-based p -valuespanning(onthe-log 10 scale)30-0. 80 FigureA.4:ComparisonbetweenSMRandLBRfordiscoveringG Ö Sinteractions Manhattanplotshowing pvalue -di˙foreachanalyzedSNP.SNPsarecoloredyellowiftheywerefocal SNPswitha ˙ 2 g j 0 : 9 andcoloredrediftheywerefocalSNPswitha ˙ 2 g j 0 : 95 .The dashedhorizontallinesdenotep-di˙thresholdsof1x10 -5 and5x10 -8 . 81 FigureA.5:eQTLenrichmentasafunctionofthenumberofSNPsselected LBR(Window):usesthe ˙ 2 g j metric.SMR:usesthe pvalue -di˙metric. 82 APPENDIXB 83 CHAPTER4SUPPLEMENTARYMATERIAL TableB.1:HeritabilityestimatesforallRNAeditingsiteswithsamplesize Editingsiteposition:strandN a ^ ˙ 2 g b SE ^ ˙ 2 g c ^ ˙ 2 " d SE ^ ˙ 2 " e ^ h 2 f p -value g 16:26512555:minus1390.5760.2210.4540.1490.560.0000819 15:110910484:plus1660.3860.1710.6090.1340.3880.0000859 6:39368241:minus1660.4110.1760.5990.1350.4070.000157 16:26512645:minus1590.3350.1660.6470.1390.3410.000234 1:126167425:minus1650.310.160.6860.1380.3110.00088 X:60918926:plus340.6960.5510.230.4420.7520.0302 6:115902177:plus760.3250.2780.6890.2540.320.0662 8:67587714:minus890.3330.2410.6890.2180.3260.0698 14:91530951:plus1190.1120.140.8420.1710.1170.0752 2:30295839:minus1170.1930.1750.8120.1810.1920.0758 6:93743264:minus1410.1890.1520.8160.1590.1880.0786 10:56281853:minus1500.1220.1290.8840.1520.1220.136 7:96470770:plus340.5120.5240.3780.440.5750.137 6:48557105:minus1500.09790.1230.9010.150.0980.151 X:60918925:plus320.4670.5860.5260.5420.470.209 18:11662642:plus540.1960.3240.8160.3380.1940.263 1:126837057:minus110.9041.740.000003251.6410.272 16:26513687:minus1170.0670.1320.9190.1740.0680.288 6:62205750:minus860.1150.530.930.2140.110.291 5:57668385:plus230.2430.6330.670.6140.2660.3 15:77692835:plus810.08320.1860.8990.2280.08470.325 6:75654298:minus360.1660.3250.5650.3450.2270.358 18:11664433:plus360.1060.4160.9090.4680.1050.362 13:140729295:minus680.0690.2190.9330.2650.06880.365 4:91383948:minus100.8762.060.1261.490.8740.366 14:87895203:minus540.0570.2590.90.3010.05950.379 6:159410531:minus290.06760.6090.950.5990.06640.438 14:39851072:minus330.03750.4290.9940.530.03630.474 18:17492309:plus1500.004310.0890.9980.1460.00430.477 1:128090247:minus740.00830.1630.9740.2410.008450.48 10:29217567:plus340.000006980.3991.020.4730.000006850.5 13:111022773:minus600.000007310.2210.9990.2880.000007320.5 13:111022791:minus791.97E-090.1721.010.2371.94E-090.5 13:111023101:minus801.86E-090.1540.9270.2142.01E-090.5 13:146223066:plus180.0000007640.6130.8190.6880.0000009330.5 13:39922407:plus160.0002290.8311.030.9190.0002220.5 13:86543564:minus681.82E-090.1820.9430.2481.92E-090.5 14:111666396:plus427.92E-080.3191.020.3927.76E-080.5 15:110909979:plus1171.93E-090.11110.1731.92E-090.5 15:59690693:plus880.0000000470.1350.9040.1940.0000000520.5 16:23629281:minus1381.83E-090.0940.9970.1541.84E-090.5 6:168677788:plus150.3820.9710.4721.050.4480.5 6:62207065:minus431.51E-090.3220.9780.3841.55E-090.5 6:93743195:minus1242.01E-090.0990.9650.1592.09E-090.5 7:3053788:plus401.49E-090.3651.030.4361.45E-090.5 9:115531653:minus314.66E-080.4111.020.4994.59E-080.5 X:60916145:plus241.48E-090.57810.6651.47E-090.5 a Samplesize(thenumberofanimalswithadetectableeditinglevel) b REMLestimatedgenomicvariancecomponent c Standarderrorofgenomicvarianceestimate d REMLestimatedresidualvariancecomponent e Standarderrorofresidualvarianceestimate f Genomicheritabilityestimate g p -valuefromalikelihoodratiotest,testing H 0 : ˙ 2 g = 0 . 84 TableB.2: ADAR -localizedgenomicvarianceestimatesfor 67carcasscomposition,meatquality,andgrowthtraits trait a N b ^ ˙ 2 g local c SE ^ ˙ 2 g local d ^ ˙ 2 g BG e SE ^ ˙ 2 g BG f p -value g ADG9360.1440.08490.2660.05450.000293 last_lum9320.02650.02370.3580.06150.000459 tofat9360.07730.0510.2470.05130.000774 bf10_22wk9400.03250.02590.3150.05280.00118 car_bf109270.02750.02270.2870.04950.00264 mtpro9360.03550.02810.30.05390.0032 belly9330.03430.02840.2050.05150.00912 wt_22wk9400.05410.03950.2520.05480.00967 Days9400.06770.04670.2310.05280.0109 WBS9230.01450.01660.2590.05820.0154 lrf_22wk9400.01660.01680.3680.05760.0219 ˙toln9360.03280.02780.2120.05320.0268 mtfat9360.03830.03090.2210.05380.0276 car_wt9340.02540.02260.1420.04420.0424 bf10_16wk9400.01630.01710.4020.06210.0483 lrf_16wk9400.008870.01260.4110.06590.0651 farm_wt9340.02110.02020.1890.04980.0685 lma_22wk9400.01240.01530.3390.0640.0967 bf10_19wk9400.006810.01050.3930.06020.0972 lrf_13wk9400.008170.01220.3820.06480.115 wt_19wk9400.01690.01820.3250.06260.123 bf10_10wk9400.005450.009570.3620.06320.126 ˝rst_rib8450.006150.01040.2030.05290.133 lrf_10wk9400.005410.01010.3210.06220.164 85 TableB.2(cont'd) trait a N b ^ ˙ 2 g local c SE ^ ˙ 2 g local d ^ ˙ 2 g BG e SE ^ ˙ 2 g BG f p -value g lrf_19wk9400.00410.008330.490.06790.181 lma_19wk9400.01070.0140.3150.06180.185 wt_16wk9400.00950.01290.2550.05640.204 temp_24h9310.005490.009530.1790.04880.205 lma_16wk9400.004880.00960.2740.05830.207 o˙_˛avor9280.002740.006050.03120.02540.211 ph_24h9130.003150.007480.1920.05120.219 conn_tiss9280.003640.007790.1350.04350.248 juiciness9280.004280.007670.06950.03370.252 tenderness9280.004390.00920.2650.05820.262 last_rib9330.002590.007530.2510.05380.284 bf10_13wk9400.002250.007170.3630.06080.325 overtend9280.002460.007640.2730.05910.354 ˝rm9180.001470.006040.1460.04530.397 lma_13wk9400.001470.006930.3320.06370.409 wt_birth9400.0008340.005570.2130.05340.464 b8870.0002940.005960.3560.06450.476 temp_45m9330.000270.005340.20.05210.476 wt_3wk9402.06E-090.003880.06030.03130.5 wt_6wk9390.000006850.004840.1660.04730.5 wt_10wk9402.06E-090.005150.2250.05350.5 lma_10wk9402.06E-090.005450.30.05980.5 wt_13wk9400.00001220.005530.2830.05960.5 dress_ptg9340.000000180.005420.250.05720.5 color9310.000002060.005310.2270.05490.5 86 TableB.2(cont'd) trait a N b ^ ˙ 2 g local c SE ^ ˙ 2 g local d ^ ˙ 2 g BG e SE ^ ˙ 2 g BG f p -value g L8872.06E-090.005860.3420.06580.5 a8870.000001110.005720.4080.06730.5 cook_yield9244.91E-080.005520.2860.05980.5 marb9322.06E-090.00590.3720.06710.5 driploss9320.000002750.005240.2390.05510.5 ph_45m9202.06E-090.004420.1070.03990.5 pH_dec9000.0000001220.004180.07350.03470.5 car_length9330.0000001630.005950.3930.06840.5 num_ribs6550.000007810.007430.3610.08110.5 car_lma9282.06E-090.005650.4690.06870.5 ham9332.06E-090.005450.2650.05820.5 loin9332.06E-090.00540.2540.05710.5 boston9332.06E-090.00640.4030.07260.5 picnic9332.06E-090.006970.5190.08250.5 spareribs9302.06E-090.006010.3890.06880.5 moisture9226.58E-080.005580.3120.06160.5 fat9222.06E-090.005920.4830.07170.5 protein9212.06E-090.005750.3540.0650.5 a MoreinformationabouteachtraitcanbefoundinVelez-Irizarryetal.[106] b Samplesize c REMLestimatedADAR-localizedgenomicvariancecomponent d StandarderrorofADAR-localizedgenomicvarianceestimate e REMLestimatedbackgroundgenomicvariancecomponent f Standarderrorofbackgroundgenomicvarianceestimate g p -valuefromalikelihoodratiotest,testing H 0 : ˙ 2 g local = 0 . 87 TableB.3:Genomiccovarianceestimatesbetweensite-speci˝cRNAediting levelsand67carcasscomposition,meatquality,andgrowthtraits trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g temp_45m15:110910484:plus0.69785320.20066190.1947621-0.2112079-0.01644580.0013573 moisture16:26512645:minus-0.70007330.2000639-0.21684110.0504626-0.16637860.0041762 b1:126167425:minus0.62897660.21388390.194918-0.0364930.1584250.0064947 moisture16:26512555:minus-0.51795040.177277-0.21960510.02667-0.19293510.0113916 lrf_22wk16:26512645:minus0.53144820.20584770.1795119-0.01024090.16927090.0121452 boston6:39368241:minus-0.59338860.1416324-0.3154157-0.0886732-0.40408890.0133171 temp_45m1:126167425:minus0.6678270.21641990.1734387-0.02486640.14857230.0148532 fat16:26512645:minus0.51612890.18975340.20864950.04945720.25810670.0179607 last_lum16:26512645:minus0.51455070.19527520.1913932-0.00392130.18747190.0194878 conn_tiss15:110910484:plus-0.55274520.2268368-0.13748180.0535365-0.08394530.0197101 ˙toln15:110910484:plus-0.51081370.1923533-0.18187490.0911418-0.09073310.0228199 picnic16:26512555:minus0.61747210.1627870.326789-0.26183340.06495560.0236605 color16:26512555:minus-0.47119090.1923727-0.1755983-0.0175018-0.19310010.0256422 L16:26512555:minus0.47812280.18218230.2071758-0.04367770.16349810.0263383 last_rib16:26512645:minus0.48593470.22667580.13535870.05877710.19413580.0391113 wt_3wk16:26512555:minus-0.68932250.283316-0.11895980.0715114-0.04744850.0476917 mtfat15:110910484:plus-0.44720250.2039263-0.15225250.0757822-0.07647030.0477578 car_bf1016:26512645:minus0.38056010.20668790.12162960.06580330.18743280.0481823 moisture1:126167425:minus-0.47507480.2306446-0.14330860.1190167-0.02429190.0498548 pH_dec16:26512645:minus0.51506110.33275870.0772333-0.1163237-0.03909040.0498863 ph_45m1:126167425:minus0.59166690.29816620.1010036-0.1692225-0.06821890.0500024 num_ribs16:26512555:minus-0.46535060.200344-0.20250830.1169612-0.08554710.0500832 b15:110910484:plus0.47286840.21777940.1524001-0.03751630.11488380.0508777 pH_dec1:126167425:minus0.5455420.34732780.0745804-0.06025210.01432830.0517325 car_wt16:26512645:minus0.58432970.25354150.1309091-0.11746670.01344240.0572019 dress_ptg15:110910484:plus0.39785750.20585230.136633-0.04643450.09019850.0594525 temp_24h6:39368241:minus-0.47929020.2032275-0.1400908-0.3415478-0.48163870.0599582 wt_3wk6:39368241:minus-0.68907140.3143074-0.0946690.0404951-0.05417390.0599692 driploss15:110910484:plus-0.38671810.2107094-0.12983770.15196690.02212920.0630321 wt_10wk15:110910484:plus-0.41817340.2175119-0.1252656-0.0017598-0.12702550.0632891 belly16:26512645:minus0.55056960.23744970.14036320.03059530.17095860.0667876 marb16:26512645:minus0.40735350.20891690.14621830.08048370.2267020.0783371 temp_24h16:26512645:minus0.58092020.27235090.1205994-0.1429541-0.02235470.0813252 b16:26512555:minus0.31822760.18576470.1465983-0.09964180.04695650.0820527 lma_19wk15:110910484:plus-0.35443940.1928252-0.14163310.0641738-0.07745930.0831723 lma_22wk15:110910484:plus-0.35482590.185235-0.15226170.0392898-0.11297180.0841729 88 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g wt_3wk15:110910484:plus-0.59727370.3112136-0.0873063-0.0039575-0.09126380.0886598 ADG15:110910484:plus-0.33139270.1919747-0.1347480.1008835-0.03386450.1009422 bf10_19wk6:39368241:minus0.29114840.1814340.1274551-0.1708141-0.0433590.1041676 bf10_22wk16:26512645:minus0.34446550.21839950.11213520.01147550.12361070.104852 a1:126167425:minus-0.40003490.2477763-0.12180530.13530830.0135030.1066612 ph_45m15:110910484:plus0.55150010.17472560.0979464-0.1123954-0.0144490.1081383 L15:110910484:plus-0.35012750.1985502-0.13864110.17382010.0351790.1085107 a16:26512555:minus-0.29116790.1825653-0.14081920.1470530.00623380.109653 ham15:110910484:plus0.37287210.22010930.1199787-0.1438646-0.02388590.1128409 wt_6wk6:39368241:minus-0.44676730.2535047-0.10497230.0962471-0.00872520.1140483 wt_6wk15:110910484:plus-0.38159830.2228853-0.1076214-0.0741062-0.18172750.1148975 ˝rst_rib16:26512645:minus0.3996540.24036030.1112720.06076630.17203840.1233643 WBS16:26512645:minus-0.36367450.2209916-0.1186282-0.0154713-0.13409950.1266919 farm_wt16:26512645:minus0.45201420.25011980.1157098-0.09850050.01720930.129934 wt_22wk15:110910484:plus-0.31958010.2055025-0.11538620.0797386-0.03564750.1379282 mtpro16:26512645:minus0.33980090.22428670.1089872-0.01337290.09561430.1434817 num_ribs1:126167425:minus-0.38704120.2455344-0.12819530.13514930.00695390.1510288 wt_6wk16:26512645:minus-0.37781470.2514977-0.0942609-0.0085057-0.10276660.1524548 a6:39368241:minus-0.26748270.1846067-0.1169186-0.0956302-0.21254880.1528527 lrf_19wk16:26512645:minus0.30725940.20801630.11876160.06436920.18313080.1605653 wt_10wk16:26512645:minus-0.32414110.2409166-0.09281590.11037340.01755750.1652651 Days15:110910484:plus0.30040140.20463990.1084417-0.00993760.09850420.1662346 protein1:126167425:minus0.29260560.21879420.10401-0.08769230.01631770.1663039 ˝rm16:26512645:minus0.33789750.26329710.080186-0.04245560.03773040.1711678 ph_45m6:39368241:minus0.39125990.27855120.079414-0.05313260.02628140.1741369 bf10_13wk15:110910484:plus0.282910.21162430.100724-0.1323884-0.03166440.1787045 b16:26512645:minus0.287940.22527570.0978759-0.02644360.07143230.1879715 a15:110910484:plus-0.27803210.206455-0.105727-0.0030079-0.10873490.1933367 conn_tiss16:26512555:minus-0.32090720.2378448-0.09174140.0243247-0.06741660.1940133 car_lma16:26512555:minus0.27201480.17347390.1442266-0.10997550.03425120.2045772 bf10_19wk16:26512645:minus0.28378440.21727440.09882650.01418580.11301230.2048029 o˙_˛avor1:126167425:minus-0.49948140.4187853-0.05378410.0470309-0.00675320.2088798 juiciness1:126167425:minus-0.36955450.3075291-0.06559890.07823490.0126360.2111407 ph_45m16:26512645:minus0.43013920.30733780.0739628-0.01038460.06357820.2132017 ph_24h15:110910484:plus0.29560590.22987830.0887444-0.1305047-0.04176030.2174268 last_lum1:126167425:minus0.26889170.21505340.0982519-0.05012480.04812710.2227989 pH_dec6:39368241:minus0.38927060.31449920.0639188-0.00386340.06005540.2259253 89 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g bf10_16wk16:26512645:minus0.25818850.21121770.0957501-0.00838460.08736550.2270376 lrf_16wk6:39368241:minus0.22055750.19002520.0964238-0.08721190.00921190.2331844 car_bf106:39368241:minus0.20824030.19496370.0770797-0.04625810.03082160.2338157 tenderness15:110910484:plus-0.27582420.2151229-0.09447880.0445027-0.0499760.2353996 last_lum15:110910484:plus0.25333470.20313380.0979362-0.02148820.0764480.237471 wt_16wk15:110910484:plus-0.25900850.2138979-0.0880929-0.0200387-0.10813160.2385872 protein16:26512555:minus0.22405310.19450210.1001846-0.1833907-0.0832060.2408811 overtend16:26512645:minus0.2884810.22822910.0928696-0.00398210.08888750.2416114 driploss6:39368241:minus-0.24367480.2204878-0.08038920.10092610.02053680.2466252 overtend15:110910484:plus-0.27321810.2152852-0.09387040.0415702-0.05230020.2468283 belly15:110910484:plus0.280950.22488860.08666080.03138980.11805060.24926 ˙toln6:39368241:minus-0.2728010.2156793-0.09270510.0060356-0.08666950.251919 belly1:126167425:minus0.28937820.24636310.08009350.00497320.08506670.2525818 car_wt16:26512555:minus0.29472020.23050880.0920975-0.0953904-0.00329290.2643088 car_length16:26512555:minus0.22318520.18809470.1072539-0.05500280.05225110.2646737 tofat15:110910484:plus-0.2173670.2033011-0.0799180.0369009-0.04301710.2704865 temp_24h1:126167425:minus-0.31402090.2579774-0.0743201-0.3131702-0.38749030.2727324 pH_dec15:110910484:plus0.45544980.3470660.0635814-0.01233350.05124790.2730641 loin16:26512555:minus0.25954330.21090980.0980741-0.06560560.03246850.2744468 ph_24h16:26512645:minus-0.28283840.2572533-0.07361940.1105570.03693760.2748747 cook_yield15:110910484:plus0.24898730.21454520.0866483-0.00497730.08167090.2786352 wt_19wk16:26512555:minus-0.22110120.190224-0.1007887-0.0151277-0.11591640.2793971 fat16:26512555:minus0.20733810.1790370.10336490.17961510.282980.2803456 car_wt15:110910484:plus0.29988820.25785460.0736429-0.1189525-0.04530960.2828107 wt_19wk15:110910484:plus-0.22765450.2056989-0.08705980.0222916-0.06476820.2887758 tenderness16:26512645:minus0.258020.23122660.08184230.01168080.0935230.293002 lma_13wk6:39368241:minus0.21820710.20742960.0825468-0.00367720.07886970.293991 marb15:110910484:plus0.2322110.20610840.08982240.01015390.09997630.2948951 spareribs6:39368241:minus-0.2380480.1937929-0.1013694-0.0328957-0.13426520.2970763 dress_ptg16:26512645:minus0.24719770.24383720.0725209-0.03767420.03484670.2976703 lrf_16wk16:26512645:minus0.22468230.21454230.0832470.07542870.15867570.2983286 spareribs15:110910484:plus-0.25449180.2070737-0.0984231-0.0042541-0.10267720.3023104 pH_dec16:26512555:minus0.33159930.3083530.0647022-0.0946616-0.02995930.3060452 wt_6wk1:126167425:minus-0.28344820.2751467-0.06378840.0046975-0.05909090.3072649 num_ribs16:26512645:minus-0.2722520.2404619-0.0944863-0.0739905-0.16847680.308162 farm_wt16:26512555:minus0.26961270.21950120.0935755-0.08934060.00423490.3085265 bf10_13wk1:126167425:minus0.21584510.22870560.0713476-0.1296212-0.05827360.3131989 90 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g car_wt1:126167425:minus0.27790680.27410680.064099-0.0963426-0.03224360.3171421 ˝rst_rib6:39368241:minus0.24809590.23706490.0738355-0.06091450.0129210.3209069 num_ribs6:39368241:minus-0.22550630.2144594-0.0900992-0.175821-0.26592020.3276659 last_rib1:126167425:minus-0.20834970.2375975-0.06184380.07423840.01239460.3291292 mtfat6:39368241:minus-0.21936610.2128225-0.0763235-0.0632297-0.13955320.3333426 L1:126167425:minus0.23418150.23156460.07841-0.04897120.02943890.3353028 belly16:26512555:minus0.23504590.21837970.08342370.0084150.09183870.3360095 ADG6:39368241:minus-0.19290440.1931165-0.0792743-0.0961642-0.17543850.3402255 lrf_19wk6:39368241:minus0.17509890.18473320.0811234-0.03624440.0448790.3415369 color15:110910484:plus0.22343570.22613280.0715226-0.1235983-0.05207560.3437511 L16:26512645:minus0.22897380.22812360.0786197-0.0316510.04696870.3448456 lrf_10wk6:39368241:minus0.18529310.20792820.06975560.04915780.11891330.3502217 conn_tiss6:39368241:minus-0.23026460.2528379-0.0566242-0.0961793-0.15280350.3524339 lma_22wk16:26512645:minus0.24122040.23426590.07949760.0024090.08190660.3548819 tofat16:26512645:minus0.21790660.23524580.06675-0.00540480.06134520.3563514 bf10_22wk15:110910484:plus0.17917310.20578970.0644749-0.02000970.04446520.36103 mtfat16:26512555:minus-0.2089640.2079981-0.0811153-0.0141015-0.09521680.3614119 lma_10wk6:39368241:minus0.18409690.21062090.0668068-0.00489880.0619080.3640684 WBS16:26512555:minus0.19691530.20849880.0783108-0.0474970.03081380.3677198 o˙_˛avor15:110910484:plus-0.38194880.4347374-0.0405430.0357121-0.00483090.3743531 farm_wt1:126167425:minus0.24146410.26035120.0622988-0.0848597-0.02256090.3754812 last_lum6:39368241:minus0.1830680.20021540.07323120.03134990.10458120.3781264 loin15:110910484:plus-0.2171890.2285626-0.06879940.0482158-0.02058360.3820069 wt_birth16:26512555:minus0.21112750.21517310.0768658-0.02172380.0551420.3831235 ˝rm6:39368241:minus0.20506480.25304620.0529088-0.0544969-0.00158810.3873636 wt_10wk1:126167425:minus-0.21330040.2557446-0.0561465-0.0435573-0.09970380.3892369 protein15:110910484:plus0.17828560.21220120.0661004-0.01702410.04907630.3906322 bf10_10wk1:126167425:minus0.17138480.22548140.0593093-0.0821068-0.02279750.3966245 Days16:26512555:minus0.18304390.19788190.07700670.00448360.08149030.4004955 bf10_22wk6:39368241:minus0.1494770.19475150.0586909-0.0851445-0.02645360.406966 marb16:26512555:minus0.16196570.18684750.07743460.10867840.18611310.4127097 protein6:39368241:minus-0.15947780.1967758-0.06668010.13191360.06523350.4145938 mtpro1:126167425:minus0.17377860.22375240.0577126-0.02022910.03748350.4149834 juiciness16:26512645:minus0.25753220.32253440.04366390.01376430.05742820.4175586 lma_10wk15:110910484:plus-0.16548160.2136592-0.0581469-0.0701406-0.12828750.4184943 tenderness6:39368241:minus-0.17748210.20296-0.0658282-0.1548923-0.22072040.4194451 lma_22wk6:39368241:minus-0.16959730.2034665-0.06802040.0847120.01669170.426322 91 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g color1:126167425:minus-0.20015550.2596138-0.052825-0.0123827-0.06520770.4275635 bf10_19wk1:126167425:minus0.16357790.21647840.0586532-0.03063710.02801610.4306564 b6:39368241:minus0.16362740.2071160.06242750.05426590.11669340.4306727 conn_tiss1:126167425:minus-0.21564080.2938451-0.04418410.06279770.01861360.4372271 bf10_16wk6:39368241:minus0.14723870.19219340.062496-0.05898490.00351110.4383716 ˝rst_rib1:126167425:minus0.21007150.26519230.05361970.09576240.14938210.4442219 mtfat16:26512645:minus-0.19137460.2368362-0.0587899-0.0399691-0.0987590.4466127 lma_10wk16:26512645:minus-0.16502940.2328298-0.05347430.07667930.0232050.4503357 ham1:126167425:minus0.18892760.25046590.0542388-0.097682-0.04344320.4528543 car_wt6:39368241:minus0.19869880.24752850.0531838-0.1582138-0.105030.4536542 wt_3wk1:126167425:minus-0.26657150.3630597-0.0362631-0.080621-0.11688410.4542367 bf10_10wk16:26512645:minus-0.14677910.2142593-0.05497160.05702130.00204970.4571014 picnic16:26512645:minus0.24174370.21625630.1000864-0.04713750.05294890.4578164 car_bf101:126167425:minus0.14459190.2250770.0449817-0.0215830.02339870.4589563 wt_13wk15:110910484:plus-0.159450.2136491-0.0559987-0.1231484-0.17914720.4590514 ham16:26512645:minus0.18120630.24052520.0553937-0.0554881-0.00009440.4606909 ph_24h16:26512555:minus-0.17011740.2279145-0.05690590.13285940.07595340.4657904 lrf_22wk6:39368241:minus0.12886040.18834090.0541613-0.1375891-0.08342780.4674333 car_length15:110910484:plus-0.15248180.2069982-0.06051540.0543489-0.00616650.467747 lma_19wk6:39368241:minus-0.15583650.2129435-0.05756920.10944680.05187760.4682166 ADG16:26512555:minus-0.14787550.1897236-0.0682718-0.0023605-0.07063220.475393 driploss16:26512645:minus-0.16861440.2485945-0.0478864-0.0228969-0.07078330.4754865 lma_10wk1:126167425:minus-0.16264180.2430544-0.0488912-0.0108961-0.05978730.4764479 lrf_10wk1:126167425:minus0.15446910.23578720.0501495-0.02159550.0285540.4768412 ADG16:26512645:minus0.16927040.22515020.0586575-0.0609666-0.00230910.4814227 boston16:26512555:minus0.2042550.18778920.1017839-0.04540320.05638070.4899681 juiciness15:110910484:plus-0.20696830.3062559-0.03828080.08542010.04713940.4907493 bf10_13wk16:26512645:minus0.1405990.21689030.05024880.02495880.07520760.4923723 wt_birth1:126167425:minus0.19497950.26331050.0507076-0.1353645-0.08465690.495174 ham16:26512555:minus0.14820280.20813210.0590085-0.04771630.01129220.4961451 ham6:39368241:minus-0.15642390.2163433-0.0546501-0.0164437-0.07109380.4984195 last_lum16:26512555:minus0.13559090.1892480.06345190.02172430.08517620.5004708 juiciness16:26512555:minus0.20114980.29099160.0428790.04436040.08723930.5095759 fat15:110910484:plus0.13935730.19572010.0610172-0.03127560.02974160.5107377 lrf_13wk1:126167425:minus0.12893830.21670770.0479946-0.0905471-0.04255250.5122901 juiciness6:39368241:minus0.19879930.30463390.0372065-0.0603744-0.02316790.5155103 wt_22wk6:39368241:minus-0.13035610.200148-0.0504709-0.1340778-0.18454870.5275101 92 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g fat1:126167425:minus0.14117430.21707440.0547474-0.0664766-0.01172930.5383842 last_rib15:110910484:plus-0.12298440.2166575-0.04102740.21146870.17044130.5401597 ˙toln16:26512555:minus-0.14324720.2129389-0.05493810.0163072-0.03863090.540564 WBS1:126167425:minus0.16246970.25430490.046345-0.0686191-0.02227410.5423175 lma_10wk16:26512555:minus-0.11525470.2033807-0.04767960.0312974-0.01638220.5474477 bf10_19wk15:110910484:plus0.12056040.19976470.04735250.01716340.06451590.5543059 moisture6:39368241:minus-0.1321490.2170971-0.04693390.0282772-0.01865660.555355 boston1:126167425:minus0.20698610.22603220.0756742-0.05726190.01841230.5576765 ADG1:126167425:minus0.13231080.22779620.04492510.00494550.04987060.5673039 car_length16:26512645:minus-0.12821130.2235696-0.04674790.0245737-0.02217420.5676653 boston15:110910484:plus-0.17024090.2073931-0.0688635-0.0423637-0.11122720.571351 WBS15:110910484:plus0.13638860.22868530.04447780.00559710.05007490.5722162 farm_wt6:39368241:minus0.14465760.23324040.0436757-0.147997-0.10432130.5722967 temp_45m6:39368241:minus-0.15028220.2395513-0.0432328-0.1496475-0.19288030.5736385 spareribs16:26512645:minus0.14811320.22317660.0540750.02546440.07953950.5754834 ph_24h6:39368241:minus0.13864480.24042430.0403208-0.03424310.00607770.5763378 o˙_˛avor6:39368241:minus-0.23061060.4246238-0.02623920.0078805-0.01835870.5811339 bf10_13wk6:39368241:minus0.10575780.20330750.040972-0.0966894-0.05571730.5837899 wt_13wk16:26512555:minus-0.11328740.2060728-0.04611160.0090628-0.03704880.584874 ph_24h1:126167425:minus-0.14965350.2723411-0.0361461-0.069358-0.10550410.5885506 last_rib16:26512555:minus0.1059730.2098090.03954160.11433230.15387390.5890126 wt_22wk16:26512555:minus-0.11368350.1989365-0.0482446-0.0234481-0.07169260.5930439 Days1:126167425:minus-0.12831730.2409273-0.03922810.08604090.04681280.5943441 cook_yield6:39368241:minus0.12239360.21986850.0425102-0.0943446-0.05183440.5966366 ph_45m16:26512555:minus0.15165950.2751610.03618160.02607710.06225870.6006761 bf10_22wk1:126167425:minus0.10420590.22195710.0347694-0.0061060.02866340.6009449 ˙toln16:26512645:minus-0.13792220.2432724-0.0414282-0.0112717-0.05269990.6021818 dress_ptg16:26512555:minus0.1074750.21332210.0413156-0.01957040.02174520.6025657 lma_16wk1:126167425:minus0.12519310.24127880.0382801-0.0461075-0.00782740.604718 temp_45m16:26512555:minus0.12808560.22638020.0439548-0.0667698-0.02281490.6057148 car_length1:126167425:minus-0.11588890.2305331-0.04041050.0402431-0.00016740.609722 cook_yield16:26512645:minus0.12760.23904780.03988050.01807080.05795130.6121718 lrf_13wk6:39368241:minus0.09379850.2015290.03800850.03211280.07012140.6156588 loin1:126167425:minus-0.13540010.2568535-0.03737970.05604790.01866810.6203779 wt_22wk1:126167425:minus0.11937950.2396030.0368454-0.00316710.03367830.6246045 Days16:26512645:minus-0.12227840.2384499-0.03807580.04801550.00993970.6274328 a16:26512645:minus-0.09968150.2155526-0.0377495-0.0624146-0.10016410.6303711 93 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g bf10_19wk16:26512555:minus-0.08306040.1810327-0.04039620.06303580.02263960.6310913 wt_16wk16:26512555:minus-0.09928650.208538-0.03918690.001633-0.0375540.6396938 cook_yield1:126167425:minus-0.11495660.2451556-0.0344387-0.038007-0.07244570.6446724 ˝rst_rib15:110910484:plus0.12255420.24674260.0350250.05864110.09366610.6450172 lrf_13wk16:26512555:minus0.0816970.18772110.039355-0.01736670.02198830.6457789 overtend16:26512555:minus-0.10168380.2043616-0.04212010.08018510.0380650.6469964 picnic15:110910484:plus0.13520350.20663830.0585721-0.1663812-0.1078090.649712 temp_24h15:110910484:plus-0.11866550.248899-0.0320029-0.0082482-0.04025120.650646 lrf_22wk16:26512555:minus0.08131950.18350970.0382327-0.009580.02865270.6523264 overtend6:39368241:minus-0.10024670.2041207-0.0376138-0.1831348-0.22074860.6534165 fat6:39368241:minus0.08844390.18197340.042917-0.1453352-0.10241820.6542608 bf10_10wk6:39368241:minus0.08469750.20317560.0336485-0.0978635-0.0642150.6545824 ˝rm16:26512555:minus-0.10255340.2486785-0.0295026-0.0389664-0.0684690.6583117 moisture15:110910484:plus-0.10244740.2222102-0.0351406-0.0149755-0.05011610.6635611 car_lma1:126167425:minus-0.10527990.2144879-0.0407820.05434220.01356030.6707484 wt_13wk16:26512645:minus-0.10235020.2422701-0.03165750.11454180.08288430.6745644 dress_ptg6:39368241:minus0.09246850.23117430.02945110.01889370.04834480.6779402 picnic6:39368241:minus0.12724470.21073390.0538243-0.1748191-0.12099470.6805866 car_lma15:110910484:plus0.09694510.19889650.0408374-0.0609388-0.02010140.6816932 dress_ptg1:126167425:minus0.09732080.25390510.0275009-0.02441750.00308340.6855967 lma_13wk1:126167425:minus0.09149110.23524270.0300887-0.0351742-0.00508550.6868016 lma_13wk16:26512555:minus-0.07761270.1993232-0.03412690.0150029-0.01912390.6931593 lrf_19wk15:110910484:plus0.07517610.19314710.03296490.01008710.04305210.6976633 lma_19wk1:126167425:minus-0.09277440.2403673-0.02942690.08118060.05175370.6982174 o˙_˛avor16:26512645:minus-0.1762290.4665122-0.01759720.05486170.03726450.6997954 Days6:39368241:minus0.08132980.20414850.03064570.21618690.24683260.703174 lma_16wk16:26512645:minus0.09551410.24304720.0290555-0.0345935-0.0055380.7050698 lrf_10wk15:110910484:plus0.07739130.21977020.02729580.07098570.09828150.7103721 boston16:26512645:minus-0.11170290.2243104-0.0419084-0.00132-0.04322840.7234848 tofat6:39368241:minus-0.06649970.2014655-0.0251202-0.1065948-0.1317150.7289428 spareribs16:26512555:minus0.07973910.19268340.0379814-0.00069440.0372870.7310941 wt_19wk6:39368241:minus-0.07194860.2067507-0.0279894-0.0852592-0.11324860.7346876 bf10_16wk15:110910484:plus0.06617420.2007870.0264741-0.01863990.00783420.741028 driploss1:126167425:minus-0.07638350.2559055-0.02100510.07491050.05390530.7420244 lrf_22wk1:126167425:minus0.06501150.21456760.0233606-0.0474056-0.02404490.7437884 farm_wt15:110910484:plus0.0828390.24684760.0231778-0.0810389-0.05786110.7580161 lrf_13wk16:26512645:minus0.06507510.22979540.02212960.16449470.18662430.7593929 94 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g tofat1:126167425:minus0.06569850.23582320.02033660.008870.02920660.7643431 ˙toln1:126167425:minus0.08202970.253930.02314380.0460070.06915080.7648345 car_bf1015:110910484:plus0.05368830.20377390.01892830.03326530.05219370.7687679 mtfat1:126167425:minus0.07754790.25265760.0220450.02603550.04808050.768823 car_lma16:26512645:minus-0.071390.2071697-0.0288831-0.087058-0.11594110.7717547 wt_birth16:26512645:minus-0.08473450.2572912-0.0227705-0.1079413-0.13071190.7728507 bf10_22wk16:26512555:minus0.05171280.18642570.02325680.01923720.04249410.7736674 lrf_13wk15:110910484:plus-0.05243190.2019387-0.02117-0.0445553-0.06572530.7801575 wt_10wk6:39368241:minus-0.06189790.2273147-0.0196546-0.0692724-0.08892710.7841681 loin16:26512645:minus-0.07071880.2471477-0.02089610.02866730.00777120.7893274 belly6:39368241:minus0.06309980.23581620.01926720.0574240.07669120.7901433 wt_6wk16:26512555:minus-0.06455830.2382651-0.0199841-0.0065554-0.02653950.7965679 tenderness16:26512555:minus-0.05451480.2082312-0.02199810.05273790.03073980.8063238 cook_yield16:26512555:minus-0.05397470.2087991-0.02174560.03364720.01190160.8082364 lrf_19wk1:126167425:minus0.04943010.20868920.0199124-0.01965910.00025330.8088257 lma_22wk16:26512555:minus0.05090560.19748980.02299580.0393940.06238970.8113698 tofat16:26512555:minus-0.04318930.1957682-0.0184248-0.0064753-0.02490.8219577 lma_22wk1:126167425:minus0.05158570.23339470.0174564-0.00870970.00874670.8279009 marb6:39368241:minus0.0455410.20095110.0188274-0.1487871-0.12995970.8307879 mtpro6:39368241:minus0.04031070.19547810.0160683-0.1077932-0.09172490.8311801 lma_16wk6:39368241:minus-0.04690760.2236622-0.01580170.08915730.07335560.8366246 lrf_16wk1:126167425:minus-0.04248660.2216924-0.0154942-0.0039458-0.01944010.8391594 wt_birth15:110910484:plus0.05203040.25027730.0142693-0.1784088-0.16413950.8517117 num_ribs15:110910484:plus-0.0461480.2352905-0.0169746-0.1309972-0.14797180.856189 temp_24h16:26512555:minus-0.04142010.2307567-0.01364410.17690130.16325720.858759 mtpro15:110910484:plus-0.03516860.2054784-0.01308010.02813080.01505080.8607926 wt_13wk6:39368241:minus0.03827240.21641880.0136476-0.1217133-0.10806570.861163 protein16:26512645:minus-0.03819090.2288531-0.012975-0.1924175-0.20539240.8628197 wt_16wk16:26512645:minus-0.04065530.242841-0.0123087-0.0012041-0.01351290.8666131 driploss16:26512555:minus0.03566920.21504850.01338170.05460350.06798520.8666922 temp_45m16:26512645:minus0.0487720.2631570.01276540.02540790.03817320.8706103 wt_birth6:39368241:minus-0.04407490.2429699-0.0127384-0.0959784-0.10871680.872635 bf10_10wk16:26512555:minus-0.0262510.1902752-0.01224170.07400190.06176020.8791091 bf10_13wk16:26512555:minus0.0265780.18409450.01253760.11851150.13104920.8799535 car_lma6:39368241:minus-0.03296260.1913465-0.01475530.03824470.02348940.8824579 bf10_16wk1:126167425:minus-0.03009760.2187554-0.0108983-0.0144716-0.025370.8859281 color16:26512645:minus-0.03444130.2520494-0.0097509-0.0440156-0.05376650.8901388 95 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g car_bf1016:26512555:minus-0.02244380.1941115-0.00906020.09609060.08703040.8950399 lma_19wk16:26512645:minus0.03197830.23401980.01062620.0441850.05481110.8971709 ˝rst_rib16:26512555:minus-0.02772080.2235411-0.0099950.08422410.07422920.9048467 WBS6:39368241:minus0.0253880.21028820.00939110.13834090.1477320.9072824 ˝rm1:126167425:minus0.02858410.28697030.00622760.11339760.11962520.9114846 lma_19wk16:26512555:minus0.02267670.20070180.00986480.02424110.03410590.9115242 picnic1:126167425:minus0.0394210.22527940.0156001-0.0449129-0.02931280.9124973 color6:39368241:minus-0.02372710.2353243-0.0073135-0.0191298-0.02644330.9183845 wt_22wk16:26512645:minus0.02474140.23376340.0079986-0.0444655-0.03646690.92508 lrf_16wk16:26512555:minus0.0168460.18657870.00812340.13815540.14627880.9295823 lrf_22wk15:110910484:plus-0.01629690.1992203-0.0063776-0.0185724-0.024950.9318165 last_rib6:39368241:minus0.01748230.22857960.00547380.08686180.09233560.9339904 wt_19wk1:126167425:minus-0.01829720.2358962-0.0060360.00659930.00056320.9379897 conn_tiss16:26512645:minus-0.01768570.287292-0.003862-0.000251-0.0041130.9486608 wt_13wk1:126167425:minus0.01545870.24645280.0046351-0.0459512-0.04131610.9505577 marb1:126167425:minus0.01473710.2328280.0050407-0.0251173-0.02007650.953837 L6:39368241:minus-0.01243590.2145004-0.00466940.08496190.08029240.9555233 ˝rm15:110910484:plus-0.01385020.2668092-0.00334190.06524630.06190450.9562916 lma_13wk15:110910484:plus0.01277010.21559590.0046818-0.0368351-0.03215330.9581677 lrf_10wk16:26512645:minus0.01202850.23499420.00394310.1506340.15457720.9589063 lrf_19wk16:26512555:minus-0.00914360.1793327-0.00474810.09023830.08549020.9596086 tenderness1:126167425:minus0.01328230.24896930.0039214-0.0261136-0.02219220.9600551 wt_19wk16:26512645:minus-0.00887050.2327081-0.00300160.05386580.05086430.9702132 wt_16wk6:39368241:minus-0.00863430.2199931-0.0029587-0.0682485-0.07120710.9718213 lma_16wk16:26512555:minus0.00741610.2095050.00294840.05056760.0535160.9731017 bf10_16wk16:26512555:minus0.0057160.18466330.0027470.07551970.07826680.9751567 wt_16wk1:126167425:minus-0.00614460.2494703-0.00178010.04085410.0390740.9826452 bf10_10wk15:110910484:plus0.00389180.20925940.0014812-0.0027238-0.00124270.9846833 lma_13wk16:26512645:minus-0.00473490.235922-0.00156440.04856050.04699610.9847688 lma_16wk15:110910484:plus0.00451420.2240570.0015134-0.0696241-0.06811070.9853733 spareribs1:126167425:minus0.01013370.23094780.00354990.01604920.01959910.9861807 lrf_10wk16:26512555:minus0.00531210.20305860.00226660.0792360.08150260.9871669 loin6:39368241:minus-0.00317430.2258024-0.0010487-0.056781-0.05782960.9910243 car_length6:39368241:minus0.00230480.20523940.0009366-0.0153793-0.01444270.9913493 o˙_˛avor16:26512555:minus0.00188070.41259680.0002467-0.0378749-0.03762820.9967745 wt_10wk16:26512555:minus0.00266810.21917060.00096520.00777150.00873670.9970458 lrf_16wk15:110910484:plus0.0017880.20089690.0007346-0.0522624-0.05152791 96 TableB.3(cont'd) trait a editingsite ^ ˆ g b SE ^ ˆ g c ^ ˙ g 1 g 2 d ^ ˙ " 1 " 2 e ^ ˙ p 1 p 2 f p -value g mtpro16:26512555:minus0.00152770.18992460.00068240.01616650.01684891 overtend1:126167425:minus-0.00246820.2488897-0.00073090.0322180.03148721 a MoreinformationabouteachtraitcanbefoundinVelez-Irizarryetal.[106] b Geneticcorrelationestimate,where ˆ g = ˙ g 1 g 2 q ˙ 2 g 1 ˙ 2 g 2 c Standarderrorofgeneticcorrelationestimate d GenomiccovarianceREMLestimate e ResidualcovarianceREMLestimate f Phenotypiccovarianceestimate g P -valuetesting H 0 : ˙ g 1 g 2 FigureB.1:PairwiseLDplotbetweenSNPs˛anking ADAR 97 BIBLIOGRAPHY 98 REFERENCES [1] PeterDonnelly.Progressandchallengesingenome-wideassociationstudiesinhumans. Nature ,December2008. [2] LuciaA.Hindor˙,PraveenSethupathy,HeatherA.Junkins,ErinM.Ramos,etal.Potential etiologicandfunctionalimplicationsofgenome-wideassociationlociforhumandiseases andtraits. ProceedingsoftheNationalAcademyofSciences ,June 2009. [3] BrendanMaher.Personalgenomes:Thecaseofthemissingheritability. Nature , November2008. [4] TeriA.Manolio,FrancisS.Collins,NancyJ.Cox,DavidB.Goldstein,etal.Findingthe missingheritabilityofcomplexdiseases. Nature ,October2009. [5] JianYang,BebenBenyamin,BrianP.McEvoy,ScottGordon,etal.Commonsnpsexplaina largeproportionoftheheritabilityforhumanheight. NatureGenetics ,2010. [6] GustavodelosCampos,DanielSorensen,andDanielGianola.Genomicheritability:What isit? PLoSGenetics ,2015. [7] FelixR.Day,HannesHelgason,DanielI.Chasman,LyndaM.Rose,etal.Physicalandneu- robehavioraldeterminantsofreproductiveonsetandsuccess. NatureGenetics , 623,June2016. [8] JohnR.Sha˙er,JinxiLi,MyoungKeunLee,JasmienRoosenboom,etal.Multiethnic gwasrevealspolygenicarchitectureofearlobeattachment. TheAmericanJournalofHuman Genetics ,December2017. [9] BarbaraSchormair,ChenZhao,StevenBell,ErikTilch,etal.Identi˝cationofnovelriskloci forrestlesslegssyndromeingenome-wideassociationstudiesinindividualsofeuropean ancestry:ameta-analysis. TheLancetNeurology ,November2017. [10] JoëlleA.Pasman,KarinJ.H.Verweij,ZacharyGerring,SvenStringer,etal.Gwasof lifetimecannabisuserevealsnewriskloci,geneticoverlapwithpsychiatrictraits,anda causale˙ectofschizophrenialiability. NatureNeuroscience ,September 2018. [11] LoicYengo,JuliaSidorenko,KathrynE.Kemper,ZhiliZheng,etal.Meta-analysisof genome-wideassociationstudiesforheightandbodymassindexin ˘ 700000individualsof europeanancestry. HumanMolecularGenetics ,October2018. [12] M.C.Zillikens,M.Yazdanpanah,L.M.Pardo,F.Rivadeneira,etal.Sex-speci˝cgenetic e˙ectsin˛uencevariationinbodycomposition. Diabetologia ,2008. 99 [13] JianYang,AndrewBakshi,ZhihongZhu,GibranHemani,etal.Genome-widegenetic homogeneitybetweensexesandpopulationsforhumanheightandbodymassindex. Human MolecularGenetics ,2015. [14] KonradRawlik,OriolCanela-Xandri,andAlbertTenesa.Evidenceforsex-speci˝cgenetic architecturesacrossaspectrumofhumancomplextraits. GenomeBiology ,17(1):166, December2016. [15] BruceWalshandMichaelLynch. GeneticsandAnalysisofQuantitativeTraits .Sinauer Associates,Inc,1edition,1998. [16] JoshuaC.Randall,ThomasW.Winkler,ZoltánKutalik,SonjaI.Berndt,etal.Sex-strati˝ed genome-wideassociationstudiesincluding270,000individualsshowsexualdimorphismin geneticlociforanthropometrictraits. PLoSGenetics ,9(6):e1003500,June2013. [17] Ching-TiLiu,KarolEstrada,LauraM.Yerges-Armstrong,NajafAmin,etal.Assessment ofgene-by-sexinteractione˙ectonbonemineraldensity. Journalofboneandmineral research:theo˚cialjournaloftheAmericanSocietyforBoneandMineralResearch , October2012. [18] AndrewDeWan,MugenLiu,StephenHartman,SamuelShao-MinZhang,etal.Htra1 promoterpolymorphisminwetage-relatedmaculardegeneration. Science , 992,2006. [19] KristinL.AyersandHeatherJ.Cordell.Snpselectioningenome-wideandcandidategene studiesviapenalizedlogisticregression. GeneticEpidemiology ,December 2010. [20] YongtaoGuanandMatthewStephens.Bayesianvariableselectionregressionforgenome- wideassociationstudiesandotherlarge-scaleproblems. AnnalsofAppliedStatistics , 2011. [21] HuiYi,PatrickBreheny,NetsanetImam,YongmeiLiu,etal.Penalizedmultimarkervs. single-markerregressionmethodsforgenome-wideassociationstudiesofquantitativetraits. Genetics ,January2015. [22] JeremyA.Sabourin,WilliamValdar,andAndrewB.Nobel.Apermutationapproachfor selectingthepenaltyparameterinpenalizedmodelselection. Biometrics , December2015. [23] T.H.E.Meuwissen,B.J.Hayes,andM.E.Goddard.Predictionoftotalgeneticvalueusing genome-widedensemarkermaps. Genetics ,2001. [24] DavidHabier,RohanL.Fernando,KadirKizilkaya,andDorianJ.Garrick.Extensionofthe bayesianalphabetforgenomicselection. BMCBioinformatics ,12,2011. [25] JamesG.ScottandJamesO.Berger.Bayesandempirical-bayesmultiplicityadjustmentin thevariable-selectionproblem. TheAnnalsofStatistics ,October2010. 100 [26] RohanFernando,AliToosi,AnnaWolc,DorianGarrick,etal.Applicationofwhole-genome predictionmethodsforgenome-wideassociationstudies:Abayesianapproach. Journalof Agricultural,Biological,andEnvironmentalStatistics ,2017. [27] PeterM.Visscher,MatthewA.Brown,MarkI.McCarthy,andJianYang.Fiveyearsofgwas discovery. TheAmericanJournalofHumanGenetics ,January2012. [28] DanL.Nicolae,EricGamazon,WeiZhang,ShiweiDuan,etal.Trait-associatedsnps aremorelikelytobeeqtls:Annotationtoenhancediscoveryfromgwas. PLoSGenetics , 6(4):e1000888,April2010. [29] YangI.Li,BrycevandeGeijn,AnilRaj,DavidA.Knowles,etal.Rnasplicingisaprimary linkbetweengeneticvariationanddisease. Science ,2016. [30] RobBenne,JannyVanDenBurg,JustP.J.Brakenho˙,PaulSloof,etal.Majortranscriptof theframeshiftedcoxllgenefromtrypanosomemitochondriacontainsfournucleotidesthat arenotencodedinthedna. Cell ,September1986. [31] BrendaL.BassandHaroldWeintraub.Anunwindingactivitythatcovalentlymodi˝esits double-strandedrnasubstrate. Cell ,December1988. [32] AlekosAthanasiadis,AlexanderRich,andStefanMaas.Widespreada-to-irnaeditingof alu-containingmrnasinthehumantranscriptome. PLoSBiology ,2(12):e391,November 2004. [33] MiyokoHiguchi,StefanMaas,FrankN.Single,JochenHartner,etal.Pointmutationin anampareceptorgenerescueslethalityinmicede˝cientintherna-editingenzymeadar2. Nature ,2000. [34] YukioKawahara,AddaGrimberg,SarahTeegarden,CedricMombereau,etal.Dysregulated editingofserotonin2creceptormrnasresultsinenergydissipationandlossoffatmass. J Neurosci ,2008. [35] MengHowTan,QinLi,RaghuvaranShanmugam,RobertPiskol,etal.Dynamiclandscape andregulationofrnaeditinginmammals. Nature ,2017. [36] ZishuaiWang,XikangFeng,ZhonglinTang,andShuaiChengLi.Genome-wideinvesti- gationandfunctionalanalysisofsusscrofarnaeditingsitesacrosseleventissues. Genes , 10(5):327,April2019. [37] YueboZhang,LongchaoZhang,JingweiYue,XiaWei,etal.Genome-wideidenti˝cation ofrnaeditinginsevenporcinetissuesbymatcheddnaandrnahigh-throughputsequencing. JournalofAnimalScienceandBiotechnology ,2019. [38] LaurenA.Weiss,LinPan,MarkAbney,andCaroleOber.Thesex-speci˝cgeneticarchitec- tureofquantitativetraitsinhumans. NatureGenetics ,2006. [39] ThomasJ.Ho˙mann,GeorgB.Ehret,PriyankaNandakumar,DilriniRanatunga,etal. Genome-wideassociationanalysesusingelectronichealthrecordsidentifynewlociin˛u- encingbloodpressurevariation. NatureGenetics ,2017. 101 [40] GustavodelosCampos,DanielGianola,andDavidB.Allison.Predictinggeneticpre- dispositioninhumans:thepromiseofwhole-genomemarkers. NatureReviewsGenetics , December2010. [41] GustavodelosCampos,AnaI.Vazquez,RohanFernando,YannC.Klimentidis,etal. Predictionofcomplexhumantraitsusingthegenomicbestlinearunbiasedpredictor. PLoS Genetics ,9(7),2013. [42] LouisLello,StevenG.Avery,LaurentTellier,AnaI.Vazquez,etal.Accurategenomic predictionofhumanheight. Genetics ,October2018. [43] NengjunYi,VargheseGeorge,andDavidB.Allison.Stochasticsearchvariableselection foridentifyingmultiplequantitativetraitloci. Genetics ,164(3):11291138,July2003. [44] JianZeng,MarcinPszczola,AnnaWolc,TomaszStrabel,etal.Genomicbreedingvalue predictionandqtlmappingofqtlmas2011datausingbayesianandgblupmethods. BMC Proceedings ,6(SUPPL.2):S13,2012. [45] BjarniJ.Vilhjálmsson,JianYang,HilaryK.Finucane,AlexanderGusev,etal.Modeling linkagedisequilibriumincreasesaccuracyofpolygenicriskscores. AmericanJournalof HumanGenetics ,2015. [46] GustavodelosCampos,YogasudhaVeturi,AnaI.Vazquez,ChristinaLehermeier,etal. Incorporatinggeneticheterogeneityinwhole-genomeregressionsusinginteractions. Journal ofAgricultural,Biological,andEnvironmentalStatistics ,2015. [47] YogasudhaVeturi,GustavodelosCampos,NengjunYi,WenHuang,etal.Modeling heterogeneityinthegeneticarchitectureofethnicallydiversegroupsusingrandome˙ect interactionmodels. Genetics ,2019. [48] IrisM.Heid,AnneU.Jackson,JoshuaC.Randall,ThomasW.Winkler,etal.Meta-analysis identi˝es13newlociassociatedwithwaist-hipratioandrevealssexualdimorphisminthe geneticbasisoffatdistribution. NatureGenetics ,2010. [49] ThomasW.Winkler,AnneE.Justice,MariaelisaGra˙,LlildaBarata,etal.Thein˛uenceof ageandsexongeneticassociationswithadultbodysizeandshape:Alarge-scalegenome- wideinteractionstudy. PLOSGenetics ,11(10):e1005378,October2015. [50] DmitryShungin,ThomasW.Winkler,DamienC.Croteau-Chonka,TeresaFerreira,etal. Newgeneticlocilinkadiposeandinsulinbiologytobodyfatdistribution. Nature , 2015. [51] DavidKarasikandS.L.Ferrari.Contributionofgender-speci˝cgeneticfactorstoosteo- porosisrisk. AnnalsofHumanGenetics ,2008. [52] PaulinoPérezandGustavoDeLosCampos.Genome-wideregressionandpredictionwith thebglrstatisticalpackage. Genetics ,2014. 102 [53] SerenaSanna,AnneU.Jackson,RamaiahNagaraja,CristenJ.Willer,etal.Common variantsinthegdf5-uqccregionareassociatedwithvariationinhumanheight. Nature Genetics ,2008. [54] MaríaCorrea-Rodríguez,JacquelineSchmidtRio-Valle,andBlancaRueda-Medina.Akap11 genepolymorphismisassociatedwithbonemassmeasuredbyquantitativeultrasoundin youngadults. InternationalJournalofMedicalSciences ,2018. [55] Dong-LiZhu,Xiao-FengChen,Wei-XinHu,Shan-ShanDong,etal.Multiplefunctional variantsat13q14risklocusforosteoporosisregulateranklexpressionthroughlong-range super-enhancer. JournalofBoneandMineralResearch ,33(7),2018. [56] BenjaminH.Mullin,JohnP.Walsh,HouFengZheng,SuzanneJ.Brown,etal.Genome- wideassociationstudyusingfamily-basedcohortsidenti˝esthewlsandccdc170/esr1loci asassociatedwithbonemineraldensity. BMCGenomics ,2016. [57] BenjaminH.Mullin,JingHuaZhao,SuzanneJ.Brown,JohnR.B.Perry,etal.Genome-wide associationstudymeta-analysisforquantitativeultrasoundparametersofboneidenti˝es˝ve novellociforbroadbandultrasoundattenuation. Humanmoleculargenetics , 2802,2017. [58] AdamE.Locke,BratatiKahali,SonjaI.Berndt,AnneE.Justice,etal.Geneticstudiesof bodymassindexyieldnewinsightsforobesitybiology. Nature ,2015. [59] JongWeonChoiandSooHwanPai.Associationsbetweenabobloodgroupsandosteoporosis inpostmenopausalwomen. AnnalsofClinicalandLaboratoryScience , 2004. [60] B.B.LuandK.H.Li.Associationbetweenabobloodgroupsandosteoporosisseverity inchineseadultsaged50yearsandover. JournalofInternationalMedicalResearch , 2011. [61] CathieSudlow,JohnGallacher,NaomiAllen,ValerieBeral,etal.Ukbiobank:Anopen accessresourceforidentifyingthecausesofawiderangeofcomplexdiseasesofmiddleand oldage. PLOSMedicine ,12(3):e1001779,2015. [62] ScottA.Funkhouser,JuanP.Steibel,RonaldO.Bates,NancyE.Raney,DariusSchenk, etal.Evidencefortranscriptome-widernaeditingamongsusscrofapre-1sineelements. BMCGenomics ,18:360,2017. [63] MatthewBlow,P.AndrewFutreal,RichardWooster,andMichaelR.Stratton.Asurveyof rnaeditinginhumanbrain. GenomeResearch ,November2004. [64] ErezY.Levanon,EliEisenberg,RodrigoYelin,SergeyNemzer,etal.Systematicidenti˝- cationofabundanta-to-ieditingsitesinthehumantranscriptome. NatureBiotechnology , 2004. [65] EliEisenberg,SergeyNemzer,YaronKinar,RotemSorek,etal.Isabundanta-to-irna editingprimate-speci˝c? TrendsinGenetics ,2005. 103 [66] YossefNeeman,ErezY.Levanon,MichaelF.Jantsch,andEliEisenberg.Rnaeditinglevel inthemouseisdeterminedbythegenomicrepeatrepertoire. RNA ,2006. [67] LilyBazak,AmiHaviv,MichalBarak,JasmineJacob-Hirsch,etal.A-to-irnaeditingoccurs atoverahundredmilliongenomicsites,locatedinamajorityofhumangenes. Genome Research ,March2014. [68] JiaYuChen,ZhiyuPeng,RongliZhang,XinZhuangYang,etal.Rnaeditomeinrhesus macaqueshapedbypurifyingselection. PLoSGenetics ,2014. [69] EricS.Lander,LaurenM.Linton,BruceBirren,ChadNusbaum,etal.Initialsequencing andanalysisofthehumangenome. Nature ,2001. [70] LilyBazak,ErezY.Levanon,andEliEisenberg.Genome-wideanalysisofalueditability. NucleicAcidsResearch ,June2014. [71] NikitaS.VassetzkyandDmitriA.Kramerov.Sinebase:Adatabaseandtoolforsineanalysis. NucleicAcidsResearch ,2013. [72] GalitLev-Maor,RotemSorek,ErezY.Levanon,NuritPaz,etal.Rna-editing-mediatedexon evolution. GenomeBiology ,8(2):R29,2007. [73] A.D.J.ScaddenandChristopherW.J.Smith.Rnaiisantagonizedbyhyper-editing. EMBOreports ,December2001. [74] ClaudiaL.Kleinman,VéroniqueAdoue,andJacekMajewski.Rnaeditingofprotein sequences:Arareeventinhumantranscriptomes. RNA ,September2012. [75] GokulRamaswami,WeiLin,RobertPiskol,MengHowTan,etal.Accurateidenti˝cation ofhumanaluandnon-alurnaeditingsites. NatureMethods ,June2012. [76] WeiLin,RobertPiskol,MengHowTan,andJinBillyLi.Commenton"widespreadrnaand dnasequencedi˙erencesinthehumantranscriptome". Science ,335(6074):1302,March 2012. [77] HengLi.Astatisticalframeworkforsnpcalling,mutationdiscovery,associationmap- pingandpopulationgeneticalparameterestimationfromsequencingdata. Bioinformatics , 2011. [78] SarahN.De˚tandHeatherA.Hundley.Toeditornottoedit:Regulationofadarediting speci˝cityande˚ciency. WileyInterdisciplinaryReviews:RNA ,2016. [79] JosephK.Pickrell,YoavGilad,andJonathanK.Pritchard.Commenton"widespreadrna anddnasequencedi˙erencesinthehumantranscriptome". Science ,335(6074):1302,2012. [80] HengLi,JueRuan,andRichardDurbin.MappingshortDNAsequencingreadsandcalling variantsusingmappingqualityscores. GenomeResearch ,November 2008. 104 [81] XuedaHu,ShengqingWan,YingOu,BopingZhou,etal.Rnaover-editingofblcapcon- tributestohepatocarcinogenesisidenti˝edbywhole-genomeandtranscriptomesequencing. CancerLetters ,2015. [82] ChammiranDaniel,GiladSilberberg,MikaelaBehm,andMarieÖhman.Aluelementsshape theprimatetranscriptomebycis-regulationofrnaediting. GenomeBiology ,15(2):R28, 2014. [83] A.F.A.Smit,R.Hubley,andP.Green.Repeatmaskeropen-4.0,2013. [84] J.Jurka,V.V.Kapitonov,A.Pavlicek,P.Klonowski,etal.Repbaseupdate,adatabase ofeukaryoticrepetitiveelements. CytogeneticandGenomeResearch , 2005. [85] D.B.Edwards,C.W.Ernst,R.J.Tempelman,G.JMRosa,etal.Quantitativetraitloci mappinginanf2durocxpietrainresourcepopulation:I.growthtraits. JournalofAnimal Science ,2008. [86] SAndrews.Fastqc:aqualitycontroltoolforhighthroughputsequencedata,2010. [87] LinnéaSmedsandAxelKünstner.Condetri-acontentdependentreadtrimmerforillumina data. PLoSONE ,6(10):e26314,October2011. [88] BenLangmeadandStevenL.Salzberg.Fastgapped-readalignmentwithbowtie2. Nature methods ,2012. [89] ColeTrapnell,LiorPachter,andStevenL.Salzberg.Tophat:Discoveringsplicejunctions withrna-seq. Bioinformatics ,2009. [90] HadleyWickham. ggplot2:ElegantGraphicsforDataAnalysis .Springer-VerlagNewYork, 2016. [91] RachelB.Brem,GaëlYvert,RebeccaClinton,andLeonidKruglyak.Geneticdissectionof transcriptionalregulationinbuddingyeast. Science ,April2002. [92] RitsertC.JansenandJ.P.Nap.Geneticalgenomics:theaddedvaluefromsegregation. Trendsingenetics:TIG ,July2001. [93] K.G.Ardlie,D.S.Deluca,A.V.Segre,T.J.Sullivan,etal.Thegenotype-tissueexpression (gtex)pilotanalysis:Multitissuegeneregulationinhumans. Science , May2015. [94] AlexisBattle,SaraMostafavi,X.Zhu,J.B.Potash,etal.Characterizingthegeneticbasis oftranscriptomediversitythroughrna-sequencingof922individuals. GenomeResearch , January2014. [95] GokulRamaswami,PatriciaDeng,RuiZhang,MaryAnnaCarbone,etal.Geneticmap- pinguncoverscis-regulatorylandscapeofrnaediting. NatureCommunications ,6(1):8194, November2015. 105 [96] YerbolZ.Kurmangaliyev,SammiAli,andSergeyV.Nuzhdin.Geneticdeterminantsof rnaeditinglevelsofadartargetsindrosophilamelanogaster. Genes|Genomes|Genetics , February2016. [97] TongjunGu,DanielM.Gatti,AnujSrivastava,ElizabethM.Snyder,etal.Geneticar- chitecturesofquantitativevariationinrnaeditingpathways. Genetics , 2016. [98] EddiePark,JiguangGuo,ShihaoShen,LevonDemirdjian,etal.Populationandallelic variationofa-to-irnaeditinginhumantranscriptomes. GenomeBiology ,2017. [99] ThomasJ.Lopdell,VictoriaHawkins,ChristineCouldrey,KathrynTiplady,etal. Widespreadcis-regulationofrnaeditinginalargemammal. RNA ,March 2019. [100] WoutervanRheenen,WouterJ.Peyrot,AndrewJ.Schork,S.HongLee,etal.Genetic correlationsofpolygenicdiseasetraits:fromtheorytopractice. NatureReviewsGenetics , 2019. [101] PetrDanecek,Christo˙erNellåker,RebeccaE.McIntyre,JorgeE.Buendia-Buendia,etal. Highlevelsofrna-editingsiteconservationamongst15laboratorymousestrains. Genome Biology ,13(4),2012. [102] J.-H.Shin,S.Blay,B.McNeney,andJ.Graham.Ldheatmap:Anrfunctionforgraphical displayofpairwiselinkagedisequilibriabetweensinglenucleotidepolymorphisms. JStat Soft ,16:CodeSnippet3,2006. [103] ElizabethA.Hackler,DavidC.Airey,CaitlinC.Shannon,MonsheelS.Sodhi,etal.5-ht2c receptorrnaeditingintheamygdalaofc57bl/6j,dba/2j,andbalb/cjmice. Neuroscience Research ,2006. [104] E.LanderandN.Schork.Geneticdissectionofcomplextraits. Science , 2048,September1994. [105] NataliaS.Forneris,AndresLegarra,ZulmaG.Vitezica,ShogoTsuruta,etal.Quality controlofgenotypesusingheritabilityestimatesofgenecontentatthemarker. Genetics , 2015. [106] DeborahVelez-Irizarry,SebastianCasiro,KaitlynR.Daza,RonaldO.Bates,etal.Ge- neticcontroloflongissimusdorsimusclegeneexpressionvariationandjointanalysiswith phenotypicquantitativetraitlociinpigs. BMCGenomics ,20(1):3,December2019. [107] S.Casiró,D.Velez-Irizarry,C.W.Ernst,N.E.Raney,etal.Genome-wideassociationstudy inanf2durocxpietrainresourcepopulationforeconomicallyimportantmeatqualityand carcasstraits. JournalofAnimalScience ,2017. [108] AnthonyM.Bolger,MarcLohse,andBjoernUsadel.Trimmomatic:a˛exibletrimmerfor illuminasequencedata. Bioinformatics ,August2014. 106 [109] JoseL.GualdrónDuarte,RodolfoJ.C.Cantet,RonaldO.Bates,CatherineW.Ernst,etal. Rapidscreeningforphenotype-genotypeassociationsbylineartransformationsofgenomic evaluations. BMCBioinformatics ,15(1):246,2014. [110] Y.L.BernalRubio,J.L.GualdrónDuarte,R.O.Bates,C.W.Ernst,etal.Implementing meta-analysisfromgenome-wideassociationstudiesforporkqualitytraits1. Journalof AnimalScience ,December2015. [111] MinKangHyun,NoahA.Zaitlen,ClaireM.Wade,AndrewKirby,etal.E˚cientcontrolof populationstructureinmodelorganismassociationmapping. Genetics , 2008. [112] StevenG.SelfandKung-yeeLiang.Asymptoticpropertiesofmaximumlikelihoodesti- matorsandlikelihoodratiotestsundernonstandardconditions. JournaloftheAmerican StatisticalAssociation ,1987. [113] TroyM.Fischer,ArthurR.Gilmour,andJuliusWerf.Computingapproximatestandard errorsforgeneticparametersderivedfromrandomregressionmodels˝ttedbyaverage informationreml. GeneticsSelectionEvolution ,36(3):363,2004. [114] TomazBerisaandJosephK.Pickrell.Approximatelyindependentlinkagedisequilibrium blocksinhumanpopulations. Bioinformatics ,January2016. [115] ChenYao,RobyJoehanes,AndrewD.Johnson,TianxiaoHuan,etal.Sex-andage- interactingeqtlsinhumancomplexdiseases. HumanMolecularGenetics , April2014. 107