EVIDENCE FOR THE ASSOCIATED PRODUCTION OF A W BOSON AND A TOP QUARK AT ATLAS By James Koll A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Physics - Doctor of Philosophy 2013 ABSTRACT EVIDENCE FOR THE ASSOCIATED PRODUCTION OF A W BOSON AND A TOP QUARK AT ATLAS By James Koll This thesis discusses a search for the Standard Model single top W t-channel process. An analysis has been performed searching for the W t-channel process using 4.7 f b−1 of integrated luminosity collected with the ATLAS detector at the Large Hadron Collider. A boosted decision tree is trained using machine learning techniques to increase the separation between signal and background. A profile likelihood fit is used to measure the cross-section +4.9 +2.9 of the W t-channel process at σ(pp → W t + X) = 16.8−2.9 (stat)−4.9 (syst) pb, consistent with the Standard Model prediction. This fit is also used to generate pseudoexperiments to calculate the significance, finding an observed (expected) 3.3σ (3.4σ) excess over background. ACKNOWLEDGMENTS I am immensely thankful for the amazing people who have helped and encouraged me as I worked on this dissertation. I’d like to express deep gratitude to my advisor, Jim Linnemann, for all of his mentorship. I’d also like to give special thanks to Huaqiao Zhang for his patience in working with me on this analysis and teaching me so much about HEP. I would also like to thank Reinhard Schwienhorst for working with me from day one of my time at MSU. It is almost impossible to name all of my fellow students and coworkers who have helped me both as physicists and friends. Thank you to James Kraus, for showing me the ropes while we muddled through L1Calo upgrade planning. A massive debt is also owed to the students who have graduated before me for all the times they helped me, and for raising the bar so very high, especially Sarah Heim, Jenny Holzbauer, and Jeremiah Holzbauer. I’d also like to thank Emily Johnson, Brad Schoenrock, and Patrick True, who volunteered their time to review my dissertation, to act as a sounding board for my ideas, and, most importantly, to listen to me complain. I’d like to acknowledge the people who made this dissertation possible in a myriad of indirect ways. Thank you to Dennis Hewett, for giving me the passion for physics that led me here. Thank you to my favorite cat Elly for putting up with the upsetting lack of belly rubs over the last year. Most importantly, I cannot thank my parents enough for instilling in me a love for science and supporting me in everything I do. iii TABLE OF CONTENTS LIST OF TABLES vi LIST OF FIGURES ix Chapter 1 Introduction 1 Chapter 2 Theory 2.1 Standard Model 2.1.1 Feynman diagrams 2.1.2 Electroweak theory 2.1.3 Quantum Chromodynamics 2.2 Top quark physics 2.2.1 Wt-channel 2.2.2 Backgrounds 3 3 6 9 13 15 20 21 Chapter 3 The LHC and the ATLAS Experiment 3.1 The Large Hadron Collider 3.2 The ATLAS detector 3.1.1 Detector basics 3.1.2 Magnet systems 3.1.3 Inner detector tracking 3.1.4 Calorimetry 3.1.5 Muon spectrometer 3.1.6 Triggering and data acquisition 3.1.7 Pile-up 27 27 31 32 37 38 41 44 47 49 Chapter 4 Object Reconstruction and Definition 4.1 Electrons 4.2 Muons 4.3 Jets 4.4 Missing Transverse Energy 50 50 53 54 57 Chapter 5 Event Selection 5.1 Selecting events from data 5.2 Selecting dilepton events 5.3 Event yields 59 59 59 65 Chapter 6 Event Selection 6.1 Monte Carlo modeling 6.2 Fake dilepton data-driven estimate 6.3 Drell-Yan data-driven estimate 6.4 Z → ττ data-driven estimate 67 68 74 78 81 Chapter 7 Multivariate Analysis 7.1 Boosted decision trees 7.2 BDT variable kinematics 7.2.1 Thrust 7.2.2 Centrality 7.2.3 Motivation for variable choice 7.3 Optimization and cross checks 83 83 87 94 95 95 96 iv Chapter 8 Significance, Cross-Section Measurement, and Systematic Errors 8.1 Systematic uncertainties 8.2 Cross-section and significance measurement 8.2.1 The likelihood function 8.2.2 Cross-section measurement 8.2.3 Significance calculation 8.3 Measurement of top quark width and lifetime 103 103 110 112 115 124 126 Chapter 9 Conclusion 130 APPENDICES Appendix A Data/MC Agreement in Control Regions A.1 2-jet events A.2 3-jet inclusive events A.3 Dilepton subchannels Appendix B b* search B.1 Introduction to b* B.2 Simulation B.3 Object definition B.4 Event selection B.5 Background estimation B.6 Discriminant variable selection B.7 Measurement 132 133 133 139 145 149 149 152 153 157 158 159 163 BIBLIOGRAPHY 173 v LIST OF TABLES Table 2.1 List of particles and their properties in the Standard Model. *The Higgs described here uses the mass of the Higgs candidate discovered at the LHC. [1, 2] For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 √ The cross-sections of the single top processes at the LHC at s = 7 T eV [3, 4, 5]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 The observed and predicted event yields in the selected dilepton sample with at least one jet and for an integrated luminosity of 2.05 fb−1 . Uncertainties represent the effect of MC statistics for the MC-based estimates and the total uncertainty for the data-driven estimates. . . 65 Table 6.1 The simulated samples and their respective cross-sections. . . . . . . 72 Table 6.2 The simulated samples and their respective cross-sections. . . . . . . 73 Table 6.3 Fake dilepton background estimated for a luminosity of 2.05 fb−1 . Both statistical and systematic uncertainties are included. . . . . . . 78 Drell-Yan background estimates for selected events in the 1-jet, 2jet and 3-jet and higher bins, obtained using the ABCDEF method with 2.05 fb−1 of data. The combined statistical and systematic uncertainty is shown. . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Z → τ τ background estimates for selected events in the 1-jet, 2-jet and 3-jet and higher bins. The errors include statistical and systematic uncertainties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 A listing of the variables (see text for definition) used in the BDT and their respective definitions. . . . . . . . . . . . . . . . . . . . . . 88 A listing of the variables (see text for definition) used in the BDT and their respective separation power. . . . . . . . . . . . . . . . . . 94 The parameters used in final optimized BDT. . . . . . . . . . . . . . 99 Table 2.2 Table 5.1 Table 6.4 Table 6.5 Table 7.1 Table 7.2 Table 7.3 vi Table 8.1 The effect of the individual systematic uncertainties on the acceptance for selected events in the 1-jet bin. This is evaluated by calculating the change in the overall yield of a process when subjected to a ± 1σ shift of the nuisance parameter. The uncertainties from the shape of the systematics are not covered in this Table. . . . . . . . . 110 Table 8.2 The effect of the individual systematic uncertainties of the acceptance for selected events in the 2-jet bin and the 3-jet bin. In other words, the change in the overall yield of a process when subjected to a ± 1σ shift of the nuisance parameter. The uncertainties from the shape of the systematics are not covered in this Table. . . . . . . . . . . . . . 111 Table 8.3 Breakdown of the full uncertainty on the W t-channel cross-section measurement. Unlike Tables 8.1 and 8.2, the percentages listed here represent the uncertainty from both the normalization and the shape of the distribution. The uncertainty from the parton shower and generator systematics are calculated independently as described in the text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Table 8.4 The fitted nuisance parameters and their uncertainties are shown here. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Table B.1 The total cross-section of b∗ → W t in a mass range of 300 GeV to 1400 GeV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Table B.2 b∗ simulated samples for the analysis. The cross-section column includes branching ratios. All b∗ simulated samples are generated with at least one leptonic W boson decay. . . . . . . . . . . . . . . . . . 153 Table B.3 b∗ simulated samples for the analysis. The cross-section column includes branching ratios. All b∗ simulated events are generated with at least one leptonic W boson decay. . . . . . . . . . . . . . . . . . 154 Table B.4 Top quark event simulated samples for the analysis. The cross-section column includes k-factors and branching ratios. All NLO simulated samples have been simulated with pile-up corresponding to 50 ns bunch trains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Table B.5 Background simulated samples. Cross-sections include k-factor. All NLO simulated samples have been simulated with pile-up corresponding to 50 ns bunch trains. . . . . . . . . . . . . . . . . . . . . . . . 156 vii Table B.6 Background simulated samples. Cross-sections include K-factor. All NLO simulated samples have been simulated with a pile-up corresponding to a 50 ns bunch trains (tag r2920). . . . . . . . . . . . . 157 Table B.7 The triggers for the electrons and muons for each data-taking period. 158 Table B.8 Observed and predicted event yields in the 1-jet bin after the preselection with an integrated luminosity of 2.05 fb−1 . Fake dilepton and Z + jets background event yields are estimated from the data-driven techniques applied to the 1-jet bin. The errors shown include statistical error only (top pair, signal, dibosons) or statistical + systematic uncertainties (Drell-Yan, fakes). . . . . . . . . . . . . . . . . . . . . 162 viii LIST OF FIGURES Figure 2.1 An example of a basic Feynman diagram [6]. . . . . . . . . . . . . . 7 Figure 2.2 An example of a Next to Leading Order diagram. . . . . . . . . . . 8 Figure 2.3 An example of a Feynman diagram with ISR. . . . . . . . . . . . . 9 Figure 2.4 The top quark typically decays into a W boson and a bottom quark. 16 Figure 2.5 Feynman diagrams illustrating (a) the t-channel process and (b) the s-channel process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Figure 2.6 The W t-channel process. . . . . . . . . . . . . . . . . . . . . . . . . 20 Figure 2.7 The decay chain of an example W t-channel event. . . . . . . . . . . 21 Figure 2.8 ¯ The tt process. It has a final state with two b-quarks, two oppositely signed leptons, and two neutrinos. . . . . . . . . . . . . . . . . . . 22 Feynman diagrams of diboson processes with dilepton final states. (a) and (b) are W W processes. (c) and (d) are W Z processes. (e) and (f) are ZZ processes. . . . . . . . . . . . . . . . . . . . . . . . 24 Figure 2.10 The Drell-Yan background involves a photon or Z boson. . . . . . . 25 Figure 2.11 One contributing process to the multijet background is W+jets. . . 26 Figure 3.1 The delivered luminosity to the ATLAS experiment in the years 2010, 2011, and 2012 [7]. . . . . . . . . . . . . . . . . . . . . . . . . . . 30 The mean number of interactions per crossing taken in 2011 and between April 4th and November 26th in 2012 [7]. . . . . . . . . . . 31 Figure 3.3 Relationship between η and θ. . . . . . . . . . . . . . . . . . . . . . 35 Figure 3.4 A diagram of the ATLAS detector and its subdetectors. Image of people added to the left side to illustrate scale. [8] . . . . . . . . . . 36 Figure 2.9 Figure 3.2 ix Figure 3.5 The predicted bending power through MDT layer as a function of |η| for infinite momentum muons [8]. . . . . . . . . . . . . . . . . . . . 38 A diagram of the three subdetectors of the inner detector and their relative sizes [8]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Figure 3.7 A diagram of the layers of the calorimeter [8]. . . . . . . . . . . . . 42 Figure 3.8 A diagram of the muon detector systems [8]. . . . . . . . . . . . . . 45 Figure 5.1 The impact of the triangle cut on signal and background: (a) the miss (b) the angle between the angle between leading lepton and ET miss second lepton and ET . The simulated events are represented by the solid regions, while the data are represented with a black dot. . 63 Figure 3.6 Figure 5.2 The effect of the triangle Z → τ τ veto cut in two dimensions. . . . 64 Figure 5.3 Histograms of the selected sample with combined ee, eµ and µµ channels. The simulated events are represented by the solid regions, while the data are represented with a black dot. (a) Jet multiplicity, (b) miss Leading jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT . . . . 66 Histograms of the number of primary vertices in data and simulated events for (a) the selected sample and (b) the signal enhanced region. The simulated events are represented by the solid regions, while the data are represented with a black dot. . . . . . . . . . . . . . . . . 71 A scatter plot illustrating the division of phase space into six regions and their relative population sizes. A larger dot indicates a higher density of events. . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Figure 7.1 An example of a decision tree is shown. . . . . . . . . . . . . . . . 84 Figure 7.2 The top five variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. . . . . . . . . . . . . . . . . . . . . . . . . 89 The 6th-10th top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. . . . . . . . . . . . . . . . . . . . 90 Figure 6.1 Figure 6.2 Figure 7.3 x Figure 7.4 Figure 7.5 Figure 7.6 Figure 7.7 Figure 7.8 Figure 7.9 The 11th-15th top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. . . . . . . . . . . . . . . . . . . . 91 The 16th-20th top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. . . . . . . . . . . . . . . . . . . . 92 The 21st and 22nd top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. . . . . . . . . . . . . . . . . . 93 The decay chain of an example W t-channel event. It has a finale state with one b-quark, two oppositely signed leptons, and two neutrinos 97 ¯ The tt process. It has a final state with two b-quarks, two oppositely signed leptons, and two neutrinos. . . . . . . . . . . . . . . . . . . 97 The classifier output for the training and test samples for signal (in blue) and background (red). The signal has K-S test value of 0.866 while the background has a K-S test value of 0.941. . . . . . . . . 99 Figure 7.10 The signal selection efficiency vs total background rejection using the BDT classifier output. The solid blue line is from the BDT, while the long dotted line is from a simple cut-based optimization using the two most powerful variables. The short dotted line is the effect of a cut from a hypothetical variable with zero separation power to show a worst case scenario. . . . . . . . . . . . . . . . . . . . . . . . . . 100 Figure 7.11 The BDT classifier output (a) in the 1-jet bin (b) in the 2-jet bin (c) in the 3-jet inclusive bin. The simulated events are represented by the solid regions, while the data are represented with a black dot. . 101 Figure 8.1 An example of a Feynman diagram with ISR. Figure 8.2 The BDT classifier output for selected events (a) in the 1-jet bin (b) in the 2-jet bin (c) in the 3-jet inclusive bins. The simulated events are represented by the solid regions, while the data are represented with a black dot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 xi . . . . . . . . . . . . 106 Figure 8.3 Expected likelihood ratio with only statistical uncertainties (red dashed) and profile likelihood ratio with statistical and a subset of the systematic uncertainties (blue solid) for Wt cross-section measurement. The full set of systematic uncertainties cannot be included because the PLR will not have a smooth shape. The horizontal green lines show the 1σ, 1.6σ, and 2σ thresholds. This Figure is not used in the final cross-section measurement. . . . . . . . . . . . . . . . . . . . . 117 Figure 8.4 Observed likelihood ratio with only statistical uncertainties (red dashed) and profile likelihood ratio with statistical and a subset of the systematic uncertainties (blue solid) for Wt cross-section measurement. The full set of systematic uncertainties cannot be included because the PLR will not have a smooth shape. The horizontal green lines show the 1σ, 1.6σ, and 2σ thresholds. This Figure is not used in the final cross-section measurement . . . . . . . . . . . . . . . . . . . . 118 Figure 8.5 Observed distribution of fitted µ values for the pseudoexperiments generated while fixing all profiled nuisance parameters to their fitted values. The mean and RMS of the distribution is used to calculate the data statistical uncertainty. The histogram is normalized to unit area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Figure 8.6 Expected distribution of fitted µ values for the pseudoexperiments generated while fixing all systematic nuisance parameters to their fitted values. The mean and RMS of the distribution is used to calculate the data statistical uncertainty. The plot is normalized to unit area. 121 Figure 8.7 Significance estimation using pseudo-experiments as described in the text. The continuous line is the qµ distribution of background only pseudo-experiments, the dashed line curve is the qµ distribution of Standard Model hypothesis pseudo-experiments, and the red line is the qµ of data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Figure A.1 The top five variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Figure A.2 The 6th-10th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 xii Figure A.3 The 11th-15th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Figure A.4 The 16th-20th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Figure A.5 The 21st and 22nd top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Figure A.6 The top five variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Figure A.7 The 6th-10th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Figure A.8 The 11th-15th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Figure A.9 The 16th-20th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Figure A.10 The 21st and 22nd top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. . . . . . . . . . . . . . . . . . . . . . . . . . 144 Figure A.11 Distributions of variables comparing the signal and background estimate to the data in the ee channel. (a) Jet multiplicity, (b) Leading miss jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT . . . . . . . . 146 Figure A.12 Distributions of variables comparing the signal and background estimate to the data in the eµ channel. (a) Jet multiplicity, (b) Leading miss jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT . . . . . . . . 147 Figure A.13 Distributions of variables comparing the signal and background estimate to the data in the µµ channel. (a) Jet multiplicity, (b) Leading miss jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT . . . . . . . . 148 xiii Figure B.1 A correction to the Higgs mass from the top quark. . . . . . . . . . 151 Figure B.2 A Feynman diagram illustrated the b∗ decay investigated in this analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Figure B.3 Kinematic distributions of the signal region comparing data and background. (a) Leading lepton pT , (b) Leading lepton η, (c) Sub leading lepton pT and (d) Sub leading lepton η . . . . . . . . . . . . . . . . 160 Figure B.4 Kinematic distributions of the signal region comparing data and background. (a) Leading jet pT , (b) Leading jet η, (c) ∆φ between the two leptons and (d) ∆R between the two leptons. . . . . . . . . . . 161 Figure B.5 The variables considered to be the discrimination template for the b∗ search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Figure B.6 (a) Comparison of data and predicted background HT . (b) Comparison of data and predicted background HT at log scale. . . . . . . . 166 Figure B.7 Data and predicted background HT are shown. In addition, several signal-only HT distributions at Mb∗ = 300, 700, 1100 GeV are shown. 167 Figure B.8 Comparison of JES shifted background HT with data. Figure B.9 b∗ mass limit from the combined analysis, with an observed limit of Mb∗ > 870 GeV and expected limit of Mb∗ > 910 GeV . . . . . . . . 169 . . . . . . . 167 Figure B.10 The two dimensional coupling and mass limits for left-handed coupling b∗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Figure B.11 The two dimensional coupling and mass limits for right-handed coupling b∗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Figure B.12 The two dimensional coupling and mass limits for a combined left and right-handed coupling b∗. . . . . . . . . . . . . . . . . . . . . . 172 xiv Chapter 1 Introduction Science never rests. It constantly drives the boundaries of knowledge to new and unexpected realms. Through human history we have seen this knowledge progress from a practical, intuitive, and frequently incorrect understanding of the world to more rigorous models with greater predictive power than our ancestors could have ever dreamed. One of the themes seen throughout the history of science is the push to understand the basic building blocks of the universe. Ancient models posited four or five basic elements, made up of the most common materials found. In the 19th century, atomic theory was developed, which drove the smallest objects down to the atomic level, and then later even further when scientists discovered that atoms were made of protons, neutrons, and electrons. In the mid 20th century, scientists discovered that protons and neutrons were made of even smaller particles, which were named quarks [9]. Through the scientific process we probe the smallest scales, trying to understand the list of particles that we now consider fundamental. Investigating these particles can be difficult, as the proton is tightly bound and high energies are required to break it apart. Even more energy is necessary to create the most massive particles we have discovered. To reach these massive energies an accelerator 24 kilometers in length, the Large Hadron Collider (LHC), has been constructed. At the LHC the proton is broken apart by accelerating two sets of protons to near the speed of light and colliding them. These collisions can create new particles, the products of which are detected at massive detectors built around the collision points. Through these collisions we study the 1 properties of the known particles and, if we are lucky, discover new ones. This dissertation will detail the search for a special kind of production of the most massive fundamental particle known, the top quark. This kind of production is known as the W tchannel. In the following pages the workings of the LHC and the ATLAS detector will be discussed. From there I will explain the efforts required to go from a set of raw observations to a complete picture of the results of a collision. I will discuss how systematic uncertainties impact our measurement, and the steps we take to reduce them. Finally, the experimental and statistical methodology used to extract the measurements made will be detailed and the results will be shown. 2 Chapter 2 Theory This chapter will cover the theoretical background to motivate and perform this analysis. Not only will it introduce the basics of the Standard Model of high energy physics, but it will also discuss the signal and background processes in this analysis. Here the signal is the W t-channelprocess, while the backgrounds are the set of processes that can appear similar to the signal in the detector. In addition, it will communicate an understanding of where this result fits in the broader scope of the field of high energy physics. 2.1 Standard Model The Standard Model describes the fundamental particles and how they interact [10, 11]. A listing of particles and their properties is given in Table 2.1. These particles can be separated into two categories based on their spin: fermions and bosons. Fermions, which include leptons and quarks, have half-integer spin and no two identical fermions can occupy the same quantum mechanical state. Electrons are a common example of a fermion. Bosons have integer spin and any number of identical bosons can occupy the same state. They are often carriers of force, and the photon is the most ubiquitous example of a boson. Frequently in this document a particle name indicates both itself and its anti-particle. For example, when reference is made to the W t-channel, this descriptor refers not only to the W + t final ¯ state, but also the W − t final state. 3 There are three generations of fermions. Almost all observable matter is made up of fermions from the first generation. There are two families of fundamental fermions: leptons and quarks. Protons and neutrons are examples of composite fermions, and their quark components, up and down quarks, are also fermions. The second and third generation particles tend to have larger masses and shorter lifetimes and will quickly decay into less massive particles. The exception to this are the second and third generation neutrinos whose mass hierarchy is not known and which are stable (although they can oscillate between neutrino flavor states). Family Quarks Leptons Bosons Name Up Down Charm Strange Top Bottom Electron Electron Neutrino Muon Muon Neutrino Tau Tau Neutrino Photon W ± Boson Z Boson Gluon Higgs* Symbol u d c s t b e eν µ µν τ τν γ W± Z g H Mass 2.4 MeV 4.8 MeV 1.27 GeV 104 MeV 172 GeV 4.2 GeV 511 KeV <2.2 eV 105.7 MeV <0.17 MeV 1.78 GeV <15.5 MeV 0 80.4 GeV 91.2 GeV 0 125 GeV Charge 2/3 −1/3 2/3 −1/3 2/3 −1/3 -1/2 0 1/2 0 1/2 0 0 ±1 0 0 0 Spin 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1 1 1 1 0 Table 2.1: List of particles and their properties in the Standard Model. *The Higgs described here uses the mass of the Higgs candidate discovered at the LHC. [1, 2] For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. The Standard Model describes the interaction of three of the four known fundamental forces: electromagnetic, weak, and strong. The fourth, gravity, is not described by the Standard Model. The strong force is the force that holds protons, neutrons, and the atomic 4 nucleus together. Quantum chromodynamics (QCD) models this force by describing the interactions between particles with a “color” charge (See section 2.1.3). The weak force describes interactions mediated by the W and Z bosons. An example of the weak interaction is beta decay, where an atomic neutron decays into a proton, releasing an electron and a neutrino from the atom. The electromagnetic force describes the interactions between electronically charged particles. The interactions of the Standard Model can be defined by a Lagrangian. A Lagrangian can possess different symmetries under transformations. For example, a Lagrangian can be symmetric under changes in coordinate system. Using a different coordinate system does not change the physics of the Lagrangian. The Standard Model Lagrangian has many gauge symmetries, meaning that the Standard Model Lagrangian is invariant under classes of gauge transformations with these symmetries. The consequences of each invariance is that additional conservation laws must be respected by the interaction. For example, the symmetry of the electromagnetic force leads to conservation of electric charge. The language of group theory describes these symmetries. A group is defined as an abstract set of elements with a defined operator that obeys certain rules. An important concept in group theory is that of generators. A set of generators A of group G is a collection of elements such that every element in G can be formed through group operations using only elements in set A. For example, for the natural numbers under addition, 1 is a complete generator set, as any natural number n can be represented as the sum of 1s. The Standard Model is based on a Lagrangian with many symmetries, three of which are associated with the three forces the Standard Model describes. These three symmetries are the SU (3)×SU (2)×U (1) gauge symmetries, meaning that the Lagrangian is invariant under transformations of SU (3) × SU (2) × U (1). The SU (3) group describes the interactions of the 5 strong force, while the SU (2) × U (1) group describes the unified electroweak force. These forces are mediated by boson force carrier particles. When two particles act on each other, a virtual force carrier particle, called a propagator, is exchanged. This propogator is what carries the momentum and energy that gets traded between the two interacting particles. 2.1.1 Feynman diagrams In collider physics the concept of a cross-section is critical to making predictions. The crosssection represents the likelihood of a process given some initial conditions, measured in units of area. A barn (b) is the accepted unit for cross-section with one barn (b) being equal to 10−24 cm2 . At the LHC cross-sections are frequently described in picobarns (pb), which are 10−36 cm2 , or femtobarns (fb), which are 10−39 cm2 . As described in greater detail in Section 3.1, if the cross-section and the amount of data collected are both known, the number of events expected can be calculated by Nevents = Lσ. In this equation L is the luminosity, a measure of how much data has been collected, and σ is the cross-section of the proces of interest. In the case of this analysis, our initial condition is a proton-proton collision at 7 TeV. From there, the cross-sections of interesting processes can be calculated. These theoretically predicted cross-sections can be compared to experimentally observed cross-sections as tests of the models. Often when discussing interactions in the Standard Model a Feynman diagram is used to illustrate the process [11]. The Feynman diagrams not only have great utility for understanding the physics at work in a process, they also inform the resulting cross-section calculation, and consequently are common in both high energy theory and experiment as explanatory devices. In this analysis, the Feynman diagrams are drawn with space on the y-axis and time on the x-axis. For example, Fig. 2.1 describes a process in which a quark-antiquark 6 ¯ pair interact and form a gluon, which then splits into a tt pair. The points at which particles connect are called vertices, and are identified by the particles involved. For example, the ¯ rightmost vertex in this diagram is a gtt vertex. q t g ¯ t q ¯ Figure 2.1: An example of a basic Feynman diagram [6]. In general, the Feynman diagrams shown in this analysis reflect the most basic interactions that result in the observed final state by showing only the tree-level diagrams. The tree-level diagrams are constructed such that there are a minimal number of vertices. These tree-level diagrams do not represent the only way that such a final state could occur. For every tree-level diagram, there are infinitely many higher-level diagrams with more vertices that contribute to the total cross-section. Physically these higher-level diagrams can have loops or additional radiation modifying the tree-level diagram. Mathematically, these diagrams represent expansion terms in a perturbative cross-section calculation. For example, in Fig. 2.1 the interacting gluon could split into two gluons and reform in the middle of the interaction, as shown in Fig. 2.2. This higher order diagram would be called a Next to Leading Order (NLO) diagram. There are also Next To Next To Leading Order (NNLO) 7 diagrams and so on. The contributions by these higher order diagrams generally decrease with their complexity, but a critical part of correctly calculating the total cross-section of processes involve estimating the contribution of the higher order diagrams omitted in a given computation. These contributions are often given as a k-factor, a scaling factor which can be applied to the calculated cross-section to give the estimated cross-section for a higher order. At the LHC we collide protons on protons, but those collisions have a high enough energy to break the proton, allowing the components particles to interact instead. Constructing Feynman diagrams involving all componenents of the proton would be a difficult task, so instead we take advantage of the fact that at high energies we can factorize the problem into two problems. The momentum contribution from each parton is measured to construct PDF(s), which give the the initial states for quark collisions. These initial states are used to solve the second problem of constructing Feynman diagrams for the processes of interest. q g q ¯ t g g g ¯ t Figure 2.2: An example of a Next to Leading Order diagram. Another diagram correction that must be added for a realistic cross-section estimate is taking into account the effect of initial and final state radiation (ISR and FSR). These are diagrams in which the initial and/or final state particles radiate off an additional particle. 8 These diagrams do give a new unique tree level-diagram from a theoretical standpoint, but from the experimental standpoint these diagrams are processes we will physically see in our detector. A single top event with final state radiation is still considered a single top event, thus these effects must be included in any cross-section calculation or simulation. An ¯ example of a tt event with initial state radiation is given in Fig. 2.3. g q t g ¯ t q ¯ Figure 2.3: An example of a Feynman diagram with ISR. 2.1.2 Electroweak theory The Standard Model describes the unification of two of the four fundamental forces, the electromagnetic force and the weak force, into one force, the electroweak force. This section will first discuss the electromagnetic and weak forces separately, and then discuss their unification in the Standard Model. The electromagnetic force is mediated by the photon. The photon interacts with electromagnetically charged particles, which are all known fundamental particles excluding the Z 9 boson, the gluon, the neutrinos, the Higgs boson, and the photon itself. The weak force describes interactions mediated by the W ± and Z bosons. All quarks and leptons can participate in weak interactions. In addition, all force carriers except gluons can also participate. The weak force allows for several quantum number conservation laws to be “broken” in ways that the electromagnetic and strong force cannot. One symmetry that is broken by the weak force is chirality, a quantum number that represents the right- or left-handedness of a particle. If the spin of a massless particle is in the same direction as its momentum it right-handed, and if the spin is in the opposite direction as the momentum it is left-handed. Parity describes a possible symmetry in which the physics is identical if the coordinate system is inverted. The strong and electromagnetic interactions both interact with right- and left-handed particles in exactly the same way, but the weak force violates parity by only acting on left-handed particles (and their respective right-handed antiparticles) [10]. Charge-parity is another conservation law, requiring that the product of the charge of the initial state multiplied by the parity is conserved. However this conservation has also been discovered to be violated in some weak processes such as Kaon decay. Most relevant to this analysis, however, is the weak force’s ability to change the generation of quarks. For example, the up and down quarks make up the first generation, while the top and bottom quarks make up the third generation. A non-weak interaction cannot change an up quark to a bottom quark. The weak force is capable of changing generations because the weak interaction eigenstates of the quarks are not the same as their flavor eigenstates. This allows the weakly interacting quarks to change not only their momentum and energy, but also the generation of particles. This quark flavor mixing is described by the Cabibbo-KobayashiMasakawa (CKM) matrix [11, 10] 10        ′ d  d Vud Vus Vub  d         ′      s  = VCKM s =  Vcd Vcs Vcb  s               b′ b Vtd Vts Vtb b (2.1) where d, s, and b are the down, strange, and bottom quarks. The matrix VCKM can also be parametrized with three mixing angles (θ12 , θ23 , θ13 and a CP-violating phase (δ):  e−ıδ c12 c23 s12 c13 s13    VCKM = −s12 c23 − c12 s23 s13 eıδ c12 c23 − s12 s23 s13 eıδ s23 c13   s12 s23 − c12 c23 s13 eıδ −c12 s23 − s12 c23 s13 eıδ c23 c13        (2.2) Here cij and sij represent cos(θij ) and sin(θij ), respectively. Each element represents the mixing between flavor eigenstates under the weak interaction. If there was no mixing between cross generational eigenstates the matrix would be the identity matrix. For example, Vtb represents the relative strength of the W tb vertex (the coupling between W, t, and b), shown in Fig. 2.4. If Vub was zero, there would be no W ub vertex in the Standard Model and the u quark could not change flavor to or from the b-quark through the weak interaction. The Standard Model takes the mixing angles and δ as inputs that must be measured experimentally. From these measurements we know the diagonal elements are close to one, while the off-diagonal elements are close to zero. An interpretation of the matrix elements is that the interaction and flavor eigenstates for the quarks are almost identical, and consequently the weak force typically conserves quark generation. The measurement of each of these elements is an active field of research. The current best measured magnitudes for the CKM matrix elements are [10]: 11  0.2252 ± 0.0009  0.97425 ± 0.00022   VCKM =  0.230 ± 0.011 1.006 ± 0.023   (8.4 ± 0.6) × 10−3 (42.9 ± 2.6) × 10−3 (4.15 ± 0.49) × 10−3    −3  (40.9 ± 1.1) × 10    0.89 ± 0.07 (2.3) One of the goals of this analysis is to make a direct measurement of the Vtb matrix element (The lower right hand element). In this analysis we look at a class of processes in which only one top quark is produced, referred to as single top processes. Single top processes uniquely allow for a simple direct measurement of Vtb because they contain a W tb vertex. Consequently their cross-section (discussed in Section 2.1.1) is proportional to the magnitude of the Vtb matrix element squared, σsgtop ∝ |Vtb |2 [12]. Without a direct measurement, an analysis must assume the unitarity of the CKM matrix and the existence of exactly three generations of quarks to make an indirect measurement of Vtb . While in general the experimental evidence is consistent with a unitary 3×3 CKM matrix and exactly three quark generations, measurements with a minimum number of assumptions are preferable. In addition, direct measurements of Vtb can be sensitive to new physics that violate these assumptions. This is why it is critical to make direct measurements of Vtb and why single top analyses are important. The weak and electromagnetic forces unify at high energy (approximately the scale of the mass energy of the weak force carriers [10]). This unification is the manifestation of the gauge group SU (2) × U (1) with four massless gauge bosons. Three of these gauge bosons come from the generators of the SU (2) symmetry, while the remaining one comes from generator of the U (1) symmetry. The Standard Model also posits a Higgs potential of the form: 12 V (φ) = µ2 φ† φ + λ2 † 2 φ φ 2 (2.4) If µ2 is negative, then this potential has a symmetric minimum away from the central value. Once a point in the minimum is selected the symmetry is broken. In the Standard Model this leaves the massive Z and W ± bosons that we observe. The W ± bosons come from the SU (2) group, while the Z boson and the photon originate from a mixing of the SU (2) and U (1) groups’ bosons. This symmetry breaking also implies the existence of a scalar Higgs boson which prior to the LHC had not been observed. The search for the Higgs boson was one of the driving arguments to build the LHC and the ATLAS and CMS experiments. As of this writing, a Higgs-like particle has been observed with a mass of approximately 125 GeV [1, 2]. 2.1.3 Quantum Chromodynamics QCD defines the interactions between particles with color charge (the origin of the “chromo” in chromodynamics) and is mediated by the gluons. The strong force has a much larger coupling strength than the other forces, and as a result the cross-section (discussed in Section 2.1.1) of strong force interactions is generally larger than the cross-section of electroweak interactions. The strong force is represented by an SU (3) group symmetry, and as a result there are three types of color charge, referred to conventionally as red, green, and blue. The selection of colors from the visible electromagnetic spectrum to represent the conserved quantities is to give some intuition to the concept of color charge, but there is no connection in the theory between the color red and the strong force color charge red. Like electric charge, these color charges can each have negative values, referred to as 13 anti-red, anti-green, and anti-blue. Each quark has a color charge, and each anti-quark has an anti-color charge, while each of the gluons carries a superposition of color and anti-color states. One superposition with all three color anti-color combinations is a colorless state, which does not correspond to a gluon. Consequently, there are a total of eight gluons we observe. Isolated color charge is disfavored by the strong force, and as a result, stable states must be color neutral, possessing an equal amount of red, green, and blue color charge or color and anti-color charges that sum to zero. This favoring of color neutral states is called color confinement and consequently they cannot exist in nature alone, instead grouping into bound states with other quarks. Mesons are two quark bound states with color-anticolor pairs, for example the π + particle is made up of an (up, anti-down) pair with a color state such as (blue, antiblue). Baryons are three particle bound states with a red, a blue, and a green component particle. States that are not color neutral, for example any bare quark or gluon, will quickly hadronize, creating quarks and antiquarks which combine to form color neutral baryons and mesons. If the quark has significant momentum, such as in a collider experiment, this hadronization manifests as a spray of hadrons, called a jet. These jets are how quarks are seen from the perspective of a detector, as described in more detail in Section 4.3. Although the Standard Model includes a complete theory of QCD, the theory does not give a set of computations to calculate all quantities to arbitrary precision in closed form. As a result, many phenomena in QCD are modeled using both experimental and theoretical inputs. For example, the modeling of the hadronization of quarks and gluons is strongly dependent on experimental data. Another example is the use of experimental data for parton distribution functions (PDFs), the modeling of the interior momentum distribution of the 14 components (also called partons) of particles such as the proton. In addition to the three valence quarks that make up the proton, there are many gluons and other quarks within that exist on short time scales. At high energies these other partons can have significant amounts of momentum, making them important to include in the PDFs. Because the calculations required for short range QCD are beyond present simulation capabilities, the composition of the proton cannot be computed from first principles and must be modeled using experimental data as an input. 2.2 Top quark physics The top quark was first observed in 1995 at the Tevatron at Fermilab [13, 14]. The top quark’s high mass makes it of great interest to high energy physics. Understanding the properties of the top quark and its associated production processes is critical to probing the Standard Model and searching for new physics. The mass of the top quark, along with the mass of the W boson, can constrain the mass of the Standard Model Higgs. This argument leads to the conclusion that the Higgs is relatively low mass (less than 200 GeV), a prediction that turned out to be correct. From the perspective of a detector, top quark processes often appear similar to processes in many new physics models as well as rare Standard Model processes, such as the signal in the analysis described in Section B. Although it was predicted well before observation, its high mass of 172 GeV made detecting it difficult. Due to the top quark’s large width, it has the interesting quality of being the only quark with an observed decay lifetime (˜10−25 s) much shorter than the strong force timescale (˜10−24 s, the timescale for the quark to hadronize and turn into a jet). As a result, it decays instead of hadronizing and forming a colorless bound state. As a result, 15 the detector does not see a jet; instead sees the top quark’s decay products. As a result of the CKM matrix element Vtb being close to 1, the top quark almost exclusively decays to a W boson and a bottom quark, as show in Fig. 2.4. b t W+ Figure 2.4: The top quark typically decays into a W boson and a bottom quark. The W boson can decay to either a lepton and its corresponding anti-neutrino or hadronically into quarks which will produce jets. Approximately 32% of the time it decays leptonically and the remaining 68% of the time it decays to a pair of quarks. Leptonically includes tau leptons here, although when we talk about leptonic top decays from an experimental perspective we usually mean electron and muon decays, as those are directly detected by our detector. ¯ The top quark was initially discovered by searching for tt pair production, shown in Fig. 2.1, in which two top quarks are formed in the same QCD-mediated process. This production channel has a relatively high cross-section compared to processes in which only 16 one top quark is formed. The relatively large cross-section for top pair production may be surprising, as the high mass of the top quark would lead one to expect that creating two simultaneously would be much less favorable than a single top quark because it requires much ¯ more energy. However, because tt production can occur through the strong force while single top processes only happen through electroweak mechanisms (see Fig. 2.6), the cross-section ¯ of tt processes is much higher. The top quark has two related properties that we will be measuring in this analysis: the top quark width and lifetime [15]. In this analysis we indirectly measure the top quark width by taking advantage of the linear dependence of the signal cross-section on the width, shown in equation 2.5. obs σW t Γobs = ΓSM × SM t t σW t (2.5) obs SM Here σW t is the measured cross-section of the W t-channel process, σW t is the predicted Standard Model cross section of the W t-channel process, and Γobs and ΓSM are the measured t t and predicted top quark widths. The lifetime is a measure of the decay time of the top quark and can be calculated directly from the top quark width, as shown in equation 2.6. τt = Γt (2.6) ¯ Single top processes, with their much lower cross-section than that for tt production, are also important to particle physics, and were first observed in 2009 at the Tevatron [16]. The existence and properties of single top processes reflect testable predictions of the Standard Model, and studying single top processes allows physicists to test these predictions. In 17 addition, the single top processes uniquely allow for a direct measurement of Vtb , while previous measurements all required the assumption of three generations of quarks. Three main channels of single top processes are studied at the collider experiments: the t-channel, the s-channel, and associated production (also referred to as W t-channel). The t-channel is the highest cross-section contributor, and has been observed independently of the other two channels at the LHC [17]. Its Feynman diagram is shown in Fig. 2.5. The s-channel cross-section is relatively small compared to the t-channel. At the LHC, it is even smaller than the W t-channel, for reasons that will be discussed below. It has not been observed independently as of this writing, but it is an important channel with sensitivity to new physics. Its Feynman diagram is shown in Fig. 2.5. The W t-channel channel is the signal this analysis is searching for. The Feynman diagram for the W t-channel process is given in Fig. 2.6. The cross sections for the different single top production processes at a pp collider with √ s = 7 TeV are given in Table 2.2. Here √ s is the total center of mass energy of the proton- proton collision. The cross-sections are given in units of pb. Examining the initial states of these three processes provides insight into the hierarchy of the cross-sections shown. The t-channel has the highest cross-section as it requires only an energetic gluon in addition to a quark. The W t-channel process requires both an energetic gluon and an energetic b-quark. The s-channel is disfavored due to the energetic anti-quark required in addition to the quark. At the Tevatron the s-channel had a significantly higher cross-section than the W t-channel because the Tevatron was a p¯ collider, making energetic anti-quarks much more common p while the lower energies made energetic gluons less common. 18 t-channel W t-channel s-channel 64.2 pb 15.6 pb 4.6 pb Table 2.2: The cross-sections of the single top processes at the LHC at q′ √ s = 7 T eV [3, 4, 5]. q q t W W+ t b g ¯ b ¯ b q′ ¯ (a) (b) Figure 2.5: Feynman diagrams illustrating (a) the t-channel process and (b) the s-channel process. 19 2.2.1 Wt-channel The signal in this analysis is the associated production of a W boson and a top quark, referred to as the W t-channel. The process occurs primarily in two diagrams, shown in Fig. 2.6. This process has not previously been observed independently of other single top measurements due to its relatively low cross-seciton at the Tevatron. The LHC’s energy provides many more gluons with much more energy, significantly increasing the cross-section of the process. While the W t-channel has a lower cross-section than the s-channel at the Tevatron, at the LHC the cross-section is significantly higher. Due to this small cross-section at the Tevatron, the LHC provides the first opportunity to observe the W t-channel. b W W b b t t g g t Figure 2.6: The W t-channel process. Figure 2.7 shows W t-channel production and decay. In this analysis we are looking in the dilepton subchannel, which means that both of the W bosons must decay leptonically to electrons or muons. This gives three lepton final states, two electron (ee), two muon (µµ), and electron muon (eµ). Despite the reduction of the size of the signal by an order of magnitude, the dilepton final state is much cleaner than final states that include hadronic 20 W boson decays. Not only are leptons better measured in the detector, but the backgrounds to the dilepton final state are much better understood than the backgrounds to the single lepton final state. Note that the final state contains two oppositely signed leptons, two neutrinos, and a jet from the bottom quark. The neutrinos, while not directly detected, miss are observed as missing energy in the transverse direction denoted ET , described in more detail in Section 4.4. ℓ− W− b νℓ b ℓ+ W+ g ¯ t b νℓ Figure 2.7: The decay chain of an example W t-channel event. 2.2.2 Backgrounds ¯ The major backgrounds for this analysis are tt, diboson, Drell-Yan, and fake dilepton. The background processes that contaminate this measurement each mimic the final state of the ¯ signal in some way. The tt background is by far the largest background to our signal. Although the other backgrounds are much smaller, together they contribute about the same number of events as the W t-channel signal itself. 21 ¯ The tt background is shown in Fig. 2.8. This is the top quark production channel through which the top quark was initially observed. The final state is similar to the W t-channel, the only significant difference being an extra b-quark. However, this extra jet can be lost during the detection and reconstruction (discussed in Sections 3.2 and 4), giving a reconstructed final state that matches the signal. In addition, the kinematics of these two processes are ¯ similar, making it difficult to design kinematic cuts that remove tt without also removing ¯ the signal. In addition, the tt cross-section is approximately an order of magnitude higher than that for the W t-channel. ¯ b q ℓ+ t W+ g q ¯ νℓ ℓ− W− ¯ t νℓ b ¯ Figure 2.8: The tt process. It has a final state with two b-quarks, two oppositely signed leptons, and two neutrinos. The diboson backgrounds are shown in Fig. 2.9. Although they are referred to as a single background, many processes contribute. There are two potential final states to consider. The first is a two lepton, two neutrino final state. For this to be mistaken as the W t-channel process, an additional jet will need to be added to the event through ISR/FSR or pile-up (discussed in Section 3.2.7. The other final state contains two leptons and two jets. Here one miss of the jets must be lost during reconstruction and there must be significant fake ET (MET 22 miss not corresponding to a neutrino) added. ET is how neutrinos can be indirectly observed in the detector, discussed in greater detail in Section 4.4. The combined cross-section of these processes is marginally larger than the W t-channel signal cross-section and even after the decrease in the events due to the difference in final state, the diboson background is the ¯ second largest background after tt. The Drell-Yan background, shown in Fig. 2.10, makes up a significant fraction of the background contamination. It occurs when a Z boson or γ is created and then produces a lepton anti-lepton pair. For the kinematic region relevant to W t-channel, this background is strongly dominated by the case where the mediating particle is a Z boson, thus it is often referred to as the Z + jets background. The final state of this process does not strictly match the final state of the signal due to its lack of a jet and neutrinos. However, additional reconstructed jets can be added to an event in various ways, such as from ISR and FSR and miss ET can be added through reconstruction errors. Although most of the Z + jets events do not pass the jet requirement, because of its large cross-section relative to the W t-channel signal cross-section it remains a significant background due to its large cross-section. The fake dilepton background is a difficult background to quantify, representing a wide range of processes. These processes are events where many jets are formed, but only one or zero leptons. A common example of this background is a W + jets process containing many jets, but only one lepton, illustrated in Fig. 2.11. The actual final state of these processes does not contain two real leptons, making them different from the signal final state. For a fake dilepton event to look similar to the signal in the detector at least one jet must be misreconstructed as a lepton. The ATLAS lepton reconstruction algorithms have a low rate of false positives, hence jets faking as leptons are uncommon (< 1% for high energy jets). Despite the rarity of faking a lepton, the fake dilepton events are so numerous that many 23 q q W+ γ/Z νℓ ℓ+ d ℓ+ νℓ q ¯ W νℓ q ¯ − W− ℓ− (a) W+ q νℓ W+ ℓ− (b) q ′′ q ′′′ q ′′ q d W+ q′ ¯ q′ ¯ ℓ+ Z Z − ℓ+ (d) ℓ− Z q ′′′ ℓ− ℓ (c) q W+ q ℓ− Z + ℓ ℓ+ d d ¯ q′ q ¯ ν q ¯ Z Z q′ (e) ν ¯ (f) Figure 2.9: Feynman diagrams of diboson processes with dilepton final states. (a) and (b) are W W processes. (c) and (d) are W Z processes. (e) and (f) are ZZ processes. 24 ℓ− q γ/Z q ¯ ℓ+ Figure 2.10: The Drell-Yan background involves a photon or Z boson. still meet the selection criteria by chance. 25 ℓ+ q W+ νℓ q′ ¯ g Figure 2.11: One contributing process to the multijet background is W+jets. 26 Chapter 3 The LHC and the ATLAS Experiment A vast experimental apparatus is required to investigate the physics of the single top W tchannel process. An large and powerful accelerator must be designed to bring particles to near light speed and collide them. Also, an sensitive detector must be built around a collision point to study the collision products. An experiment of this scope rests on decades of planning, construction, and testing. This analysis uses proton-proton collisions from the Large Hadron Collider (LHC) measured by the ATLAS (A Toroidal LHC ApparatuS) detector. 3.1 The Large Hadron Collider The Large Hadron Collider (LHC) is a particle accelerator and collider 27 km in circumference situated on the French-Swiss border near Geneva, Switzerland [18]. It was designed to be the next generation high energy collider, surpassing the previous highest energy collider, the Tevatron [19]. The Tevatron, which ran from 1983-2011, was the world’s premier particle collider prior to the LHC. It made numerous discoveries, the most critical to this analysis being the first observation of the top quark [13, 14], the first observation of single top quark production [16, 20], and evidence for the Higgs [21]. The LHC is a circular accelerator which uses superconducting magnets and accelerating cavities to accelerate beams of particles to high energies and collide them together. Its primary function is to collide proton beams 27 with proton beams. The total center of mass energy in each proton collision is a critically important quantity that determines the kind of physics one can study. The LHC was designed to run at 14 TeV, but due to technical problems with the superconducting magnets a collision center of mass energy of 7 TeV was used from 2009 through 2011. For 2012 the center of mass energy was increased to 8 TeV, and after a brief set of runs with lead ions, the LHC was shut down in 2013 until approximately 2015 to upgrade the collision energy to 14 TeV. For the purposes of this analysis, we use data collected between February 2011 and August 2011, and thus only use 7 TeV center of mass collisions. The actual acceleration of protons to their final collision speed is performed in several steps. They are first accelerated to 50 MeV in the LINAC 2 linear accelerator, then the Proton Synchrotron Booster, a small circular accelerator further accelerates them to 1.4 GeV. The 1.4 GeV protons are delivered to the Proton Synchrotron which boosts them to 25 GeV. These protons are fed into the Super Proton Synchrotron and accelerated to 450 GeV. Finally, they are delivered to the LHC ring which accelerates them to their final collision energy. During this injection process starting in the Proton Synchrotron, the beam is divided into separated groups of protons called bunches. These bunches are capable of colliding at eight interaction points throughout the detector. Currently, there are four active interaction points spaced throughout the beamline with a 50-75 nanosecond separation, depending on the current running conditions. The LHC is host to seven major experiments: 1. ALICE (A Large Ion Collider Experiment) [22] 2. TOTEM (TOTal Elastic and diffractive cross-section Measurement) [23] 3. LHCb (Large Hadron Collider beauty) [24] 28 4. LHCf (Large Hadron Collider forward) [25] 5. MoEDAL (Monopole and Exotics Detector At the LHC) [26] 6. CMS (Compact Muon Solenoid) [27] 7. ATLAS (A Toroidal LHC ApparatuS) An important concept in high energy physics experiment is integrated luminosity, a measure of the interactions per unit cross-section. It is a measure of how much data has been collected. It can also be described as a rate, referred to as instantaneous luminosity, related to integrated luminosity by Lintegrated = Linst (t) dt. The relationship between luminosity, cross-section, and number of events is described by the following equation: Nevents = Lσ, (3.1) where Nevents is the number of events of some process over some period of time, σ representing the cross-section of the process, and L representing the integrated luminosity over the period of time. The LHC is designed for a peak instantaneous luminosity of 1034 cm−2 s−1 , or 10−5 f b−1 s−1 , although it will almost certainly reach even higher luminosities as the operators become more experienced and as its hardware is upgraded. Fig. 3.1 shows the measured delivered luminosity for the ATLAS experiment. Note the significant gains in rate that have been made in each year of running. The rapidly increasing luminosity from the LHC is a strong driver for the output of physics results from the experiments. While a higher rate of events is typically desired, especially by the larger experiments, there are difficulties when the rates get too high. At the high instantaneous luminosities at the LHC, multiple proton-proton interactions are likely to occur in each bunch crossing. This 29 Figure 3.1: The delivered luminosity to the ATLAS experiment in the years 2010, 2011, and 2012 [7]. 30 Recorded Luminosity [pb-1/0.1] phenomenon is referred to as in-time pile-up, discussed in greater detail in Section 3.2.7. 180 160 ATLAS Online Luminosity ∫ s = 7 TeV, ∫ Ldt = 5.2 fb-1, <µ> = s = 8 TeV, Ldt = 20.8 fb-1, <µ> = 20.7 140 9.1 120 100 80 60 40 20 0 0 5 10 15 20 25 30 35 40 45 Mean Number of Interactions per Crossing Figure 3.2: The mean number of interactions per crossing taken in 2011 and between April 4th and November 26th in 2012 [7]. 3.2 The ATLAS detector This analysis uses data collected by the ATLAS detector [28]. The ATLAS detector is large, the largest LHC experiment by volume at approximately 22,000 m3 and has a mass of approximately 7,000 tons with over 100 million electronic readout channels. It is maintained and its data analyzed by a world-spanning collaboration of over 2900 scientists as of July 2012. It is able to detect a variety of particles, including photons, electrons, muons, and 31 the products of quark hadronization. These particles are detected using many different technologies which will be discussed in the following Sections. 3.2.1 Detector basics There are a number of general concepts that must be discussed to understand the functioning of the detector. First consider the coordinate system describing the location of objects in the detector. The origin is defined as the interaction point. The proton beamline runs along the z-axis. The positive z direction is counterclockwise around the LHC ring as viewed from above. The x-axis points towards the center of the ring, and the y-axis points up vertically. Typically, however, the coordinates are not discussed in Cartesian coordinates, instead using coordinates of z, η, and φ, with z remaining the same as in the Cartesian system. The angle φ is defined as the azimuthal angle from the x-axis in the x-y plane, while η is a more complex variable used for reasons described below. The vector r also sometimes represents the vector from the origin to the point. The η coordinate, also known as pseudorapidity, is derived from the more intuitive polar angle θ, the angle between r and the y-axis. In high energy experiments, θ is no longer a useful variable because ∆θ between two objects it is not relativistically invariant along the z-axis. Instead, angles are better measured using rapidity, defined as: 1 y = ln 2 E + pz E − pz (3.2) In this equation we use natural units in which c = 1. The rapidity transformation under a Lorentz boost β = v along the z-axis is given below. It is shown that difference between c rapidities is invariant under these transformations. 32 y → y − tanh−1 (β), (3.3) ′ ′ ′ ′ y1 − y2 = y1 − tanh−1 (β) − y2 − tanh−1 (β) = y1 − y2 . (3.4) Although the invariance of the rapidity is very useful, rapidity as a measurement of angle is problematic, as E is dependent not only on the momentum of the particle, but also its mass. In other words, two particles with identical momentum traveling in identical directions but with different masses will have two different rapidities. There is also the practical concern that the mass of a given particle is not always known, thus rapidity cannot be calculated even if it were desirable. As a compromise, pseudorapidity is used instead, defined as: 1 η = ln 2 |p| + pz |p| − pz = −ln tan θ 2 (3.5) This quantity has the benefit of ∆η being relativistically invariant for massless particles under boosts along the z-axis while being independent of mass. Note that in the case m << E, the equation for rapidity is equivalent to pseudorapidity. Since at ATLAS we often deal with particles with energies much higher than their mass, pseudorapidity proves to be a useful approximation for rapidity. The relationship between η and θ is shown in Fig. 3.3. Often we consider the angular difference between two objects in the detector. Calculating this difference is straightforward if they lie on the η − z or φ − z planes, but for the general case we need to define something more robust. This variable is called ∆R, and is defined: ∆R = (∆φ)2 + (∆η)2 . (3.6) A concept often encountered in detector design is the radiation length. The radiation length 33 is a material property that reflects the amount of energy lost by an EM particle passing through. When designing an experiment’s EM calorimeter, it is important to maximize the number of radiation lengths in the calorimeter while minimizing the number of radiation lengths the particle will encounter before reaching the calorimeter. A similar concept exists for hadronic objects interacting with nuclei through the strong force called interaction length. The number of interaction lengths in the hadronic calorimeter must be maximized to capture all of the remaining energy of the hadronic shower. A diagram of the ATLAS detector featuring the major subsystems is shown in Fig. 3.4. Each of these subsystems is discussed briefly below. • The magnet systems change the direction of charged particles, giving more information on their mass and momentum. There are two magnet systems, the solenoid magnet, used by the inner detector, and the toroid magnets, used by the muon detectors. These systems are discussed in more detail in Section 3.2.2. • The tracking systems observe the path that particles take through the solenoid’s magnetic fields to determine a particle’s momentum and to aid in particle identification. The technical details on the tracking are discussed in Section 3.2.3. • The calorimeters measure the energy of particles and help with particle identification. There are a large number of different technologies that are described in Section 3.2.4. • The muon systems are the largest system by volume. They detect and measure muons with the aid of the toroidal magnets. More detail is given in Section 3.2.5. 34 η 4 2 0 -2 -4 0 20 40 60 80 100 120 140 160 180 θ[degrees] Figure 3.3: Relationship between η and θ. 35 Figure 3.4: A diagram of the ATLAS detector and its subdetectors. Image of people added to the left side to illustrate scale. [8] 36 3.2.2 Magnet systems The magnet systems in ATLAS curve the path of charged particles. By looking at the amount of deflection a particle experiences in a known magnetic field, the particle’s momentum is better understood. These systems use superconducting magnets made of niobium-titanium, requiring them to be cooled to low temperature. Liquid helium at 4.5K is used for this cooling while the critical temperature of the superconductor is 1.9-2.7K above that. The solenoid magnet system generates a two Tesla magnetic field for use by the inner detector. Minimizing the number of radiation lengths present in the magnet system’s structure is a critical constraint to maximize the sensitivity of the detectors. The solenoid is designed to present a maximum of 0.66 radiation lengths to an incoming particle. The magnetic field generated is axial along the z-axis, which means that it will cause a charged particle to bend in the x-y plane. There are three sets of toroidal magnets at ATLAS, one in the barrel and one at each end-cap. These magnets bend the path of muons passing through the muon detectors. The barrel magnet provides a 0.5 T average magnetic field and 1.5-5.5 Tm of bending power, while the end cap magnets each provide a 1.0 T average magnetic field with 1-7.5 Tm of bending power. The barrel services the |η| < 1.4 region, while the end caps service the 1.6 < |η| < 2.7 regions. The region 1.4 < |η| < 1.6 is covered by a combination of the two. The magnetic field generated is inhomogeneous, but mostly perpendicular to the path of muons. Extensive testing was done to construct an detailed map of the magnetic fields created by the toroidal magnet systems. An example of the bending power of the magnetic field as a function of |η| is shown in Fig. 3.5. The bending power measures the amount of deflection on a charged particle as it passes through and it is an important quantity because 37 it, along with the resolution of the detectors, determines what ranges of momenta can be measured and with what precision. Figure 3.5: The predicted bending power through MDT layer as a function of |η| for infinite momentum muons [8]. 3.2.3 Inner detector tracking The inner detector tracking system gives high resolution information about the path particles take through the detector as they pass through the magnetic field of the inner solenoid [29]. Combined with information from other detectors, the inner detector is a powerful tool for correctly identifying particles, determining their momenta, and locating their origin. The inner detector has sensitivity in the range |η| < 2.5. The ATLAS tracking system uses three different subdetectors to accomplish this task, as illustrated in Fig. 3.6. The highest resolution tracking system is the pixel detector [30]. It is made up of three 38 Figure 3.6: A diagram of the three subdetectors of the inner detector and their relative sizes [8]. 39 barrel layers and six end-cap disk layers, three on each side of the detector. These layers contain approximately 80 million silicon sensors giving it a resolution of up to 10 µm in R-φ space and 115 µm along the z-axis. High resolution tracking so close to the interaction point allows for accurate measurement of the origin of each particle, which is useful in verifying that different particles originate from the same interaction and also provides discrimination power for particle identification. Due to its close proximity to the interaction point, the pixel detector is designed to be able to withstand the large amounts of radiation expected. The next system out from is the semiconductor tracker (SCT) [31]. These four cylindrical double-layers of sensors function similarly to the pixel detectors, but instead of being small pixels, they are long strips stretching in the z-direction. The pairs of sensors are angled slightly with respect to the z-axis to allow measurement of the z-coordinate. This angle makes the SCT more cost effective than simply extending the pixel detector while still fulfilling the physics requirements, as the high resolution of the pixel detector is not necessary farther from the beamline. The spatial resolution of the SCT is 17 µm in R-φ space, and 580 µm along the z-axis. The final inner detector system is the transition radiation tracker (TRT), which takes advantage of transition radiation, the radiation emitted when a particle moves across the border between two materials with differing dielectric constants [32, 33]. It is formed from 73 (barrel) or 163 (endcap) layers of 4 mm diameter drift tubes containing a mixture of 70% xenon, 27% carbon dioxide, and 3% O2 . When a particle passes through the surrounding layer made up of a polypropylene-polyethylene fiber mat, it will produce transition radiation which ionizes the gas in the tube. The signal is picked up by a wire that runs through the middle of each straw, which is then interpreted as a hit. The energy released by transition radiation is dependent on the β of the particle. Examining the energy profile as a particle 40 passes through the TRT allows particles to be identified. In particular, the TRT is critical for discriminating electrons from charged pions, giving a rejection factor greater than 20 for pions at 90% electron efficiency [28]. The TRT is the largest of the three tracking detectors and even though its absolute resolution of approximately 170 µm per straw is the lowest, the number of hits it receives makes it critical for particle identification and momentum measurements. 3.2.4 Calorimetry The calorimetry systems measure the energy of certain particles in the detector and in identify particles. There are two layers of calorimetry, an EM (electromagnetic) calorimeter which is sensitive to low mass particles that interact electromagnetically, for example electrons and photons, and a hadronic calorimeter which is sensitive to hadrons. The calorimeters make up the second to last layer of the detector and should stop nearly all of the remaining outgoing particles with the exception of muons, neutrinos, and possibly exotic undiscovered particles. Figure 3.7 shows the layout of the layers of calorimeters. 41 Figure 3.7: A diagram of the layers of the calorimeter [8]. 42 The EM calorimeter is made up of a barrel section and two end-cap sections (EMEC) covering the region |η| < 2.5. It contains sections of lead plates and electrodes with liquid argon as a sampling medium. High energy electrons shower Bremsstrahlung radiation while interacting with the lead plates [34]. These high energy photons will then pair produce to form smaller energy electrons and positrons. The cycle repeats until the photons and leptons remaining are low enough energy to ionize the liquid argon. These ionized electrons are then detected by the electrodes. The EM calorimeter is designed to be thick enough to stop the propagation of all but the most energetic photons and electrons. Corrections are applied to account for energy lost in the previous layers of the detector to get an accurate estimate of the total energy. Note that since the mechanism for generating radiation is Bremsstrahlung, there is a mass dependence of 1/m4 . This is why these calorimeters are so sensitive to electrons, but not sensitive to muons, which are approximately 200 times more massive, resulting in a (1/200)4 = 1/1, 600, 000, 000 reduction in sensitivity. The hadronic tile calorimeters operate in the range |η| < 1.7, using steel as the absorber and scintillator tiles as the active medium [35]. The iron in the steel has an interaction length much larger than its radiation length. Scintillating tiles are used here and not in the interior layers because the scintillating tiles are not nearly as radiation hard as liquid argon systems but are much more affordable. Unlike the EM calorimeter which relies on electromagnetic interactions, hadronic calorimeters create cascades which rely primarily on the strong force. The basic concept of sampling is similar to the calorimeter, where the passive medium initiates cascades which are then measured in the active medium. In the case of the tile calorimeters, the showers originate through mostly through inelastic interactions with nuclei in the steel layers. The charged particles passing through the scintillating tiles 43 excite the molecules to a higher energy state. Upon returning to their ground state, the molecules emit ultraviolet photons that are read out though fibers to photomultiplier tubes. Because the cascades created by the hadronic calorimeter are driven primarily by the strong force, muons will pass through this layer with minimal interaction. The hadronic end cap calorimeter (HEC) uses similar principles, but with copper as the absorber and liquid argon as the active medium. The forward calorimeter (FCal) covers the extreme η region of the detector, 3.1 < |η| < 4.9. Due to its proximity to the beamline, it is sensitive to the pile-up effects described in Section 3.2.7. It is composed of three modules projecting away from the interaction point. The module closest to the interaction point is designed for EM interactions, using a copper absorber, while the other two use a tungsten absorber to create hadronic interactions. The interactions are sampled by thin layers of liquid argon. Copper was chosen to give high resolution for the EM interactions and its high conductivity allows for quick heat removal. Tungsten were chosen because it create showers with small lateral spread, giving better containment to the laterally thin FCAL. 3.2.5 Muon spectrometer The muon spectrometer is the outermost layer of the ATLAS detector. It tracks muons as they bend through the toroidal magnetic field in the region |η| < 2.7, allowing for their momenta to be measured. The amount of bending is determined by the magnets as discussed in Section 3.2.2. The detection occurs in four subsystems: the monitored drift tubes (MDT) and the cathode strip chambers (CSC) make detailed measurements, while the resistive plate chambers (RPC) and the thin gap chambers (TGC) are primarily designed to allow quick trigger decisions to be made. Figure 3.8 illustrates the layout of the muon system. 44 Figure 3.8: A diagram of the muon detector systems [8]. 45 The MDTs are installed to cover |η| < 2.7. They are made up of many pressurized drift tubes approximately 3 cm in diameter running in the z direction. Muons ionize the gas as they pass through, releasing electrons, and these electrons are attracted to a central wire at high positive potential. As they approach the wire they pick up enough energy to ionize the surrounding gas. This ionization creates an avalanche of electrons hitting the wire and this signal is then propagated to the electronics. These chambers are located throughout the η space of the detector, and the geometry varies throughout. The placement of the tubes and the deformation of their internal geometry are well known due to monitoring by built-in optical systems, allowing an optimal resolution of tracked muons of 50 µm. The Cathode Strip Chambers (CSCs) give a high resolution view of the region 2 < |η| < 2.7. Similar to the MDT, the CSC is made up of chambers filled with pressurized gas. Muons pass through, ionizing the gas. In the CSC, instead of a single central wire, the chambers are strips filled with many wires. The wires induce a charge onto cathodes of the side of the strip. These cathodes are segmented, giving additional information about the angular coordinates of the muon. The CSCs are divided into smaller and larger wedge chambers which alternate around in the φ direction of each of the endcap regions. As a muon leaves the detector in the appropriate η range, it will pass through four planes of CSCs, giving up to four measurements of its η and φ coordinates. The CSC subsystem has a resolution of 40 µm in the R direction and 5 mm in the φ direction. The Resistive Plate Chambers (RPCs) are used for triggering in the barrel region |η| < 1.05. The RPCs are made of parallel resistive plates 2 mm apart. An electric field of 4.9 kV/mm applied to these plates cause discharges along the ionized tracks as muons pass through. The discharge signal is read out to conducting strips attached to the plates. The plates are resistive so that the discharge is localized and doesn’t immediately discharge the rest of the 46 plate while the charge replenishes. The discharge is quick and consequently the RPCs are useful for triggering. There are three layers of RPCs in the barrel, each layer containing two detectors. Therefore, a muon going through the barrel region will be detected up to six times by the RPC, allowing a reasonable estimate of its path through the region. The distance between the RPCs determines the observable energies of the muons, as the muon energy determines the amount of bending applied to the muons between layers. The bending must be large enough to be observable given the resolution of the RPC. The design of the ATLAS RPCs allows muons in the range of 6-35 GeV to be selected with a spatial resolution of 10 mm in the z direction and 10 mm in the φ direction. The Thin Gap Chambers (TGCs) are used for triggering in the end-cap region 1.05 < |η| < 2.4. Additionally, they add information to measurements from the MDTs about the φ coordinate. The TCGs are made up of many wires enclosed in a gas volume between two plates separated by 2.8 mm that read out to conductive strips perpendicular to the wires. The information from the strips can also determine information about the φ coordinate. The TGCs are constructed in sets of doublets and triplets, the number of each depending on the location in the detector. The TGCs have a resolution of 2-6 mm in the R direction and 3-7 mm in the φ direction. Although their resolution is high compared to the MDTs and the CSCs, the TGCs have a fast response time and are critical for the triggering discussed in the next section. 3.2.6 Triggering and data acquisition Given the bunch crossing rate of the LHC combined with the size and complexity of the detector systems, it is clear that ATLAS is collecting information at a rate that is unreasonable to store in real-time. A back of the envelope calculation reveals that with 25 ns bunch 47 crossings and at least one event occurring per crossing, there are 40 million events every second. Given that the average event is 1.3 Mbytes in size [8], 40 millions events per second is an impossible rate at which to collect and record data, thus a triggering system has been developed that makes decisions about which events to keep and which to discard. There are three levels of decision making that occur for each selected event: level 1, level 2, and the event filter. The level 1 (L1) trigger system makes use of the calorimeter and muon systems to make rough and quick judgments about which to keep. It is the level that first evaluates each event, and so it must make a decision for every crossing. It accepts only one in one thousand events, reducing the incoming rate for the level 2 system to 40k events per second. Due to the short timescale to make decisions, the L1 system does minimal processing on the data. Typically it is limited to local phenomenon such as determining how many clumps of energy are in various detectors, but the calorimeter has additional hardware designed to also calculate the global missing transverse energy and the total transverse energy. The L1 detector calls these clumps of energy Regions of Interest (RoIs), and this list of regions is used as a seed for the level 2 systems. The level 2 (L2) trigger system a factor of one thousand more time than the L1 trigger to make decisions, and as a result is able to use information from more of the detector subsystems, including the tracking, to construct a better picture of the events. It can evaluate criteria like the shape of showers, is able to do much better particle identification than the L1 trigger, and reconstructs the RoI energies with much better resolution. It has an input rate of 40 kHz from the L1 system, and outputs to the event filter at a rate below 3.5 kHz. The event filter (EF or L3 trigger) uses advanced offline reconstruction techniques utilizing the full capabilities of the detector to make its decision. The events being processed 48 by the EF are temporarily written to memory, so that even a failure in a computing node will not cause a loss of the event. It also classifies the events that pass into various streams. The most obvious stream is the set of events collected for physics analysis, but there are also streams for detector calibration and other such tasks. It outputs events to be saved at approximately 200 Hz. 3.2.7 Pile-up Often an interaction can interfere with the detector readout of another unrelated interaction. This phenomenon is referred to as pile-up. There are two types of pile-up, called in-time pileup and out-of-time pile-up. In-time pile-up is caused by multiple interactions in the same bunch crossing creating many hits in detectors from events other than the one selected by the trigger, leading to objects and tracks being assigned to the wrong interactions. Sometimes this only impacts the lowest level triggers, but this pile-up can also have an effect on the offline reconstruction, too, especially at high instantaneous luminosities. Out-of-time pile-up occurs when interactions from an earlier or later bunch crossing bleed into the readings of the current bunch crossing. This pile-upis caused by detectors that have a response time longer than the bunch crossing time. Out-of-time pile-up has the most impact on the LAr calorimeters due to their long electronics response time (up to ˜ ns). The muon gas 400 chambers also have a long electronics response time, but they aren’t as sensitive to pile-up because the rate of detected particles is lower. Out-of-time pile-up can also occur in the in the inner detector when particles spiral in the detector for longer than the bunch crossing time. Any simulation of interactions in the detector must take into account both of these types of pile-up. 49 Chapter 4 Object Reconstruction and Definitions The ATLAS detector is immensely complicated, and therefore the raw data received from the subdetectors must be refined before it can be analyzed properly. The goal of object reconstruction is to turn the disparate set of observed interactions into recognizable and well defined physical objects. The end object is considered likely to be a member of the class we assign it. For example, in our detector we may combine measurements with electron-like activity into one reconstructed electron object. This is not guaranteed to be an electron object in reality due to the detector resolution and error, but is likely to be (¿90% in the case of electrons). In this section details will be given about the reconstruction and selection of the objects important to this analysis: electrons, muons, jets and neutrinos (Missing transverse energy). The definitions given here are defined by the ATLAS collaboration. 4.1 Electrons Identifying and reconstructing electrons is critical to creating an accurate picture of a collision event. Measurements from both of the calorimeters and the inner detector tracking systems are used to reconstruct electrons. Electrons are reconstructed primarily using a seeding algorithm [36]. Seeding algorithms start by finding candidate objects, and building up a fully reconstructed object around that. The particular algorithm used in the calorimeter is called a sliding window algorithm, which 50 works by constructing a rectangular region of the calorimeter of a fixed size and then is moved to maximize the energy within. Once a candidate region is defined, a matching track is searched for which must be consistent not only with the location in the detector observed, but also with the measured energy calculated. Having both of these two checks ensures accurate track matching. Electrons can also be reconstructed by the “softe” algorithm. The softe algorithm uses seeds from the tracking system instead of from the calorimeter. The seeding track must have pT > 2 GeV and a significant number of hits in the inner detector. This track is then matched to an EM cluster in the calorimeter. The softe algorithm is more sensitive to low pT electrons and electrons in jets, while the standard algorithm is better at detecting high pT isolated electrons. Electrons are given an associated quality level that indicates how likely it is that the reconstructed electron object was the result of an electron in the detector. A loose electron is defined a set of cuts using information about hadronic leakage and the shower-shape from the middle of the EM calorimeter layer. Loose electrons have high acceptance, but poor background rejection. A tight electron is defined by a complex set of cuts using the full information from the calorimeter layers and the inner detector tracking. It is designed to have high acceptance and good background rejection. Analysis level selection uses the the most stringent tight electrons, while a loose electron definition is used for sideband cross checks and background estimates. Electrons in this analysis are required to pass an additional set of stringent selection cuts beyond the tight electron definition. They must have been identified as an electron by either the calorimeter seeding algorithm alone or both the calorimeter seeding algorithm and the tracking algorithm. The transverse energy of the electron uses the energy of the seeding 51 calorimeter cluster and is defined as ET = (cluster E)/cosh(track η), where cluster E is the energy of electron as measured by the calorimeter. The transverse energy of the cluster must satisfy the energy threshold ET > 25 GeV . Reconstructed electrons are also required to lie within the high efficiency region excluding the calorimeter crack region. |η(cluster)| < 2.47, excluding 1.37 < |η(cluster)| < 1.52 (4.1) A jet can often look like an electron if a significant amount of energy is deposited in the EM calorimeter. These cases are identified and corrected by looking at the isolation variable, or the surrounding energy, both in the nearby areas of the electromagnetic calorimeter and in the hadronic calorimeter behind the selected region. As jets leave a much wider and deeper footprint in the detector, excess energy in these regions is indicative of a jet. A useful variable for evaluating isolation is Etcone20, defined as the transverse energy deposited in the calorimeter in a cone of half opening angle size 0.2 minus the energy due to the electron. An isolation criterion of Etcone20 < 3.5 GeV is required of all selected electrons. During the data-taking period there was a significant problem in one particular region of the LAr calorimeters, leaving a hole in the detector. This malfunction was eventualy repaired and later events in our dataset do not have this problem, leaving approimately 43% of our data affected. The region affected was 0.20 < eta < 1.65 and −0.99 < φ < −0.39. To compensate for this malfunction events during this time that have jets in this region are rejected. Additionally simulated events are chosen randomly in proportion to the time this malfunction was present, and in these events if an electron or jet is in the affected region the event is discarded. 52 4.2 Muons Muons are the other lepton of interest to this analysis. As explained in Section 3.2.5, a large portion of the ATLAS detector is dedicated to identifying and measuring the properties of muons. As a result, muons are reconstructed accurately. ATLAS uses many algorithms to reconstruct the muons, but for this analysis just one of these algorithms is used for selection. The algorithm, “MuId combined” [37], identifies pairs of inner detector and muon spectrometer tracks using a global fitting procedure. This fit takes into account both the magnetic fields in the detector and the material effects of the detector. Each selected muons must pass a stringent set of required track quality cuts: • Either the number of hits in the B layer, the innermost layer of the pixel detector, must be greater than 0, or the B layer corresponding to this muon must have been disabled. • The number of pixel hits plus the number of crossed dead pixel sensors must be greater than one. • The number of SCT hits plus the number of crossed dead SCT sensors must be greater than five. • The number of dead pixel holes plus the number of dead SCT holes must be less than three. • A more complicated TRT cut is required. Let n = nT RT Hits + nT RT Outliers. nT RT Outliers is formed by either a straw tube with a signal but not crossed by the nearby track, or a set of TRT measurements that do not match smoothly with the pixel and SCT measurements [37]. 53 • If |η| < 1.9, then require that n > 5 and nT RT Outliers/n < 0.9. • If |η| >= 1.9, if n > 5, then require that nT RT Outliers/n < 0.9. If n < 5 then no further requirement. An isolation requirement is applied to ensure high purity muons. The muon isolation uses variables that evaluate the amount of energy surrounding the muon object inside a ∆R = 0.3 cone. In this calculation we use two values: Etcone30 measures the transverse energy from the calorimeter while P tcone30 measures the transverse momentum from the inner detector tracking. To pass the isolation cut it is required that P tcone30 < 4 GeV and Etcone30 < 4 GeV . In addition all muons within ∆R < 0.4 of a jet with pT > 20 GeV are removed from the selection, as these are likely jet fragments reconstructed as independent muons. For this analysis, we apply an additional cut requiring the remaining muons to have pT > 20 GeV and |η| < 2.5. 4.3 Jets Jets are a complex composite object made up of many particles, typically representing an originating quark or gluon. As discussed in Section 2.1, the decay time of most quarks and gluons is much longer than the interaction timescale of the strong force. As a result, a bare quark or gluon will hadronize instead of decaying, leaving a shower of hadrons in the detector. This shower can then be reconstructed back into a 4-vector representing the originating parton. Because of the complexity of the jet structure, it is difficult to reconstruct jets accurately, and there are several different methods to approach this problem. The jets in this analysis are clustered using the anti − kt [38] algorithm. 54 The anti − kt algorithm was developed to avoid known problems with other standard algorithms. Naive algorithms give results that are collinearly unsafe. Collinear unsafe values means that if the hard seed jet happens to radiate a high pT gluon, it will no longer be the seed and change the entire structure of the reconstructed jet, even though the underlying jet is the same in both cases. As a result, it is difficult to do theoretical calculations using this algorithm and this motivated the development of a more sophisticated algorithm, defined below. Consider the set of all objects and all pairs of objects and calculate di for each object and dij for each pair using their angular coordinates and their transverse momentum relative to the beam. 2 2 2p 2p ∆ηij + ∆φij dij = min(pT,i , pT,j ) R2 2p di = pT (4.2) (4.3) Here pT is the transverse momentum of the object, η is the pseudorapidity, φ is the phi coordinate, p is a parameter which scales the dependence of the pT , discussed in more detail below, and R is the characteristic size of the jet. In this analysis we use R = 0.4. These elements dij and di are combined to form a list and the lowest value dmin is chosen. If it is a dij , both objects i and j are removed from the list, combined and the combination inserted into the list. If dmin is a single object, that object is considered a jet and removed from the list. The process repeats until all objects are removed from the list. As a result, every input object becomes either part of a jet or a jet itself. The selection of the value for p significant changes the function of the algorithm. The 55 case where p = 1 is called the kT algorithm, and in this case low pT objects are the first to combine, making the shape of the jet sensitive to these low pT objects. This sensitivity means that this algorithm is infrared unsafe, that the result of the algorithm can change significantly if you add soft radiation. Instead, for this analysis we use the anti − kT algorithm where p = −1, which causes the highest momentum jets to be the first to combine with their neighbors. We can immediately see that two nearly collinear jet fragments will be among the first pairs to combined, which means that this algorithm will be collinearly safe. In addition to being collinearly safe, the anti − kt algorithm is also infrared safe. Any soft radiation near a real jet will quickly be combined with a high energy neighbor, resulting in a single high energy potential jet fragment. The anti − kt algorithm is also defined such that two jets cannot share a detected object, as could be the case in the naive example given at the beginning of this section. Furthermore, the anti − kt algorithm is implemented in a computationally efficient way [39]. These benefits create a compelling case to use the anti − kt algorithm. Once a list of jet candidates is created, a number of quality selection criteria are applied to ensure that they are physical jets that match the possible kinematics of the signal. There are several quality corrections applied, each of which is a small effect [40]. • Jets with negative energy represent jets that are unphysical and are removed from the event. • Jets that overlap electrons can represent a misreconstruction of an electron as a jet, thus a procedure is applied to reduce this effect. For each electron the ∆R between the electron and every positive energy jet is calculated. If there exists at least one jet with ∆R < 0.2 with respect to the electron, the jet closest to that electron is removed. 56 • Any event which contains a jet marked as “bad” by the Jet quality criteria is removed. The “bad” label indicates jets that are likely to be from background events or detector effects, or in regions of the calorimeter that are not well understood. Overall this is an small effect. In addition, for this analysis jets are required to have pT > 30 GeV and |η| < 2.5. The η cut removes the forward portions of the detector from consideration, this region is more sensitive to the effects of pile-up, increasing the jet energy scale systematic effects described in Section 8.1. 4.4 Missing transverse energy Neutrinos originating from the collision interact through the weak force with an effectively zero cross-section with the ATLAS detector. Consequently, it is impossible to detect them directly. Fortunately, the global kinematics of the interaction can reconstruct the portion of the momentum of the neutrino that is transverse to the beam line. Consider the colliding protons. Almost the entirety of their initial momentum is along the beam line and little of the colliding system’s momentum is transverse to the beam line. Since momentum is conserved, then the sum of the momenta perpendicular to the beam line must sum to zero. The momenta of all detected objects is summed vectorially and the 2-vector that cancels this extra momenta is called the MET (Missing Transverse Energy), or miss ET . Note that because the assumption about the component of the neutrino only works transverse to the beam, there is no information on what the pz of the neutrino may be. miss The ET is sensitive to confounding factors. Any energy not measured by the detector miss will lead to a mismeasurement of the ET . As a result, a distinction is often made between 57 miss originating from a neutrino or other non-interacting particle and E miss from other ET T sources such as a missed interacting particle, energy measurement errors, or pile-up effects. miss At ATLAS the ET is calculated starting from the raw calorimeter measurements with corrections from the following categories of objects: electrons, jets, muons, and cell-out [41]. Jets are split up into two different types: hard jets and soft jets. Hard jets have pT > 20 GeV and soft jets have 20 GeV > pT > 7 GeV . All calorimeter energy fragments that were not reconstructed as part of an object are considered as cell-out objects. miss In this analysis ET is a useful tool for separating our signal from the backgrounds, as miss the signal has high ET due to its two neutrinos. All of our backgrounds which do not miss have a neutrino in their final state will see significant reduction from an ET cut, and in miss > 50 GeV is placed on the events. this analysis a hard cut of ET 58 Chapter 5 Event Selection In this section the selection criteria applied to the data and simulated events and the reasoning behind them are described. These selection cuts are chosen because they keep clean signal events while rejecting background and poorly reconstructed signal events. These cuts come miss from the Top Working Group, although two of our cuts, the Z veto cut and the ET angular correlations cut are specific to this analysis. 5.1 Selecting events from data The data used are 7 TeV proton-proton collision data from between February 2011 and August 2011. Unprescaled single electron and muon triggers are used to choose event candidates, and the event is required to be flagged as having taken place during a period of running where the LHC had stable beams and all detectors were running without issue. These quality criteria are applied using a list of sections of runs, called a Good Runs List (GRL). These data represent 2.05 fb−1 of integrated luminosity. 5.2 Selecting dilepton events To select dilepton events and reject our backgrounds a chain of cuts is applied to both the data and the simulated events. The cuts applied in this analysis are: 59 • Primary vertex cut. • Reject events with a “Bad” jet. • Reject events that may be contaminated by the LAr hole problem. • Reject events with an electron overlapping a muon. • Reject events with two muons satisfying ∆φ > 3.10. • Require two selected leptons. • Trigger selection of an electron (ee subchannel only). • Trigger matched to reconstructed electron (ee subchannel only). miss > 50 GeV . • ET • Z veto cut, 81 GeV < Mll < 101 GeV (ee and µµ only). miss miss • ∆φ(l1 , ET ) + ∆φ(l2 , ET ) > 2.5. I will now go into more detail on each of these selection cuts and discuss the rationale behind them. A number of event quality cuts are applied to eliminate events that have been poorly reconstructed or otherwise do not represent a good collision event. These cuts are determined by the top working group and the implementation and rationale for each is well documented [42]. To ensure that the event is from a collision event, a cut is applied requiring that the first primary vertex in the event have at least four tracks. Another cut is applied to remove events that contain “Bad” jets. These are jets that have been determined to not be associated with a real in-time energy deposit. Because the 60 presence of a single high pT “Bad” jet can pollute the event kinematics, any event with a bad jet with pT > 10 GeV is removed from the selection. A third cut rejects events if they may have been impacted by LAr hardware issue during running discussed in section 4.1. The fourth cut rejects events if a selected muon and a selected electron share a track. This would be an indication that the electron is erroneously reconstructed from muon energy deposits in the calorimeter from the muon, and consequently these events are discarded. In the µµ channel, there is an additional veto to remove coincident cosmics events. Cosmic muons can appear as two muons in the detector, with one track from the muon going in and another back to back track of the muon leaving. As a result, this cut rejects events with pairs of tracks muon that match up closely. Specifically, muons are required to have been reconstructed with opposite charge, both having an impact parameter greater than 0.5 mm, and must have ∆φ > 3.1. After these cuts are applied, the remaining events have no obvious reconstruction errors and originate from collisions in our detector. These events are subjected to further cuts to enhance the signal to background ratio as much as possible. This analysis is divided into three channels with differing lepton combinations. Events are selected that have two electrons (ee channel), an electron and a muon (eµ channel), or two muons (µµ channel). Each of these channels requires that the leptons selected meet the quality criteria defined in Sections 4.1 and 4.2. In the ee channel it is also ensured that the EF E20 electron trigger fired for this event. Furthermore, this triggering object must be consistent with at least one of our selected electrons by meeting the requirement ∆R(electron, trigger object) < 0.15. Due to a bug in simulating the trigger conditions for the muons in the 2010 simulated events, the same 61 procedure cannot be repeated for muons. We consider three regions of our analysis defined by the number of jets: 1-jet, 2-jet, ¯ and 3+jet. As the largest background, tt, contains two jets in the final state, 1-jet events ¯ are considered the primary signal region. Since the tt background yield dominates in the ¯ 2+jet bin, events with more than one jet are used to constrain the uncertainty in the tt normalization. One distinguishing characteristic of the W t signal is the presence of two miss > 50 GeV . This cut eliminates much neutrinos, hence it is required that events have ET of the fake dileptonbackground. Even after all of the previous cuts, the ee and µµ channels suffer from large contamination from Z → ee and Z → µµ events due to their relatively large cross-sections. To reduce the impact of these channels, an additional cut is made on events with a dilepton invariant mass near the Z boson mass, 81 GeV < Mll < 101 GeV . This cut is independent of whether the channel is ee or µµ, because in this energy regime the energy resolutions of reconstructed electrons and muons are similar. A powerful cut reduces the Z → τ τ background significantly. This cut is performed by taking the sum of the ∆φ of both leptons with the missing transverse energy vector. The cut value is optimized to maximize background rejection while minimizing signal loss. This triangle cut is defined as: miss miss ∆φ(l1 , ET ) + ∆φ(l2 , ET ) > 2.5. (5.1) The resulting impact on the events is shown in Figs. 5.1 and 5.2. Although there is some discrimination power in the individual distributions, when they are summed together the reason for the triangle cut becomes obvious, as we are able to eliminate many background 62 Events / 0.16 Events / 0.16 events without losing much signal. 220 ATLAS -1 200 L dt = 2.05 fb 180 1+ jets s = 7 TeV 160 140 120 100 80 60 40 20 0 0 1 2 3 ∫ ∆ΦLep1,Emiss) T 180 160 ATLAS L dt = 2.05 fb-1 140 1+ jets s = 7 TeV 120 100 80 60 40 20 0 0 1 2 3 miss ) ∆Φ(Lep2,E ∫ T (a) (b) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure 5.1: The impact of the triangle cut on signal and background: (a) the angle between miss miss leading lepton and ET (b) the angle between the second lepton and ET . The simulated events are represented by the solid regions, while the data are represented with a black dot. An event selection is applied that divides the events into three exclusive bins: dielectron, dimuon, and electron-muon. These bins are examined separately in the control regions to make sure that the backgrounds are well modeled. Plots of selected variables in these bins are shown in Appendix A. Examining them in the bins independently gives us a useful tool for diagnosing the cause of disagreements between the data and the simulated events. Since the kinematics of these three subchannels are similar, for the final analysis they are merged 63 ∆φ(Lepton2,MET) ATLAS Preliminary Wt Drell-Yan 3 2.5 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 3 ∆φ(Lepton1,MET) Figure 5.2: The effect of the triangle Z → τ τ veto cut in two dimensions. 64 together. 5.3 Event yields Table 5.1 shows the resulting yields after selection along with the simulated statistical and data-driven uncertainties. We expect 3003.5 events in our signal region and observe 3059. The data are in reasonable agreement with our background and signal estimates given the data statistical uncertainty and the simulated event yield uncertainty. In addition, agreement is also observed individually in the ee, eµ, and µµ channels. Distributions of relevant kinematic variables are shown in Fig. 5.3 for the combined channel. Similar Figs. are available for the three individual ee, eµ and µµ channels in Appendix A. Process Wt t¯ t WW WZ ZZ Z → ee (DD) Z → µµ (DD) Z → τ τ (DD) Fake dilepton (DD) Total expected Data Observed ee µµ 38.6 ± 0.8 65.3 ± 1.0 438.1 ± 4.5 738.5 ± 5.8 16.7 ± 2.4 29.0 ± 2.9 4.9 ± 0.7 13.8 ± 1.2 0.9 ± 0.1 4.5 ± 0.3 35.7 ± 2.5 — — 69.5 ± 3.1 1.1 ± 0.6 5.7 ± 3.4 9.0 ± 9.0 — 542.0 ± 10.7 926.3 ± 8.1 573 ± 24 905 ± 30 eµ All combined 119.7 ± 1.3 223.6 ± 1.8 1336.0 ± 7.8 2509.6 ± 10.7 55.3 ± 4.1 101.0 ± 5.6 8.1 ± 0.9 26.8 ± 1.7 0.4 ± 0.1 5.8 ± 0.3 — 35.7 ± 2.5 — 69.5 ± 3.1 2.6 ± 1.6 9.4 ± 3.8 6.9 ± 6.9 15.9 ± 15.9 1529.0 ± 11.4 2997.3 ± 17.6 1581 ± 40 3059 ± 55 Table 5.1: The observed and predicted event yields in the selected dilepton sample with at least one jet and for an integrated luminosity of 2.05 fb−1 . Uncertainties represent the effect of MC statistics for the MC-based estimates and the total uncertainty for the data-driven estimates. 65 Events / 10 GeV Events 2000 ATLAS 1800 1+ jets 1600 1400 1200 1000 800 600 400 200 0 0 ∫ L dt =s2.05 fb = 7 TeV -1 500 ATLAS 1+ jets 400 ∫ L dt =s2.05 fb = 7 TeV -1 300 200 100 2 0 0 4 50 100 150 200 Jet1 p [GeV] Njets T (b) 800 700 ATLAS L dt = 2.05 fb-1 1+ jets 600 s = 7 TeV 500 400 300 200 100 0 0 100 200 300 Events / 10 GeV Events / 15 GeV (a) ∫ 900 ATLAS 800 1+ jets 700 600 500 400 300 200 100 0 0 50 ∫ L dt =s2.05 fb = 7 TeV -1 100 HJets [GeV] T Events / 10 GeV (c) 600 ATLAS 1+ jets 500 ∫ L dt =s2.05 fb = 7 TeV -1 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 300 200 100 50 200 (d) 400 0 0 150 Emiss [GeV] T 100 150 200 Lep1 p [GeV] T (e) Figure 5.3: Histograms of the selected sample with combined ee, eµ and µµ channels. The simulated events are represented by the solid regions, while the data are represented with a miss black dot. (a) Jet multiplicity, (b) Leading jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT . 66 Chapter 6 Signal and Background Estimation The signal and backgrounds processes are modeled using a variety of techniques. Primarily they are based on Monte Carlo models using a pseudo-random number generator (PRNG) to simulate many events. These simulations contain many steps, chaining together several pieces of software to arrive at a complete simulated event. Different software is used to simulate different processes, as some software is known to simulate certain classes of processes better than others. In addition, the same process is simulated using several different software combinations to investigate the dependence of the software on the result. These estimates are done by the analyzers themselves. In particular, I performed the Z → τ τ estimate. 1. The events are generated at the parton level and simulated through the initial interaction, which takes into account the parton distribution function of the proton and the underlying Standard Model physics. This analysis uses the physics generators AcerMC 3.5 [43], ALPGEN 2.13 [44], POWHEG 1.0 patch 4 [45, 46], and MC@NLO 3.41 [47, 48]. The processes created by each generator are detailed in Tables 6.1 and 6.2. 2. Bare quarks and gluons are showered into jets using hadronization and parton showering software. The two hadronization simulation software packages are Pythia 6.423 [49] and HERWIG 6.510 [50]. 3. The detector is simulated using the GEANT4 3.5 [51] software package. This simulates the geometry of the ATLAS detector in detail, such as the energy resolution of detector 67 elements and pile-up effects. 4. For the remainder of the chain the same reconstruction steps are applied to the simulated events as to the data. 6.1 Monte Carlo modeling Good simulation of high energy physics events is difficult. It is for this reason that many of the systematic uncertainties shown in Section 8.1 are related to the simulation steps discussed above. Additionally, many cross checks are done to ensure that the simulation is is an accurate model of the physics and detector. Simulated samples are shared across the collaboration, therefore many of the cross checks are done at the collaboration or physics group level. However we independently compare our data with the simulation to verify that the modeling is good. The simulated samples are discussed in detail below and summarized in Tables 6.1 and 6.2. The W t signal is calculated to have a cross-section approximately 20% the magnitude of the total single top cross-section at 7 TeV [3, 4, 5], a theoretical cross-section of σW t = 15.74 pb [4]. It has been simulated using a variety of generator and hadronization model combinations. The nominal sample uses AcerMC 3.5 as the generator and Pythia 6.423 as the hadronization model. The top quark decays almost exclusively to a W boson and a b-quark, while the resulting two W bosons follow the decay branching ratios of the W boson. For the purposes of this analysis, we examine final states in which both of the W bosons decay leptonically into either an electron/neutrino pair or a muon/neutrino pair. This occurs for approximately 5% of the W t events [10]. The tau lepton decays of the W boson are also simulated in the simulated events and some events may make it past the selection, but they 68 are a small fraction of the total yield due to the approximately 35% branching ratio of τ to electrons and muons. ¯ The tt background makes up the largest background in this analysis. The total crosssection at 7 TeV is σtt = 161+11 pb [52], approximately ten times larger than the W t signal. ¯ −16 ¯ Like the W t-channel, the top quarks in the tt process almost exclusively decay into a W boson and b-quark pair, and in this analysis we are interested in the case where both of the W bosons decay leptonically. The major difference is the second b-quark in the final state, but a second b-quark can go undetected if it has low energy or is reconstructed incorrectly. For example, particles with significant momentum may diverge from the cone of the jet and be left out of the reconstruction, giving the reconstructed jet energy lower than the ¯ selection threshold. It is for this reason that the tt background is by far the most significant background for a W t-channel analysis. The nominal sample uses the MC@NLO generator with the HERWIG hadronization model. Additional simulated events have been generated to analyze the contribution from several different systematic uncertainty. For more information on the systematics, refer to Section 8.1. For comparison in generator and hadronization studies, two W t samples have been created, one using MC@NLO as the generator and HERWIG for the hadronization, and a second using AcerMC as the generator and Pythia for the hadronization. Ad¯ ditionally, two tt samples have been created, one using POWHEG as the generator and HERWIG for hadronization, and another using POWHEG as the generator and Pythia ¯ for hadronization. For both the tt and W t processes, six different samples have been created exploring a range of ISR/FSR parameter phase space. This scheme allows us to probe the ISR and FSR contributions independently and in combination with each other. The Z + jets background is significant. While its tree level final state is not similar 69 miss to the W t signal (it has no real neutrinos to provide ET ) its cross-section is over sixty times higher. Our selection leaves the events where the Z boson decays to two leptons. The Z + jets background is divided into several different samples, depending on the number of jets in the final state. These samples are used to determine the shape of the Z + jets distributions, and the overall normalization is provided by a data-driven method described in Sections 6.3 and 6.4 to minimize impact of systematic uncertainties. They are generated with ALPGEN and hadronized with HERWIG. Their respective cross-sections are given in Table 6.2. The W +jets background is similar to the Z +jets background in that its final state does not resemble the final state of the W t signal, but its cross-section is higher still, approximately 10 times as large as the Z + jets background. This simulated sample is not used directly as an estimate, but is instead used to provide a shape to the data-driven estimate of the fake dilepton background. This background’s normalization must be estimated from data because doing a simulation is much more difficult than using data-driven methods. Due to its large cross-section and low acceptance, it would require generating many orders of magnitude more events than the other backgrounds. In addition, generating these events accurately would be difficult, as the low acceptance means that the software would have to accurately simulate even rare events. The W + jets sample is generated using ALPGEN and hadronized with HERWIG. The samples are generated based on how many additional partons are involved in the interaction and additional samples are constructed specifically for the heavier quark flavors [53]. The samples and their respective cross-sections are given in Table 6.2. The diboson backgrounds W W , W Z, and ZZ are simulated with at least one of the bosons decaying leptonically. These backgrounds were generated using ALPGEN and 70 hadronized with HERWIG. The NLO k-factors were calculated with MCFM for W W and ZZ and extrapolated from calculations for √ s = 14 T eV [54] for W Z. The simulated events are weighted to a total integrated luminosity of 2.05 fb−1 . It simulates the effect of pile-up by reweighting individual events to compensate for the variation in the mean number of interactions per collision observed in the data. The accuracy of this simulation is evaluated by producing the histograms showing the number of primary vertices detected as in Fig. 6.1 and verifying that the simulation agrees with the data within the 1000 800 ATLAS 1+ jets Events / 0.75 Events / 0.75 expected uncertainty. ∫ L dt =s2.05 fb = 7 TeV -1 600 ∫ L dt =s2.05 fb = 7 TeV -1 200 150 400 100 200 0 0 350 ATLAS 300 1 jet 250 50 5 10 0 0 15 5 nPrimary vertices (a) 10 15 nPrimary vertices (b) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure 6.1: Histograms of the number of primary vertices in data and simulated events for (a) the selected sample and (b) the signal enhanced region. The simulated events are represented by the solid regions, while the data are represented with a black dot. 71 Description W t all decays W t all decays W t all decays W t ISR up W t ISR down W t FSR up W t FSR down W t ISR/FSR up W t ISR/FSR down ¯ tt not fully hadronic ¯ tt not fully had. ¯ tt not fully had. ¯ tt not fully had. ISR + ¯ tt not fully had. ISR ¯ tt not fully had. FSR + ¯ tt not fully had. FSR ¯ tt not fully had. ISR/FSR + ¯ tt not fully had. ISR/FSR single top t-channel (e) single top t-channel (µ) single top t-channel (τ ) single top s-channel (e) single top s-channel (µ) single top s-channel (τ ) σ [pb] Lint [f b−1 ] NM C Generator+Hadronization 15.74 15.74 15.74 15.74 15.74 15.74 15.74 15.74 15.74 89.7 89.4 89.4 89.1 89.1 89.1 89.1 89.1 89.1 7.09 7.09 7.09 0.47 0.47 0.47 9.5 19 19 32k 32k 32k 32k 32k 32k 2.2 2.2 2.2 2.2 2.2 2.2 2.2 2.2 2.2 28 28 28 21 21 21 150k 300k 300k 19 19 19 19 19 19 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 10k 10k 10k AcerMC+HERWIG AcerMC+Pythia MC@NLO+HERWIG ACERMC+Pythia ACERMC+Pythia ACERMC+Pythia ACERMC+Pythia ACERMC+Pythia ACERMC+Pythia MC@NLO+HERWIG POWHEG+HERWIG POWHEG+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia AcerMC+Pythia MC@NLO+HERWIG MC@NLO+HERWIG MC@NLO+HERWIG Table 6.1: The simulated samples and their respective cross-sections. 72 σ [pb] Lint [f b−1 ] Description Z → ℓℓ + 0 parton Z → ℓℓ + 1 partons Z → ℓℓ + 2 partons Z → ℓℓ + 3 partons Z → ℓℓ + 4 partons Z → ℓℓ + 5 partons W → ℓν + 0 parton W → ℓν + 1 partons W → ℓν + 2 partons W → ℓν + 3 partons W → ℓν + 4 partons W → ℓν + 5 partons W → ℓν + b¯ + 0 parton b ¯ + 1 partons W → ℓν + bb W → ℓν + b¯ + 2 partons b W → ℓν + b¯ + 3 partons b W → ℓν + c W → ℓν + c W → ℓν + c W → ℓν + c W → ℓν + c WW WZ ZZ + + + + + 0 1 2 3 4 parton partons partons partons partons NM C Generator+Hadronization 827.4 166.6 50.4 14.0 3.4 1.0 8,296 1,551 452 121 30.3 8.3 54.7 40.4 20.0 7.6 8.0 6,600k 8.0 1,340k 5.7 285k 7.9 110k 8.8 30k 9 9k 2.0 3,500k 1.5 2,500k 6.1 3,770k 8.3 1,000k 8.3 250k 8.4 70k 8.7 475k 5.1 205k 8.8 175k 9.2 70k ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG 517.6 192.1 51.0 11.9 2.8 17.0 5.5 1.3 1.7 1.7 1.7 1.7 1.8 15 45 192 ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG 860k 318k 85k 20k 5k 250k 250k 250k Table 6.2: The simulated samples and their respective cross-sections. 73 6.2 Fake dilepton data-driven estimate The contributions from W + jets and multijet effects are difficult to model correctly in simulation. Instead, a data-driven method is employed to more accurately model this background. These backgrounds are significantly reduced in magnitude by the requirement of two leptons, as the tree level diagrams for these processes have one or fewer lepton. In order for these backgrounds to pass the event selection criteria, one of the quark or gluon jets in the events must be reconstructed as a lepton by mistake. This misreconstructed jet is referred to as a “fake”, and this data-driven method relies on estimates of the prevalence of these fakes using a sideband of the data. A sideband region is a set of data with selection criteria orthogonal to the selection. The matrix method is used in this estimate. It defines a selection of the data using a loose electron and muon requirement and divides the events into one of four categories (NT T , NT L , NLT , NLL ) depending on which of the leptons fit the loose definition or the tight definition. The loose selection is made as to select events with an increased contribution from mis-reconstructed leptons. From these regions we can use equations 6.1 and 6.2 below to estimate the real prevalence of real and fake leptons in the analysis selected sample. In this equation r represents the real-to-tight efficiency, the probability that a real lepton that passes the loose cut will be identified as a tight lepton. Also, f represents the fake-to-tight efficiency, the probability that a jet that passes the loose lepton cut will be identified as a tight lepton. 74     N T T  NRR           NT L  NRF       =E      NLT   NF R          NLL NF F  (6.1)  rr rf fr ff        r(1 − r) r(1 − f ) f (1 − r) f (1 − f )    E=     (1 − r)r (1 − r)f (1 − f )r (1 − f )f      (1 − r)(1 − r) (1 − r)(1 − f ) (1 − f )(1 − r) (1 − f )(1 − f ) (6.2) Both leptons are selected using looser requirements which are a subset of the tight selection requirements. This will allow us to look at leptons which pass this loose requirement and see how they compare to leptons that pass the more stringent tight analysis selection. Electrons are selected by replacing the “isEM tight” and track match requirement from the analysis with an “isEM medium”, track match and b-layer hit requirement. Additionally the isolation requirement is removed. The muon selection is modified by removing the ID hit, the Etcone, and the P tcone isolation cut requirements. Two methods are used to estimate r and f separately. The real-to-tight efficiencies for the real leptons uses a enhanced sample of Z + jets events. Events which have one tight and one opposite signed loose lepton with an invariant mass within 5 GeV of Mℓℓ = 91 GeV are selected. This selection is dominated by Z + jets events decaying to two leptons, as the events selected have leptons with an invariant mass close to the Z boson mass of 91 GeV. As a result, it provides a high probability that the loose lepton is a real lepton. This loose lepton can be divided into categories of leptons that 75 pass the event selection and leptons that don’t, giving the efficiency of a real lepton passing the tight lepton selection. The fake-to-tight efficiencies are estimated by selecting events with a single loose lepton miss < 10 GeV . Although this selection is primarily made up of QCD events, also with ET it still has significant contamination of real leptons by W + jets and Z + jets events. An iterative procedure has been developed to remove these events [53]. The initial step assumes no contamination, given an estimate of the fake-to-tight efficiency. This estimate is used to extract a scale factor between the total number of events and the numbers of W + jets and miss Z + jets events that pass selection without a ET cut. n,tight n kW/Z+jets = N tight − Nf ake tight tight NW +jets,M C + NZ+jets,M C (6.3) The estimate of the fake-to-tight efficiency is then repeated, this time subtracting off the scale ¯ factor adjusted W + jets and Z + jets contributions and the MC estimated tt contribution. This procedure is iterated until the efficiency converges to a stable value. The results of these calculations are re = 85.40 ± 0.10% fe = 4.86 ± 0.01% (6.4) rµ = 98.27 ± 0.03% fµ = 21.07 ± 0.05%. Once these values are known, the matrix given in 6.1 is inverted to give the estimated composition of the analysis selection, the tight-tight contribution. The dataset used to 76 estimate the yield contains a luminosity of 0.7 f b−1 , and the resulting estimate is rescaled by 2.05 fb−1 /0.7 f b−1 . This estimate is affected by several systematic and statistical uncertainties. The statistical uncertainties come from the event counts in the regions used to estimate the efficiencies and the data statistics in the regions used to estimate the final yield. Systematic uncertainty contributitions arise from four sources: • Choice of parametrization of the real and fake efficiencies. • Additional unmodeled contamination from backgrounds ignored in the signal enhanced regions. • Differing composition of the control regions from the signal region. • Differing data-taking conditions between the first 0.7 f b−1 and the full 2.05 fb−1 . Another single top ATLAS analysis has done a thorough estimate of these uncertainties [53]. Instead of repeating these studies in detail for such a small background, we use a conservative estimate based on the findings of the other single-top analysis. We use a normalization uncertainty of 100% to account for these systematic uncertainties. Although this means that the lower end is not modeled properly, the yield of this background is small (< 1%) and the effort required to better understand the systematic does not provide significant gain, as it has little impact on the measured cross-section. Overall it is found that due to the strict muon selection, the only significant background comes from ee and eµ channels. The shape for this background is estimated using W + jets simulated events as described in Section 6.1. The estimated fake dilepton background and its associated uncertainties are given in Table 6.3. 77 Channel ee µµ eµ 1-jet 2-jet and higher 6.6 ± 6.6 2.4 ± 2.4 negl. negl. 4.5 ± 4.5 3.6 ± 3.6 Table 6.3: Fake dilepton background estimated for a luminosity of 2.05 fb−1 . Both statistical and systematic uncertainties are included. 6.3 Drell-Yan data-driven estimate There is a significant background from Drell-Yan events in which a Z boson or a virtual photon decay into a pair of leptons. A diagram of these processes is shown in Fig. 2.10. Here a data-driven procedure called the ABCDEF method estimates the magnitude of the background for the dielectron and dimuon decays. This method uses independent uncorrelated regions in phase space to divide the data into signal and background enriched regions (shown in Fig. 6.2) and then estimates the ratio of the background population across one of the cuts using the two background enriched regions. This ratio is used to extrapolate the contamination of the third region into the signal region, as shown in equations 6.5 and 6.6. predicted data data data = ND × (NB /NE ) (6.5) predicted data data data = NF × (NB /NE ) (6.6) NA NC Two variables must be selected which are uncorrelated and have good separation between signal and background. The regions of phase space thus created must have enough events so that the statistical uncertainty on the estimate will be small. The variables chosen for miss this estimate are the dilepton invariant mass Mℓℓ and the missing transverse energy ET . Typically this method uses only four regions, but because the dilepton invariant mass is used 78 as one of the variables, the cut applied, 81 GeV < Mℓℓ < 101 GeV , gives a total of six Emiss [GeV] T regions. These regions and their relative populations are shown in Fig. 6.2. 80 70 A(1497) B(1514) C(2185) D(12200) E(135626) F(10350) 60 50 40 ATLAS 30 ∫Ldt=2.05 fb -1 20 s=7TeV 10 0 40 Z(ee/ µ µ )+jets 60 80 100 120 140 Mll [GeV] Figure 6.2: A scatter plot illustrating the division of phase space into six regions and their relative population sizes. A larger dot indicates a higher density of events. A simplistic model will allow the Drell-Yan contribution to the signal regions A and C to be estimated by using the populations of the other regions according to the following equations 6.5 and 6.6. This simple model neglects several potential sources of error and a more robust model must be used. This more sophisticated method must take into account possible contamination from non-Drell-Yan backgrounds in the background control regions B, D, E, and F. In fact, the simulated sample estimates predict a significant contamination in region B, hence this must certainly be modeled. Also, although we have selected two variables minimally correlated with each other, even a weak correlation can cause uncertainty in the 79 estimate. To model the effect of these two systematics, two additional scale factors are added, one as an overall scale factor, and one as a k-factor modifying the non-Drell-Yan simulated background estimate, which is then subtracted from the total event count in that region, data − k × N M CBG A predicted A × NB M data B NA = Nf × (ND − kA × ND CBG ) data − k × N M CBG NE A E (6.7) data − k × N M CBG C predicted data M C × NB B × (NF − kC × NF CBG ). = Nf NC data − k × N M CBG NE C E (6.8) These parameters are found by constructing a likelihood function and fitting. To make miss cuts ranging from the fit more robust, the likelihood functions for several possible ET 10 to 50 GeV in 5 GeV increments are combined. The event counts are modeled as Poisson distributions and the following likelihood function is maximized: 50 GeV L(Nf , k) = miss cut∈10 ET exp miss cut). est Pois N obs |NM C + NDY (ET (6.9) This fit is computed independently for regions A and C, since the contaminating backgrounds in these regions may have a strong dependence on the two selection cuts. An additional miss variable modeling linear dependence on the ET was considered, but an analysis showed that having no such dependence was more consistent with the data, and hence the final miss estimate is done assuming no dependence on ET . The overall scale factors derived from this A C fit are Nf = 1.0±0.1 for region A and Nf = 1.2±0.1 for region C. For the final computation, these two fits were combined into an average value of Nf = 1.1 ± 0.1. The background contamination scale factor k was fitted to region C and determined to be k = 1.3 ± 0.2 for ee 80 and 1.4 ± 0.2 for µµ. Region D was excluded due the contaminating presence of the multijet miss background in the low Mℓℓ and low ET region. The systematic uncertainty is estimated by independently varying the fitted Nf and k parameters by 1σ and calculating the change in the background estimate. These are considered to be independent and are added in quadrature to give an overall uncertainty for the estimate. This procedure is repeated for each of the 1-jet, 2-jet, and 3-jet inclusive bins and the results are displayed in Table 6.4. Channel ee µµ 1-jet 2-jet 20.1 ± 2.0 10.7 ± 2.0 29.1 ± 3.3 28.4 ± 3.1 3-jet and higher 4.9 ± 2.0 12.0 ± 3.1 Table 6.4: Drell-Yan background estimates for selected events in the 1-jet, 2-jet and 3-jet and higher bins, obtained using the ABCDEF method with 2.05 fb−1 of data. The combined statistical and systematic uncertainty is shown. It can be seen that the overall yield is largest in the 1-jet bin, where it makes up approximately 10% of the overall background. As the jet multiplicity rises, the relative contribution from Drell-Yan decreases. The shape for these backgrounds is modeled using the simulation samples described in Section 6.1. 6.4 Z → τ τ data-driven estimate A data-driven estimate was also performed for the Z → τ τ background. This background is much less significant than the other backgrounds, especially given the powerful discrimination against it during selection. As a result, after selection Z → τ τ makes up approximately 1% of the total background. This estimate uses a method similar to the Drell-Yan estimate, using a background enriched region B to estimate the contamination in the signal region A. 81 Again the Drell-Yan rejection window is chosen as the discriminating variable. The other contaminating backgrounds are subtracted from the yields using their simulation estimated yields, and then the Z → τ τ contribution to region A is estimated using the following formula: MC Backgrounds EST = DYA × (DataB − M CB ). DYA MC DYB (6.10) The uncertainty is taken to be the difference between the data-driven estimate and the simulation estimate, giving an overall uncertainty of 60%. The estimate is done separately for the ee, eµ, and µµ channels for the 1-jet, 2-jet, and 3-jet inclusive bins. The shape of the distributions is provided by the simulated events discussed in Section 6.1. Channel ee µµ eµ 1-jet 2-jet 3-jet and higher 1.1 ± 0.6 1.1 ± 0.6 0.0 ± 0.6 5.7 ± 3.4 1.7 ± 1.0 0.7 ± 0.4 2.6 ± 1.6 1.2 ± 0.7 0.8 ± 0.5 Table 6.5: Z → τ τ background estimates for selected events in the 1-jet, 2-jet and 3-jet and higher bins. The errors include statistical and systematic uncertainties. 82 Chapter 7 Multivariate Analysis After event selection it is clear that while there is excellent background rejection, there still remains a poor signal to background ratio of less than 20% in the 1-jet bin. To increase the statistical significance of the analysis machine learning techniques are utilized, specifically multivariate analysis (MVA) techniques. Multivariate machine learning is a powerful tool in high energy physics where there is a large amount of data and many variables with intricate correlations. In a typical cut-based analysis, a small set of variables are chosen and cuts are optimized one at a time. Using multivariate techniques the amount of data that can be used and the sophistication of the analysis is increased significantly, allowing the analysis to gain much greater sensitivity than without a MVA. In addition, multivariate techniques take many variables as input and combine them into one strongly discriminating variable, making analysis much more straightforward for the human analyst. The construction and optimization of the boosted decision tree is one of my major contributions to this analysis. 7.1 Boosted decision trees In this analysis boosted decision trees (BDT) [55] are trained using machine learning techniques implemented by the Toolkit for Multivariate Data Analysis with ROOT (TMVA) [56]. To understand what a BDT is, first we will discuss a simpler classifier, the decision tree. A decision tree defines a series of cuts to classify events into signal enhanced regions and 83 background enhanced regions. In Fig. 7.1 an example decision tree is illustrated. In this example, let’s see what happens to an event with Jet1 pT = 45 GeV and M ET = 75 GeV . The first node compares its Jet1 pT with the threshold variable and as a result moves the event to the node on the right. At the next node its M ET is evaluated against the cut and as a result the event moves right to a signal-enhanced end node. This event is then assigned a numerical value based on the purity of that end node. The purity of the node is defined as the fraction of events in the node that are signal events. Figure 7.1: An example of a decision tree is shown. A decision tree can be trained with machine learning techniques. A set of variables is input to the algorithm, and at each decision node the cut that gives the best separation is chosen. This process is repeated recursively until an end point is reached, such as a minimum 84 number of events to create an end node. By using machine learning, a much larger set of variables can be examined over their entire phase space. This machine learning process is referred to as training. A single decision tree is a useful tool, but it does have drawbacks: a decision tree can be strongly dependent on the input dataset, small changes in this dataset can lead to large changes in the output distribution, and a single decision tree may not be very powerful on its own. These problems can be addressed by training multiple trees each training with a random selection of your input variables. This method is called a random forest. An additional step is added to the process to improve signal/background separation even further by implementing a boosting procedure which assigns larger weights to the events it is most important to classify correctly. Boosting is performed between each tree training cycle. Each event initially starts with a weight w corresponding to its contribution to the estimated normalization. After each tree is trained, events which are commonly misclassified have their weights increased and a new decision tree is trained using these new weights. This machine learning procedure is repeated iteratively until the specified number of trees have been trained. This procedure promotes the correct classification of even the most signal-like background events, and consequently sees greater separation than a simple single decision tree. In this analysis the events are reweighted using the AdaBoost algorithm. Let errm = sum of weight of misclassif ied events total weight of events (7.1) for a tree. Then the new weight of all misclassified events is multiplied by a boosting factor to give a new weight: 85 wnew (i) = 1 − errm β × wold , errm (7.2) where β is a constant. The misclassification rate is less than 0.5 because of the initial reweighting of the background and signal events, and consequently the boosting factor will always be greater than 1. The new set of weighted events are then renormalized so that the overall weight remains the same. The β parameter is varied to find the optimal value for a given analysis. The Gini index G determines the splitting cut on each node. This is defined for each node as: n G= i=1 Wi P (1 − P ). (7.3) Here Wi is the weight of each event and P is the fraction of the events belonging to the signal. The value ∆G = Gparent −Glef t child −Gright child is then maximized for all possible choices of cuts. The number of variables that are available for each tree in training is modified to allow weaker variables an opportunity to participate in the overall MVA. Much like the boosting procedure, this can increase the overall separation power by increasing the discrimination on the most difficult to classify events. In practice this is done by randomly selecting a subset of the provided variables for each training iteration. There are a number of conditions applied to determine when to stop training a given tree. One of the criteria is to have a minimum node size. If a node has fewer events than this limit, it becomes an end node. Eventually all nodes will be split such that they have fewer than the threshold number and the training ends. Another stopping condition is to 86 set a limit on the depth of nodes. The boosted decision tree has been chosen over other multivariate techniques for several reasons. It is a proven technique in the field, used by the single top discovery papers at DØ [16] and CDF [20]. The trained classifier that is created is human readable, which allows a more intuitive understanding of the results. It is insensitive to poorly discriminating input variables and scales well with the number of input variables used, permitting a large number of variables to be used simultaneously. 7.2 BDT variable kinematics Approximately 70 variables were selected as candidate variables and evaluated by training a BDT on the simulated events. The top 22 variables were selected based on their separation power and how well modeled they were. The separation power S of those variables is defined as [56]: < S 2 >= 1 2 (YS (y) − YB (y))2 dy. (YS (y) + YB (y)) (7.4) where YS (y) is the probability that a signal event has a value y for the variable and YB (y) is the probability that a background event has a value y for the variable. Table 7.1 lists the variables and their definitions. Table 7.2 shows the variables’ respective separation power. Figs. 7.2, 7.3, and 7.4 show data-background agreement of these variables in the 1-jet bin. Distributions of these variables in the 2-jet and 3+jet inclusive bins are shown in Sections A.1 ¯ and A.2. The 2-jet and 3-jet inclusive bins are dominated by the tt background, and we take ¯ advantage of this to constrain the uncertainty on the normalization of tt. 87 Variable sys pT σpsys T Centrality(Lep1Lep2Jets) η T hrust Definition pT of the leading jet, the leptons, and the MET summed vectorially sys √ pT / HT + ΣEt , where ΣEt is the scalar sum of all of the energy observed in the calorimeter Centrality of the selected jets and leptons. Centrality is defined in Section 7.2.2 η of the thrust. Refer to Section 7.2.1 for details on thrust η Lep1Lep2 η of the dilepton system η Lep1Lep2Jet1 η of the dilepton and leading jet system η Lep1 η of the leading lepton E Lep1Lep2 Energy of the dilepton system jets HT Lep1Lep2Jet1 pT Scalar sum of the pT s of the selected jets T hrust M Lep2Jet1 η Lep1Jet1 Transverse momentum of the system composed of the two leptons and the leading jet The thrust of the event. Refer to Section 7.2.1 for a definition of this variable Invariant mass of the subleading lepton and the leading jet η of the system of the leading lepton with the leading jet η Lep2 η of the subleading lepton η Jet1 η of the leading jet The minimum difference in the phi coordinate between each of the leptons and the leading jet. Invariant mass of the leading lepton and leading jet system The difference in the phi coordinate between the leading lepton and leading jet system and the sub leading lepton ∆φ(Lep, Jet1)min M Lep1Jet1 ∆φ(Lep1Jet1, Lep2) M ET ISS ∆η(Lep1, Jet1) ∆R(Lep2, Jet1) M (LepJet1)max The missing transverse energy, discussed in Section 4.4 The difference in the eta coordinate between the leading lepton and the leading jet The opening angle between the sub leading lepton and the leading jet The maximum invariant mass of each of the leptons with the leading jet Table 7.1: A listing of the variables (see text for definition) used in the BDT and their respective definitions. 88 ∫ Events / 0.5 Events / 10 GeV 350 ATLAS 300 1 jet 250 200 150 100 50 0 0 50 -1 L dt = 2.05 fb s = 7 TeV 100 150 200 400 ATLAS 350 1 jet 300 250 200 150 100 50 0 0 2 ∫ L dt =s2.05 fb = 7 TeV -1 4 6 8 10 σ sys p T sys p [GeV] T 140 120 100 ATLAS 1 jet ∫ (b) Events / 0.5 Events / 0.05 (a) -1 L dt = 2.05 fb s = 7 TeV 80 60 40 20 0 0 0.2 0.4 0.6 0.8 1 160 ATLAS 140 1 jet 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 (c) Events / 0.5 4 ηThrust Centrality(Lep1Lep2Jets) 160 ATLAS 140 1 jet 120 100 80 60 40 20 0 -4 -2 2 (d) ∫ L dt =s2.05 fb = 7 TeV -1 0 2 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 4 Lep1Lep2 η (e) Figure 7.2: The top five variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. 89 ∫ Events / 0.5 Events / 0.5 160 ATLAS 140 1 jet 120 100 80 60 40 20 0 -4 -2 -1 L dt = 2.05 fb s = 7 TeV 0 2 4 220 ATLAS 200 1 jet 180 160 140 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 2 ηLep1Lep2Jet1 ηLep1 (b) 350 ATLAS -1 L dt = 2.05 fb 300 1 jet s = 7 TeV 250 200 150 100 50 0 0 200 400 600 800 1000 ∫ Lep1Lep2 E Events / 15 GeV 200 HJets [GeV] T (d) ∫ L dt =s2.05 fb = 7 TeV -1 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 150 100 50 0 0 100 ∫ 400 ATLAS -1 L dt = 2.05 fb 350 1 jet s = 7 TeV 300 250 200 150 100 50 0 0 100 200 300 [GeV] (c) ATLAS 1 jet Events / 15 GeV Events / 50 GeV (a) 250 4 200 300 pLep1Lep2Jet1 [GeV] T (e) Figure 7.3: The 6th-10th top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. 90 Events / 15 GeV Events / 0.05 ∫ 160 ATLAS L dt = 2.05 fb-1 140 1 jet s = 7 TeV 120 100 80 60 40 20 0 0 0.2 0.4 0.6 0.8 1 ∫ 220 ATLAS -1 L dt = 2.05 fb 200 1 jet 180 s = 7 TeV 160 140 120 100 80 60 40 20 0 0 100 200 300 Lep2Jet1 Thrust M 160 ATLAS 140 1 jet 120 100 80 60 40 20 0 -4 -2 (b) Events / 0.5 Events / 0.5 (a) ∫ L dt =s2.05 fb = 7 TeV -1 0 2 4 220 ATLAS 200 180 1 jet 160 140 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 2 Lep1Jet1 η Events / 0.5 (c) ATLAS 200 1 jet (d) ∫ L dt =s2.05 fb = 7 TeV -1 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 150 100 50 0 -4 -2 4 Lep2 η 250 [GeV] 0 2 4 ηJet1 (e) Figure 7.4: The 11th-15th top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. 91 Events / 20 GeV Events / 0.16 120 ATLAS 1 jet 100 ∫ L dt =s2.05 fb = 7 TeV -1 80 60 40 20 0 0 1 2 3 ∆φ(Lep,Jet1) 240 220 ATLAS 200 1 jet 180 160 140 120 100 80 60 40 20 0 0 100 ∫ L dt =s2.05 fb = 7 TeV -1 200 M min ∫ L dt =s2.05 fb = 7 TeV -1 350 ATLAS 300 1 jet 250 200 150 -1 100 50 2 3 0 0 4 50 100 200 (d) ∫ L dt =s2.05 fb = 7 TeV -1 2 150 Emiss [GeV] T (c) Events / 0.25 [GeV] ∫ L dt =s2.05 fb = 7 TeV ∆φ(Lep1Jet1, Lep2) 240 220 ATLAS 200 1 jet 180 160 140 120 100 80 60 40 20 0 0 1 400 (b) Events / 10 GeV Events / 0.2 (a) 140 ATLAS 120 1 jet 100 80 60 40 20 0 0 1 300 Lep1Jet1 3 4 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 5 ∆η(Lep1, Jet1) (e) Figure 7.5: The 16th-20th top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. 92 Events / 20 GeV Events / 0.32 220 200 ATLAS -1 L dt = 2.05 fb 180 1 jet s = 7 TeV 160 140 120 100 80 60 40 20 0 0 2 4 6 ∫ 250 200 ATLAS 1 jet ∫ L dt =s2.05 fb = 7 TeV -1 150 100 50 0 0 100 200 300 400 LepJet1 Mmax [GeV] ∆R(Lep2,Jet1) (a) (b) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure 7.6: The 21st and 22nd top variables in the BDT ranked by separation power. In these histograms the data are compared to the simulated background estimate in the 1-jet bin. 93 Variable sys pT σpsys T Centrality(Lep1Lep2Jets) η T hrust η Lep1Lep2 η Lep1Lep2Jet1 η(Lep1) E(Lep1Lep2) jets HT pT (Lep1Lep2Jet1) T hrust Separation 6.76% 6.17% 3.82% 3.43% 3.19% 2.94% 2.58% 2.56% 2.38% 2.31% 2.01% Variable M Lep2Jet1 η Lep1Jet1 η Lep2 η Jet1 ∆φ(Lep, Jet1)min M Lep1Jet1 ∆φ(Lep1Jet1, Lep2) M ET ISS ∆η(Lep1, Jet1) ∆R(Lep2, Jet1) LepJet1 Mmax Separation 1.42% 1.32% 1.24% 1.13% 1.05% 8.797e-03 8.404e-03 7.615e-03 7.071e-03 5.571e-03 5.519e-03 Table 7.2: A listing of the variables (see text for definition) used in the BDT and their respective separation power. 7.2.1 Thrust The thrust variable is defined as a vector with a direction that represents an axis that maximizes the sum of the positive parallel components of the momenta of the selected leptons and jets and a magnitude that represents the fraction of the momentum in the event in this direction. It is calculated by first searching for the thrust axis. Eta and phi are searched in 0.05 increments defining a potential thrust axis. For each selected lepton and jet, the momentum parallel to the axis is calculated. To reduce the impact of back-to-back objects, we consider only the positive contributions to the thrust vector. Therefore if the parallel momentum is positive, it is summed with the others, and otherwise it is discarded. After the summation is complete, the thrust is calculated by dividing this value by the scalar sum of all selected objects’ momenta. The eta and phi combination that maximizes this value defines the thrust vector. 94 7.2.2 Centrality Centrality is a measure of the fraction of the momentum of the jets and leptons that is transverse to the beam line. It is defined by taking the selected lepton and jets and calculating the scalar sum of the transverse momenta divided by the scalar sum of the total momenta. 7.2.3 Motivation for variable choice This section will discuss the reasoning that went into selecting the candidate variables for ¯ the BDT. Figures 7.7 and 7.8 show the two most important processes: W t-channel and tt. ¯ The final state for these two processes is similar, except tt has an extra jet. If one of its jets is lost during reconstruction, then it becomes similar to a W t-channel event. Almost all of the variables chosen were selected to differentiate between the subtle differences between these two processes. First, all of the kinematic information from each of the final state objects is considered as a variable. Although none by itself provides good separation, together with one of the more complex variables it may prove to be useful to the BDT. This is shown to be true for miss ET and the η of the leptons and jet. sys Two of the variables pT and σpsys , measure the vector sum of the pT of the hard T miss . If the second jet of a tt event interacted with the calorimeter, but ¯ interaction and the ET sys did not meet the jet selection criteria, it would have a high pT and σpsys . On the other hand T all of the W t-channel’s final state particles must be detected to meet the selection criteria. sys As a result it should have relatively low pT and σpsys since there are no high pT objects T Lep1Lep2Jet1 variable is also chosen to discriminate not meeting the selection criteria. The pT ¯ between W t-channel and tt on the basis of the difference in the pT distributions. 95 Another set of variables considered is angular correlations between the final state parti¯ cles. The two leptons and the jet in the tt and W t-channel final states will have different ¯ angular correlations due to the existence of the second top decay in tt. For this reason the variable list contains many different η and φ correlations between final state particles and combinations of final state particles. Due to the two neutrinos in the final state, the reconstruction of the invariant mass of the W bosons or the top quark is not possible. A sophisticated method trying to use invariant mass constraints to reconstruct the neutrinos was attempted, but did not provide accurate results. Instead, the only information we have is the estimate of the vector sum of their pT s, miss the ET . Consequently, there are no variables using neutrino kinematic information and miss only a few variables using information from the less powerful ET . Calculations of the invariant mass of a lepton with the jet, however, are useful in identifying which of the leptons originates from the top quark. The lack of information about the neutrinos means that these invariant masses do not have great resolution, but they still ¯ provide some information. Although both the tt and W t-channel processes have a top quark decaying, once the lepton associated with the top has been identified, variables associated with the other lepton may prove to improve separation. This is the kind of physics that MVAs are useful for, as by themselves these variables provides little information, but in combination with other variables they help provide good separation. 7.3 Optimization and cross checks Overtraining is caused by an MVA which has been trained to the point where it is sensitive to statistical fluctuations in the simulated events. The result is that if the same trained MVA 96 ℓ− W− b νℓ b ℓ+ W+ ¯ t g b νℓ Figure 7.7: The decay chain of an example W t-channel event. It has a finale state with one b-quark, two oppositely signed leptons, and two neutrinos ¯ b q ℓ+ t W+ g q ¯ νℓ ℓ− W− ¯ t νℓ b ¯ Figure 7.8: The tt process. It has a final state with two b-quarks, two oppositely signed leptons, and two neutrinos. 97 is used to evaluate a new set of simulated events generated under identical conditions, then it would output a different distribution. The consequences of using an overtrained MVA can range from using a poorly-optimized MVA in the analysis resulting in lower significance to an outright bias in the results. When training a BDT, it is important to ensure that one does not overtrain on the available simulated events. To evaluate if a prospective BDT is overtrained, only half of the input simulated events are used for the training, and after the training is complete both halves are run through the BDT independently. The resulting distributions are compared in a Kolmgorov-Smirnov test [57]. If the K-S test shows disagreement, defined as a K-S test result < 0.5 for either the signal or background distribution, the trained BDT is determined to be overtrained and is discarded. The overtraining plot and K-S test values for the final BDT are shown in Fig. 7.9. The solid areas represent the sets of events used for testing, while the dots represent the sets of events used for training. It is seen from the K-S test values that these are in good agreement. The MVA is optimized by maximizing the separation of signal and background while avoiding overtraining. A number of parameters are adjusted in the course of this optimization. An iterative procedure is performed to optimize this BDT, in which the BDT parameters in Table 7.3 are adjusted based on whether the current training resulted in an overtrained BDT or not. The procedure is repeated until further iterations results in no improvement in the significance. The values selected for the BDT parameters in the final optimized BDT and their step size are listed in Table 7.3. The depth of the trees ended up being surprisingly shallow. My interpretation of this is that the BDT for this set of variables is sensitive to pairs of variables. For example, the invariant mass example above where one variable gives infor98 (1/N) dN / dx Training 4.5 -1 Wt channel L dt = 2.05 fb tt, diBoson, DY, and fakes 4 ∫ Testing 3.5 s = 7 TeV Wt channel Dilepton 1 jet tt, diBoson, DY, and fakes 3 2.5 2 1.5 1 0.5 0 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 -1 BDT response Figure 7.9: The classifier output for the training and test samples for signal (in blue) and background (red). The signal has K-S test value of 0.866 while the background has a K-S test value of 0.941. Parameter Number of trees trained (number of boosting cycles) Minimum number of events in an end node Maximum depth of tree AdaBoost parameter (β) Number of cuts sampled Value Step size 300 20 500 20 2 1.0 8 1 0.1 1 Table 7.3: The parameters used in final optimized BDT. 99 mation about which lepton is the top, and the next makes a separating cut based on that information. Unfortunately, it seems that the BDT is not sensitive to deeper relationships Background rejection between variables. 1 BDT Cut-based 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Signal efficiency Figure 7.10: The signal selection efficiency vs total background rejection using the BDT classifier output. The solid blue line is from the BDT, while the long dotted line is from a simple cut-based optimization using the two most powerful variables. The short dotted line is the effect of a cut from a hypothetical variable with zero separation power to show a worst case scenario. The signal selection efficiency vs background rejection is shown in Fig. 7.10. This Fig. contains not only the performance of the BDT compared with no selection, it also contains the efficiency of a simple cut-based analysis done using the top two discriminating variables. Although this is a difficult problem, the gains from using the BDT are seen by comparing it this a cut-based analysis. In Fig. 7.11 the difficulty of classifying events are seen even more clearly. Few of the signal events have a BDT response of > 0.2, and none have a response 100 Events / 0.03 Events / 0.03 80 ATLAS -1 L dt = 2.05 fb 70 1 jet s = 7 TeV 60 50 40 30 20 10 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 ∫ 160 ATLAS -1 140 2 jets L dt = 2.05 fb s = 7 TeV 120 100 80 60 40 20 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 ∫ BDT Response BDT Response Events / 0.03 (a) (b) ∫ 120 ATLAS L dt = 2.05 fb-1 >=3 jets s = 7 TeV 100 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 80 60 40 20 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 BDT Response (c) Figure 7.11: The BDT classifier output (a) in the 1-jet bin (b) in the 2-jet bin (c) in the 3-jet inclusive bin. The simulated events are represented by the solid regions, while the data are represented with a black dot. 101 > 0.4. The BDT by itself does not give good separation between signal and background. However, in the next section we will take advantage of the improved separation of the binned BDT response distribution by modeling it with a likelihood function. 102 Chapter 8 Significance, Cross-Section Measurement, and Systematic Errors In this section we discuss the methods to estimate systematic uncertainties and the statistical techniques used to measure the cross-sectio and determine the statistical signifiance of the result. To calculate the significance and cross-section, a template fit is performed using the BDT distribution for the 1-jet, 2-jet and 3-jet inclusive bins. Although only the 1-jet bin has a good signal to background ratio, the 2-jet and 3-jet inclusive bins are included to ¯ constrain the backgrounds, particularly tt. The systematic uncertainties evaluated in this fit are discussed below. 8.1 Systematic uncertainties The primary sources of systematic error have been estimated using a variety of means. The methods to estimate the systematic effects have been provided by the ATLAS collaboration and the top working group [58]. Many of the uncertainties are experimental in nature, such as jet energy resolution(JER), jet reconstruction efficiency, the lepton identification efficiency, the lepton energy scale, the lepton energy resolution, and the effect of pileup and the soft jet cutoff on the missing transverse energy. There are also theoretical sources of uncertainty, such as the Monte Carlo generator choice, the hadronization and parton showering modeling, 103 ¯ the parton distribution function, and the uncertainty of the cross-section calculation for tt and diboson production. Our data-driven backgrounds also have uncertainties associated with the yield, as was previously discussed in Sections 6.2- 6.4. The impact of these systematics is evaluated for both the shape of the BDT response distribution and the acceptance. The list of systematics and their impact on the cross-section measurement is seen in Table 8.3. Jet energy scale The jet energy scale uncertainty incorporates several possible sources of uncertainty related to properly measuring the energy of jets [59, 60, 61]. The JES uncertainty has both experimental and theoretical components. The experimental components include the uncertainty in the JES calibration method, the calorimeter response, the simulation of the ATLAS detector, and the effect of pileup. The theoretical components are evaluated by comparing two different simulation chains. The ATLAS collaboration produces a software tool JESUncertaintyProvider [62] that is applied to simulated events to simulate a 1σ variation. To evaluate the uncertainty, a 1σ shift is applied to each reconstructed event in both the positive and negative direction, creating a two additional sets of simulated events. The jet energy scale uncertainty is one of the largest uncertainties in this analysis due to both the magnitude of the jet energy scale uncertainty and the importance of jet variables in discrimination against the backgrounds. Jet energy resolution The precision with which a given jet’s energy is measured has some uncertainty associated with it, referred to here as jet energy resolution (JER) [61]. A mismodeling of this energy resolution can lead to differences in the acceptance rate of events and changes in 104 the final state event kinematics. A software tool JERProviderTop applies an additional smearing of the jet energy beyond of the nominal energy smearing. To estimate the effect of this systematic an additional set of simulated events is created by applying this tool to the set of simulated events prior to event selection. The yields of these simulated events are compared to the nominal simulated events, and half of the difference is taken as a symmetric uncertainty about the nominal value. Jet reconstruction efficiency The efficiency with which the ATLAS reconstruction algorithm correctly identifies jets is another source of systematic uncertainty [61, 63]. Jet reconstruction can fail for a variety of reasons and such failures can cause an event that would be rejected to be accepted or an event that would be accepted to be rejected. To estimate this uncertainty, the top working group provides a software tool JetEfficiencyEstimator which is used to construct an alternate set of simulated events by randomly removing reconstructed jets prior to event selection. The yields of these simulated events are compared to the nominal simulated events, and half of the difference is taken as a symmetric uncertainty about the nominal value. Initial and final state radiation The initial and final state radiation uncertainty is a result of difficulties in modeling events which have radiated particles prior to or immediately after the hard interaction vertex of interest, as shown in Fig. 8.1 and discussed in Section 2.1.1. For example, a gluon could radiate off an interacting quark immediately prior to a W t-channel event, creating a second jet in the event which causes the event to end up in the 2-jet bin instead of the 1-jet bin. These effects occur in both single top and top pair processes. The procedure for 105 estimating this effect is to use several independently created simulated samples generated with different ISR/FSR parameters. For each process studied, six simulated datasets are constructed. These datasets are then filtered through the same event selection process as the nominal datasets. g q t g ¯ t q ¯ Figure 8.1: An example of a Feynman diagram with ISR. Background cross-sections Two backgrounds have significant theoretical cross-section uncertainties that must be accounted for. The diboson background is given a symmetric 5% cross-section uncertainty ¯ to account for the associated theoretical uncertainties [54]. The largest background, tt, uses +11.45 an estimated cross-section of 164.57−15.78 pb. [64] Parton distribution function Parton distribution functions (PDFs) represent information about the momentum distri- 106 bution of particles inside an object, specifically the momentum distribution of quarks and gluons in the proton. PDFs are a result of collaboration between several groups of theoretical and experimental high energy physicists. For this analysis three different PDF sets are considered: CTEQ [65], MRST [66] and NNPDF [67]. Their impact is evaluated using the recommended reweighting procedure [28]. The full difference in acceptance between the PDF sets with the highest and lowest yields is divided in two and this value is used as the symmetric uncertainty on the nominal dataset. Generator dependence and parton shower modeling There are several different choices of Monte Carlo generator and parton shower software, as discussed in Section 6. The choice of Monte Carlo generator and parton shower software affects the shape and yield of the BDT response distribution. The effect of this choice is estimated by generating an additional set of simulated samples for both the W t ¯ and tt processes. The difference between the two sets of simulated events is then used ¯ as a systematic uncertainty. For tt, the generator dependence is calculated using MCNLO+Herwig and POWHEG+Herwig. The parton showering uncertainty is estimated by comparing POWHEG+Pythia and POWHEG+Herwig. For the W t signal process, the generator uncertainty compares AcerMC+Herwig and MCNLO+Herwig and the parton showering uncertainty compares AcerMC+Pythia and AcerMC+Herwig. Lepton selection efficiency scale factors The leptons go through several layers of selection before reaching the analysis level. The modeling of these various layers is not perfect, and so each layer has an associated selection efficiency uncertainty. The layers considered in this systematic include the triggering 107 efficiency, the offline reconstruction efficiency, and the identification efficiency. The ATLAS collaboration uses detector performance information to create a set of correction factors to be applied to the nominal dataset. The systematic error corresponds to the uncertainty in these correction factors. In addition, the single top group uses its own isolation criteria which also affects the selection efficiency, and a similar process is applied to the nominal dataset using the results from single top isolation studies. These scale factors are calculated separately for electrons and muons. In general, the selection efficiency for leptons is good, and as a result the effect of this uncertainty is relatively small. Lepton energy scale and resolution The uncertainty in the lepton energy originates from both the estimation of the scale of the energy and in the energy resolution of the ATLAS detector. The ATLAS collaboration provides software which can apply a 1σ shift up or down to the pT scale of the leptons to represent the systematic errors associated with lepton energy. For the electrons, the e/gamma performance group provides the egammaAnalysisUtils [68] for the energy scale and resolution. The scaling applied depends on the electron’s E, Et , η, and φ. The energy resolution is estimated by modifying the Gaussian smear that is applied during the event selection using a sigma that is a function of the electron’s E and η. The MCP (Muon Combined Performance) group’s MuonMomentumCorrections software package [69] is used for both the energy scaling and resolution for the muons. The scaling is applied to the muon spectrometer (MS) and inner detector (ID) components of the measurement separately using the muon’s MS pT , ID pT , CB pT , and η. The smearing is also applied to the MS and ID components independently using the same input information from the muon. Like the electrons, the muon smearing is applied by modifying sigma of the momentum smearing 108 that is normally applied to the nominal dataset. miss ET and Pile-up uncertainties miss The soft jet and cell-out components of the ET calculation (previously discussed in Section 4.4) have been investigated and a software tool (METTool) [70] has been developed by the Jet/EtMiss working group that can apply the uncertainty as seen by detector studies. The uncertainty in the cell-out and soft jet components are evaluated simultaneously with a 10% uncertainty, and the systematic shift is used to create a new dataset derived from the nominal dataset. An additional systematic representing the uncertainty of the effect of miss pile-up on the ET measurement is assessed using the same tool. Luminosity The luminosity and its associated uncertainty is determined centrally by the ATLAS collaboration [71]. A normalization uncertainty of 3.7% is applied to the simulated background estimates to cover this uncertainty. Additionally, an uncertainty of 3.7% is added to the final cross-section measurement. This is because the luminosity is used to scale the excess signal observed, thus the measured cross-section is directly dependent on the measured luminosity of the data used. Summary table The impact of the various systematic uncertainties on the acceptance of the signal and background processes is shown in Tables 8.1 and 8.2. These tables only contain the effect of the uncertainty on the acceptance, not the full effect on the cross-section measurement. The ¯ largest systematic uncertainties on the dominant tt background are the jet energy scale, the 109 ¯ choice of generator software, the choice of parton shower software, and the tt cross-section uncertainty. In general, the overall uncertainty increases as the number of jets increases, which is to be expected given that one of our dominant uncertainties is the jet energy scale. Although the Z → τ τ and fake dilepton backgrounds have the largest percentage uncertainty, they have little impact on the final result because of their small yields compared to the signal and the other background processes. W t-channel 1-jet +1.3 % Jet Energy Scale −2.4 Jet Energy Resolution ± 1.2% Jet Reconstruction ± 1% Lepton Scale Factor ± 3.0% Lepton Resolution ± 0.5% +5.9 ISR/FSR −4.2 % Generator ± 2.0% Parton Shower ± 1.4% Normalization to data − Normalization to theory − +7.4 % Total −6.4 ¯ tt exclusive +7.7 −8.2 % ± 0.3% ± 1% ± 3.2% ± 0.4% +4.8 −5.6 % ± 8.1% ± 9.1% − ± 8.3% +18 % −18 Diboson events +6.7 −5.4 % ± 8.7% ± 1% ± 3.3% ± 1.3% − − − − ± 5% +13 % −12 Z→ τ τ − − − − − − − − ± 60% − ± 60% Drell-Yan Fakes − − − − − − − − − − − − − − − − ± 6.2% ± 100% − − ± 6.2% ± 100% Table 8.1: The effect of the individual systematic uncertainties on the acceptance for selected events in the 1-jet bin. This is evaluated by calculating the change in the overall yield of a process when subjected to a ± 1σ shift of the nuisance parameter. The uncertainties from the shape of the systematics are not covered in this Table. 8.2 Cross-section and significance measurement The primary goal of this analysis is to search for the existence of the single top W tchannelprocess and to measure its cross-section. The statistical method used to perform the cross-section measurement is a profile likelihood fit. Profiling is a tool which allows us to use the observed data to estimate the nuisance parameters [72, 73] thus reducing their un110 W t-channel 2-jet +9.5 % Jet Energy Scale −8.4 Jet Energy Resolution ± 5.5% Jet Reconstruction ± 1% Lepton Scale Factor ± 3.0% Lepton Resolution ± 0.4% +1.8 ISR/FSR −8.3 % Generator ± 5.3% Parton Shower ± 5.6% Normalization to data − Normalization to theory − +14 % Total −15 Jet Energy Scale Jet Energy Resolution Jet Reconstruction Lepton Scale Factor Lepton Resolution ISR/FSR Generator Parton Shower Normalization to data Normalization to theory Total ¯ tt exclusive −0.7 −0.8 % ± 1.1% ± 1% ± 3.0% ± 0.4% +6.9 −1.3 % ± 6.9% ± 2.5% − ± 8.3% +14 % −12 3-jet inclusive +8.7 −6.3 % ± 2.5% ± 2.3% ± 1% ± 1% ± 3.4% ± 3.1% ± 0.5% ± 0.4% +2.7 +7.5 −19.1 % −13.4 % ± 17.3% ± 0.5% ± 14.1% ± 0.8% − − − ± 8.3% +29 % +15 % −33 −17 +17.7 −14.7 % Diboson Z→ τ τ Drell-Yan Fakes events +30.3 − − − −23.7 % ± 16.8% − − − ± 1% − − − ± 2.5% − − − ± 0.7% − − − − − − − − − − − − − − − − ± 60% ± 9.4% ± 100% ± 5% − − − +35 % ± 60% ± 9.4% ± 100% −30 events +40.9 −12.9 % ± 47.0% ± 1% ± 1.8% ± 1.0% − − − − ± 5% +63 % −49 − − − − − − − − ± 60% − ± 60% − − − − − − − − − − − − − − − − ± 22% ± 100% − − ± 22% ± 100% Table 8.2: The effect of the individual systematic uncertainties of the acceptance for selected events in the 2-jet bin and the 3-jet bin. In other words, the change in the overall yield of a process when subjected to a ± 1σ shift of the nuisance parameter. The uncertainties from the shape of the systematics are not covered in this Table. 111 certainty and its effects on the cross-section measurement. We construct a model of the bins of the BDT response distribution using a likelihood function, parametrizing our systematics as nuisance parameters. The BDT response distributions in the nominal and systematicshifted datasets are used to estimate the nuisance parameters. The likelihood function is then fit to find the optimal value of the signal strength and to constrain the nuisance parameters. From this model we can extract a fitted cross-section and use pseudoexperiments, simulated experiments constructed using the model, to estimate the associated uncertainty. The modeled likelihood function and constrained nuisance parameters are used to generate pseudoexperiments that are compared to the observed data to give a calculated significance. The details of this procedure are described in depth below. 8.2.1 The likelihood function The first step is to construct a likelihood function to model the experiment. The likelihood function is a probability distribution function modeling the probability of seeing the dataset observed as a function of some parametrization of the uncertainties. By maximizing the likelihood function, the set of parameters most consistent with the observed data are obtained. Since we have modeled our systematic uncertainties as parameters in our likelihood function, these uncertainties will be profiled away during the fit. In other words, the likelihood function which depends on µ, L, and α will become a profile likelihood function which depends only on µ. This profile likelihood function is then maximized to find the most likely value of µ. The likelihood function is: 112 L(µ, L, α) = G(L0 |L, σL ) ×    exp k=1,N jet i=1,N bin   obs Pois Ni,k | Ni,k (µ, α)  × j∈systematic (8.1) G(αj |0, 1). σ obs Wt In the above, µ is defined as the signal strength (the ratio SM ), α is the set of nuisance σ Wt parameters modeling the strength of the systematic uncertainties (including luminosity), and L is the luminosity. There are three indices which are iterated over. The index k represents the 1-jet, 2-jet, and 3-jet inclusive channels. The index i represents the i-th bin of the BDT response template. The nominal distributions of the BDT response in the 1-jet, 2-jet and 3-jet inclusive channels are shown in Fig. 8.2. Finally, the index j iterates over each of the systematic uncertainties, with three exceptions. The luminosity is covered separately in the profile likelihood function, and the generator and parton shower uncertainties are not continuous, hence they cannot be modeled as Gaussian distributions and must be handled independently, described further below. The profile likelihood function contains a Poisson term that represents the probability of seeing the observed number of events given our expectation of the yield. The expected yield is calculated by modeling the total signal and background contribution as a function of the exp exp exp signal strength and nuisance parameters: Ni,k (µ, α) = si,k (µ, α) + bi,k (α). The fit to the data (N obs ) is made by adjusting the expected signal and background contribution. There is also a Gaussian term that models the probability of observing a luminosity L0 given the measured luminosity L and its associated uncertainty σL . The final set of terms for the systematic uncertainties are Gaussian distributions which 113 Events / 0.03 Events / 0.03 80 ATLAS -1 L dt = 2.05 fb 70 1 jet s = 7 TeV 60 50 40 30 20 10 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 ∫ 160 ATLAS -1 140 2 jets L dt = 2.05 fb s = 7 TeV 120 100 80 60 40 20 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 ∫ BDT Response BDT Response Events / 0.03 (a) (b) ∫ 120 ATLAS L dt = 2.05 fb-1 >=3 jets s = 7 TeV 100 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 80 60 40 20 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 BDT Response (c) Figure 8.2: The BDT classifier output for selected events (a) in the 1-jet bin (b) in the 2-jet bin (c) in the 3-jet inclusive bins. The simulated events are represented by the solid regions, while the data are represented with a black dot. 114 model the probability of observing a given value of a given nuisance parameter. This term represents the probability of a nuisance parameter having a particular value. A value of zero corresponds to the nominal value, a value of one corresponds to a +1σ shift, a value of negative one corresponds to a -1σ shift, and linear interpolation determines the rest of the distribution. These terms penalize improbably large nuisance parameters, even if they make the expected yields match the observed yields closely. 8.2.2 Cross-section measurement With the experiment modeled, the cross-section is calculated. This is done by finding the minimum of the negative log likelihood function. During this fitting procedure, all nuisance parameters are allowed to float. The signal strength at this minimum value is our measured signal strength. The software used for these fitting procedures is RooStat [74]. The fitting of the profile likelihood function procedure determines a fitted parameter value and uncertainty for each of the profiled nuisance parameters. We use these fitted values to assign a new data-driven mean and standard deviation. Naively, one may expect this to give similar results as the +1σ and -1σ shifts calculated with the methods described above. However, there are reasons to expect that this may lead to more constrained values. A nuisance parameter may have been estimated too conservatively or the event selection criteria may produce a signal region that is less sensitive to this systematic uncertainty than is estimated using selection-independent procedures. The shape of the template distribution itself, in this case the BDT distribution (especially in the background dominated 2 jet and 3+ jet regions), may also provide additional constraints on nuisance parameters. This potential constraining of the nuisance parameters makes profiling effective. The impact of the uncertainties on the cross-section measurement must be assessed. In 115 this analysis we initially used a Profile Likelihood Ratio (PLR), but ultimately a different method utilizing pseudoexperiments was selected because the PLR fit was sensitive to fitting failures where the fitting procedure does not converge to a stable set of values. The Profile Likelihood Ratio (PLR) is constructed as a model to calculate the uncertainty of the crosssection. The PLR is defined as: PLR(µ) = −2ln L(data|µ, αµ ) ˜ L(data|ˆ, αµ ) µ ˜ˆ , µ > 0. ˆ (8.2) Here L is the likelihood function as defined above. The denominator is the value of likelihood function with the parameters set to the fitted values from the cross-section measurement. The numerator is also the likelihood function, but is not maximized for the optimal value of µ. Instead, various values of µ are chosen and for each value of µ the likelihood function is minimized. During this minimization all nuisance parameters are allowed to float except for the generator and parton shower nuisance parameters. This set of floating nuisance parameters are the profiled systematics. The PLR allows constructs a likelihood ratio that is no longer a function of our nuisance parameters. The construction of the PLR profiles the nuisance parameters out of the distribution. This results in the most likely configuration of nuisance parameters given a signal strength. The resulting PLR function shows the relative likelihood of this µ compared to the globally fitted µ as a function of µ. Note that the numerator must always be greater than ˆ the denominator, and as a result the minimum of the PLR must be at the measured cross section value. Figure 8.3 shows the expected shape of the PLR distribution. Expected means that all calculations were done without data, instead using the nominal Monte Carlo as the “data” in 116 the calculation. The red dotted curve is the PLR with only statistical uncertainties included. The solid blue curve is the PLR with all systematic and statistical uncertainties included. The width is proportional to the uncertainty, as discussed in greater detail below. As one would expect, when the systematic uncertainties are added to the PLR, the distribution becomes wider. A similar plot for the observed PLR distribution is shown in Fig. 8.4. This is the distribution with the observed ATLAS data that is used for the cross-section measurement. Although it is not identical to the expected distribution, the difference is -log likelihood clearly within the uncertainty in the cross-section measurement. 2 1.8 1.6 1.4 1.2 ATLAS 1 ∫L dt=2.05 fb −1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 pred. σobv/σWt Wt SM 3 Figure 8.3: Expected likelihood ratio with only statistical uncertainties (red dashed) and profile likelihood ratio with statistical and a subset of the systematic uncertainties (blue solid) for Wt cross-section measurement. The full set of systematic uncertainties cannot be included because the PLR will not have a smooth shape. The horizontal green lines show the 1σ, 1.6σ, and 2σ thresholds. This Figure is not used in the final cross-section measurement. 117 -log likelihood 2 1.8 1.6 1.4 1.2 ATLAS 1 ∫L dt=2.05 fb −1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 pred. σobv/σWt Wt SM 3 Figure 8.4: Observed likelihood ratio with only statistical uncertainties (red dashed) and profile likelihood ratio with statistical and a subset of the systematic uncertainties (blue solid) for Wt cross-section measurement. The full set of systematic uncertainties cannot be included because the PLR will not have a smooth shape. The horizontal green lines show the 1σ, 1.6σ, and 2σ thresholds. This Figure is not used in the final cross-section measurement The uncertainty in the measurement can be calculated by examining the shape of the PLR distribution [75]. Preliminary Figs. 8.3 and 8.4 the 1σ, 1.6σ, and 2σ thresholds are shown with horizontal green lines. The uncertainty is calculated by locating the intersections between the PLR with the 1σ line [75]. However, these Figs. do not represent a final result and only contain a subset of the systematic uncertainties, as including all systematics leads to the fitting failures discussed previously. These Figs. have been left in for illustrative purposes. 118 In this analysis fitting the PRL fitting algorithm often fails, leaving a nonsmooth curve which is not useful for extracting an uncertainty from. Instead of using the PRL, we use another method. The uncertainty on the cross-section value is estimated from the profiled nuisance parameters by constructing pseudoexperiments, using the model of our experiment to construct simulated experiements seeded by a random number generator. These pseudoexperiments are used to examine the impact of the varied nuisance parameters on the measured cross-section. To construct the pseudoexperiments, each of the systematic uncertainties profiled is modeled as a Gaussian with a mean and width determined by the constraining procedure. The data and simulated event statistical uncertainties are modeled as Poisson distributions. The full profile likelihood fit is applied to each pseudoexperiment to determine a µP E value. The mean and RMS of the distribution of the fitted µP E values are used as an estimate of the uncertainty of the cross-section from all systematic and statistical uncertainties (except for the parton and generator uncertainties, which are discussed further below). Once this procedure is established, the contribution to the total uncertainty from the individual uncertainties is estimated. For the data statistical uncertainty, the method is applied while fixing the systematic nuisance parameters to their profiled values. A plot of the distribution of the fitted µP E values for these pseudoexperiments is shown for the observed data in Fig. 8.5 (expected is shown in Fig. 8.6). The individual systematic uncertainty contributions is determined by repeating the fit while the uncertainty in question has its nuisance parameter fixed, then subtracting the resulting uncertainty from the total uncertainty in quadrature. Uncertainties less than 5% are denoted as < 5%, as this method does not give accurate results for small uncertainties. This procedure gives only the uncertainty on the cross-section measurement, as the measured cross-section itself comes from the profile 119 Observed µ distribution 0.1 h1 Mean 0.9974 Entries 2000 Mean 0.9974 RMS 0.1678 0.08 RMS 0.1678 0.06 0.04 0.02 0 0 0.5 1 1.5 2 µ 2.5 PE Figure 8.5: Observed distribution of fitted µ values for the pseudoexperiments generated while fixing all profiled nuisance parameters to their fitted values. The mean and RMS of the distribution is used to calculate the data statistical uncertainty. The histogram is normalized to unit area. 120 Expected µ distribution h1 0.09 Mean 1.002 Entries 2000 0.08 0.07 Mean 1.002 RMS 0.1674 RMS 0.1674 0.06 0.05 0.04 0.03 0.02 0.01 0 0 0.5 1 1.5 2 µ 2.5 PE Figure 8.6: Expected distribution of fitted µ values for the pseudoexperiments generated while fixing all systematic nuisance parameters to their fitted values. The mean and RMS of the distribution is used to calculate the data statistical uncertainty. The plot is normalized to unit area. 121 likelihood fitting. Consequently, the mean of the µP E distributions may differ slightly from the fitted cross-section value. Source Data statistics MC statistics Lepton energy scale/resolution Lepton efficiencies Jet energy scale Jet energy resolution Jet reconstruction efficiency Generator Parton Shower ISR/FSR PDF Pileup ¯ tt cross-section Diboson cross-section Drell-Yan estimate Fake dilepton estimate Z → τ τ estimate Luminosity All systematics Total ∆σ/σ [%] all jets combined 1-jet bin only observed expected observed expected +17/-17 +17/-17 +15/-15 +18/-18 <5 <5 <5 <5 <5 <5 <5 +6/-6 +7/-7 +6/-6 +11/-11 +11/-11 +16/-16 +14/-14 +28/-28 +16/-16 <5 <5 <5 +6/-6 <5 <5 <5 +6/-6 +10/-10 +12/-12 +11/-11 +13/-13 +15/-15 +14/-14 +6/-6 +9/-9 +5/-5 +6/-6 +18/-18 +17/-17 <5 +6/-6 <5 <5 +10/-10 +7/-7 +10/-10 +10/-12 +6/-6 +6/-6 +14/-14 +12/-12 +6/-6 +5/-5 <5 <5 <5 <5 <5 <5 <5 <5 <5 <5 <5 <5 <5 <5 +7/-7 +7/-7 +13/-13 +8/-8 +29/-29 +29/-29 +40/-40 +30/-30 +34/-34 +33/-33 +43/-43 +35/-35 Table 8.3: Breakdown of the full uncertainty on the W t-channel cross-section measurement. Unlike Tables 8.1 and 8.2, the percentages listed here represent the uncertainty from both the normalization and the shape of the distribution. The uncertainty from the parton shower and generator systematics are calculated independently as described in the text. The contributions from the parton shower and generator systematic uncertainties must be calculated independently, as these uncertainties are not continuous and cannot be profiled. Instead, ATLAS has a recommended procedure [76] to be used. For each discrete systematic, the full profile likelihood fit is performed for each of its options. The difference between the fitted cross-sections is taken as the cross-section uncertainty associated with this systematic. The cross-section uncertainty breakdown is shown Table 8.3. The largest systematic 122 uncertainty contributions come from the JES, generator and parton shower uncertainties. The fitted nuisance parameters are shown in Table 8.4. A fit value of zero indicates the nuisance parameter remains at the nominal value. An uncertainty of less than one indicates the profiling has constrained the uncertainty. The fake dilepton nuisance parameter is fitted to a value that, combined with its large (100%) uncertainty, leads to a nearly 0% normalization. Although this is not ideal, the fake dilepton yield contributes < 1% to the overall yield in the 1-jet bin and even less in the 2-jet and 3-jet inclusive bins means that its contribution to the cross-section uncertainty is negligible. Because of how small the impact of this uncertainty is, it is not investigated further. ¯ The largest improvement is gained by the constraining the JES, tt normalization, and ISR/FSR uncertainties. Although these uncertainties are significantly constrained by the fitting procedure, they are still among the largest uncertainties, as shown in Table 8.3, particularly JES with a 16% observed uncertainty. Nuisance parameter ISR/FSR PDF JES JER Jet Reco. eff. LSF ¯ tt normalization DY normalization VV normalization Fake dilepton normalization Zτ τ normalization MC stat. Lumi Fitted value 0.75±0.52 0.01±0.99 -0.47± 0.42 -0.01± 0.67 0.01 ± 0.74 0.01 ± 0.92 0.16 ± 0.68 -0.75 ± 0.93 -0.13±0.99 -0.95±0.99 -0.64±0.78 0.00 ± 0.99 0.00 ± 0.99 Table 8.4: The fitted nuisance parameters and their uncertainties are shown here. The uncertainty contribution from each of these systematics is shown in Table 8.3. The 123 right hand side shows the uncertainties from a fitting using only the 1-jet bin, while the left hand side shows uncertainties from fitting all of the jet bins. By comparing the two sides the benefit of including the 2-jet and 3-jet inclusive bins is clear, reducing the overall uncertainty from 43% to 34%. Although the parton shower uncertainty is increased by adding these ¯ bins, the decrease in the jet energy scale, tt normalization, and ISR/FSR uncertainties has a greater impact on the overall uncertainty. The uncertainty contributed by the generator and parton shower systematic uncertainties are added in quadrature to the systematic uncertainty calculated from the profile likehood fit to give an overall uncertainty, shown below. The uncertainty from the luminosity is applied not only on the final cross-section measurement, but also to the normalization of the simulated backgrounds. Consequently, the impact of this uncertainty on the cross-section measurement is larger than the 3.7% applied to the luminosity. +4.9 +2.9 σ(pp → W t + X) = 16.8−2.9 (stat)−4.9 (syst) pb (8.3) This value is consistent with the Standard Model prediction for the cross-section. σ(pp → W t + X)N N LL = 15.7 ± 1.1 pb 8.2.3 (8.4) Significance calculation In addition to the fitted cross-section measurement we also measure the statistical significance with which we can claim rejection of the null hypothesis (i.e. the background-only hypothesis. To determine this significance pseudo-experiments (PEs) are generated with both the Standard Model and background-only hypotheses. The nuisance parameters dis- 124 cussed previously are modeled as Gaussian distributions with a mean equal to their nominal value and a standard deviation equal to 1σ. The parton shower and generator systematic uncertainties are now also modeled using a Gaussian distribution with a mean equal to their nominal value and a standard deviation equal to the difference between the nominal and the alternate set of simulated events generated. For the profiled nuisance parameters the nominal and standard deviations used are the same ones derived from the profile likelihood fitting procedure, allowing the advantages of the profiling discussed above to be applied to the significance calculation. For each PE, the log likelihood function is minimized while allowing the nuisance parameters to float, excluding the parton shower and generator uncertainties (which are set to their nominal values). A test statistic qµ is defined: qµ = −2ln L(data|µ, αµ ) L(data|0, α0 ) . (8.5) Here αµ and α0 are the maximum likelihood estimators for the Standard Model and backgroundonly hypotheses. The results of these PEs are shown in Fig. 8.7. The curve on the right hand side is made up of the PEs from the background-only hypothesis. The curve on the left hand side is made up of the PEs from the signal+background hypothesis. The two vertical lines that are close to each other are the observed and expected qµ values. The expected and observed qµ values are both in the center of the left curve, consistent with the signal+background hypothesis. The p-value is then computed by evaluating the fraction of the background-only PEs that have a value more extreme than the one observed. This p-value is the estimate for the probability, given the background hypothesis, of a given experiment to give a result greater than or equal to the result observed in the data. The p-value is used to calculate the 125 significance in standard deviations Z using the the Gaussian probability distribution: inf p= Z 1 √ exp(−x2 /2)dx. 2π (8.6) Using this method we calculate an expected p-value of 0.00036 for the result with an associated 3.4σ significance. The final observed p-value is 0.00044 with an associated significance of 3.3σ. This is greater than 3σ, making this the first analysis with evidence of the W t-channel. The signifance without profiling was not calculated, but we estimate how much of an impact the profiling made by examining the cross-section divided by the uncertainty on the cross-section. With profiling this value is 3.0σ. We compare this value to the same ratio but with the JES constraining removed. This removal is done by scaling the JES uncertainty contribution by the constraint factor (1.0/0.42 in this case) and use this new estimated uncertainty to calculate the total uncertainty. This gives us a ratio of 2.1σ, much less than the profiled result of 3.0σ. 8.3 Measurement of top quark width and lifetime We also measure three other Standard Model parameters. One of the parameters is the CKM matrix element |Vtb |. To make this measurement, it is assumed that the off-diagonal CKM matrix elements |Vts | and |Vtd | are much smaller than |Vtb |. We do not require any assumption about the top quark decay. This is a well motivated assumption, consistent with other measurements of these matrix elements [10]. The |Vtb | element is calculated by dividing the measured cross-section by the theoretical cross-section calculated using a top theory mass of 172.5 GeV. Using σWt = 15.7 × |Vtb |2 pb [4], a value for |Vtb | is obtained: 126 Fraction/0.6 ATLAS 10-1 N(σ)obs = 3.3 σ N(σ)exp = 3.4 σ NPE = 120,000 L(n|b) L(n|s+b) Obs. value Exp. value 10-2 10-3 10-4 10-5 -30 -20 -10 0 10 20 30 LLR Figure 8.7: Significance estimation using pseudo-experiments as described in the text. The continuous line is the qµ distribution of background only pseudo-experiments, the dashed line curve is the qµ distribution of Standard Model hypothesis pseudo-experiments, and the red line is the qµ of data. 127 +0.16 |Vtb | = 1.03−0.19 . (8.7) In this calculation the experimental and theoretical uncertainties have been added in quadrature. This measurement has a slightly larger uncertainty than the other direct measurements +0.14 that have been made such as the ATLAS t-channel analysis result of |Vtb | = 1.13−0.19 [17]. However, our result is consistent with them and the current world average of direct and indrect measurements of 0.89 ± 0.07 [10]. The top quark width and lifetime can also be determined from the W t-channelcrosssection measurement [15]. Using the linear dependence of the top quark width on the single top W t-channelcross-section, the top quark width is related to the cross-section measurement obs. σW t by Γobs. = ΓSM × SM . Here ΓSM = 1.3 GeV has been calculated and has uncertainties t t t σ Wt negligible relative to the cross-section measurement uncertainties [77]. From this we calculate the top quark width as Γobs. = 1.4 ± 0.5 GeV. t (8.8) From this measurement we can also calculate the top quark lifetime, which is related simply to the width. τt = Γt +1.2 τt = (4.7−1.2 ) × 10−25 s (8.9) (8.10) Prior to this analysis, D0 and CDF made direct measurements of the top width [78, 79]. 128 CDF measured a width 0.3 GeV < Γt < 4.4 GeV at a 68% CL, and D0 measured a width +0.69 of 1.99−0.55 GeV . Our indirectly measured values are consistent with the values observed at CDF and D0. 129 Chapter 9 Conclusion We have analyzed 2.05 fb−1 of data collected with the ATLAS detector. In our search for the W t-channel we have seen a statistically significant excess of 3.3σ. This is sufficient to claim evidence, and although this does not meet the > 5σ criteria to claim observation, it is a significant step to verifying the Standard Model prediction. The estimated cross-section is also +4.9 +2.9 extracted from the data, giving a result of σ(pp → W t + X) = 16.8−2.9 (stat)−4.9 (syst) pb. This analysis also allowed us to make measurements of other Standard Model parameters. +0.16 The CKM matrix element Vtb is measured to be |Vtb | = 1.03−0.19 . The width of the top quark is measured at Γobs. = 1.4 ± 0.5 GeV (Note the increase in the percent uncertainty t +1.2 due to the |Vtb |2 dependence), giving a lifetime of τt = (4.7−1.2 ) × 10−25 s. These measure- ments are all consistent with theoretical Standard Model predictions and other experimental measurements. This analysis is published in Physics Letters B [80]. In this analysis I implemented the BDT used, which includes the variable selection and testing, the training procedure, and the parameter optimization. I implemented the ATLAS and top group recommendations for the object definitions, event selection, and studied most of the systematics (the jet energy scale, jet reconstruction, jet ID, lepton ID, lepton resolumiss tion, ET , and pile-up uncertainties). The data-driven Z → τ τ normalization is estimated by me. I prepared the plots of the BDT and plots of the variables used. During the preparation of the paper and the associated note, I gave many single top working group talks and the approval talk to the top working group. I also collaborated with Huaqiao Zhang to 130 perform many cross-checks while going through review. With time the systematic uncertainties will be better understood and in the future this analysis will be repeated with more data. However, there is ample room for improvement in the analysis procedure itself. Note that the BDT optimization is done using only the nominal Monte Carlo. A look at the uncertainty composition of the final cross-section measurement will reveal that this analysis is quite systematically limited. A BDT optimization using information from the systematically shifted datasets could bring significant improvement to the result as a whole. This is not a trivial undertaking, as the existing toolsets are not equipped to do this kind of optimization out of the box, however implementing a systematicssensitive optimization has the potential to greatly increase the significance. This evidence for the existence of the W t-channel was also confirmed independently by the CMS collaboration [81]. Both the CMS and ATLAS collaborations will continue to update these analyses with better analysis techniques, a better understanding of the systematic uncertainties, and more data. The discovery of the W t-channel is not the end, of course. Precision measurements of Vtb and the top quark properties and searches for new physics in the W t-channel signal region are all exciting new analyses waiting to be explored. The LHC era is already showing its promise, giving exciting results like the recent Higgs discovery [1, 2] and confirming the predictions of the Standard Model. Even with the Higgs boson discovered, there remains much discovery ahead. The LHC will be running for years, pushing our understanding forward. With each collision we strive for a better understanding of our universe, and with time and hard work, these efforts will be rewarded. 131 APPENDICES 132 Appendix A Data/MC Agreement in Control Regions This appendix shows the BDT variables in the background-enhanced 2-jet and 3-jet regions. ¯ The 2-jet and 3-jet regions clearly show how dominant of a background tt is for this anal¯ ¯ ysis. Due to the strong tt contribution we are able to use these regions to constrain the tt normalization, which would otherwise be a dominating uncertainty. Selected variables are also shown in the three dilepton channels: ee, eµ, and µµ. The dilepton subchannels show that the good data-simulation agreement does not break down when these subchannels are examined independently. A.1 2-jet events 133 ∫ Events / 0.5 Events / 10 GeV 300 ATLAS 2 jets 250 -1 L dt = 2.05 fb s = 7 TeV 200 150 100 50 0 0 50 100 150 200 400 ATLAS 350 2 jets 300 250 200 150 100 50 0 0 2 ∫ L dt =s2.05 fb = 7 TeV -1 4 6 8 10 σ sys p T sys p [GeV] T (b) 180 ATLAS -1 L dt = 2.05 fb 160 2 jets 140 s = 7 TeV 120 100 80 60 40 20 0 0 0.2 0.4 0.6 0.8 Events / 0.5 Events / 0.05 (a) ∫ 1 220 200 ATLAS 180 2 jets 160 140 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 Events / 0.5 (c) (d) ∫ L dt =s2.05 fb = 7 TeV -1 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 150 100 50 0 -4 -2 4 ηThrust Centrality(Lep1Lep2Jets) 250 ATLAS 2 jets 200 2 0 2 4 Lep1Lep2 η (e) Figure A.1: The top five variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. 134 ∫ Events / 0.5 Events / 0.5 240 220 ATLAS 200 2 jets 180 160 140 120 100 80 60 40 20 0 -4 -2 -1 L dt = 2.05 fb s = 7 TeV 0 2 4 350 ATLAS 300 2 jets 250 200 150 100 50 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 2 ηLep1Lep2Jet1 ηLep1 (b) Events / 15 GeV Events / 50 GeV (a) 500 ATLAS 2 jets 400 ∫ L dt =s2.05 fb = 7 TeV -1 300 200 100 0 0 200 400 600 800 1000 Lep1Lep2 E HJets [GeV] T Events / 15 GeV (d) ∫ L dt =s2.05 fb = 7 TeV -1 150 100 50 0 0 100 ∫ 350 ATLAS -1 L dt = 2.05 fb 2 jets 300 s = 7 TeV 250 200 150 100 50 0 0 100 200 300 [GeV] (c) 250 ATLAS 2 jets 200 4 200 300 pLep1Lep2Jet1 [GeV] T Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton (e) Figure A.2: The 6th-10th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. 135 ∫ Events / 15 GeV Events / 0.05 250 ATLAS 200 2 jets -1 L dt = 2.05 fb s = 7 TeV 150 100 50 300 ATLAS 250 2 jets ∫ L dt =s2.05 fb = 7 TeV -1 200 150 100 50 0 0 0.2 0.4 0.6 0.8 0 0 1 100 Thrust M ATLAS 200 2 jets 300 [GeV] (b) Events / 0.5 Events / 0.5 (a) 250 200 Lep2Jet1 ∫ L dt =s2.05 fb = 7 TeV -1 150 300 250 ATLAS 2 jets ∫ L dt =s2.05 fb = 7 TeV -1 200 150 100 100 50 0 50 -4 -2 0 2 0 4 -4 -2 0 2 Lep1Jet1 η η Events / 0.5 (c) 350 300 ATLAS 2 jets 250 (d) ∫ L dt =s2.05 fb = 7 TeV -1 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 200 150 100 50 0 -4 -2 4 Lep2 0 2 4 ηJet1 (e) Figure A.3: The 11th-15th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. 136 Events / 20 GeV Events / 0.16 180 ATLAS -1 L dt = 2.05 fb 160 2 jets 140 s = 7 TeV 120 100 80 60 40 20 0 0 1 2 3 ∆φ(Lep,Jet1) ∫ 300 ATLAS 2 jets 250 ∫ L dt =s2.05 fb = 7 TeV -1 200 150 100 50 0 0 100 200 M min ∫ -1 L dt = 2.05 fb s = 7 TeV 2 3 4 400 ATLAS 350 2 jets 300 250 200 150 100 50 0 0 50 -1 100 Events / 0.25 150 200 Emiss [GeV] T (c) (d) ∫ L dt =s2.05 fb = 7 TeV -1 2 [GeV] ∫ L dt =s2.05 fb = 7 TeV ∆φ(Lep1Jet1, Lep2) 350 ATLAS 300 2 jets 250 200 150 100 50 0 0 1 400 (b) Events / 10 GeV Events / 0.2 (a) 180 ATLAS 160 2 jets 140 120 100 80 60 40 20 0 0 1 300 Lep1Jet1 3 4 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 5 ∆η(Lep1, Jet1) (e) Figure A.4: The 16th-20th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. 137 ATLAS 200 2 jets ∫ Events / 20 GeV Events / 0.32 250 -1 L dt = 2.05 fb s = 7 TeV 150 100 50 0 0 300 ATLAS 2 jets 250 ∫ L dt =s2.05 fb = 7 TeV -1 200 150 100 50 2 4 0 0 6 100 200 300 400 LepJet1 Mmax [GeV] ∆R(Lep2,Jet1) (a) (b) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure A.5: The 21st and 22nd top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 2-jet bin. 138 A.2 3-jet inclusive events 139 ATLAS 100 3+ jets 80 Events / 0.5 Events / 10 GeV 120 ∫ L dt =s2.05 fb = 7 TeV -1 60 40 20 0 0 50 100 150 200 180 ATLAS 160 3+ jets 140 120 100 80 60 40 20 0 0 2 ∫ L dt =s2.05 fb = 7 TeV -1 4 6 8 10 σ sys p T sys p [GeV] T 120 ATLAS 100 3+ jets (b) Events / 0.5 Events / 0.05 (a) ∫ L dt =s2.05 fb = 7 TeV -1 80 60 40 20 0 0 0.2 0.4 0.6 0.8 1 160 ATLAS 140 3+ jets 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 (c) Events / 0.5 4 ηThrust Centrality(Lep1Lep2Jets) 160 ATLAS 140 3+ jets 120 100 80 60 40 20 0 -4 -2 2 (d) ∫ L dt =s2.05 fb = 7 TeV -1 0 2 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 4 Lep1Lep2 η (e) Figure A.6: The top five variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. 140 ∫ Events / 0.5 Events / 0.5 140 ATLAS 120 3+ jets 100 80 60 40 20 0 -4 -2 -1 L dt = 2.05 fb s = 7 TeV 0 2 4 220 ATLAS 200 180 3+ jets 160 140 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 2 ηLep1Lep2Jet1 ηLep1 (b) Events / 15 GeV Events / 50 GeV (a) 300 ATLAS 250 3+ jets ∫ L dt =s2.05 fb = 7 TeV -1 200 150 100 50 0 0 200 400 600 800 1000 Lep1Lep2 E Events / 15 GeV 100 ATLAS 3+ jets (d) ∫ L dt =s2.05 fb = 7 TeV -1 60 40 20 100 ∫ HJets [GeV] T 80 0 0 180 ATLAS -1 160 L dt = 2.05 fb 3+ jets 140 s = 7 TeV 120 100 80 60 40 20 0 0 100 200 300 [GeV] (c) 120 4 200 300 pLep1Lep2Jet1 [GeV] T Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton (e) Figure A.7: The 6th-10th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. 141 Events / 15 GeV Events / 0.05 160 ATLAS -1 L dt = 2.05 fb 140 3+ jets s = 7 TeV 120 100 80 60 40 20 0 0 0.2 0.4 0.6 0.8 ∫ 1 ∫ 160 ATLAS -1 L dt = 2.05 fb 140 3+ jets s = 7 TeV 120 100 80 60 40 20 0 0 100 200 300 Lep2Jet1 Thrust M 160 ATLAS 140 3+ jets 120 100 80 60 40 20 0 -4 -2 ∫ (b) Events / 0.5 Events / 0.5 (a) -1 L dt = 2.05 fb s = 7 TeV 0 2 4 200 ATLAS 180 3+ jets 160 140 120 100 80 60 40 20 0 -4 -2 ∫ L dt =s2.05 fb = 7 TeV -1 0 2 Lep1Jet1 η (c) Events / 0.5 4 Lep2 η 220 200 ATLAS 180 3+ jets 160 140 120 100 80 60 40 20 0 -4 -2 [GeV] (d) ∫ L dt =s2.05 fb = 7 TeV -1 0 2 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 4 ηJet1 (e) Figure A.8: The 11th-15th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. 142 Events / 20 GeV Events / 0.16 140 ATLAS -1 L dt = 2.05 fb 120 3+ jets s = 7 TeV 100 80 60 40 20 0 0 1 2 3 ∆φ(Lep,Jet1) ∫ 160 ATLAS 140 3+ jets 120 100 80 60 40 20 0 0 100 ∫ L dt =s2.05 fb = 7 TeV -1 200 M min 100 ATLAS 3+ jets 80 ∫ -1 L dt = 2.05 fb s = 7 TeV 60 40 20 0 0 1 2 3 4 200 ATLAS 180 3+ jets 160 140 120 100 80 60 40 20 0 0 50 -1 100 Events / 0.25 150 200 Emiss [GeV] T (c) (d) ∫ L dt =s2.05 fb = 7 TeV -1 2 [GeV] ∫ L dt =s2.05 fb = 7 TeV ∆φ(Lep1Jet1, Lep2) 200 180 ATLAS 160 3+ jets 140 120 100 80 60 40 20 0 0 1 400 (b) Events / 10 GeV Events / 0.2 (a) 120 300 Lep1Jet1 3 4 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton 5 ∆η(Lep1, Jet1) (e) Figure A.9: The 16th-20th top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. 143 Events / 20 GeV Events / 0.32 140 ATLAS -1 L dt = 2.05 fb 120 3+ jets s = 7 TeV 100 80 60 40 20 0 0 2 4 6 ∫ 160 ATLAS 140 3+ jets 120 100 80 60 40 20 0 0 100 ∫ L dt =s2.05 fb = 7 TeV -1 200 300 400 LepJet1 Mmax [GeV] ∆R(Lep2,Jet1) (a) (b) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure A.10: The 21st and 22nd top variables in the BDT ranked by separation power, comparing the signal and background estimate to the data in the 3-jet inclusive bin. 144 A.3 Dilepton subchannels This section contains selected variables of the different dilepton final states. This illustrates that our backgrounds are well modeled for each of the final states individually. 145 ∫ Events / 10 GeV Events 350 ATLAS 300 1+ jets 250 200 150 100 50 0 0 -1 L dt = 2.05 fb s = 7 TeV 100 ATLAS 1+ jets 80 ∫ L dt =s2.05 fb = 7 TeV -1 60 40 20 2 0 0 4 50 100 150 200 Jet1 p [GeV] Njets T 300 ATLAS 250 1+ jets ∫ 200 (b) Events / 10 GeV Events / 15 GeV (a) -1 L dt = 2.05 fb s = 7 TeV 150 100 50 0 0 300 ATLAS 250 1+ jets ∫ L dt =s2.05 fb = 7 TeV -1 200 150 100 50 100 200 300 0 0 50 100 HJets [GeV] T Events / 10 GeV (c) 120 100 ATLAS 1+ jets 80 ∫ L dt =s2.05 fb = 7 TeV -1 40 20 50 200 (d) 60 0 0 150 Emiss [GeV] T 100 150 200 Lep1 p [GeV] Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton T (e) Figure A.11: Distributions of variables comparing the signal and background estimate to the miss data in the ee channel. (a) Jet multiplicity, (b) Leading jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT 146 800 ATLAS 1+ jets ∫ Events / 10 GeV Events 1000 -1 L dt = 2.05 fb s = 7 TeV 600 400 200 0 250 ATLAS 1+ jets 200 ∫ L dt =s2.05 fb = 7 TeV -1 150 100 50 0 2 0 0 4 50 100 150 200 Jet1 p [GeV] Njets T (b) 400 ATLAS -1 L dt = 2.05 fb 350 1+ jets s = 7 TeV 300 250 200 150 100 50 0 0 100 200 300 ∫ Events / 10 GeV Events / 15 GeV (a) 450 ATLAS 400 1+ jets 350 300 250 200 150 100 50 0 0 50 ∫ L dt =s2.05 fb = 7 TeV -1 100 HJets [GeV] T Events / 10 GeV (c) 350 ATLAS 300 1+ jets 250 200 150 100 50 0 0 50 150 200 Emiss [GeV] T (d) ∫ L dt =s2.05 fb = 7 TeV -1 100 150 200 Lep1 p [GeV] Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton T (e) Figure A.12: Distributions of variables comparing the signal and background estimate to the miss data in the eµ channel. (a) Jet multiplicity, (b) Leading jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT 147 ∫ 400 Events / 10 GeV Events 600 ATLAS 500 1+ jets -1 L dt = 2.05 fb s = 7 TeV 300 200 100 0 0 2 4 160 ATLAS 140 1+ jets 120 100 80 60 40 20 0 0 50 ∫ L dt =s2.05 fb = 7 TeV -1 100 150 200 Jet1 p [GeV] Njets T (b) ∫ 400 ATLAS -1 L dt = 2.05 fb 350 1+ jets s = 7 TeV 300 250 200 150 100 50 0 0 100 200 300 Events / 10 GeV Events / 15 GeV (a) 400 ATLAS 350 1+ jets 300 250 200 150 100 50 0 0 50 ∫ L dt =s2.05 fb = 7 TeV -1 100 HJets [GeV] T Events / 10 GeV (c) 200 180 ATLAS 160 1+ jets 140 120 100 80 60 40 20 0 0 50 150 200 Emiss [GeV] T (d) ∫ L dt =s2.05 fb = 7 TeV -1 100 150 200 Lep1 p [GeV] Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton T (e) Figure A.13: Distributions of variables comparing the signal and background estimate to the miss data in the µµ channel. (a) Jet multiplicity, (b) Leading jet pT , (c)HT (jet), (d) ET , (e) Leading lepton pT 148 Appendix B b∗ search This appendix will describe another analysis I worked on. In this analysis I implemented the object definitions, the event selection, and most of the systematic uncertainties. I studied the potential templates we considered using and attempted to reconstruct the neutrinos using invariant mass constraints, although this is not effective enough to make it into the paper. This analysis has been accepted for publication in Physics Letters B, and will be published in the near future (preprint [82]). It is a search for a hypothetical b∗ excited state using 4.7 f b−1 of integrated luminosity. This search uses ATLAS data in the same final state as the W t-channel analysis, hence the object definitions and event selection criteria will be similar to the W t-channel analysis. In addition, this appendix will give an overview of the analysis with the focus being the significant differences between the two. As a result, some of the details in common with the W t-channel analysis will be glossed over. For a full description of this search, please consult the ATLAS note for this analysis [83]. B.1 Introduction to b∗ This analysis is motivated in part by the fine-tuning problem, which is illustrated by examining the Standard Model Higgs mass to a one loop correction [10] m2 = m2 + H H 0 149 kg 2 Λ2 . 16π 2 (B.1) where mH is the observed Higgs mass, mH0 is an unmeasured fundamental parameter, g is the electroweak coupling, k is a constant expected to be O(1), and Λ2 is tge energy scale of new physics. If Λ is large, such as the Planck scale, then the mH0 parameter must be carefully balanced with the second term to cancel it out to give the observed Higgs mass. This is referred to as the fine-tuning problem in high energy physics. This amount of fine-tuning seems unnatural, thus it is suspected that there is other physics at work here. Theorists have made significant efforts to address this problem with models that modify the Standard Model to avoid the fine-tuning. Supersymmetry models describing massive supersymmetric partners [10] for every particle currently in the Standard Model are an example of such efforts. Instead of a new family of massive particles, smaller additions to the Standard Model are often considered [84]. Because the largest corrections to the Higgs mass arise from the top quark in loops such as that shown in Fig. B.1, an excited state of the top quark can cancel out those corrections. In addition, if an excited top quark is added, an associated excited bottom quark should also exist. We may expect that the mass hierarchy of these excited states would mirror the hierarchy we see in the Standard Model, hence in this analysis we search for a single theoretical excited state of the bottom quark that will be referred to as b∗. The experimental constraints on this b∗ state require it to be much more massive than the Standard Model particles. Due to this high mass some of the b∗-state’s most common decays lead to high mass final states. In general, the most common decay modes are expected to be b∗ → Zb, b∗ → bg, b∗ → bH, and b∗ → W t. This analysis searches for the decay mode b∗ → W t, illustrated in Fig. B.2. This decay mode varies in branching ratio from about 20% at low mass (200 GeV) to approximately 40% at high b∗ masses (400 GeV). The theoretical 150 ¯ t H H t Figure B.1: A correction to the Higgs mass from the top quark. cross-section for p¯ → b∗ → W t production in the model [84] at the LHC at 7 TeV are p shown in Table B.1. mass point [ GeV] cross-section [pb] mass point [ GeV] cross-section [pb] 300 181.2 900 0.804 400 69.21 1000 0.394 500 24.45 1100 0.201 600 9.366 1200 0.106 1300 0.057 700 3.884 800 1.719 1400 0.031 Table B.1: The total cross-section of b∗ → W t in a mass range of 300 GeV to 1400 GeV. This analysis is constructed to be sensitive to generic resonances in the W t final state and observed deviations from the Standard Model may also be caused by other resonances. In addition, coupling limits are calculated for three potential b∗ models: a b∗-state with only left-handed couplings, a b∗-state with only right-handed couplings, and a vector b∗-state with both right and left-handed couplings with equal magnitude. These limits are calculated on 151 g t b∗ b W Figure B.2: A Feynman diagram illustrated the b∗ decay investigated in this analysis. a two-dimensional plane along with the mass of the b∗-state. An example of this plane can be seen in Fig. B.12 in Section B.7. Like the W t-channel analysis, this analysis looks at the dilepton final state. This analysis uses the full 2011 dataset with updated simulation and systematic implementations. Another analysis was performed by a second group looking at the leptons+jets final state [85]. These two analyses then collaborated to produce a unified result. The methods used to combine these two analysis will be discussed in Section B.7. B.2 Simulation Because the final state in this analysis is the same as the final state in the W t-channel dilepton analysis, the backgrounds for these analyses are identical, except that the W tchannel is a Standard Model background to the b∗ process. The signal in this analysis is simulated using Madgraph5 [86] for the generation and Pythia [49] for the hadronization. In total 12 simulated samples are generated representing b∗ with masses from 300 GeV to 1400 152 GeV in 100 GeV increments. The cross-section of b∗ production is dependent on the mass point, and these cross-sections are given in Table B.1. In addition, dedicated simulation samples are generated to study the impact of the uncertainty in the initial and final state radiation modeling. The backgrounds are modeled using the same general scheme as the W t analysis, but updated to match the full 2011 ATLAS recommendations, described in the note [83]. The full list of simulated samples is shown in Tables B.4, B.5, and B.6. σ [pb] Lint [f b−1 ] Description b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ = 300 GeV , = 400 GeV , = 500 GeV , = 600 GeV , = 700 GeV , = 800 GeV , = 900 GeV , = 1000 GeV , = 1100 GeV , = 1200 GeV , = 1300 GeV , = 1400 GeV , 61.6 23.5 8.31 3.18 1.32 0.58 0.27 0.13 0.07 0.04 0.02 0.01 3.2 8.5 24 63 150 350 740 1500 2900 5000 10000 20000 NM C Generator+Shower 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia Table B.2: b∗ simulated samples for the analysis. The cross-section column includes branching ratios. All b∗ simulated samples are generated with at least one leptonic W boson decay. B.3 Object definition As in the W t analysis, the same basic objects types are considered: electrons, muons, jets, and missing transverse energy. These objects are constructed in the same manner as described in the main text, with some refinements that will be discussed below. The electron definition remains mostly the same with a few exceptions. A new electron 153 Description b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, b∗ → W t, Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ Mb∗ = 300 GeV , ISRFSR= 400 GeV , ISRFSR= 500 GeV , ISRFSR= 600 GeV , ISRFSR= 700 GeV , ISRFSR= 800 GeV , ISRFSR= 900 GeV , ISRFSR= 1000 GeV , ISRFSR= 1100 GeV , ISRFSR= 1200 GeV , ISRFSR= 1300 GeV , ISRFSR= 1400 GeV , ISRFSR= 300 GeV , ISRFSR+ = 400 GeV , ISRFSR+ = 500 GeV , ISRFSR+ = 600 GeV , ISRFSR+ = 700 GeV , ISRFSR+ = 800 GeV , ISRFSR+ = 900 GeV , ISRFSR+ = 1000 GeV , ISRFSR+ = 1100 GeV , ISRFSR+ = 1200 GeV , ISRFSR+ = 1300 GeV , ISRFSR+ = 1400 GeV , ISRFSR+ σ [pb] Lint 61.6 23.5 8.31 3.18 1.32 0.58 0.27 0.13 0.07 0.04 0.02 0.01 61.6 23.5 8.31 3.18 1.32 0.58 0.27 0.13 0.07 0.04 0.02 0.01 [f b−1 ] NM C Generator+Shower 3.3 8.5 23 63 150 340 740 1500 2900 5000 10000 20000 3.3 8.5 23 63 150 340 740 1500 2900 5000 10000 20000 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k 200k MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia MadGraph+Pythia Table B.3: b∗ simulated samples for the analysis. The cross-section column includes branching ratios. All b∗ simulated events are generated with at least one leptonic W boson decay. 154 Description W t all decays W t Less ISRFSR W t More ISRFSR ¯ tt no fully hadronic ¯ tt no fully hadronic ¯ tt no fully hadronic ¯ tt no fully hadronic Less ISRFSR ¯ tt no fully hadronic More ISRFSR σ [pb] Lint [f b−1 ] NM C Generator+Shower 15.74 15.74 15.74 89.71 89.4 89.4 89.4 89.4 13 19 19 17 34 34 11 11 200k 300k 300k 1,500k 3,000k 3,000k 1,000k 1,000k MC@NLO+Herwig ACERMC+Pythia ACERMC+Pythia MC@NLO+Herwig POWHEG+Herwig POWHEG+Pythia ACERMC+Pythia ACERMC+Pythia Table B.4: Top quark event simulated samples for the analysis. The cross-section column includes k-factors and branching ratios. All NLO simulated samples have been simulated with pile-up corresponding to 50 ns bunch trains. identification criterion is used, called “tightPP” (tight plus plus). This is the result of reoptimizing the same tight algorithms using more data and a better understanding of the ATLAS triggering systems, giving an overall increase in detection efficiency. An additional step has also been added to the jet-electron overlap removal algorithm. After applying the old jet-electron cut of removing a single jet if there exists one within dR < 0.2 of an electron, electrons within dR < 0.4 of any jet are rejected. This makes the electron signal cleaner by removing electrons that may be contaminated by nearby jets. The muon definition remains the the same with optimizations to the quality definitions using new performance data. The jet definition adds a cut on the jet vertex fraction (JVF). This variable corresponds to how certain we are that a jet originated from the primary vertex. As jets are sensitive to pile-up, this cut reduces the impact of pile-up on the analysis. While pile-up was not a problem in the W t analysis, the data added when considering the full 2011 dataset contains many runs with much higher instantaneous luminosity, which increases the impact of the pile-up systematic uncertainty. While implementing this cut we also add a scale factor to 155 σ [pb] Lin [f b−1 ] NM C Z → ℓℓ + 0 parton 827.4 8.0 6,600k Z → ℓℓ + 1 partons 166.6 8.0 1,340k Z → ℓℓ + 2 partons 50.4 5.7 285k Z → ℓℓ + 3 partons 14.0 7.9 110k Z → ℓℓ + 4 partons 3.4 8.8 30k Z → ℓℓ + 5 partons 1.0 9.0 9k W → ℓν + 0 parton 8,296 0.4 3,500k W → ℓν + 1 partons 1,551 1.6 2,500k W → ℓν + 2 partons 452 8.3 3,770k W → ℓν + 3 partons 121 8.3 1,000k W → ℓν + 4 partons 30.3 8.3 250k W → ℓν + 5 partons 8.3 8.4 70k W → ℓν + b¯ + 0 parton b 54.7 8.7 475k W → ℓν + b¯ + 1 partons b 40.4 5.1 205k ¯ + 2 partons W → ℓν + bb 20.0 8.8 175k W → ℓν + b¯ + 3 partons b 7.6 9.2 70k Description W W W W W → ℓν + c → ℓν + c → ℓν + c → ℓν + c → ℓν + c + + + + + 0 1 2 3 4 parton partons partons partons partons 517.6 192.1 51.0 11.9 2.8 1.7 1.7 1.7 1.7 1.8 Generator+Shower ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG ALPGEN+HERWIG 860k ALPGEN+HERWIG 318k ALPGEN+HERWIG 85k ALPGEN+HERWIG 20k ALPGEN+HERWIG 5k ALPGEN+HERWIG Table B.5: Background simulated samples. Cross-sections include k-factor. All NLO simulated samples have been simulated with pile-up corresponding to 50 ns bunch trains. 156 σ [pb] Lint [f b−1 ] NM C Generator+Shower W W → lνlν + 0 parton 2.0950 95 200k ALPGEN+Herwig W W → lνlν + 1 partons 0.9962 100 100k ALPGEN+Herwig W W → lνlν + 2 partons 0.4547 130 60k ALPGEN+Herwig W W → lνlν + 3 partons 0.1758 230 40k ALPGEN+Herwig W Z → ℓνℓℓ + 0 parton 0.6718 89 60k ALPGEN+Herwig W Z → ℓνℓℓ + 1 partons 0.4138 97 40k ALPGEN+Herwig W Z → ℓνℓℓ + 2 partons 0.2249 89 20k ALPGEN+Herwig W Z → ℓνℓℓ + 3 partons 0.0950 210 20k ALPGEN+Herwig ZZ → inclusive + ℓℓ + 0 parton 0.5086 79 40k ALPGEN+Herwig ZZ → inclusive + ℓℓ + 1 partons 0.2342 85 20k ALPGEN+Herwig ZZ → inclusive + ℓℓ + 2 partons 0.0886 230 20k ALPGEN+Herwig ZZ → inclusive + ℓℓ + 3 partons 0.0314 320 10k ALPGEN+Herwig Description Table B.6: Background simulated samples. Cross-sections include K-factor. All NLO simulated samples have been simulated with a pile-up corresponding to a 50 ns bunch trains (tag r2920). renormalize the simulated samples. These scale factors are calculated using a tag and probe method choosing a selection which results in a high likelihood of having a high pT jet from the primary interaction. The difference between the predicted efficiency and the observed efficiency in this region are parametrized as a scale factor as a function of jet pT . This scale factor also comes with a corresponding additional systematic uncertainty, described in Section B.7. miss The ET definition is also updated with the new data, taking into account the changes in the identification of the electrons, muons, and jets. B.4 Event selection This analysis uses 4.7 f b−1 of data at √ s = 7 T eV collected with the ATLAS detector. The data are filtered to select only events during which all detectors were functioning normally 157 with stable beam from the LHC. Like the W t analysis, events are selected from dielectron (ee), dimuon (µµ), and electron-muon (eµ) channels, and then eventually combined into one channel for the final analysis. The same general event quality filtering is applied to the events as in the W t analysis, but several of the details have been updated in the full 2011 dataset. The cut due to malfunction in the LAr detectors during data taking is no longer explicitly made in the selection cuts, instead being accounted for in the generation of the simulated events. The trigger selection and matching has been updated to account for the changing triggering conditions while running, and also to add trigger selection and matching criteria for the muons. The triggers for various periods are given in Table B.7. There is also an additional selection cut of Mℓℓ > 15 GeV added to the analysis. This cut has little impact on the selected events, but is required to allow an improvement in the fake dilepton estimation technique discussed in Section B.5. Electrons Before period K Period K After period K Muons Before period J Period J and later EF e20 medium EF e22 medium EF e22VHF medium1 OR EF e45 medium1 EF mu18 EF mu18 medium Table B.7: The triggers for the electrons and muons for each data-taking period. B.5 Background estimation In this analysis the backgrounds were simulated using the same software as the W t-channel ¯ analysis, with updated simulations of the ATLAS running conditions. The tt, W t, and di- 158 boson backgrounds remain estimated using Monte Carlo techniques, while the fake dilepton, ¯ Z → ℓℓ, and Z → τ τ backgrounds use data-driven estimates to determine the normalization and simulated events to estimate the distribution shapes. The methodology used for the ¯ Z → ℓℓ and Z → τ τ backgrounds is identical to that used for the W t-channel analysis, but with an updated input dataset using the full 4.7f b−1 luminosity. The fake dilepton estimation procedure is almost identical, but is improved by adding an additional requirement of Mℓℓ > 15 GeV to minimize contamination from J/Ψ and Y . After selection in the 1-jet bin, 2190 events are expected and 2259 are observed, a good agreement between data and simulation within two σ of data statistical uncertainty. This agreement also extends to each of the ee, and µµ subchannels, as shown in Table B.8. The µ+ µ− channel has some disagreement, but it is consistent when data statistical uncertainties ¯ and tt theoretical modeling systematic uncertainties are considered (the generator, parton shower, and normalization uncertainties). Agreement in the kinematics of the event is also good, as shown in Figs. B.3 and B.4. B.6 Discriminant variable selection After selection a discrimination template is chosen to analyze. For the W t-channel analysis the template was the BDT distribution histogram, but this analysis does not use MVA techniques. This analysis is intended to be quicker and more straightforward than the W tchannel analysis and adding a MVA technique requires a lot of cross-checks. It also is more difficult to do a MVA analysis when there are multiple mass points for the signal. Instead of training on a single signal sample, either a different methodology has to be developed to train for each mass point, or only one mass point is trained on, decreasing overall sensitivity. 159 Events / 0.5 Events / 10 GeV 500 ATLAS Dilepton 400 4.7 fb ∫ L dt = = 7 TeV s -1 300 600 ATLAS 500 Dilepton 400 4.7 fb ∫ L dt = = 7 TeV s -1 300 200 200 100 100 0 0 100 200 0 300 -4 -2 0 2 plep1 [GeV] T η ATLAS 1000 Dilepton ∫ 800 (b) Events / 0.5 Events / 10 GeV (a) 1200 4 Lep1 -1 L dt = 4.7 fb s = 7 TeV 600 ATLAS 500 Dilepton 400 ∫ -1 L dt = 4.7 fb s = 7 TeV 300 400 200 200 100 0 0 50 100 150 200 0 -4 -2 0 plep2 [GeV] T 2 4 Lep2 η (c) (d) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure B.3: Kinematic distributions of the signal region comparing data and background. (a) Leading lepton pT , (b) Leading lepton η, (c) Sub leading lepton pT and (d) Sub leading lepton η . 160 ∫ Events / 0.5 Events / 10 GeV 700 ATLAS -1 L dt = 4.7 fb 600 Dilepton s = 7 TeV 500 400 300 200 100 0 0 100 200 300 600 4.7 fb ∫ L dt = = 7 TeV s ATLAS 500 Dilepton -1 400 300 200 100 0 -4 -2 0 2 jet p [GeV] T jet η ∫ (b) Events Events (a) 400 ATLAS 350 Dilepton 300 250 200 150 100 50 0 0 1 4 -1 L dt = 4.7 fb s = 7 TeV 600 ∫ ATLAS 500 Dilepton 400 -1 L dt = 4.7 fb s = 7 TeV 300 200 100 2 3 0 0 4 2 ∆φ(l1, l2) 4 6 ∆R(l1, l2) (c) (d) Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure B.4: Kinematic distributions of the signal region comparing data and background. (a) Leading jet pT , (b) Leading jet η, (c) ∆φ between the two leptons and (d) ∆R between the two leptons. 161 Process b∗400 GeV b∗600 GeV b∗800 GeV b∗1000 GeV b∗1200 GeV Wt t¯ t Diboson Z → ee Z → µµ Z → ττ Fake lepton Total Bkg. Expected Total Observed ee µµ 187.1 34.4 6.9 1.5 0.4 42.8 196.5 31.6 41.1 ± 3.6 394.5 ± 5.5 ± 0.6 70.3 ± 0.9 ± 0.1 13.6 ± 0.2 ± 0.0 3.0 ± 0.0 ± 0.0 0.7 ± 0.0 ± 1.8 97.6 ± 2.9 ± 2.3 470.2 ± 3.6 ± 1.2 96.6 ± 2.2 ± 4.1 negl. negl. 118.0 ±11.8 1.5 ± 0.7 3.7 ± 0.9 78.0 ±78.0 8.6 ± 8.6 391.5 ±78.2 794.9 ±13.3 347.0 ±18.6 805.0 ±28.4 eµ all combined 663.8 105.9 20.1 4.4 1.1 152.7 713.0 126.3 ± 6.9 1245.5 ± 9.6 ± 1.0 210.7 ± 1.4 ± 0.2 40.6 ± 0.3 ± 0.0 8.9 ± 0.1 ± 0.0 2.1 ± 0.0 ± 3.5 293.2 ± 4.8 ± 4.4 1379.7 ± 6.1 ± 2.5 254.6 ± 3.5 negl. 41.1 ± 4.1 negl. 118.0 ±11.8 7.8 ± 1.3 14.2 ± 1.8 3.2 ± 3.2 89.8 ±89.8 1003.0 ±10.6 2190.5 ±91.1 1107.0 ±33.3 2259.0±47.5 Table B.8: Observed and predicted event yields in the 1-jet bin after the preselection with an integrated luminosity of 2.05 fb−1 . Fake dilepton and Z + jets background event yields are estimated from the data-driven techniques applied to the 1-jet bin. The errors shown include statistical error only (top pair, signal, dibosons) or statistical + systematic uncertainties (Drell-Yan, fakes). The choice of variable is critical to maximizing sensitivity, as its bins will be the only information the statistical tools will have as input. Consequently, we want to choose a variable with good signal/background separation. For the b∗ signal, the most obvious feature that stands out is the high mass of the resonance particle. Though the interacting particle itself is not directly detected by the ATLAS detector, this high mass is seen indirectly as a high transverse mass of the system. However, calculating the transverse mass of the system requires information of each individual particle in the system, which is not available for the neutrinos. As a result, we can only choose variables that approximate the transverse mass. Five of the most promising candidates for the discriminant are defined below, shown in order of increasing complexity: miss 1. HT is defined as the scalar sum of all of the pT of the jets, leptons and the ET . 162 This is the same variable as one of the input variables for the BDT for the W t-channel analysis. sys 1 2. MT = 2 HT − (pT )2 2 3. MT = pT 3 4. MT = ET sys leptons+jet miss ET − (pT )2 leptons+jet miss sys − (pT )2 ET leptons+jet where ET = leptons+jet 2 ) + (M leptons+jet )2 , and leptons+jet represents (pT the system composed of both leptons and the jets. 4 5. MT = lep1 pT lep2 + pT jet + pT + miss ET miss cos(∆φ(lep1,ET )) + miss ET miss cos(∆φ(lep2,ET )) 2 sys − (pT )2 These five variables are shown in Fig. B.5. The sensitivity for each of these templates is evaluated using the template fitting procedure described in Section B.7. It is found that there n is no improvement from any of the MT variables over HT . Since HT is straightforward and has an intuitive physical interpretation, this variable is used as the discrimination variable. B.7 Measurement The systematics investigated in this analysis were applied with similar procedures as in the W t-channel analysis. For details specific to this analysis, please see the note [83]. This analysis has one additional systematic uncertainty that did not exist in the W t-channel analysis. It is described below. Jet Vertex Fraction 163 ATLAS Dilepton 800 ∫ Events / 30 GeV Events / GeV 1000 -1 L dt = 4.7 fb s = 7 TeV 600 400 200 0 0 600 ATLAS Dilepton 500 400 300 200 100 500 1000 0 0 1500 HT [GeV] 200 ∫ -1 L dt = 4.7 fb s = 7 TeV 800 600 400 1200 ATLAS 1000 Dilepton 800 4.7 fb ∫ L dt = = 7 TeV s -1 600 400 200 200 0 0 200 400 600 0 0 200 400 M2 T Events / 30 GeV 600 ATLAS 500 Dilepton (d) 4.7 fb ∫ L dt = = 7 TeV s -1 400 300 200 100 200 600 M3 T (c) 0 0 600 (b) Events / 30 GeV Events / 30 GeV 1000 ATLAS Dilepton 400 M1 T (a) 1200 ∫ -1 L dt = 4.7 fb s = 7 TeV 400 600 Data JES uncertainty Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton M4 T (e) Figure B.5: The variables considered to be the discrimination template for the b∗ search. 164 The jet vertex fraction (JVF) is an estimate of the probability that a given jet originated from the primary vertex. If it did not originate from the primary vertex, it is assumed that it is a pile-up effect and is ignored. If the JVF cut applied to our events, an additional scale factor must be applied to match the simulated events to the observed data. This scale factor has an uncertainty associated with it, calculated by the TopJetUtils package. These uncertainty scale factors are applied to the nominal sample, creating an alternate set of JVF systematic events. In this analysis a template shape fitting procedure is used to set limits on mass points and couplings. We do a binned likelihood analysis using the Bayesian Analysis Toolkit software package [87]. This distribution is shown in both flat and log scale in Fig. B.6. Figure B.7 compares the HT signal distribution to the background distribution for selected b∗ mass points and Fig. B.8 shows the effect of the JES systematic on the background compared to the observed data. The likelihood function is constructed by taking the product of the likelihood for each bin, shown in equation B.2. Nbin L(data|σpp→b∗→W t , θi ) = k=1 n µk k e−µk nk ! Nsys G(θi , 0, 1) , where µk = sk + bk (B.2) i=1 Here the index k loops over the bins of the HT distribution, µk = sk + bk is the sum of the expected signal and background yield, nk is the number of observed events, the index ı loops over the systematics, and Gi is a Gaussian model for each systematic. The prior probability for the cross-section is taken to be uniform. By integrating over the systematic nuisance parameters, the likelihood function becomes parametrized in terms of only the b∗ 165 Events / GeV Events / GeV 1000 ATLAS Dilepton 800 4.7 fb ∫ L dt = = 7 TeV s -1 600 400 200 0 0 500 1000 1500 HT [GeV] 105 ATLAS 104 Dilepton 103 102 10 1 10-1 10-2 10-30 500 (a) 4.7 fb ∫ L dt = = 7 TeV s -1 1000 1500 HT [GeV] (b) Data JES uncertainty BPrime (800 GeV) Wt tt WW/ZZ/WZ Z(ee/µµ)+jets Z(ττ)+jets Fake dilepton Figure B.6: (a) Comparison of data and predicted background HT . (b) Comparison of data and predicted background HT at log scale. 166 Data b* 300 GeV b* 700 GeV b* 1100 GeV Bkgd. nominal Figure B.7: Data and predicted background HT are shown. In addition, several signal-only HT distributions at Mb∗ = 300, 700, 1100 GeV are shown. Data Bkgd. JES up Bkgd. JES down Bkgd. nominal Figure B.8: Comparison of JES shifted background HT with data. 167 cross-section B.3. L(data|σpp→b∗→W t ) = L(σpp→b∗→W t , θ1 , ..., θN )dθ1 , ..., dθN (B.3) This likelihood function is converted to a posterior probability density using Bayes Theorem using our assumption that the prior probability of the cross-section is uniform. This posterior probability density is shown in equation B.4. L(σpp→b∗→W t |data) = L(data|σpp→b∗→W t )π(σpp→b∗→W t ) (B.4) This posterior probability density has a maximum at the most likely cross-section given the data. However, in this analysis we do not expect to see a signal, and instead want to set exclusion limits. To do this we take the ratio of the integral of the posterior probability density from zero to σ ′ to the integral of the posterior probability density from zero to infinity, and find the value of σ ′ such that this ratio is equal to our exclusion criteria, in this case 0.95. 0.95 = σ′ 0 L(data|σpp→b∗→W t )π(σpp→b∗→W t )d(σpp→b∗→W t ) . ∞ 0 L(data|σpp→b∗→W t )π(σpp→b∗→W t )d(σpp→b∗→W t ) (B.5) This gives a 95% cross-section limit for each mass point. These cross-section limits are interpolated using the theoretical relationship between the cross-sections and the b∗ mass. This procedure is performed using both the observed dataset and ensembles of pseudoexperiments from the background estimates to give observed and expected limits. This procedure combines the results from both the dilepton and lepton+jets analyses. The intersection between the observed (expected) cross-section limit and the theoretical cross-section gives the 168 observed (expected) b∗-quark mass limit. The cross-section limit for a maximal left-handed coupling is 870 GeV observed (910 GeV expected) and the associated exclusion plot is shown σpp→ b*→ Wt [pb] in Fig. B.9. 102 ATLAS s = 7 TeV pp→ b* → Wt Expected limit Expected ± 1σ Expected ± 2σ Observed limit b* (κ b=gL=1;κ b =gR=0) R L 10 Theory uncertainty 1 10-1 ∫ L dt = 4.7 fb -1 400 600 800 1000 1200 mb* [GeV] Figure B.9: b∗ mass limit from the combined analysis, with an observed limit of Mb∗ > 870 GeV and expected limit of Mb∗ > 910 GeV . The cross-section limit is also calculated for the case where b∗ has only a maximal righthanded coupling and when it couples maximally both right- and left-handed. Here the cross-section limit in the right-handed case is 920 GeV observed (950 GeV expected). For the case where it has both maximal left and right-handed couplings, the cross-section limit is 1030 GeV observed (1030 GeV expected). b We can also make our limits more general by allowing the bb∗→ g (kL/R ) and b∗→ W t couplings (gL/R ) to vary independently. Here we investigate three cases: the case where we 169 assume only left-handed couplings, the case where we assume only right-handed couplings, and the case where we assume equal right- and left-handed couplings. The two dimensional κb L limits for each of these cases are given in Figs. B.10, B.11, and B.12. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 mb* [GeV] 300 400 500 600 700 800 ATLAS 95% CL limit dt = 4.7 fb ∫ LLeft-handed b*-quarks pp→ b* → Wt; -1 s = 7 TeV; -0.2 0 0.2 0.4 0.6 0.8 1 g L Figure B.10: The two dimensional coupling and mass limits for lefthanded coupling b∗. 170 κb R 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 mb* [GeV] 300 400 500 600 700 800 900 ATLAS 95% CL limit dt = 4.7 fb ∫ LRight-handed b*-quarks pp→ b* → Wt; -1 s = 7 TeV; -0.2 0 0.2 0.4 0.6 0.8 1 g R Figure B.11: The two dimensional coupling and mass limits for righthanded coupling b∗. 171 κb L/R 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 mb* [GeV] 300 400 500 600 700 800 900 1000 ATLAS 95% CL limit dt = 4.7 fb ∫ LVector-like b*-quarks pp→ b* → Wt; -1 s = 7 TeV; -0.2 0 0.2 0.4 0.6 0.8 1 g L/R Figure B.12: The two dimensional coupling and mass limits for a combined left and right-handed coupling b∗. 172 BIBLIOGRAPHY 173 BIBLIOGRAPHY [1] ATLAS Collaboration, Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC , Physics Letters B 716 (2012) no. 1, 1 – 29. http://www.sciencedirect.com/science/article/pii/S037026931200857X. [2] CMS Collaboration, Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC , Physics Letters B 716 (2012) no. 1, 30 – 61. http://www.sciencedirect.com/science/article/pii/S0370269312008581. [3] N. Kidonakis, Next-to-next-to-leading-order collinear and soft gluon corrections for t-channel single top quark production, Phys.Rev. D83 (2011) 091503, arXiv:1103.2792 [hep-ph]. [4] N. Kidonakis, Two-loop soft anomalous dimensions for single top quark associated production with a W- or H-, Phys.Rev. D82 (2010) 054018, arXiv:1005.4451 [hep-ph]. [5] N. Kidonakis, NNLL resummation for s-channel single top quark production, Phys.Rev. D81 (2010) 054028, arXiv:1001.5034 [hep-ph]. [6] D. Binosi and L. Theul, JaxoDraw: A graphical user interface for drawing Feynman diagrams, Computer Physics Communications 161 (2004) no. 12, 76 – 86. http://www.sciencedirect.com/science/article/pii/S0010465504002115. [7] ATLAS Collaboration, ATLAS Luminosity Public Results, https://twiki.cern.ch/twiki/bin/view/AtlasPublic/ LuminosityPublicResults#Multiple_Year_Collision_Plots. [8] ATLAS Collaboration, The ATLAS Experiment at the CERN Large Hadron Collider , JINST 3 (2008) S08003. [9] W. Heisenberg and F. Northrop, Physics and philosophy: The revolution in modern science, vol. 18. Prometheus Books New York, 1999. [10] K. Nakamura et al., Particle Data Group, J. Phys. G 37 (2010) 075021. 174 [11] D. Griffiths, Introduction to Elementary Particles. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2004. [12] A. Heinson, A. Belyaev, and E. Boos, Single top quarks at the Fermilab Tevatron, Physical Review D 56 (1997) no. 5, 3114. [13] CDF Collaboration, Observation of Top Quark Production in pp Collisions with the Collider Detector at Fermilab, Phys. Rev. Lett. 74 (1995) 2626 – 2631. http://link.aps.org/doi/10.1103/PhysRevLett.74.2626. [14] D0 Collaboration, Observation of the Top Quark , Phys. Rev. Lett. 74 (1995) 2632–2637. http://link.aps.org/doi/10.1103/PhysRevLett.74.2632. [15] D0 Collaboration, Determination of the width of the top quark , Phys. Rev. Lett. 106 (2011) 022001. [16] D0 Collaboration, Observation of Single Top Quark Production, Phys.Rev.Lett. 103 (2009) 092001, arXiv:0903.0850 [hep-ex]. [17] ATLAS Collaboration, Observation of t Channel Single Top-Quark Production in pp Collisions at sqrts = 7 TeV with the ATLAS detector , Tech. Rep. ATLAS-CONF-2011-088, CERN, Geneva, Jun, 2011. [18] T. S. Pettersson and P. Lefvre, The Large Hadron Collider: conceptual design, Tech. Rep. CERN-AC-95-05 LHC, CERN, Geneva, Oct, 1995. [19] H. T. Edwards, The Tevatron energy doubler: a superconducting accelerator , Annual Review of Nuclear and Particle Science 35 (1985) no. 1, 605–660. [20] CDF Collaboration, First Observation of Electroweak Single Top Quark Production, Phys.Rev.Lett. 103 (2009) 092002, arXiv:0903.0885 [hep-ex]. [21] D0 Collaboration, Evidence for a particle produced in association with weak bosons and decaying to a bottom-antibottom quark pair in Higgs boson searches at the Tevatron, Physical review letters 109 (2012) no. 7, 071804. [22] ALICE Collaboration, The ALICE experiment at the CERN LHC , Journal of Instrumentation 3 (2008) no. 08, S08002. [23] TOTEM Collaboration, The TOTEM Experiment at the CERN Large Hadron Collider , Journal of Instrumentation 3 (2008) no. 08, S08007. 175 [24] LHCb Collaboration, The LHCb detector at the LHC , Journal of Instrumentation 3 (2008) no. 08, S08005. [25] LHCf Collaboration, The LHCf detector at the CERN Large Hadron Collider , Journal of Instrumentation 3 (2008) no. 08, S08006. [26] MoEDAL Collaboration, Technical design report of the moedal experiment, tech. rep., Tech. Rep. CERN-LHCC-2009-006. MoEDAL-TDR-001, CERN, Geneva, 2009. [27] G. Bayatian, C. collaboration, et al., The Compact Muon Solenoid Technical Proposal , CERN/LHCC 94 (1994) 38. [28] ATLAS Collaboration, Expected performance of the ATLAS experiment-detector, trigger and physics, tech. rep., CERN, 2008. [29] ATLAS Collaboration, ATLAS inner detector technical design report, CERN/LHCC (1997) 97–16. [30] ATLAS Collaboration, ATLAS pixel detector electronics and sensors, Journal of Instrumentation 3 (2008) no. 07, P07007. [31] A. Abdesselam, T. Akimoto, P. Allport, J. Alonso, B. Anderson, L. Andricek, F. Anghinolfi, R. Apsimon, G. Barbier, A. Barr, et al., The barrel modules of the ATLAS semiconductor tracker , Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 568 (2006) no. 2, 642–671. [32] ATLAS Collaboration, The ATLAS TRT barrel detector , Journal of Instrumentation 3 (2008) no. 02, P02014. [33] ATLAS Collaboration, The ATLAS TRT end-cap detectors, Journal of Instrumentation 3 (2008) no. 10, P10003. [34] T. Ferbel, Experimental Techniques in High Energy Physics. Addison-Wesley Publshing Company, Inc., 1987. [35] ATLAS Collaboration, ATLAS tile calorimeter: Technical design report. CERN, 1996. [36] ATLAS Collaboration, Electron reconstruction, https://twiki.cern.ch/twiki/bin/ viewauth/AtlasProtected/ElectronReconstruction. 176 [37] ATLAS Collaboration, Muon reconstruction efficiency in reprocessed 2010 LHC proton-proton collision data recorded with the ATLAS detector , ATLAS-CONF-2011-063 (2011) . [38] M. Cacciari, G. Salam, and G. Soyez, The anti-kt jet clustering algorithm, Journal of High Energy Physics 2008 (2008) no. 04, 063. [39] M. Cacciari and G. P. Salam, Dispelling the N 3 myth for the kt jet-finder , Phys.Lett. B641 (2006) 57–61, arXiv:hep-ph/0512210 [hep-ph]. [40] ATLAS Collaboration, Data-Quality Requirements and Event Cleaning for Jets and Missing Transverse Energy Reconstruction with the ATLAS Detector in Proton-Proton √ Collisions at a Center-of-Mass Energy of s = 7 TeV , ATLAS-CONF-2010-038 (2010) . https://cdsweb.cern.ch/record/1277678. [41] ATLAS Collaboration, Performance of Missing Transverse Momentum Reconstruction in Proton-Proton Collisions at 7 TeV with ATLAS , Eur.Phys.J. C72 (2012) 1844. [42] ATLAS Collaboration, Top common object selection criteria, https://twiki.cern. ch/twiki/bin/viewauth/AtlasProtected/TopCommonObjects2011rel16. [43] B. P. Kersevan and R. W. Elzbieta, The Monte Carlo Event Generator AcerMC version 3.5 with interfaces to PYTHIA 6.4, HERWIG 6.5 and ARIADNE 4.1 , hep-ph/0405247 (2008) . [44] M. L. Mangano, M. Moretti, F. Piccinini, R. Pittau, and A. D. Polosa, ALPGEN, a generator for hard multiparton processes in hadronic collisions, JHEP 07 (2003) 001. [45] P. Nason, A new method for combining NLO QCD computations with parton shower simulations, JHEP 11(2004)-040, hep-ph/0409146 (2004) . [46] S. Frixione, P. Nason, et al., Positive weight next-to-leading-order Monte Carlo, JHEP 11(2007)-126 and JHEP 09(2007)111, hep-ph/07092092 and hep-ph/07073088 (2007) . [47] S. Frixione, B. Webber, and P. Nason, Single-top production in MC@NLO, hep-ph/0512250 and hep-ph/08053067 (2002) . [48] S. Frixione, B. R. Webber, and P. Nason, MC@NLO Generator version 3.4 , hep-ph/0204244 and hep-ph/0305252 (2002) . 177 [49] T. Sjostrand, S. Mrenna, and P. Skands, PYTHIA Generator version 6.418 , JHEP 05 (2006) 026. [50] G. Corcella et al., HERWIG 6.5: an event generator for Hadron Emission Reactions With Interfering Gluons (including supersymmetric processes), JHEP 01 (2001) 010. [51] GEANT4 Collaboration, S. Agostinelli et al., GEANT4: A simulation toolkit, Nucl. Instrum. Meth. A 506 (2003) 250–303. [52] U. Langenfeld, S. Moch, and P. Uwer, New results for t anti-t production at hadron colliders, arXiv:0907.2527 [hep-ph]. [53] B. Alvarez et al., Measurement of Single Top-Quark Production in the Lepton+Jets √ Channel in pp Collisions at s = 7 TeV , Tech. Rep. ATL-COM-PHYS-2011-058, CERN, Geneva, Jan, 2011. [54] J. Campbell and R. Ellis, Update on vector boson pair production at hadron colliders, Phys. Rev. D D60 (1999) . [55] B. P. e. a. Roe, Boosted Decision Trees as an Alternative to Artificial Neural Networks for Particle Identification, physics/0408124v2 (2004) . [56] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, and H. Voss, TMVA: Toolkit for Multivariate Data Analysis, PoS ACAT (2007) 040, arXiv:physics/0703039. [57] F. J. Massey, The Kolmogorov-Smirnov Test for Goodness of Fit, Journal of the American Statistical Association 46 (1951) no. 253, 68–78. [58] ATLAS Collaboration, Top systematic uncertainties, https://twiki.cern.ch/ twiki/bin/view/AtlasProtected/TopSystematicUncertainties2011rel16. [59] ATLAS Collaboration, Jet √ energy scale and its systematic uncertainty in proton-proton collisions at s = 7 TeV in ATLAS 2010 data, Tech. Rep. ATLAS-CONF-2011-032, CERN, Geneva, Mar, 2011. [60] ATLAS Collaboration, Jet √ energy measurement with the ATLAS detector in proton-proton collisions at s = 7 TeV , arXiv:1112.6426 [hep-ex]. submitted to European Physical Journal C. 178 [61] ATLAS Collaboration, Jet uncertainties, https: //twiki.cern.ch/twiki/bin/viewauth/AtlasProtected/JetUncertainties2011. [62] ATLAS Collaboration, Jet Energy Scale uncertainty provider , https://twiki.cern. ch/twiki/bin/viewauth/AtlasProtected/JESUncertaintyProvider?rev=54. [63] ATLAS Collaboration, Jet Reconstruction Efficiency, https://twiki.cern.ch/ twiki/bin/viewauth/AtlasProtected/TopJetReconstructionEfficiency. [64] ATLAS Collaboration, Measurement of the top quark pair production cross-section with ATLAS in pp collisions at 7 TeV , Eur.Phys.J. C71 (2011) 1577. [65] J. Pumplin et al., New generation of parton distributions with uncertainties from global QCD analysis, JHEP 07 (2002) 012. [66] A. Martin, W. Stirling, R. Thorne, and G. Watt, Uncertainties on alpha s in global PDF analyses and implications forpredicted hadronic cross sections, Eur.Phys.J.C Particles and Fields 64 (2009) 653–680. [67] Demartin, Federico and Forte, Stefano and Mariani, Elisa and Rojo, Juan and Vicini, Alessandro, Impact of parton distribution function and αs uncertainties on Higgs boson production in gluon fusion at hadron colliders, Phys. Rev. D 82 (2010) 014002. [68] ATLAS Collaboration, Energy Rescaler , https://twiki.cern.ch/twiki/bin/viewauth/AtlasProtected/EnergyRescaler. [69] ATLAS Collaboration, Muon energy and momentum systematics, https://twiki. cern.ch/twiki/bin/viewauth/AtlasProtected/MCPAnalysisGuidelinesEPS2011. [70] ATLAS Collaboration, MissingEt calculation recommendation, https://twiki.cern.ch/twiki/bin/viewauth/AtlasProtected/ TopETmissLiaison_EPS#Recommendations_for_Calculating. [71] √ ATLAS Collaboration, Updated Luminosity Determination in pp Collisions at s = 7 TeV using the ATLAS Detector , ATLAS-CONF-2011-011 (2011) . [72] G. Cowan, K. Cranmer, E. Gross, and O. Vitells, Asymptotic formulae for likelihood-based tests of new physics, The European Physical Journal C-Particles and Fields 71 (2011) no. 2, 1–19. 179 [73] ATLAS Collaboration, Top Profiling Checks, https: //twiki.cern.ch/twiki/bin/viewauth/AtlasProtected/TopProfilingChecks. [74] L. Moneta, K. Belasco, K. Cranmer, S. Kreiss, and M. Wolf, The RooStats Project, arXiv:arXiv:1009.1003. [75] O. E. Barndorff-Nielsen and D. D. R. Cox, Inference and asymptotics, vol. 52. Chapman & Hall/CRC, 1994. [76] W. Verkerke, TopProfilingChecks, https: //twiki.cern.ch/twiki/bin/viewauth/AtlasProtected/TopProfilingChecks, 2011. [77] M. Jezabek J. Kuhn, Top quark width: Theoretical update, Phys. Rev. D 48 (1993) . [78] CDF Collaboration, Direct Top-Quark Width Measurement at CDF , Physical review letters 105 (2010) no. 23, 232003. [79] D0 Collaboration, Determination of the width of the top quark , Physical review letters 106 (2011) no. 2, 022001. [80] ATLAS Collaboration, Evidence for the associated production of a W boson and a top quark in ATLAS at, Physics Letters B (2012) . [81] CMS Collaboration, Evidence for associated production of a single top quark and W boson in pp collisions at 7 TeV , arXiv preprint arXiv:1209.3489 (2012) . [82] ATLAS Collaboration, Search for single b*-quark production with the ATLAS detector at sqrt (s)= 7 TeV , arXiv preprint arXiv:1301.1583 (2013) . [83] J. Koll, H. ZHANG, R. Schwienhorst, and J. Nutter, Search for single B’ production √ in the model of decay to Wt dilepton final states at s = 7 TeV , Tech. Rep. ATL-COM-PHYS-2011-1705, CERN, Geneva, Dec, 2011. [84] J. Nutter, R. Schwienhorst, D. G. Walker, and J.-H. Yu, Single Top Production as a Probe of B-prime Quarks, Phys.Rev. D86 (2012) 094006, arXiv:1207.5179 [hep-ph]. [85] H. Lee, D. Geerts, R. van der Geer, D. Ta, P. Ferrari, M. Vreeswijk, and S. Bentvelsen, Search for single B ′ production in the decay to W t lepton+jets final states at √ s = 7 TeV., Tech. Rep. ATL-COM-PHYS-2012-1040, CERN, Geneva, Jul, 2012. 180 [86] J. Alwall, M. Herquet, F. Maltoni, O. Mattelaer, and T. Stelzer, MadGraph 5 : Going Beyond , JHEP 1106 (2011) 128, arXiv:1106.0522 [hep-ph]. [87] A. Caldwell, D. Kollar, and K. Kroninger, BAT - The Bayesian Analysis Toolkit, Comput. Phys. Commun. 180 (2009) 2197–2209, arXiv:0808.2552 [physics.data-an]. 181