STUDY OF HIGGS BOSON PRODUCTION AT HIGH TRANSVERSE MOMENTUM
                IN THE B-QUARK PAIR DECAY MODE
                                      By
                        José Gabriel Reyes Rivera
                            A DISSERTATION
                                Submitted to
                        Michigan State University
                 in partial fulfillment of the requirements
                              for the degree of
                      Physics—Doctor of Philosophy
                                    2023


                                        ABSTRACT
    This document presents constraints on Higgs boson production at high transverse mo-
                                                                           √
mentum using the bb̄ channel. The study is based on data collected at        s = 13 TeV with
the ATLAS detector, corresponding to an integrated luminosity of 136 fb−1 . The events
of interest consist of two large radius jets recoilling against each other. The Higgs boson
decaying to b-quarks is identified using b-tagging techniques, exploiting the experimental
signature of b-hadron decays while the other jet is a fully hadronic system. Z → bb̄ events
are used to validate experimental techniques. Upper limits at the 95% confidence level on
the Higgs boson production cross section are established for transverse momenta above 450
GeV and above 1 TeV. Studies related to possible improvements of these results, by reducing
the uncertainties are also discussed, such as the use of modern jet definitions like UFO jets
and the development of jet substructure taggers using machine learning techniques.


                                ACKNOWLEDGMENTS
    I would like to start by thanking my advisor Joey Huston, who has always been a source
of guidance and support in all the avenues of research I wanted to explore. Thank you for the
valuable feedback as well as the great stories. I would like to also thank the MSU ATLAS
team of professors and postdocs for always providing me with feedback and suggestions
on how to proceed with my projects. I am also thankful of the many friends I’ve made
throughout my life. Appreciate all the support and thank you for the good times.
    Termino agradeciendo a mis seres queridos. A toda mi familia, todo lo que soy y he
logrado se lo debo a ustedes. Por último, a Marı́a, sin tu apoyo durante este proceso,
completar esta meta no hubiese sido posible. Me inspiras a ser un mejor ser humano, a creer
en mi, y a no rendirme nunca.
                                              iii


                             TABLE OF CONTENTS
Chapter 1    Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       1
Chapter 2 Theoretical Background . . .            . . . . . . . . . . . . . . . . . . . . .   5
  2.1 Relativistic Quantum Mechanics . . . .      . . . . . . . . . . . . . . . . . . . . .   5
  2.2 Quantum Field Theory . . . . . . . . .      . . . . . . . . . . . . . . . . . . . . .  12
  2.3 Standard Model . . . . . . . . . . . . .    . . . . . . . . . . . . . . . . . . . . .  16
  2.4 Quantum Chromodynamics . . . . . .          . . . . . . . . . . . . . . . . . . . . .  17
  2.5 Electroweak Theory . . . . . . . . . . .    . . . . . . . . . . . . . . . . . . . . .  19
  2.6 Higgs Boson Phenomenology . . . . . .       . . . . . . . . . . . . . . . . . . . . .  24
  2.7 Simulation of proton-proton Collisions      . . . . . . . . . . . . . . . . . . . . .  28
Chapter 3 Experimental Apparatus .            . . . . . . . . . . . . . . . . . . . . . . .  33
  3.1 Large Hadron Collider . . . . . . .     . . . . . . . . . . . . . . . . . . . . . . .  33
  3.2 ATLAS Detector . . . . . . . . . .      . . . . . . . . . . . . . . . . . . . . . . .  36
  3.3 Coordinate System . . . . . . . . .     . . . . . . . . . . . . . . . . . . . . . . .  37
  3.4 Tracking . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . .  37
  3.5 Calorimetry . . . . . . . . . . . . .   . . . . . . . . . . . . . . . . . . . . . . .  40
  3.6 Muon system . . . . . . . . . . . .     . . . . . . . . . . . . . . . . . . . . . . .  47
  3.7 Magnet system . . . . . . . . . . .     . . . . . . . . . . . . . . . . . . . . . . .  49
  3.8 Trigger system . . . . . . . . . . .    . . . . . . . . . . . . . . . . . . . . . . .  50
Chapter 4 Object Reconstruction .          .  . . . . . . . . . . . . . . . . . . . . . . .  52
  4.1 Track and Vertex Reconstruction      .  . . . . . . . . . . . . . . . . . . . . . . .  52
  4.2 Jets . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . .  54
  4.3 b-hadron Identification . . . . . .  .  . . . . . . . . . . . . . . . . . . . . . . .  66
  4.4 Muons . . . . . . . . . . . . . . .  .  . . . . . . . . . . . . . . . . . . . . . . .  70
Chapter 5 Boosted H → bb̄ Analysis            . . . . . . . . . . . . . . . . . . . . . . . 73
  5.1 Introduction . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . 73
  5.2 Samples . . . . . . . . . . . . . . .   . . . . . . . . . . . . . . . . . . . . . . . 77
  5.3 Object Definition . . . . . . . . . .   . . . . . . . . . . . . . . . . . . . . . . . 81
  5.4 Event Selection . . . . . . . . . . .   . . . . . . . . . . . . . . . . . . . . . . . 84
  5.5 Higgs Boson Modeling . . . . . . .      . . . . . . . . . . . . . . . . . . . . . . . 87
  5.6 Background Process Modeling . . .       . . . . . . . . . . . . . . . . . . . . . . . 89
  5.7 Statistical Analysis . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . 105
Chapter 6 Boosted H → bb̄        Results      . . . . . . . . . . . . . . . . . . . . . . . 107
  6.1 Inclusive Region . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . 108
  6.2 Fiducial Region . . . . .  . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . 111
  6.3 Differential Regions . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . 114
                                             iv


Chapter 7 Unified Flow Objects . . . . . .         . . . . . . . . . . . . . . . . . . . . 121
  7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
  7.2 Jet Substructure Variables . . . . . . . .   . . . . . . . . . . . . . . . . . . . . 128
  7.3 Machine Learning . . . . . . . . . . . . .   . . . . . . . . . . . . . . . . . . . . 133
  7.4 Binary Taggers for Boosted UFO jets . .      . . . . . . . . . . . . . . . . . . . . 135
  7.5 High pT Scale Factor Extrapolation . . .     . . . . . . . . . . . . . . . . . . . . 138
  7.6 Multiclass Tagger for Boosted UFO jets .     . . . . . . . . . . . . . . . . . . . . 157
Chapter 8   Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
APPENDIX A: H → bb̄ Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 193
APPENDIX B: Unified Flow Objects . . . . . . . . . . . . . . . . . . . . . . . . 207
                                             v


Chapter 1
Introduction
Humanity’s quest to understand our place in the universe has lead us to the study of what
is the most fundamental representation of our reality, particles. Even though the idea of
elementary particles has existed for millenia, it wasn’t until the 19th century that a “modern”
view of particles was defined with the discovery of atoms. Atoms are the basic particle
of chemical elements and it didn’t take phycisists too long to discover that atoms are in
fact composite particles themselves made of protons, neutrons and electrons. With the
developments of quantum physics to explain nuclear phenomena coupled with technogical
advancements in acceleration physics and particle colliders, we soon found ourselves within a
“particle zoo” of supossedly elementary particles during the 1950’s. It wasn’t until the 1960’s,
when physicists formulated what we call today the Standard Model (SM), that the origin of
so many particles was explained as combinations of a smaller amount of true fundamental
particles.
    High energy physics (also known as particle physics) attempts to create a robust mathe-
matical framework that models all the fundamental interactions observed in nature through
experimental observations. The SM is constantly being tested and re-tested by continuous
analysis of particle collisions produced on the largest and most complex machines ever built
by humanity. Teams of scientists and engineers perform a multitude of studies to confirm
with greater accuracy the established SM and test theories beyond the Standard Model
                                                1


(BSM). This document presents one of those measurements in one of those experiments for
one specific particle, the Higgs boson. The Higgs boson plays a fundamental role in the SM,
as it is the particle responsible for the generation of the W, Z and fermion masses.
    The discovery of a particle with the properties of the Higgs boson in 2012 by ATLAS [1]
and CMS [2] concluded one of the main goals of the Large Hadron Collider (LHC) program
[3]. In subsequent years, with a larger dataset, more studies have been made putting the
measured resonance on more solid grounds [4][5]. One of those was the measurement of
H → bb̄ published in 2018 [6]. This measurement uses final states that limit a specific Higgs
boson production channel: gluon-gluon fusion (ggF), which in itself is a window to BSM
physics effects [7][8].
    This document explores a fully inclusive Higgs boson production using the H → bb̄
decay mode at very high energies. A boosted all-hadronic H → bb̄ search requires an extra
hadronic jet. Therefore, the analysis will focus on H(→ bb̄) + j where both jets must have a
boosted topology. It is the first study in the ATLAS collaboration that targets Higgs boson
production cross-sections with transverse momenta above 1 TeV.
    An analysis in a particle collision experiment requires multiple steps as a setup. You
need a collider to create the collisions, a detector to collect signals and procedures to turn
those signals into representations of physical objects. At the same time you cannot extract
much information without simulating first what process is being produced and how it inter-
acts with the detector. Only after that we can then define the scope of the measurement
and complete an analysis that measures a physical observable. Figure 1.1 presents a diagra-
matic representation of the steps required to perform a collider experiment analysis by the
combination of collider data and simulations.
                                                2


Figure 1.1: Diagram that shows the steps required to perform an analysis in a collider
experiment [9].
    This thesis is divided into three parts. The first part, starts with the theoretical back-
ground behind the SM, the Higgs boson, jets and hadronic collisions in Chapter 2. Then,
a description of the experimental apparatus: the Large Hadron Collider and the ATLAS
detector is explored in Chapter 3. Chapter 4 describes the reconstruction algorithms used to
define the physics objects used within the analysis, mainly Large-R jets and the techniques
to identify b-hadrons (b-tagging).
    The second part of this thesis, composed of Chapters 5 and 6, presents the boosted all-
hadronic H → bb̄ analysis and the results obtained, where the author contributed to various
studies, including uncertainties of the signal modeling, multijet background modeling and
combination studies. The analysis and the results shown on this document were published
on the paper: “Constraints on Higgs boson production with large transverse momentum
using H → bb̄ decays in the ATLAS detector” by the ATLAS Collaboration [10].
                                              3


    Finally, the third part of this thesis, consisting of Chapter 7, presents the studies per-
formed by the author as a member of the ATLAS Jet Tagging and Scale Factor Derivation
group. This chapter explores modern jet definitions and tagging techniques using Unified
Flow Objects [11]. These projects support ATLAS efforts to reduce the systematic uncer-
tainties associated with jet reconstruction as well as the development of tools to identify
boosted hadronic jets, which will be useful not just for the next iterations of the analysis
presented in this thesis, but for any analysis that aims to make measurements using hadronic
jets in the boosted regime.
                                                4


Chapter 2
Theoretical Background
To give a theoretical description of the Higgs boson, the subject of this thesis, we must first
explore the underlying framework that is used to describe fundamental interactions. The
framework is based on relativistic quantum mechanics and quantum field theories (QFT) that
respect certain symmetry transformations. The introduction given here for these subjects
is based on “The Quantum Theory of Fields” by Steven Weinberg [12], for a more detailed
approach refer to the original source.
    After the introduction, a summary of the QFTs that compose the Standard Model is
presented. The specifics of the Electroweak (EW) interaction, where the introduction of
the Higgs field becomes a neccesity to explain experimental observations of certain physical
procesess, is explored. To finalize, Quantum Chromodynamics is discussed to be able to
understand the origin of hadronic jets and how we model hadronic collisions.
2.1      Relativistic Quantum Mechanics
Any physical state is represented by rays in a finite complex vector space known as Hilbert
space [13]. A ray R is a set of normalized state vectors where two states Ψ,Ψ0 belong to the
same ray if Ψ0 = ξΨ with ξ being an arbitrary complex number that satisfies |ξ| = 1.
    An observable is represented by a Hermitian operator that satisfies the reality condition
A† = A. They represent mappings Ψ → AΨ of Hilbert space into itself. An observable
                                              5


represented by A acting on state represented by ray R must have a definitie value α for the
observable if its vectors are eigenvectors of A with eigenvalue α:
                                    AΨ = αΨ for Ψ in R.                                 (2.1)
If a system is in a state represented by R and an experiment is performed to test if it is in
one of the different states represented by mutually orthogonal rays Rn then the probability
of finding it is given by
                                   P (R → Rn ) = |(Ψ, Ψn )|2                            (2.2)
Symmetries
A symmetry transformation is a change of point of view that does not change the results of
an experiment [12]. That is, two observers O, O0 , looking at the same system represented
by rays R, R0 must find the same probabilities
                                 P (R → Rn ) = P (R0 → R0n ).                           (2.3)
Any transformation (R → R0 ) is defined by an operator U on Hilbert space such that if a
state Ψ is in ray R then U Ψ is in ray R0 . The operator U must satisfy U † = U −1 .
    Symmetry transformations have certain properties that define them as mathematical
groups [14]. A group is a set and an operation such that any two elements of the set produce
a third element of the same set. The operation must be associative, the set must have an
identity element and every element of the group has an inverse. For particle physics the
symmetry groups of interest for this thesis are the Lie groups. In particular, SU(n), the Lie
                                               6


group of n × n unitary matrices with determinant 1, and the Poincaré group, the Lie group
of Minkowski spacetime isometries.
Lie Group
A Lie group is a group of transformations T (θ) that can be described by a finite set of
real continuous parameters θa . On the Hilbert space the unitary operator U (T (θ)) can be
represented by a power series
                                                         1
                             U T (θ) = 1 + iθa ta + θb θc tbc ...,
                                    
                                                                                      (2.4)
                                                         2
where ta , tbc are Hermitian operators. The operator ta is known as the generator of the
group. Higher order terms of the expansion are related to the generator by the equation
                                    tbc = −tb tc − ifbc  a t .                        (2.5)
                                                            a
It is required that the generators satisfy a set of commutation relations known as the Lie
algebra:
                                                      a t
                                       [ta , tc ] = iCbc                              (2.6)
                                                          a
where Cbca ≡ −f a + f a are known as structure constants. For the special case where the
                  bc   cb
generators commute, the group goes from non-abelian to abelian. The unitary operator can
be expressed as simply
                                   U T (θ) = exp(ita θa ).
                                             
                                                                                      (2.7)
                                                  7


Poincaré Group
The Poincaré group, also known as the inhomogeneous Lorentz group is a 10-dimensional
non-Abelian Lie group represented by the set of transformations with the form T (Λ, a). A
Lorentz transformation connects coordinate systems in different intertial frames in a linear
form:
                                               µ
                                       x0µ = Λν xν + aµ .                                   (2.8)
The constant matrix Λ satisfies
                                             µ
                                        ηµν Λρ Λνσ = ηρσ                                    (2.9)
where ηµν is a Minkowski metric tensor. In Hilbert space, for an infinitesimal Lorentz
transformations, the unitary operator U (Λ, a) can be expanded in the form
                                                1
                            U (1 + ω, ) = 1 + iωρσ J ρσ − iρ P ρ .                       (2.10)
                                                2
Commutation relations between the combinations of J µν and P µ with themselves and each
other define the Lie algebra of the Poincaré group.
    The Hamiltonian operator is given by P 0 and is the generator of time translations. The
momentum three-vector P = {P 1 , P 2 , P 3 } is the generator of space translations. These
form a subgroup of the Poincaré group. In Hilbert space pure translations are represented
by
                                   U (1, a) = exp(−iP µ aµ ),                              (2.11)
In the same fashion, the angular-momentum three-vector J = {J 23 , J 31 , J 12 } is the generator
                                                8


of rotations
                                       U (Rθ , 0) = exp(iJ · θ).                             (2.12)
Finally the other generators form the boost three-vector K = {J 10 , J 20 , J 30 }.
Particles
A general one particle state Ψp,σ , with momentum p, and degrees of freedom σ, under any
Lorentz transformation Λ is given by
                                         N (p) X
                           U (Λ)Ψp,σ =                D 0 (W (Λ, p))ΨΛp,σ0 ,                 (2.13)
                                        N (Λp) 0 σ ,σ
                                                  σ
where N is a normalization factor, W is a Lorentz transformation that leaves the momentum
invariant (known as the little group) and D(W ) are the coefficients that form a representation
of the little group.
    Finding irreducible representations of the little group is how we classify physical states
and thus how we define the different types of particles. For example, for particles with
                                                            (j)
positive-definite mass, an irreducible representation D          of dimensionality 2j + 1 with j =
                                                            σσ 0
0, 1/2, 1, · · · , can be built using the standard rotation matrices. For this case σ runs over
the values j, j − 1, · · · , −j for a particle with spin j.
Experimental Observables
The Hamiltonian (H) can be divided into two terms, a free-particle Hamiltonian H0 and an
interaction term V :
                                             H = H0 + V.                                     (2.14)
                                                    9


The free-particle Hamiltonian has eigenstates Φα , with eigenvalue Eα and the full Hamilto-
nian has eigenstates Ψ±α with the same eigenvalue as the free-particle Hamiltonian. These
are known as “in”(+) and “out” (−) states and can be written in terms of the free-particle
eigenstates:
                                      Ψ±α = Ω(∓∞)Φα ,                                   (2.15)
where
                               Ω(τ ) = exp(+iHτ )exp(−iH0 τ ).                          (2.16)
The in and out states contain the particles described by the label α if observations are made
at τ → ±∞.
    The probability amplitude for a transition of states α → β is governed by the scalar
product of the “in”(+) and “out”(-) states known as the S-matrix:
                                Sβα = (Ψ−     +
                                         β , Ψα ) ≡ (Φβ , SΦα ).                        (2.17)
Where S is the S-operator defined as
                                     S = Ω(∞)† Ω(−∞).                                   (2.18)
The master formula to interpret calculations of S-matrix elements in terms of predictions for
actual experiments is
                                Sβα ≡ −2πiδ 4 (pβ − pα )Mβα .                           (2.19)
Here the delta function ensures the conservation of total energy and momentum and Mβα
represents the non-trivial scattering matrix elements.
                                              10


Decay Rate
The decay rate for a single particle state α into a general multi-particle state β is given by
                             dΓ(α → β) = 2π|Mβα |2 δ 4 (pβ − pα )dβ.                       (2.20)
When multiple decay modes are available to a specific particle the total decay rate will be
the sum of all of the individual modes
                                                  X n
                                         Γtotal =     Γi .                                 (2.21)
                                                  i=1
Then, it is useful to define the branching fractions to quantify the probability of each specific
decay mode. The branching fraction of mode i is given by
                                                   Γi
                                          Bi =          .                                  (2.22)
                                                 Γtotal
Cross-Section
When α is a two particle state, we can calculate the transition rate per flux, known as the
differential cross section, using the decay rate
                dσ(α → β) = dΓ(α → β)/Φα = (2π)4 u−1              2 4
                                                          α |Mβα | δ (pβ − pα )dβ,         (2.23)
where Φα is defined as the product of the density and the relative velocity uα between the
two particles in state α. When the differential cross section is integrated over all the possible
                                                11


configurations i we call it the total or inclusive cross-section
                                                     X n
                                          σtotal =        σi .                             (2.24)
                                                     i=1
Interactions
To consider the interaction term we rewrite the S-operator as a Dyson series for the time
ordered interaction Hamiltonian density H(x) defined as:
                              ∞
                                 (−i)n
                                        Z
                                           d4 x1 · · · d4 xn T {H(x1 ) · · · H(xn )},
                             X
                   S =1+                                                                   (2.25)
                                   n!
                             n=1
where
                                                                Z
                       V (τ ) ≡ exp(H0 τ )V exp(−iH0 τ ) =        d3 xH(x, t).             (2.26)
    Then it is possible to write an asymptotic expansion of the S-operator in whatever cou-
pling constant factors appear in the inveraction terms of the Hamiltonian density. This
technique is known as perturbation theory.
2.2      Quantum Field Theory
For the Hamiltonian density to satisfy both Lorentz invariance and the cluster decomposi-
tion principle, the Hamiltonian density must be constructed as a function of creation and
annihilation fields. A creation field is defined to have a creation operator (ac† (p, σ)) that
adds a particle to the list of particles in a physical state. The annihilation field contains the
annihilation operator (a(p, σ)) and does the opposite, it removes a particle from any state
in which it acts. A general quantum field in the irreducible (A, B) representation of the ho-
                                                 12


mogeneous Lorentz group, where A and B are the spins, is defined by a linear combination
of creation and annihilation fields:
                         XZ
               (2π)−3/2          d3 p uab (p, σ)a(p, σ)eip·x + (−)2B vab (p, σ)ac† e−ip·x .
                                                                                        
     ψab (x) =                                                                              (2.27)
                          σ
where the coefficients a, b are integers or half-integers running over the values
                 a = −A, −A + 1, · · · , +A and b = −B, −B + 1, · · · , +B.                 (2.28)
A field according to the (A, B) representation has components that rotate like objects of
spin j with
                              j = A + B, A + B − 1, · · · , |A − B|.                        (2.29)
Fields with integer values for the spin commute with each other and are classified as bosons.
Bosons do not obey the Pauli exclusion principle and thus are described by Bose-Einstein
statistics. On the other hand, half-integer spin fields, known as fermions, anticommute with
each other. Fermions obey the Pauli exclusion principle and therefore, a system of fermions
follow Fermi-Dirac statistics.
Lagrangians
In practice it is preferable to work with Langrangians (L) instead of the Hamiltonians (H).
These two quantities are related to each other by taking the Legendre transformation:
                              XZ
                        H=            d3 x Πl (x, t)Ψ̇l (x, t) − L[Ψ(t), Ψ̇(x)],            (2.30)
                               l
                                                   13


where Ψ is a set of generic fields, Π are the conjugate fields and the dotted variables represent
time-derivatives. The conjugate field is defined using variational derivatives:
                                                 δL[Ψ(t), Ψ̇(t)]
                                    Πl (x, t) =                     .                      (2.31)
                                                     δ Ψ̇l (x, t)
These are knwon as the Euler-Lagrange equations, and their time derivatives Π̇l (x, t) are
the equations of motion. Defining the Lagrangian density L,
                                        Z
                                 ˙ =
                       L[Ψ(t), Ψ(t)]       d3 x L(Ψ(x, t), ∇Ψ(x, t), Ψ̇(x, t))             (2.32)
we can express the Euler-Lagrange equations in their usual form
                                               ∂L           ∂L
                                        ∂µ         l
                                                       =        .                          (2.33)
                                           ∂(∂µ Ψ )         ∂Ψl
Scalar Fields
A scalar field is a field of type (0,0) in the irreducible representation of the homogeneous
Lorentz group and therefore are spin 0 fields. A general Langrangian density L for a massive
free scalar field Φ is given by
                                            1                m2 2
                                    L = − ∂µ Φ∂ µ Φ −             Φ .                      (2.34)
                                            2                 2
The Euler-Lagrange equation is then
                                        (∂µ ∂ µ − m2 )Φ = 0,                               (2.35)
                                                  14


which is the usual Klein-Gordon equation.
Vector Fields
A vector field is a field of type ( 12 , 21 ) in the irreducible representation of the homogenous
Lorentz group. Therefore they can be spin 0 or spin 1. The spin 0 vector field is just the
derivative of a spin 0 scalar particle (∂µ Φ). For a massive spin 1 vector field Aµ and no
external currents the Langrangian density L is
                                             1
                                  L = − Fµν F µν − m2 Aµ Aµ ,                                     (2.36)
                                             4
where Fµν ≡ ∂µ Aν − ∂ν Aµ is the field strength tensor. In cojunction with ∂µ Aµ = 0, the
Euler-Lagrange equation takes the form:
                                        (∂µ ∂ µ − m2 )Aµ = 0,                                     (2.37)
which is known as the Proca equation. This implies that each component of the field fulfils
the Klein-Gordon equation.
Dirac Fields
Dirac fields represent particles of spin 1/2 and are of the type ( 12 , 0)   (0, 21 ) in the irreducible
                                                                           L
representation of the homegeneous Lorentz group. A general Langrangian density for Dirac
fields is of the form
                                       L = −ψ̄(γ µ ∂µ + m)ψ,                                      (2.38)
                                                    15


where ψ̄ = ψ † γ 0 is the Dirac adjoint. The Euler-Lagrange equations for ψ is known as the
Dirac equation
                                     (iγ µ ∂µ + m)ψ(x) = 0.                             (2.39)
Taking the hermitian conjugate of the Dirac equation and multiplying on the right by γ 0 ,
the adjoint Dirac equation can be derived. When both solutions are combined we arrive at
                                      (∂µ ∂ µ − m2 )ψ µ = 0.                            (2.40)
Therefore, each component of the Dirac field satisfies the Klein-Gordon equation.
2.3       Standard Model
The Standard Model (SM) is the collection of quantum field theories (QFTs) that describe
three of the four fundamental forces of the universe and classifies all the elementary par-
ticles currently known. The three forces it describes are: the strong interaction, the weak
interaction and electromagnetism. The strong interaction is described by quantum chro-
modynamics (QCD). Electromagnetism (EM) and the weak interaction are unified into the
same theory, called the electroweak interaction (EW). On its entirety the SM respects the
symmetry under the non-abelian SU(3)C × SU(2)L × U(1)Y gauge group.
    The SM contains both types of particles, fermions and bosons. The fermions can be
divided into quarks and leptons and they exist in three generations; each one with increasing
mass. There are six types of quarks: up, down, strange, charm, bottom and top. There are
also six leptons: electron, muon, tau and their respective neutrinos. The bosons are divided
into the spin 1 (vector) force carriers and the spin 0 (scalar) Higgs boson. The force carriers
                                                16


are the photon γ for the electromagnetic interaction, the gluon g for the strong interaction
and the W ± , Z bosons that mediate the weak interaction. Figure 2.1 summarizes all the
SM fundamental particles.
Figure 2.1: Standard Model of elementary particles showing the twelve fundamental fermions
and five fundamental bosons [15].
2.4      Quantum Chromodynamics
Quantum Chromodynamics (QCD) is the non-abelian gauge field theory that describes the
strong interaction, a force only felt by quarks and gluons. Quarks are in the fundamental
representation of the SU(3) color group. They are represented by quark field spinors ψi ,
where i the color-index that goes from 1 to Nc = 3 (the number of colors). The gluon is a
vector field, Aaµ , where a runs from 1 to Nc2 − 1 = 8. Gluons transform under the adjoint
                                              17


representation of SU(3) color group. The eight 3 × 3 matrices taij are the generators of SU(3),
which can be represented explicitly by the Gell-Mann matrices (λa ) given by ta = λa /2. The
QCD Lagragian is given by
                                                                      1
                   L = ψ̄i (iγ µ ∂µ δij − gs γ µ taij Aaµ − mδij )ψj − Gaµν Gaµν .        (2.41)
                                                                      4
The QCD coupling constant is αs = gs2 /4π. The gluon field strength tensor Gaµν is given by
                              Gaµν = ∂µ Aaν − ∂ν Aaµ − gs fabc Abµ Acν                    (2.42)
where fabc are the structure constants of the SU(3) group. The coupling constant αs is a
function of the scale at which the process happens. Figure 2.2 shows different experimental
measurements of the coupling constant αs as a function of energy scale Q. Quarks and
gluons cannot be isolated, only color-singlet (color neutral) combinations of them can be
observed as free particles. Given that the coupling is really strong at low energies, it leads
to the confinement of quarks and gluons into hadrons, a non-perturbative process called
hadronization. On the other hand, for hard processes the strong coupling is weak and
the theory becomes suitable to perturbative theory techniques, a phenomenon known as
asymptotic freedom.
   Before hadronization occurs, a hard scattering event involving a QCD interaction starts
with the interacting particles radiating more gluons and quarks (parton showering) until the
parton energy gets to the hadronization scale (Λ). This leads to the formation of collimated
sprays of energetic hadrons, which we call jets [17]. In Chapter 4 we will discuss more about
the particular type of jets used in this analysis, how we define them, the rules to group
                                                      18


Figure 2.2: Measurements of αs as a function of the energy scale Q. The degree of QCD
perturbation theory used in the extraction of αs is indicated in parentheses [16].
particles and how to determine their momenta.
2.5       Electroweak Theory
The standard model of electroweak interactions is based on the gauge field theory that
respects the symmetry of the SU(2)L × U(1)Y product group, with gauge bosons Wµi , i =
1, 2, 3 and Bµ , and their corresponding gauge coupling constants g and g 0 . The fundamental
quantities are the SU(2)L weak isospin and the U(1)Y weak hypercharge. The right-handed
fields are singlets in SU(2). On the other hand, the left-handed fermion fields transform as
                                              19


doublets                                                
                                          vi           ui 
                                    Ψi =         and                                 (2.43)
                                           li−             d0i
where d0i =
            P
               j Vij dj and V is the Cabibbo-Kobayashi-Maskawa (CKM) mixing matrix [18].
The CKM matrix is a unitary 3 × 3 matrix and it describes the quark flavor mixing in weak
interactions. In one of the standard parametrizations it can be expressed as
                                                                          
                    1     0      0   c13         0 s13 e−iδ   c12 s12 0
                                                                          
         VCKM = 0 c23 s23   0                                                         (2.44)
                                                                          
                                                    1     0    −s12 c12 0
                                                                              
                                    
                                                                          
                       0 −s23 c23        −s13 e iδ  0   c13         0    0 1
where sij = sin θij , cij = cos θij and δ is the phase responsible for CP-violating phenomena
in flavor-changing processes.
Higgs Mechanism
For the EW theory to be consistent with observations we require a mechanism that makes the
W and Z bosons massive to render the weak interaction short range. This can be achieved
by the introduction of a scalar field Φ, called the Higgs field, that causes a spontaneous
breaking of the electroweak gauge symmetry (EWSB) [16]. The Higgs potential is of the
form:
                                    V (Φ) = µ2 Φ† Φ + λ(Φ† Φ)2                           (2.45)
                                                 20


The Higgs field is a self-interacting SU(2) complex doublet with a weak hypercharge Y = 1
normalized such that it has a neutral charge (Q = T3L + Y /2):
                                                 √ +
                                                        
                                           1  2φ 
                                     Φ= √                                            (2.46)
                                            2    0     0
                                                φ + ia
where φ0 and a0 are the CP-even and CP-odd neutral components and φ+ is the complex
charged component. If the quadratic term of V (Φ) is negative the neutral component of the
scalar doublet acquires a non zero vacuum expectation value (VEV)
                                                    
                                               1 0
                                        hΦi = √                                      (2.47)
                                                2
                                                    v
with φ0 = H + hφ0 i and hφ0 i = v, inducing the spontanueos breaking of the gauge symmetry
SU(3)C ×SU(2)L ×U(1)Y into SU(3)C ×U(1)em . Three of the four generators of the SU(2)L ×
U(1)Y are spontaneously broken; this imples the existence of three massless Goldstone bosons
which can be identified as three of the four Higgs field degrees of freedom. Figure 2.3
illustrates the fact that the Higgs field VEV is not a single state with an energy of 0 and
instead it has degenerate vacua with a VEV of v.
    The kinetic term of the Higgs Lagrangian shows how the Higgs field couples to the Wµ
and Bµ gauge fields of the SU(2)L × U(1)Y local symmetry:
                                LHiggs = (Dµ Φ)† (Dµ Φ) − V (Φ)                        (2.48)
                                              21


Figure 2.3: Illustration of the Higgs potential V (Φ). After EWSB, the Higgs field VEV is
not a single state with an energy of 0 (represented by A), it has degenerate vacua with a
VEV of v (represented by B) [19].
The covariant derivative is:
                              Dµ = ∂µ + igσ a Wµa /2 + ig 0 Y Bµ /2                    (2.49)
where g and g 0 are the SU(2)L and U(1)Y couplings and σ a are the Pauli matrices. Expanding
the kinetic term and rearranging then we can see that the presence of the Higgs field gives
mass to the gauge bosons. Examining the mass term of the Lagrangian:
                                g2v2                v2       µ
                          Lm =       (W12 + W22 ) + (gW3 − g 0 B µ )                   (2.50)
                                 8                   8
we can see that the physical massive bosons are combinations of the original gauge bosons.
Rewriting in terms of the physical fields:
                                   1                   1
                             Lm = m2W Wµ+ W −µ + m2Z Zµ Z µ                            (2.51)
                                   2                   2
                                              22


where
                            Wµ1 ∓ iWµ2                     1
                    Wµ± =       √      ,    Zµ = p                (gWµ3 − g 0 Bµ )       (2.52)
                                   2                     2
                                                        g +g   02
and
                                     g2v2                (g 02 + g 2 )v 2
                             m2W =         ,  m2Z =                                      (2.53)
                                       4                        4
There is one combination of W3 and B, orthogonal to Z, that is not present in the mass
Lagrangian; it corresponds to the photon:
                                            1              µ
                                Aµ = p             (g 0 W3 + gB µ )                      (2.54)
                                          2
                                         g +g   02
Of the initial four degrees of freedom of the Higgs field, three of them were absorbed by
the W ± and Z bosons and the remaining degree of freedom, H, becomes the physical Higgs
boson. The Higgs boson is a CP-even spin 0 (scalar) particle with a mass given by mH =
√
  2λv, where λ is the self coupling parameter. The Higgs field expectation value is fixed by
                                       √
the Fermi coupling costant GF : v = ( 2GF )−1/2 ≈ 246 GeV.
    Fermions acquire mass through interactions with the Higgs field, also known as Yukawa
interactions. Yukawa couplings respect all the symmetries of the SM but generates fermion
masses after the EWSB occurs.
             LYukawa = −ĥdij q̄Li ΦdRj − ĥuij q̄Li iσ2 Φ∗ uRj − ĥdij ¯lLi ΦeRj + h.c. (2.55)
After the Higgs field acquires a VEV the fermions acquire a mass in the form: mfi =
      √
hfi v/ 2, where hfi is the Yukawa coupling and i = 1, 2, 3 refer to the three families of the
up-quark, down-quark or charged lepton sectors.
                                               23


    The coupling of the Higgs boson to other fundamental particles is dictated by how massive
the particle is. The interaction is strongest with particles such as the W/Z bosons and to
top quarks. For fermions the coupling is linearly proportional to the fermion mass (gHf f¯ =
mf /v) and for bosons it is proportional to the square of the boson masses (gHV V = 2m2V /v).
2.6      Higgs Boson Phenomenology
This section explores the Higgs boson production modes and the branching ratios for all of
its decay channels [20]. We will finish the section by exploring how a boosted Higgs boson
can be used as a probe for beyond the Standard Model (BSM) physics.
Production and Decays
Experimentally, the Higgs boson mass is measured to be mH = 125.25 ± 0.17 GeV. To
produce a Higgs boson we require a collider experiment with a large center-of-mass (CoM)
energy such as the Fermilab Tevatron [21] or the CERN LHC [3]. Given that we are studying
the Higgs boson at the LHC, we need to first understand the production mechanisms in
hadron (on this case: proton-proton (pp)) collisions. The principal production mode at the
LHC is the gluon-gluon fusion (ggF) process, followed by weak-boson (vector-boson) fusion
(VBF). Other production modes include associated production with a gauge boson (VH),
associated production with tt̄ quark pair (tt̄H) or associated production with a single top
quark (tHq). Figure 2.4 illustrates the leading order Feynman diagrams for some of the
Higgs boson production modes at the LHC.
    Figure 2.5 shows the Higgs boson production cross section for pp collisions at a CoM
          √
energy of   s = 13 TeV as a function of Higgs boson mass. For a Higgs boson mass of mH =
                                               24


Figure 2.4: Leading order Feynman diagrams that contribute to Higgs boson production in
(a) ggF, (b) VBF, (c) Higgs-strahlung, (d) associated production with a gauge boson and
(e) associated production with a pair of top-quarks [16].
125 GeV, the total production cross section is 55.1 pb. The production mode breakdown as
a percentage is as follows: ggF is the largest contribution with 88% of the total production
cross section, VBF accounts for around 7%, VH (WH and ZH combined) sum to 4% and
tt̄H is close to 1%.
    Detecting the Higgs boson requires an understanding of all the relevant decay channels.
The Higgs boson has a natural width of 4 MeV, meaning it has a lifetime of the order of
10−22 seconds. Figure 2.6 shows the Higgs boson decay branching ratios as a function of
Higgs boson mass. The dominant decay mode is H → bb̄, with a branching fraction of
about 58%. This decay mode is the focus of the study presented in this thesis. Even though
it is the most common decay mode, the channel suffers from large backgrounds, primarily
from bb̄ production. To measure its mass the two high mass-resolution sensitive channels are
H → γγ and H → ZZ → 4l, which despite having low branching ratios, have clean signals.
These two channels were used for the original discovery of the Higgs boson in 2012 [23][24].
    H → bb̄ is a promising channel to study the Higgs field coupling to quarks. For the
direct observation of the Higgs boson decaying to a pair of b-quarks, the production mode
used in the original studies was the VH channel. The presence of a vector boson reduces the
                                              25


                                                                                            LHC HIGGS XS WG 2016
                         102
                                                                                s= 13 TeV
              σ(pp → H+X) [pb]
                                       pp → H (N3LO QCD + NLO EW)
                                 10
                                       pp → qqH (NNLO QCD + NLO EW)
                                       pp → WH (NNLO QCD + NLO EW)
                                       pp → ZH (NNLO QCD + NLO EW)
                                 1     pp → ttH (NLO QCD + NLO EW)
                                       pp → bbH (NNLO QCD in 5FS, NLO QCD in
                                                                               4FS)
                      10−1             pp → tH (NLO QCD)
                                 120         122           124       126          128   130
                                                                                  MH [GeV]
                                                                                        √
Figure 2.5: Standard Model Higgs boson production cross sections at                         s = 13 TeV as a
function of Higgs boson mass for pp collisions [22].
relative background because the leptonic decay of the W and Z enable efficient triggering and
a significant reduction of the multijet background. The Higgs candidate was reconstructed
from two b-tagged jets in the event. Both ATLAS and CMS observed a significance of the
excesses greater than 5σ when combining Run 1 and Run 2 data [26][27].
   Sensitivity for an inclusive search for H → bb̄ in the ggF production mode is limited
because of the large amount of background from the inclusive production of pp → bb̄ + X.
From the Run 1 dataset, no meaningful results exist. The analysis presented in this document
is the first ever performed by the ATLAS collaboration that attempts to do this with the full
Run 2 data, with the sensitivity increased by focusing on Higgs boson production at high
transverse momentum.
                                                             26


              Branching Ratio
                                                                            LHC HIGGS XS WG 2016
                                1
                                    bb
                                    WW
                                    gg
                            10-1
                                    ττ
                                    cc
                                    ZZ
                           10-2
                                    γγ
                           10-3     Zγ
                                    µµ
                           10-4
                             120 121 122 123 124 125 126 127 128 129 130
                                                               MH [GeV]
Figure 2.6: Standard Model Higgs boson decay branching ratios as a function of Higgs boson
mass [25].
Boosted Higgs boson
To boost a Higgs boson to high momenta, it is required to have an extra jet in the event for
the Higgs boson to recoil against. Figure 2.7 contains a couple examples of diagrams that
contribute to the H + j production cross section.
   A Higgs boson with high transverse momentum can be used to set constraints for beyond
the Standard Model (BSM) [29][30]. The inclusion of a set of dimension-six operators [31] in
the SM lagrangian that describe physics at a scale Λ above the EW scale, modify the Yukawa
operator, provide a contact interaction of the Higgs boson with gluons and introduce the
chromomagnetic dipole moment operator [32]. All of these interactions have an impact in
the Higgs boson pT distribution. In particular, when considering the chromomagnetic dipole
                                               27


Figure 2.7: Examples of gluon-gluon fusion Feynman diagrams that contribute to the H + j
process [28].
moment operator in the case of single Higgs production, it has been shown to have a large
impact at high pT [33]. Figure 2.8 illustrates these results. The extra term related to the
chromomagnetic dipole moment in this context is of the form
                      c3           gs mt
                        2
                          O3 = c3 3 (v + h)GA               µν A
                                                   µν (ψ̄L σ t ψR + h.c.)                 (2.56)
                      Λ             2v
where c3 is the Wilson coefficient, σ µν are the Pauli matrices and ψ is the spinor representing
the top quarks.
2.7      Simulation of proton-proton Collisions
Any cross section that involves QCD interactions of initial-state hadrons is inherently not
calculable in perturbative QCD. Structure functions are needed to describe these complex
objects. The structure functions are given in terms of non-perturbative parton distribution
functions (PDFs) . A PDF fq/p (x) represents the number density of quarks of type q inside
a hadron that carry a fraction x of its longitudinal momentum. A typical hadron-hadron
                                                28


Figure 2.8: Impact of the chromomagnetic operator on the pT spectrum of the Higgs boson.
The bottom panel shows the ratio with respect to the SM prediction [33].
(h1 ,h2 ) collision cross-section is of the form
                         ∞              XZ                                                 (n)
                             αsn (µ2R )      dx1 dx2 fi/h (x1 , µ2F )fj,h2 (x2 , µ2F ) × σ̂ij→X
                        X
    σ(h1 , h2 → X) =                                                                            (2.57)
                                                         1
                        n=0             i,j
where s is the squared center-of-mass energy of the collision, µR is the renormalization scale
and µF is the factorization scale, the scale at which emissions with transverse momenta below
it are accounted for within the PDFs. The parton level cross-section σ̂ij→X (x1 x2 s, µ2R , µ2F )
can be calculated using perturbative QCD.
    PDFs are determined empirically by fitting a large number of cross section data points
from many experiments, including Deep Inelastic Scattering experiments (DIS) and hadron
collider experiments. To evolve those functions to different energy scales, the Dokshitzer-
                                                  29


Gribov-Lipatov-Altarelli-Parisi (DGLAP) [34] equation is employed. Usually the default
choice of the scales is µR = µF = Q. Figure 2.9 shows the CT18 parton distribution
functions at different energy scales.
Figure 2.9: The CT18 parton distribution functions at Q = 2 GeV and Q = 100 GeV for
          ¯ s = s̄, and g [35].
u, ū, d, d,
     The parton-hadron transition is non-perturbative, so it is not possible to calculate quan-
tities like the energy spectrum of hadrons in high-energy collisions. Nevertheless it is possible
to factorize the perturbative and non-perturbative behaviours using the concept of fragmen-
tation functions. Similarly to PDFs, they depend on a factorization scale and satisfy the
DGLAP evolution equation.
     To create simulations of this entire process, we use parton-shower Monte Carlo (MC)
event generators such as Pythia [36], Herwig [37] and Sherpa [38]. They provide a full
simulation of QCD events at the level of measurable particles. Figure 2.10 shows a sketch
of a pp collision as simulated by a MC generator. There are MC generators that only
produce the matrix elements, such as MadGraph5 aMC@NLO [39], which are then passed to
a shower/hadronization program such as Pythia. The parton shower MC programs model
the gluon emissions and gluon splittings simulating a cascade of particles. Each emission
is generated at a lower scale, with the emissions stopping at a scale of the order of 1 GeV.
                                               30


At this point a hadronization model is used to combine the resulting particles into hadrons.
There are different hadronization/shower models which might have slight differences in the
end result. In practice multiple programs are considered when generating MC predictions
for an analysis and the differences are quantified as a source of uncertainty.
    The remnants of hadron collisions also have to be modeled; this is refered to as the
underlying event (UE). The UE is usually implemented by introducing multiple parton in-
teractions (MPI) at a scale of a few GeV. Similarly, pile-up also has to be simulated. Pile-up
refers to any other pp collisions in addition to the collision of interest.
Figure 2.10: Sketch of a proton-proton collision as simulated by a multi-purpose Monte Carlo
event generator [40].
    As a last step all of the particles/jets generated and their kinematic variables, at “truth-
level”, are subjected to a detector simulation. All of the particle interactions with the
different detector modules are done with Geant4 [41]. Geant4 is a toolkit for simulating
                                                31


the passsage of particles through matter. After the detector simulation is completed all the
kinematic variables modified are then refered to as being at “reconstructed-level”.
                                             32


Chapter 3
Experimental Apparatus
3.1      Large Hadron Collider
To study physics at small scales it is necesary to accelerate particles to high energies and
have them interact, that is, make them collide. This is done by the Large Hadron Collider
(LHC) [3] at the European Organization for Nuclear Research, known as CERN located
at the French-Swiss border of Geneva. The LHC is the largest and most powerful parti-
cle accelerator ever built and is part of the CERN accelerator complex shown in Figure
3.1. The process starts with a cylinder of hydrogen gas. The hydrogen atoms are ionized
to obtain protons. These protons are then accelerated in bunches by using a series of ac-
celerators, first a linear accelerator (LINAC), then the proton synchrotron (PS), the super
proton synchrotron (SPS) and finally the LHC. The collider itself consists of two rings with
a circumference of approximately 26.7 km, where the two counter-rotating proton beams
are accelerated to a momentum of 6.5 TeV per beam, leading to a center of mass energy of
√
  s = 13 TeV. To maintain the beams along the trajectory, the LHC uses superconducting
dipole magnets which are cooled to a temperature below 2 K using superfluid helium. The
superconducting magnets produce magnetic fields with a strength of about 8 T. Quadrupole
magnets are used to squeeze the beams as they enter the interaction points. The LHC is
designed to run with 2808 bunches per beam separated by a 25 ns gap with each bunch con-
                                              33


taining 100 billion (1011 ) protons. This translates to a crossing rate of 40 MHz with typicaly
50 collisions per crossing. There are four distinct interaction points where the beams cross
and the protons collide. On these sites the main detectors are placed: ALICE (A Large Ion
Collider Experiment) [42], LHCb (LHC-beauty) [43], CMS (Compact Muon Solenoid) [44]
and ATLAS (A Toroidal Large ApparatuS) [45].
Figure 3.1: Illustration of the CERN accelerator complex. The LHC is the last ring in a
complex chain of particle accelerators [46].
    The number of events per second generated in the LHC can be described by the equation:
                                       Nevent = Lσevent                                   (3.1)
where σevent is the cross section for a certain process under study and L is the machine
luminosity. Given that the luminosity depends only on the beam parameters it can be used
                                               34


as a measure of the performance of the collider. The full definition for a gaussian beam
distribution is:
                                            Nb2 nb frev γ
                                       L=                 F                              (3.2)
                                              4πn β
where Nb is the number of particles per bunch, nb the number of bunches per beam, frev the
revolution frequency, γ is the relativistic gamma factor, n the normalized transverse beam
emittance (area occupied by the beam), β the beta function (function of the transverse size
of the beam) at the collision point, and F the geometric luminosity reduction factor due to
the crossing angle at the interaction point. Integrating (with respect to time) the luminosity
over the different runs would then give us a measure of the amount of data that was delivered
by the LHC. Figure 3.2 shows the total integrated luminosity for Run 2 (2015-2018) of the
LHC, as well as the data recorded by the ATLAS detector that was deemed good for physics.
Figure 3.2: Total Integrated Luminosity and Data Quality of the LHC during Run 2 (2015-
2018) [47].
                                               35


3.2      ATLAS Detector
ATLAS [45] is a multi-purpose particle detector of 25 meters in height and 44 meters in
length that weighs about 7000 tonnes. It consists of various layers that perform specific
measurements of the particles from the collision. ATLAS is located 100 meters below the
surface at the CERN LHC Point 1. The detector was designed to have forward-backward
symmetry along the beam pipe with a large azimuthal angle coverage. It contains a super-
conductiong solenoid that surrounds the inner detector, inmersing it in a 2 T solenoid field.
ATLAS also contains electromagnetic and hadronic calorimeters that are surrounded by su-
perconducting air-core toroids arranged with an azimuthal symmetry. A muon spectrometer
is located within the toroids. Figure 3.3 shows an ATLAS schematic of the different detector
modules.
Figure 3.3: Schematic of the ATLAS detector showing its main sub-components. The people
in the diagram indicate the scale of the detector [48].
                                              36


3.3      Coordinate System
To describe the ATLAS detector in detail, we must first describe the conventions regarding
the coordinate system used. The clockwise direction of the beam defines the z-axis while
the x-y plane lies transversal to the beam direction. The positive x-axis points towards the
center of the LHC ring and the positive y-axis points upwards. The azimuthal angle φ is
defined around the beam axis and the polar angle θ is the angle from the beam axis. Rapidity
then is defined as y = 1/2 ln (E + pz )/(E − pz ), which for massless particles becomes the
pseudorapidity η = − ln tan(θ/2). With these quantities the distance in the pseudorapidity-
                                                    p
azimuthal angle space can be defined as ∆R = ∆η 2 + ∆φ2 . Other quantities of interest
are the kinematic variables defined on the transverse (x-y) plane, the transverse momentum
pT , transverse energy ET and missing transverse energy ET     miss .
3.4      Tracking
Because of the large number of particles that emerge from the collision point, the inner
detector (ID) must have fine granularity in order to make high precision measurements. It
also is designed to provide hermetic and robust pattern recognition. The ID achieves this
with its 3 sub-detectors: the pixel detector [49], the silicon microstrip trackers also known as
the semiconductor Tracker (SCT) and the transition radiation tracker (TRT). The ID covers
the region η < 2.5, extends to 1.15 m radially and has a length of 6.2 m. It is contained in
a solenoid that inmerses its 3 sub-detectors in a 2 T magnetic field which allows charge and
momentum measurements. With its track reconstruction capabilities, the ID is the main
system used to construct primary and secondary interaction vertices as well as identifying
heavy-flavor jets (i.e. b-tagging). Figure 3.4 shows the overall layout of the inner detector.
                                               37


A schematic view of the overrall path of a charged particle in the inner detector is shown in
Figure 3.5.
Figure 3.4: Schematic diagram of the ATLAS inner detector showing its different sub-
components [50].
Pixel detector
The innermost part of the ID is the pixel detector [49]. The pixel detector provides the
highest granularity around the vertex region with a total of 1744 modules. Each module is
composed of oxygenated silicon sensors, front-end electronics and flex-hybrids with control
circuits. The silicon sensors are the sensitive part of the pixel detector and function as a
solid-state ionization chambers. The pixel sensor is an array of bipolar diodes placed on
a silicon wafer. The p-n junctions operate under a reverse bias. Ionizing particles passing
through the active volume create drifting electron-hole pairs that produce electrical signal
that can be measured. The bulk contains oxygen impurities to increase tolerance of the
silicon against damage caused by charged hadrons [51].
                                              38


    The modules are arranged in three concentric cylinders around the beam axis and as
three disks in the end-cap regions. Three layers allow an effective reconstruction of tracks
by requiring a minimum of 3 hits. An Insertable B-Layer (IBL) [52] was installed between
the beam pipe and the pixel detector during the 2016 LHC shutdown to maintain robust
tracking in the presence of increased pileup and radiation, while also providing improved
precision for vertexing and tagging.
Semiconductor Tracker
After the pixel detector we have the semiconductor tracker (SCT), arranged in four concentric
cylinders around the beam axis and nine disks in the end-cap regions. Instead of pixels the
SCT contains silicon strip sensors. Each module is composed of two sensors glued together.
Eight strip layers are crossed by each track. Small-angle stereo strips consisting of two 6.4
cm long daisy-chained sensors measure both coordinates in the barrel region . In the end-cap
region, the strips run radially with a set of stereo strips at an angle of 40 mrad. The SCT
has a resolution of 16 µm in φ and 580 µm in z.
Transition Radiation Tracker
At a larger radius, the straw tubes of the Transition Radiation Tracker (TRT) provide infor-
mation on particle tracking and identification. The TRT consists of gas-filled (Xe,C02 ,O2 )
drift tubes with a gold plated tungsten wire inside. In the barrel region, these straws are
parallel to the beam axis, while in the end-cap region, they are arranged radially in wheels.
In the barrel region the TRT achieves a resolution of 130 µm while in the end-cap region
it provides an accuracy of 30-50 µm. Each of the 3 cylidrincal layers contains 32 modules,
                                               39


and each module is composed of a carbon-fiber laminated shell with an internal array of the
straws embedded in a matrix of polypropylene fibers that serve as the transition radiation
material.
Figure 3.5: Schematic diagram of the structural elements traversed by a charged particle in
the barrel inner detector [53].
3.5      Calorimetry
Calorimeters are used to measure the energy of particles. Due to their segmented nature
they also can provide directional information about energetic charged leptons and hadrons
and even neutral particles that don’t interact with the trackers. When particles enter the
calorimeter they initiate a particle shower. These lower energy particles are absorbed by the
                                              40


calorimeters and consequently produce a signal that allows the measurement of the energy
deposited. The ATLAS calorimeters cover the range |η| < 4.9. They can be divided in two
main categories: the EM calorimeters and hadronic calorimeters. A diagram of the calorime-
ter system is shown in Figure 3.6. Both calorimeter systems must provide good containment
for electromagnetic and hadronic showers and also limit the punch-through into the muon
system. The EM calorimeter has a finer granularity that is suited for precision measurements
of electrons and photons. The hadronic calorimeter has coarser granularity and is sufficient
to satisfy the physics requirements for jet reconstruction and ETmiss measurements.
Figure 3.6: Schematic diagram of the ATLAS calorimeter system showing all of its sub-
components [54].
    The energy resolution of a sampling calorimeter is parametrized as
                                     σ       a     b
                                        = √ ⊕ ⊕ c,                                      (3.3)
                                     E        E E
where a is the stochastic term, b is the noise term and c corresponds to a constant term.
The stochastic term represents the random nature of the showering process and is dependent
                                               41


of the active and absorber materials in the calorimeter as well as the number of layers and
their thickness. The noise term describes the electronic noise of the readout system. The
constant term reflects local non-uniformities in the response of the calorimeter.
    The EM calorimeter was measured [55] to have an energy resolution of
                               σ     2.8% 0.12 GeV
                                  = √ ⊕                 ⊕ 0.3%.                           (3.4)
                               E        E        E
    For the hadronic calorimeter the electronic noise was found to be negligible and thus not
included. The energy resolution measured [55] was
                                      σ    52.9%
                                        = √       ⊕ 5.7%.                                 (3.5)
                                     E        E
LAr Electromagnetic Calorimeter
The EM calorimeter has a barrel part (|η| < 1.475) and two end-cap components (1.375 <
|η| < 3.2). It is a liquid argon (LAr) detector with accordion-shaped electrodes and lead
absorber plates. This geometry provides a full φ symmetry without azimuthal cracks and
lead to an uniform performance in terms of linearity and resolution as a function of φ. The
absorbers have two stainless-steel sheets glued on either side using a resin-impregnated glass-
fiber fabric to provide mechanical strength. The readout electrodes, consisting of conductive
copper layers separated by insulating polyminide sheets, are located in the gaps between the
absorbers.
    The barrel EM calorimeter is composed of two half-barrels, each with a length of 3.2m
and a weight of 57 tonnes. One half-barrel consists of 1024 accordion-shaped absorbers
interleaved with readout electrodes. For the EM calorimeter, one parameter of interest is
                                              42


the radiation length X0 , defined as the mean distance a particle can travel before its energy
is reduced by a factor of 1/e. Each half-barrel is divided in 16 modules, each with a total
thickness of a minimum of 22 X0 and cover ∆φ = 22.5◦ . These modules have three layers of
depth. The front layer is read out at the low-radius side of the electrode while the middle
and back layers are read out at the high-radius side of the electrode. A sketch of the different
layers of the EM barrel module is provided in Figure 3.7.
Figure 3.7: Schematic diagram showing the different layers of the EM calorimeter barrel
module [45].
Hadronic Calorimeter
The hadronic calorimeter system consists of the tile calorimeter (TileCal), the hadronic
end-cap calorimeter (HEC) and the forward calorimeter (FCal).
                                              43


    TileCal is located outside the EM calorimeter envelope. The barrel covers |η| < 1.0 and
the extended barrels cover the range 0.8 < |η| < 1.7. It is a sampling calorimeter that
uses scintillating tiles as the active medium and steel as the aborber. The tile calorimeter
extendeds radially from an inner radius of 2.28 m to an outer radius of 4.25 m. Azimuthally,
TileCal is divided into 64 modules. It is segmented in three longitudinal layers with different
interaction lengths. The interaction length λ is defined as the mean free path of a hadronic
particle before undergoing an inelastic nuclear interaction. The three segments have 1.5, 4.1
and 1.8 λ for the barrel, and 1.5, 2.6 and 3.3 λ for the extended barrel. The scintillating tiles
are read out by wavelength shifting fibers into two photomultiplier tubes (PMT). When an
ionising particle crosses the tiles, they induce the production of blue scintillation light that
is then converted to green light by the wavelength-shifting fluors in the fibers. A schematic
drawing of a TileCal module with its components is shown in Figure 3.8.
    The HEC module is a copper/liquid-argon sampling calorimeter with a flat-plate design
that covers the range 1.5 < |η| < 3.2. It consists of two cylindrical wheels, each with two
longitudinal sections. HEC shares the end-cap cryostats with FCal and the electromagnetic
end-cap calorimeter. Each of the HEC wheels is constructed of 32 identical wedge-shaped
modules. The modules of the front wheels have 24 copper plates with a thickness of 25
mm. For the rear wheels, the modules are made of 16 copper plates with a thickness of 50
mm making its sampling fraction coarser. Figure 3.9 depicts the HEC module views from
different angles. Seven stainless-steel tie-rods provide the structural strength of the modules.
Honeycomb sheets are used to fill the space between three electrodes that divide the gaps
into four separate LAr drift zones. Each of these drift zones is supplied with a high voltage.
The middle electrode serves as the readout electrode and the other two carry surfaces of high
resistivity to which high voltage is applied, forming an electrostatic transformer.
                                                44


Figure 3.8: Schematic diagram of a TileCal module, showing the slots in the steel for scin-
tillating tiles and the method of light collection by wavelength-shifting fibers to PMTs [56].
     The FCal system provides coverage over 3.1 < |η| < 4.9. The FCal modules are located
at a distance of 4.7 m from the interaction point and are exposed to high particle fluxes. To
avoid ion build-up problems it is designed with very small liquid-argon gaps. These gaps
are constructed by using an electrode structure of small-diameter rods centered in tubes
that are oriented parallel to the beam direction. Three modules make up the FCal: an
electromagnetic module (FCal1) and two hadronic modules (FCal2, FCal3). Figure 3.10
provides a schematic diagram of the FCal modules. FCal1 uses copper as an absorber to
optimize resolution and heat removal. FCal2 and FCal2, on the other hand, use mainly
tungsten. Extra shielding behind FCal3 is employed to reduce backgrounds in the end-cap
muon system.
                                               45


Figure 3.9: Schematic of the R-φ (left) and R-z (right) views of the hadronic end-cap
calorimeter (HEC) module [57].
Figure 3.10: Schematic diagram showing the three forward calorimeter (FCal) modules [58].
                                          46


3.6       Muon system
The outer part of the ATLAS detector is the muon spectrometer. It is designed to detect
muons exiting the barrel and end-cap calorimeters in the range |η| < 2.7. Most muons
pass through the inner detector without much interaction and thus it is neccesary to have a
dedicated system for them. The muon spectrometer is based on the magnetic deflection of
the muon tracks in the large superconducting air-core toroid magnets. Magnetic bending is
provided by the large barrel toroid for |η| < 1.4. Two smaller end-cap magnets inserted at
the ends of the barrel toroid provide the track bending for 1.6 < |η| < 2.7. A combinantion
of these two fields provide the magnetic deflection in the transisiton region 1.4 < |η| < 1.6.
In the barrel region, located between the eight coils of the superconducting barrel toroid
magnet, there are eight precision-tracking chambers. In the end-cap, they are in front and
behind the two end-cap toroid magnets. Each octant is divided in the azimuthal direction
in two sectors (a large and a small sector). The chambers are arranged in three concentric
cylindrical shells around the beam axis at a radius of 5, 7.5 and 10 m. In the two end-cap
regions, the muon chambers form large wheels that are perpendicular to the z-axis at a
distance of |z| = 7.4, 10.8, 14 and 21.5 m.
    The momentum measurement is performed by the Monitored Drift Tube chambers (MDT)
that cover the range |η| < 2.7. The chambers consist of three to eight layers of drift tubes
with an average resolution of 80 µm per tube (35 µm per chamber). In the forward region
(2 < |η| < 2.7), the Cathode-Strip Chambers (CSC) are used due to their higher rate capa-
bility and time resolution. The CSCs are multiwire chambers with cathode planes segmented
into strips in orthogonal directions. This configurations allows the measurement of both co-
ordinates using the induced-charge distribution. The resolution of these chambers is 40 µm
                                              47


in the bending plane and 5 mm in the transverse plane.
    The muon system also has the capability to trigger on muon tracks. The precision-
tracking chambers have a system of fast trigger chambers capable of delivering track infor-
mation in nanoseconds after the passage of a particle. Resistive Plate Chambers (RPC) and
Thin Gap Chambers (TGC) were chosen for this, in the barrel and end-cap respectively.
Both chamber types deliver signals with a spread below 25 ns, thus they provide the ability
to tag the beam crossing. They also measure both coordinates of the track, one in the bend-
ing (η) plane and one in the non-bending (φ) plane. Muons can be measured in the inner
detector and in the muon system. Figure 3.11 shows the elements of the muon system as
they are arranged in the ATLAS detector.
Figure 3.11: Schematic view of the ATLAS muon spectrometer system showing its sub-
components [59].
                                            48


3.7       Magnet system
ATLAS features a unique system of four large superconducting magnets. This system con-
sists of a solenoid, a barrel toroid and two end-cap toroids. Figure 3.12 depicts the magnet
system layout in the detector. The powerful magnetic fields produced enable the momentum
measurement of electrically charged particles generated in the collisions.
    The solenoid is aligned on the beam axis and provides a 2 T magnetic field for the inner
detector. It was designed to to keep the material thickness in front of the calorimeter as low
as possible. It has an inner diameter of 2.46 m, an outer diameter of 2.56 m and an axial
length of 5.8 m. The material used is Al-stabilised NbTi conductor, which achieves a high
field with a reduced thickness.
Figure 3.12: Schematic representation of the ATLAS magnet system, showing the central
solenoid and the toroids [60].
    The barrel toroid system produces a magnetic field that fills the cylindrical volume sur-
rounding the calorimeter and both end-cap toroids. It consists of eight coils encased indi-
vidually in racetrack-shaped stainless-steel vacuum vessels. It has a length of 25.3 m with
                                              49


an inner and outer diameters of 9.4 m and 20.1 m respectively. The techology used for the
all the toroid system is based on using a conductor of pure Al-stabilised Nb/Ti/Cu reshaped
into “pancakes” followed by vacuum impregnation.
3.8       Trigger system
Only a fraction of all the events that ATLAS detects contain interesting and useful infor-
mation. For this reason a system is needed to ensure a proper selection of events for study.
The Trigger and Data Acquisition (TDAQ) systems, the timing and trigger-control logic and
the Detector Control System (DCS) achieves this goal [61]. The trigger system [62][63] has
three distict levels: L1, L2 and the event filter. Each one refines the decisions the previous
trigger made by applying additional selection criteria. The LHC has a collision rate of 40
MHz and ATLAS collects about 60 TB/s of data. The TDAQ and DCS systems reduce the
rate of events to the order of 1 kHz and saves to permanent storage around 1.5 GB/s.
    The L1 trigger searches for high transverse momentum muons, photons and jets. Its
selection is based on information from a subset of detectors. Muons are identified using
trigger chambers in the barrel and end-cap regions of the muon spectrometer. Calorimeter
selections are based on reduced-granularity information from all the calorimeters. L1 also
identifies Regions-of-Interest (RoI) where the detector has identified interesting features.
This first level of triggers makes a decision in less than 2.5 µs and reduces the rate from
40 MHz to 75 kHz. Events passing the L1 selection are transferred to the next stages
of the detector-specific electronics and to the data acquisition. The L2 selection uses all
the available detector data within the RoI’s. It is designed to reduce the trigger rate to
approximately 3.5 kHz, processing an event in about 40 ms. The final stage is the event
                                               50


selection carried out by the event filter. The event filter reduces the event rate to about 200
Hz and it is implemented offline, with an average event processing time of four seconds.
                                               51


Chapter 4
Object Reconstruction
Before we can perform any type of analysis we have to transform the electrical signals
recorded by the TDAQ system from particle interactions with the detector to actual physical
objects. This thesis is focused in the identification of high momentum H → bb̄ decays. To
study this specific process we have to discuss how we reconstruct hadronic jets and how we
identify b-hadrons. Muon reconstruction will also be discussed as muons are used in the
analysis to perform corrections to the mass spectrum of the Higgs as well as triggers to fill
a control region for tt̄ events.
4.1      Track and Vertex Reconstruction
Tracking is performed by the inner detector, except for muons where the outer detector may
also be involved. Track reconstruction using the ID covers two sequences, a main inside-out
track reconstruction and a consecutive outside-in track reconstruction [64]. The pattern
recognition sequence (inside-out) starts with the formation of a seed from at least 3 hits in
the inner silicon tracker. This is done with the creation of three-dimensinal representations
of the silicon detector measurements. From these, track seeds are built. Then, through a
window search, using the seed direction, the track candidates are built. Kalman filtering [65]
and smoothing are applied to the nearby hits from the detector elements to decide if they
are added or rejected to the track candidates. There is a dedicated module for resolving and
                                              52


cleaning the initial track collection to avoid ambiguity due to the presence of fake tracks or
overlapping track segments with shared hits. The ambiguity solving module is based on a
scoring algorithm that is optimised for each sub-detector. After this, two modules perform
a track extension from the silicon detectors to the TRT. The extension to the TRT improves
momentum resolution and particle identification. The final fit of the track is done using a
maximum likelihood approach that involves minimizing a global χ2 .
    Not all tracks can be found using an inside-out procedure. Some ambiguos hits survive
the ambiguity solving process and also tracks coming from secondary decay vertices may
not have any silicon hits for the inside-out sequence to proceed. This could occur due to
kaons (Ks ) decaying or from photon conversions. The outside-in procedure starts with the
identification of tracks in the TRT using a Hough transform mechanism [66]. An association
tool prevents double counting of hits that have been assigned already to tracks in the inside-
out procedure. The TRT segments are then traced back into the silicon detector, which
allows one to find small track segments that were missed in the initial inside-out stage.
Figure 4.1 provides an example of an event showing the two track reconstruction methods.
    The primary vertex is reconstructed by using an iterative vertex finding algorithm [67].
Looking at the reconstructed tracks, vertex seeds are obtained. A χ2 fit is made using the
seeds and nearby tracks. Each possible track gets a weight associated with it which quantifies
the compatibility with the fitted vertex. Any track that has a displacement larger than 7σ
from the vertex is used to seed a new vertex. The algorithm is iterated until no more vertices
are found.
                                               53


Figure 4.1: Example of an event showing the two possible TRT hit associations. Red shows
extensions using the inside-out method and black shows extensions using the outside-in
method [64].
4.2      Jets
Jets are a collimated spray of particles coming from a single hard interaction. A jet, in an
experimental context, will be detected by its interaction with the different detector modules,
creating tracks in the inner detector and depositing energy in the calorimeters. To define
a jet, we need a set of rules for grouping these particles and the calorimeter deposits. Jet
clustering algorithms are the main way of performing this task. For ATLAS, and this thesis,
the clustering is done by using the anti-kt algorithm [68] an algorithm in the same family as
the kt and Cambridge/Aachen sequential recombination algorithms.
    To define the clustering algorithms, we consider two distances: dij beween entities i and
j, and diB between entity i and the beam B. Entities refer to particles, energy deposits or
                                               54


pseudo-jets. The distance metrics are defined as
                                                            2
                                                  2p 2p ∆ij
                                    dij =   min(kti , ktj ) 2 ,                              (4.1)
                                                           R
                                                     2p
                                             diB = kti ,                                     (4.2)
where ∆2ij = (yi − yj )2 + (φi − φj )2 , kt is the transverse momentum, yi is the rapidity, φ is
the azimuthal angle, R is the jet radius and p is a parameter that governs the relative power
of the energy. A value of p = −1 results in the anti-kt algorithm, while p = 1 is the usual
kt algorithm and p = 0 corresponds to the Cambridge/Aachen algorithm. The algorithm
proceeds by identifying the shortest between the distance measures. If it is dij , then i and j
get combined into one pseudo-jet. If the shortest distance is diB , then the entity i is classified
as a jet and removed from the list. This procedure ends when every entity has been combined
and eventually classified as a jet. Soft particles will cluster with hard particles before they
cluster among themselves. A hard particle that doesn’t have another hard particle close
to it, will just accumulate all the soft particles within the radius R, resulting in a conical
jet. Figure 4.2 presents how a particular event is clustered into jets with four different jet
algorithms.
    From these algorithms, only the anti-kt algorithm is simple, yet Infrared-Collinear (IRC)
safe, and soft-resilient in terms of shape. Infrared safety is used to describe algorithms that
are robust under the addition of soft radiation. Collinear satefy describes the fact that
the result from the algorithm is not changed irregardless if the the particles are collinear
(moving together in the same direction). The impact of the underlying event (UE) and pile-
up on the momentum resolution for jets is close to zero, which is crucial for high luminosity
experiments, like at the LHC. This can be observed by looking at the average jet area at a
                                                 55


Figure 4.2: Sample parton-level event clustered with four different jet algorithms with a
radius parameter value of R = 1 [68].
given pT for dijet events clustered using different algorithms when including the underlying
event and pile-up. When the ratio of the jet area and πR2 is calculated as a function of pT
using different jet clustering algorithms, only the anti-kt clustered jets stay close to 1 [68].
Topological Clustering
Jets deposit their energies into the calorimeters. Before appyling the anti-kt algorithm we
must first find the energy clusters deposited in the detector. There are many algorithms that
have been used to construct the clusters. The fixed-sized sliding window algorithm [69] was
used in the early years of the ATLAS experiment but currently a more complex dynamical
topological cell clustering approach is employed [70].
    Topological clustering consists of finding topologically connected calorimeter signals due
to a specific collision event in an attempt to extract a significant signal from a noisy back-
ground. The metric used for the formation of topo-clusters is the cell signal significance σcell ,
                                               56


defined as the ratio of the cell signal Ecell to the average noise in the cell σnoise,cell :
                                                  Ecell
                                        σcell =             .                                (4.3)
                                                σnoise,cell
Topo-clusters are then formed starting from a calorimeter cell with a highly significant seed
signal. Three parameters (S,N,P) control how the algorithm evolves and define signal thresh-
olds for seeding, growth and boundary features of the topological clustering. To begin, proto-
cluster seeds from calorimeter cells with σcell > S are selected. Then all the neighboring cells
satisfying σcell > N around the seeds are added. Finally the neighboring cells with σcell > P
are also added to the cluster. The optimised configuration for ATLAS is: (S=4,N=2,P=0)
making the resulting clusters 4-2-0 topo-clusters. Figure 4.3 shows an example of the stages
of topo-cluster formation.
Large Radius Jets
When highly boosted massive particles decay, their decay products tend to become colli-
mated, resulting in high levels of overlap between them. For a quasi-collinear splitting [17]
into two objects i and j, the total mass is given by m2 ' pTi pTj ∆Rij    2 . Defining the total
momentum pT = pTi + pTj and z = pTj /pT , then
                                     m2 ' z(1 − z)p2T ∆Rij    2.                             (4.4)
                                                57


                         (a)                                         (b)
                                               (c)
Figure 4.3: Stages of topo-cluster formation in the first module of the FCAL calorimeter for
a simulated dijet event. Shown in (a) are the cells with signal significance σcell > 4 that can
seed topo-clusters, in (b) cells with σcell > 2 controlling the topo-cluster growth, and in (c)
all clustered cells and the outline of topo-clusters in this module [71].
                                               58


In the case of a Higgs boson decaying to a b-quark pair, the momentum fraction is uniform
(z = 0.5). Therefore the angular separation of its decay products is approximately
                                                 2mH
                                          ∆R '         .                                    (4.5)
                                                  pT
For massive particles with high pT , the ability to resolve individual hadronic decay products
using standard narrow-radius jets begins to degrade. The b-quark pair coming from a Higgs
boson at a pT ' 250 GeV would be separated by approximately ∆R ' 1. Reconstructing
these objects in a single large-radius (large-R) jet is advantageous in order to maximize
efficiency [72]. Figure 4.4 contains an illustration of the degree of collimation of the decay
products of a massive Z 0 boson when the pT increases.
Figure 4.4: Diagram showing the degree of collimation of the decay products of massive
particle decaying as pT increases [73].
    A single jet containing all the decay products of a massive particle has different properties
than a jet originating from a light quark. These large-R jets are rich with multi-pronged
substructure, properties that are absent in jets formed from gluons and light quarks. In
ATLAS, large-R jets are reconstructed using the anti-kt clustering algorithm with a radius
                                               59


parameter R = 1.0.
Jet Trimming
When a hard scattering event occurs, the detector records more than just the final states.
Initial state radiation (ISR), multiple parton interactions (MPI), underlying event (UE)
remnants and pile-up all contribute to the final state. This complicates the jet definition as
it is often important to discriminate between these types of energy and the jet of interest.
In the case of large-R jets, the subtle substructure differences of jets formed from a massive
particle decay products and jets coming from quarks and gluons can be resolved more clearly
by removing soft QCD radiation from them [72]. The process of removing soft radiation
during the jet reconstruction is referred to as jet grooming.
    One of these grooming procedures is known as jet trimming [74]. The trimming algo-
rithm starts by clustering cells into jets with any clustering algorithm, for example, the
anti-kt algorithm, and calling them seed jets. For each seed jet, all of its constituents are
then reclustered using another jet algorithm into subjects with a characteristic radius Rsub .
Subjets from the original seed jet are discarded if they have pTi < fcut ·Λhard , where fcut is a
fixed dimensionless parameter and Λhard is a hard scale chosen depending on the kinematics
of the event. Finally, the remaining subjets are assembled into the new trimmed jet. Figure
4.5 contains a diagramatic representation of how the trimming procedure is performed.
    The analysis presented in this document uses trimmed large-R jets with parameters
Rsub = 0.2 and fcut = 0.05. The scale Λhard chosen is the original jet pT , and therefore the
subjets with a pT of less than 5% of the original jet pT are removed.
                                               60


Figure 4.5: Diagram that depicts the jet trimming procedure employed in this analysis [75].
Jet Calibration
Before applying a jet clustering algorithm, cell clusters need calibration to correct for the
effects of a non-compensating calorimeter response to hadrons, to accidental signal losses
and to energy lost in the inactive material. The calibration strategy is referred to as “lo-
cal hadronic cell weighting” (LCW) [70]. Topo-clusters calibrated using this method are
transformed to be at the LCW scale. After the calibration a large-R jet is defined with the
topo-clusters using the anti-kt algorithm and subsequently trimmed.
    The energy, pseudorapidity and mass calibration of the LCW jets are corrected for resid-
ual detector effects, using energy and pseudorapidity dependent calibration factors derived
from simulation [76]. The correction restores the average reconstructed calorimeter jet en-
ergy scale (JES) to that of particle-level jets. These scale factors are applied as multiplicative
weights that correct the distributions to the proper scales. The reconstructed large-R jet
energy, mass, pseudorapidity and transverse momentum become
                               Ereco = cJES E0 ,   mreco = cJES m0 ,                         (4.6)
                        ηreco = η0 + ∆η,   preco
                                            T = cJES |~  p0 | cosh (η0 + ∆η),                (4.7)
where E0 , m0 , η0 , p~0 are the jet properties before any calibration. The correction factors
                                                 61


cJES and ∆η are smooth functions of the large-R jet kinematics. The JES factor cJES is
parametrized by a Gaussian fit of the average jet energy response RE = hEreco /Etruth i.
    An extra jet mass scale (JMS) calibration is performed after the energy scale calibra-
tion. The correction factor cJMS is applied as a function of Ereco , η and log (mreco /Ereco ).
The definition of the correction factor is determined using the same procedure as the jet
energy calibration but using the jet mass response, Rm = hmreco /mtruth i, instead. The
reconstructed kinematic variables corrected are then
                                                  q
            mreco = cJES cJMS m0 ,   preco
                                       T = cJES      E02 − c2JMS m20 / cosh (η0 + ∆η).   (4.8)
    The final step of the calibration is the in situ calibration method [77] to bring data to
agreement with MC using response measurements in pp collision data of well known objects,
such as dijet events, that work as a reference. Scale factors are derived in the same fashion
as the JES and JMS calibration but the response is defined by the ratio of jet properties of
data and MC. At the end the groomed jets should be at the proper jet energy scale (JES)
and jet mass scale (JMS). Uncertainties are also derived through this process. An overview
of all the calibration steps is shown in Figure 4.6.
 Figure 4.6: Overview of the reconstruction and calibration procedure for large-R jets [77].
                                               62


Jet Mass
One of the most powerful tools to distinguish jets that contain the decay products of massive
particles from the multijet background is its mass. Jet mass is considered one of the most
important jet substructure (JSS) variables for large-R jets. The jet mass used in the analysis
is referred to as combined jet mass mcomb [77]. It is called the combined jet mass because it
is a smooth interpolation between two other mass definitions: a calorimeter-based jet mass
mcalo and track-assisted jet mass mTA [78].
    For a large-R jet J with calorimeter-cell cluster constituents i the mcalo is defined as:
                                     v            !2             !2
                                     u
                                     u X
                            mcalo = t
                                                         X
                                               Ei    −       p~i    .                      (4.9)
                                          i∈J            i∈J
Given that the angular spread of the decay products of a boosted massive particle scales
as 1/pT , the spread is comparable with the calorimeter granularity at high values of pT .
It is possible to include tracking information to maintain performance at high pT . The
track-assisted mass is defined as:
                                             pcalo
                                   mTA = trackT    × mtrack                               (4.10)
                                            pT
where pcalo
         T is the transverse momentum of the large-R calorimeter jet, pT
                                                                         track is the transverse
momentum of the four-vector sum of tracks associated to the large-radius calorimeter jet,
and mtrack is the invariant mass of the four-vector sum. This mass measurement has a better
resolution for high-pT jets with low values of m/pT .
    The combined jet mass smoothly interpolates between mcalo at low pT and mTA at high
                                               63


pT . A weighted least-squares combination is performed to define mcomb :
                                mcomb = wcalo mcalo + wTA mTA                           (4.11)
where the weights are determined by the mass resolutions σcalo , σTA of the calorimeter and
track-assisted measurements. These are derived using the jet mass response distribution in
dijet events. They are defined as
                                        −2
                                       σcalo                  −2
                                                            σTA
                           wcalo = −2        −2
                                                   wTA = −2       −2
                                                                                        (4.12)
                                   σcalo + σTA          σcalo + σTA
with the constraint wcalo + wTA = 1.
Variable Radius Track Jets
Track jets are formed by applying jet clustering on tracks in the inner detector from charged
particles originating from the hard scattering vertex. They are crucial for finding b-hadrons
and are used in this analysis to integrate b-tagging methods with large-R groomed jets.
    Using a fixed radius size approach for reconstructing track jets from highly boosted
massive particles presents problems in the identification of more than one charged jet given
the high degree of collimation. A variable radius (VR) approach is neccesary to maintain
acceptable levels of efficiency.
    A VR jet is a jet that has been reconstructed with the use of a pT dependent effective
radius Reff [79]. It requires the defnition of a parameter ρ that controls how the radius
                                                64


changes as a function of pT ,
                                                  "            #
                                                     ρ
                                 Reff (pT ) = min      , Rmax .                              (4.13)
                                                    pT
    In this analysis the VR track jets are reconstructed using the anti-kt algorithm, with
ρ = 30 GeV and the lower and upper bounds of the track-jet radius being Rmin = 0.02 and
Rmax = 0.4. The value of Rmin is dictated by the detector resolution. These provide the
optimal performance for high-pT Higgs jets decaying to b-quarks [80]. Figure 4.7 shows the
efficiency of subjet double b-labelling of track jets associated with a Higgs jet (in MC) using
different track jet clustering algorithms. The efficiency using the standard track jets with
R = 0.2 degrades sharply for Higgs jets with pT > 1 TeV.
Figure 4.7: Efficiency of subjet double b-labelling at the truth level of a Higgs jet as a
function of pT using VR track jets with Rmin = 0.02 and Rmax = 0.4 for different values of
ρ [80].
    Track jets are matched to large-R jets using a process called ghost-association [81]. This
procedure consists of treating track jets as infinitely soft particles by setting their pT to 1 eV.
                                                65


This is done to not affect the reconstruction of the calorimeter jets. The jets are added to
the list of inputs of the jet finding algorithm which makes it possible to identify which tracks
were clustered in which subjets. This technique facilitates the measurement of the ghost
area, the effective area of a jet. Instead of identifying tracks associated with the resulting
jets, the number of ghost particles present in the jet after recontruction defines the effective
area of that jet.
4.3       b-hadron Identification
ATLAS uses various tagging algorithms to identify b-jets [82]. These are referred to as
b-tagging algorithms [83], and they exploit the long lifetime of b-hadrons as well as the
properties of the b-quark fragmentation. Measureable b-hadrons have a significant mean
flight length in the detector before decaying. This leads to an extra vertex displaced from
the hard-scatter collision point. An illustration of a track jet with a displaced secondary
vertex is shown in Figure 4.8.
b-tagging
Taggers can be divided into two main categories, low-level taggers and high-level taggers.
Low level taggers are traditional track-based impact parameter taggers. Examples of these
include IP2D and IP3D, SV1 [85] and JetFitter [86]. They are based on a log-likelihood
ratio (LLR) discriminant that separates tracks associated to jets according to their compati-
bility to the primary vertex. IP2D and IP3D use the transverse impact parameter significance
as discriminating variables. The other low-level taggers, SV1 and JetFitter, are secondary
vertex-based b-tagging algorithms. All the discriminating variables produced by the low-level
                                                66


Figure 4.8: Diagram of a track jet with displaced secondary vertex from the primary vertex
[84].
taggers are used as inputs for the high-level algorithms. The high-level taggers used in this
thesis are MV2 [87] and DL1 [87].
    MV2 consists of a boosted decision tree (BDT) algorithm. It is trained using the ROOT
Toolkit for Multivariate Data Analysis (TMVA) [88] on a hybrid tt̄ + Z 0 sample. The kine-
matic properties of the jets are included in the training in order to exploit the correlations
with the other input variables. However, for b-jets and c-jets, pT and |η| are reweighted to
match the spectrum of the light-flavor jets.
    DL1 is based on an Artificial Deep Neural Network (DNN) trained using Keras [89]
with a Theano [90] backend. DL1 has a multidimensional output corresponding to the
probabilities for a jet to be a b, c or light-flavor jets. Similar to MV2, a reweighting of pT
                                                67


                          (a)                                              (b)
Figure 4.9: Distribution of the output discriminant for (a) MV2 and (b) DL1 b-tagging
algorithms [83] .
and |η| is performed. DL1’s final discriminant is defined as:
                                                                        
                                                        pb
                              DDL1 = ln                                                        (4.14)
                                            fc · pc + (1 − fc ) · plight
where pb , pc , plight and fc are the b-jet, c-jet, light-flavor jet probabilities, and the effective
c-jet fraction in the background training sample. Figure 4.9 the values of the MV2 and DL1
discriminant is shown for the different types of jets.
    The b-tagging algorithms are calibrated in terms of their efficiency working points (WP)
by making a cut on the discriminant values. The b-tagging efficiency is defined as
                                                  Nbtagged bjets
                                    btagging =                   .                            (4.15)
                                                    Ntotal bjets
    A cut on the corresponding discriminant is done such that the overall efficiency WP
at a desired kinematic range stays constant. The WP chosen for the study shown in this
                                                  68


                         (a)                                           (b)
Figure 4.10: The (a) light-flavor jet (b) c-jet rejections versus the b-tagging efficiency [83] .
thesis is the 77% efficiency WP. For every efficiency, there is also a background rejection rate
associated with it. The background rejection is calculated as the inverse of the b-mistag rate
(1/bkg ). The light-flavor and c-flavor rejections are shown in Figure 4.10 as a function of
b-tagging efficiency for multiple b-taggers.
Double b-tagging
Given that we are looking for H → bb̄ we require the identification of 2 b-jets. When
tagging 2 b-jets, multiple schemes have been developed [91]. The benchmarks are: double,
asymmetric, single and leading single b-tagging. Double b-tagging takes the two highest-pT
track jets that pass the b-tagging requirement. For asymmetric b-tagging, the track jet that
is more consistent with the interpretation of being a b-jet must pass a fixed efficiency working
point, while the b-tagging requirement on the second track jet is varied. In single b-tagging
                                               69


                          (a)                                        (b)
Figure 4.11: The multijet rejection as a function of the Higgs tagging efficiency for large-R
jets with pT above (a) 250 GeV and above (b) 1000 GeV [91] .
at least one of the two highest-pT track jets must pass the b-tagging requirement, while on
leading single b-tagging only the highest-pT track jet must pass the b-tagging requirement.
    The scheme adopted by the study performed in this thesis is double b-tagging as it is
the scheme with the largest multijet and top-jet rejection across a large range of Higgs
efficiencies. This can be seen in Figure 4.11.
4.4       Muons
Muon reconstruction [92] is performed both in the Inner Detector (ID) and in the Muon
Spectrometer (MS). This information is then combined to form the muon tracks that are
used in the analyses. The Monitored Drift Tubes (MDT) segments are reconstructed with a
straight-line fit to the hits of each layer. The Resistive Plate Chambers (RPC) and Thin Gap
Chambers (TGC) are used to measure the coordinate orthogonal to the bending plane. For
segments in the Cathode Strip Chambers (CSC), a separate combinatorial search in the η, φ
detector planes is performed. Muon track candidates are then built by combining together
all the segments in the different layers.
                                                 70


    There are four muon types defined depending on which subdetectors are used in their
reconstruction. They are: Combined (CB) muons, Segment-tagger (ST) muons, Calorimeter-
tagger (CT) muons and Extrapolated (ME) muons. Combinations of these are used to select
and identify muons.
    Muon identification selection is divided into four categories: Medium, Loose, Tight and
High-pT . In this analysis, Medium muons are used, the default ATLAS selection. This
selection uses only tracks from CB muons and ME muons and it minimizes the systematic
uncertainties associated with reconstruction and calibration.
    For CB muons, reconstruction follows an outside-in pattern recognition; the muons are
first reconstructed in the MS and then are extrapolated inward by matching to an ID track.
The independent tracks from the ID and MS are combined using a global fit. ME muon
reconstruction is based only on the MS track and a requirement on compatiblity with the
interaction point (IP). ME muons are mainly used to extend the acceptance into the region
not covered by the ID (2.5 < η < 2.7).
    Muon isolation criteria optimized for different analyses are also defined [92]. Two vari-
                                                                   topocone20
ables are used to make the isolation cuts, pvarcone30
                                                  T        and ET             . The track-based
variable, pvarcone30
           T         , is the scalar sum of the transverse momenta of the tracks with pT > 1
                                            µ
GeV in a cone size ∆R = min(10 GeV/pT , 0.3) around the muon of transverse momentum
  µ                                                 topocone
pT . The calorimeter-based isolation variable, ET            , is the sum of the transverse en-
ergy of the topological clusters in a cone size ∆R = 0.2 around the muon direction. The
contribution of the muon itself to the energy deposits is subtracted.
    The isolation working point (WP) used in this analysis is referred to as the FixedCutLoose
                                                71


WP. The cuts on the discriminating variables are
                                 µ                  topocone20   µ
                    pvarcone30
                      T        /pT < 0.15 and ET               /pT < 0.30.           (4.16)
Figure 4.12 shows the efficiency for the FixedCutLoose muon isolation working point as a
function of pT .
 Figure 4.12: Isolation efficiency for the FixedCutLoose muon isolation working point [92].
                                               72


Chapter 5
Boosted H → bb̄ Analysis
5.1       Introduction
After describing how the different physics objects are constructed from collisions in the de-
tectors we can now perform the analysis. The analysis has the goal of extracting information
by examining the events from real data gathered by the ATLAS detector using guidance from
MC based collision simulations. For this particular analysis, the goal is to measure the Higgs
boson production cross section inclusively, in the fiducial volume and differentially (in pT ),
using its H → bb̄ decay. The measurement of the Higgs boson cross section for pT > 1 TeV
is of particular interest and is the first of its kind made by the ATLAS Collaboration.
    Before this study, the measurement of the Higgs boson cross section using its H → bb̄
decay exploited the semileptonic decays of vector bosons in the V H production modes by
requiring additional triggers involving muons and missing energy [26], or in the VBF produc-
tion mode by requiring additional jets [93]. In this analysis, we do not impose any restrictions
on the production mode and simply require the presence of an extra jet accompanying the
Higgs. This selection allows for access to a large Higgs boson pT range using triggering with
large-R jets. Without this requirement, the large QCD backgrounds would require single
b-jet triggering which limits the mass range for which the two jets can be reconstructed [94],
eliminating the possibility of measuring the 125 GeV Higgs boson.
                                                  73


Signal Measurement
The measurements presented in this thesis uses the signal strength µH extracted through a
binned likelihood fit. The signal strength is defined as the ratio of the observed yield of the
signal (NH ) over the Standard Model prediction (NSM ):
                                                    NH
                                           µH =         .                                   (5.1)
                                                  NSM
    The measurement of the signal strength µH is then used to calculate the Higgs boson
production cross-sections for different phase space regions. The measured cross-section is
given by
                                                   NH
                                          σH =          ,                                   (5.2)
                                                  c×L
where c is a correction factor, L is the integrated luminosity and NH = µH NSM is the yield
of signal events. The correction factor is defined as c = A where A is the fiducial acceptance
and  is the selection efficiency. By this definition, the signal strength can also be defined as
the ratio between the measured and the expected signal cross sections µH = σH /σSM .
Maximum Likelihood Method
Suppose you have a set of random variables x that are distributed following a probability
density function (p.d.f.) f (x; Θ) with unknown parameters Θ = (Θ1 , . . . , Θm ). By choosing
a specific functional form of f (x; Θ) we can use the maximum likelihood method to estimate
the values of the parameters given a finite sample of data. The likelihood function is defined
as
                                                Yn
                                      L(Θ) =        f (xi ; Θ)                              (5.3)
                                                i=1
                                                 74


where xi are the different outcomes of repeated measurements. Then the estimators for the
parameters are those that maximize the likelihood function,
                                    ∂L
                                         = 0,    i = 1, . . . , m.                      (5.4)
                                    ∂Θi
    In our analysis the random variable xi is the number of events in a specific mass bin
given the signal and background models. Therefore xi ≡ xi (µ, θ) where µ are the set of
all the signal strengths for signal or scale factors for the backgrounds and θ are nuisance
parameters used for uncertainty estimation. In each bin, the expected number of events is
given by
                                      X                    X
                         xi (µ, θ) =       µs xi,s (θ) +          µb xi,b (θ).          (5.5)
                                     s∈sig                b∈bkg
The quantities xi,s and xi,b are taken from the Asimov datasets constructed using MC mass
templates of the resonances, or in the case of the QCD background, a parametric function.
When a maximum likelihood fit is performed on an Asimov dataset, the results are the
expected signal strengths and their expected uncertainties µ̂H = 1 ± σ̂(µH ). The hat on the
parameters is used to identify them as estimators (parameter values for which the likelihood
function is a maximum) of the real parameters. The specifics of the MC templates for the
signal and backgrounds are explored in the modeling sections 5.5 and 5.6.
    The functional form of the p.d.f. is a Poisson distribution. The Poisson distribution
expresses the probability of a given number of events ocurring in a fixed interval. The
binned likelihood function therefore takes the following form:
                                          Y (xi (µ, θ))Ni
                             L(µ, θ) =                          exi (µ,θ) ,             (5.6)
                                                     Ni !
                                        i∈bins
                                                75


where Ni is the number of data events in bin i.
    The nuisance parameters θ = (θ1 , . . . , θm ) encode the dependence on systematic uncer-
tainties. The prior knowledge on these parameters is used to make the displacement from the
nominal value disfavored. They enter the equation as multiplicative factors in the likelihood
function and are Gaussian distributions centered at 0 and a standard deviation of 1:
                                                  1       2
                                     g(θm ) = √ e−θm /2 .                                 (5.7)
                                                  2π
Confidence Levels
To test the compatibility of a background only model with the data or derive confidence
limits on our observations we use a test statistic. According to the Neyman-Pearson lemma,
we can consider the function constructed from a ratio of likelihood functions for a set of
parameters that describe the null hypothesis H0 and a set of parameters that describe the
alternate hypothesis H1 as the best hypothesis discriminator
                                                    L(H0 )
                                       q = −2 ln            .                             (5.8)
                                                    L(H1 )
This reduces to a χ2 distribution in the large sample limit. The fit result is obtained by
maximizing the log-likelihood function with respect to all the parameters. For a discovery,
the null hypothesis H0 is the likelihood function where µ = 0 (no signal). More specifically
in our case, where the observed signal is small, we can build confidence limits using the CLs
[95] method based on the test statistic
                                                       ˆ
                                                 L(µ, θ̂(µ))
                                    qµ = −2 ln                ,                           (5.9)
                                                    L(µ̂, θ̂)
                                                76


where µ̂, θ̂ are the parameters that maximize the constrained (0 ≤ µ̂ ≤ µ) likelihood and
ˆ
θ̂(µ) are the nuisance parameters which maximize the likelihood function for a given value
of µ. The variance of µ̂, can be directly calculated using the test statistic
                                              (µ − µ̂)2
                                       σµ̂2 =           .                              (5.10)
                                                  qµ
The p-value of an hypothesized µ and the corresponding significance is given by
                                         √                    √
                             pµ = 1 − Φ( qµ ) and Zµ = qµ ,                            (5.11)
where Φ is the cumulative distribution function for the standard Gaussian. Finally the upper
limit of an estimator at a 1 − α confidence level is given by
                                  µup = µ̂ + σµ̂ Φ−1 (1 − α).                          (5.12)
Therefore, for a 95% upper limit, α would be 0.05.
5.2      Samples
Data
                                                           √
The data used was collected by the ATLAS detector at         s = 13 TeV during Run 2 (2015-
2018) of the LHC. All the events come from the Good Runs Lists (GRL) [96]. The events
in the GRL have collisions with bad detector performance removed. Table 5.1 lists all the
GRL datasets used.
    The GRL list of events sums up to an ingrated luminosity of 139 fb−1 [97]. This quantity
                                               77


Table 5.1: Summary of the Good Run List (GRL) datasets used in this analysis. The data
was collected during Run 2 of the LHC.
  Year                                                     Dataset Name
  2015           data15 13TeV.periodAllYear DetStatus-v89-pro21-02 Unknow PHYS StandardGRL All Good 25ns.xml
  2016      data16 13TeV.periodAllYear DetStatus-v89-pro21-01 DQDefects-00-02-04 PHYS StandardGRL All Good 25ns.xml
  2017 data17 13TeV.periodAllYear DetStatus-v99-pro22-01 Unknown PHYS StandardGRL All Good 25ns Triggerno17e33prim.xml
  2018 data18 13TeV.periodAllYear DetStatus-v102-pro22-04 Unknown PHYS StandardGRL All Good 25ns Triggerno17e33prim.xml
is less than the total integrated luminosity (156 fb−1 ) that was delivered by the LHC due
to the quality requirements and trigger efficiencies. Table 5.2 summarizes the integrated
luminosity and its uncertainties for each year of data taking period included in the Run 2
dataset. The uncertainty sources can be correlated between all years, correlated between a
subset of years or completely uncorrelated. Therefore the relative error on the total is not a
weighted sum of the relative errors on the individual years. It considers the covariance matrix
of the absolute luminosity uncertainties for the differrent years where correlated sources are
represented by terms with non-zero off-diagonal entries.
Table 5.2: √  Summary of the integrated luminosities and uncertainties for the Run 2 pp data
sample at s = 13 TeV [97].
                                  Data sample         Luminosity         Uncertainty
                                   2015-2016            36.2 fb−1            2.1 %
                                       2017             44.3 fb−1            2.4 %
                                       2018             58.5 fb−1            2.0 %
                                      Total            139.0 fb−1            1.7 %
     In this analysis, the data amounts to an integrated luminosity of 136 fb−1 due to the
specific Large-R jet and muon trigger requirements. Each event satisfies a trigger that
requires a large-R (R = 1.0) jet reconstructed using the anti-kt algorithm. For each year of
data taking, the jet pT and mass thresholds for triggers differ due to changes in luminosity
profiles, inclusion of new techniques [98] and generally different beam conditions. The jet
pT thresholds go from 360 to 460 GeV and the mass trigger is either 0, 30 or 35 GeV. The
                                                             78


additional muon triggering is used to fill a control region for top quarks that requires a muon
with pT > 50 GeV [99]. Table 5.3 summarizes the triggers used in this analysis. Plots for
the efficiency curves for the triggers used in this analysis are presented in Appendix B.
Table 5.3: Summary of the triggers used in this analysis. Large-R jet triggers are used to
identify candidate jets, while the muon trigger is used to fill the tt̄ control region.
    Year                     Trigger                         Treshhold          Luminosity [fb−1 ]
  Large-R jet Triggers
    2015           HLT j360 a10 lcw sub L1J100             pT > 360 GeV                3.2
    2016             HLT j420 a10 lcw L1J100               pT > 420 GeV               33.0
    2017     HLT j390 a10t lcw jes 30smcINF L1J100  pT > 390 GeV, mJ > 30 GeV         41.0
                   HLT j440 a10t lcw jes L1J100            pT > 440 GeV               41.2
             HLT j420 a10t lcw jes 35smcINF L1J100 pT > 420 GeV, mJ > 35 GeV          58.5
    2018     HLT 420 a10t lcw jes 35smcINF L1SC111 pT > 420 GeV, mJ > 35 GeV          55.4
                    HLT 460 a10t lcw jes LJ100             pT > 460 GeV               58.5
  Muon triggers
                                                            µ
  All years            HLT mu50 L1 MU20                    pT > 50 GeV              139 fb−1
Simulated Samples
Monte Carlo (MC) programs are used to simulate events which then are used to model the
signal and backgrounds pertinent to this analysis. The signal consists of the Higgs production
processes ggF, VBF, VH and tt̄H. The background samples include W +jets, Z+jets, top
quark production and dijets events. Table 5.4 has a summary of all the simulated samples
used in the analysis.
Signal Samples
Higgs production through ggF is simulated at NLO QCD accuracy including, the finite
top mass effects with the Hj-MINLO (Multi-Scale Improved NLO) [100] prescription using
                                                 79


Powheg Box v2 [101] [102]. In a similar manner, NLO accuracy in QCD is achieved for
VBF and tt̄H production [103] [104]. gg → V H production at LO accuracy is calculated
using Powheg Box v2 as calculations at NLO required new developments [105] not pub-
lished at the time this analysis was performed. qq → V H [106] is calculated at NLO accuracy
using GoSam [107]. Electroweak (EW) NLO corrections are also applied as a function of
the Higgs boson momentum for the VBF, VH and tt̄H production modes using HAWK
[108] (see Appendix B). Finally the branching ratios are calculated using Hdecay [109] and
Prophecy4F [110].
Background Samples
Vector boson + jets was simulated using Sherpa [111] with NLO QCD accuracy. NLO EW
approximate corrections were applied which reduced the predicted yield by 10 − 20%. The
NNLOJET group provided NNLO QCD custom corrections as a function of the generated
vector-boson momentum (pVT ) that are then applied on top of the NLO EW corrections.
    Every top quark production mode was modeled using Powheg Box v2 at NLO QCD.
Top quark pair production, tW and single-top t-channel and s-channel are all included [112]
[113].
    QCD multijet events were modeled using a parametric model. The MC used to study
the model was generated by Pythia 8.230 [114].
    After generation, hadronization and showering every event is put through an ATLAS
detector simulation that is based on Geant4 [41]. Pileup and multi-particle interactions
were also modeled using Pythia 8.186 with the A3 tuning [115] and were also fed through
the same ATLAS detector simulation.
                                              80


Table 5.4: Summary of the simulated samples for the signal and background processes [10].
         Process                  ME generator            ME PDF      PS and hadronization UE model tune       Cross-section order
      Higgs Boson
     gg → H → bb̄           Powheg Box v2 + MINLO      NNPDF3.0N N LO    Pythia 8.212        AZNLO           NLO(QCD) + LO(EW)
   qq → H → q 0 q 0 bb̄         Powheg Box v2           NNPDF3.0N LO     Pythia 8.230        AZNLO          NLO(QCD) + NLO(EW)
  qq → W H → q 0 q 0 bb̄ Powheg Box v2 + MINLO + GoSam NNPDF3.0N LO      Pythia 8.240        AZNLO         NNLO(QCD) + NLO(EW)
   qq → W H → lνbb̄      Powheg Box v2 + MINLO + GoSam NNPDF3.0N LO       Pythia8.212        AZNLO         NNLO(QCD) + NLO(EW)
   qq → ZH → q q̄bb̄     Powheg Box v2 + MINLO + GoSam NNPDF3.0N LO      Pythia 8.240        AZNLO         NNLO(QCD) + NLO(EW)
   qq → ZH → ννbb̄       Powheg Box v2 + MINLO + GoSam NNPDF3.0N LO      Pythia 8.212        AZNLO         NNLO(QCD) + NLO(EW)
    qq → ZH → llbb̄      PowhegBox v2 + MINLO + GoSam   NNPDF3.0N LO     Pythia 8.212        AZNLO         NNLO(QCD) + NLO(EW)
   gg → ZH → q q̄bb̄            Powheg Box v2           NNPDF3.0N LO     Pythia 8.240        AZNLO              LO +NLL(QCD)
   gg → ZH → ννbb̄              Powheg Box v2           NNPDF3.0N LO     Pythia 8.212        AZNLO              LO +NLL(QCD)
    gg → ZH → llbb̄             Powheg Box v2           NNPDF3.0N LO     Pythia 8.212        AZNLO              LO +NLL(QCD)
    gg → tt̄H → all             Powheg Box v2           NNPDF3.0N LO     Pythia 8.230        AZNLO          NLO(QCD) + NLO(EW)
    gg → tt̄H → all             Powheg Box v2           NNPDF3.0N LO     Pythia 8.230        AZNLO          NLO(QCD) + NLO(EW)
 Vector Boson + jets
         W → q q̄                  Sherpa 2.2.8        NNPDF3.0N N LO     Sherpa 2.2.8        Default    NNLO(QCD) + approx NLO(EW)
         Z → q q̄                  Sherpa 2.2.8        NNPDF3.0N N LO     Sherpa 2.2.8        Default    NNLO(QCD) + approx NLO(EW)
       Top quark
         tt̄ →all               Powheg   Box    v2      NNPDF3.0N LO     Pythia  8.230         A14              NNLO + NNLL
            tW                  Powheg   Box    v2      NNPDF3.0N LO     Pythia  8.230         A14                    NLO
       t t-channel              Powheg   Box    v2      NNPDF3.0N LO     Pythia  8.230         A14                    NLO
       t s-channel              Powheg   Box    v2      NNPDF3.0N LO     Pythia  8.230         A14                    NLO
         Multijet
          Dijets                  Pythia 8.230          NNPDF2.3.LO      Pythia 8.230          A14                     LO
5.3               Object Definition
Object Reconstruction
A Lorentz boosted Higgs boson event has a topology of the form: pp → H(→ bb̄) + j.
Therefore the events of interest are better described by two large-R (R = 1.0) jets in which
one of them contains the decay products of two b-hadrons. The large-R jets are defined
by applying the anti-kt algorithm, using the software package Fastjet [116], to topological
clusters of calorimeter energy deposits. A jet trimming procedure is employed with param-
                                 subjet       jet
eters R = 0.2 and pT                     /pT < 0.05. The jet mass mJ is defined as the combined mass, a
weighted combination of the calorimeter based mass and the track-assisted jet mass.
       Variable radius (VR) track jets are formed using the anti-kt algorithm with parameters
Reff = ρ/pT where ρ = 30 GeV and have an upper bound of Rmax = 0.4. Ghost association
is used to match large-R jets (before trimming) to VR track jets. For simulation events,
                                                                  81


track jets are labeled as having b,c or light (u,d,c and g) flavor by truth matching hadrons
with pT > 5 GeV within ∆R = 0.3 of the jet axis [87].
     The b-tagger MV2 is used to tag VR track jets containing a b-hadron decay. Track jets
must have pT > 10 GeV and |η| < 2.5. At least two track jets per event are considered. The
working point is tuned to have an average b-tagging efficiency of 77% for b-jets in simulated
tt̄ events. The misidentification efficiencies are 0.9% for light-jets and 25% for c-jets. To
prevent overlap between VR track jets, if the ∆R between any two track jets with pT > 5
GeV associated with a large-R jet [91] is less than their respective radii, then the jet is not
considered for b-tagging.
                                                                                           µ
     Muons satisfy the Medium quality criterion. Muons have to satisfy |η| < 2.5 and pT >
10 GeV. Isolated muons also have to satisfy loose track and calorimeter based isolation
conditions [92].
     For a more detailed view of the object reconstruction, algorithms, calibrations and ref-
erences used refer to Chapter 4.
Analysis Object Definitions
Reconstructed jets that have the properties compatible with a H → bb̄ decay are labeled
candidate jets. Theoretically the Higgs boson and the hadronic recoil system both have
the same pT . In practice, the reconstructed jet pT is affected by final state radiation, jet
resolution and any other activity outside the jet cone like pile-up. From simulation it is
estimated that roughly 50% of the Higgs jets are leading jets (jet with the highest pT )
and 47% are subleading jets (jet with the second largest pT ). Figure 5.1 illustrates this
phenomenon using simulated ggF events. For this reason, candidate jets are defined as
either the leading or sub-leading jet that has a pT > 250 GeV, has |η| < 2.0 and satisfies the
                                               82


boosted condition: 2mJ /pT < 1. Each candidate jet must contain at least two track jets.
The candidate jets are classified as double-tagged if its two leading track jets are b-tagged or
anti-tagged if neither of them are b-tagged.
Figure 5.1: Difference between the pT of the Higgs matched large-R jet and the recoil jet.
Higgs jets are leading pT if the difference is positive (right of the dashed line) or subleading
if the difference is negative (left of the dashed line).
    The presence of semileptonic b-hadron decays motivates the application of a correction
to candidate jets. The ‘muon in jet’ correction uses the leading-pT muon found within
                            µ
∆R = min(0.4, 0.4 + 10/pT ) of a b-tagged VR track jet. The correction consists of removing
the energy deposited by the muon in the calorimeter and adding its four-momenta to the
trimmed large-R jet. The corrrection is of the order of 13% for leading Higgs jets and 33%
for subleading Higgs jets in simulated ggF events. The mJ width is reduced by 5% and 12%
respectively. The uncorrected momentum and mass in candidate jets are denoted as p0T and
m0J respectively.
                                                83


5.4      Event Selection
Events are classified into three separate regions: a signal region (SR), a control region (CRtt̄ )
and a validation region (VR). The SR is used to calculate the signal strength. The CRtt̄ is
used as a control region to study top quark events. The VR is used to test the multijet and
V +jets models.
   For both the SR and the VR at least one jet with p0T > 450 GeV and m0J > 60 GeV is
required. The second jet required has to have a p0T > 200 GeV. From these two jets at least
one of them have to satisfy the candidate jet criteria. The SR and VR are subdivided into
subregions according to the pT ordering of the candidate jets. Figure 5.2 summarizes the
regions used for event categorization in the analysis.
                                           Leading Large-R Jet
                                             Not
                                  s
                              ta g          Cand.
                            b-
                                  Not
                                 Cand.              VRL
              Subleading                       VRS
                                                                        SRL
              Large-R Jet
                                                    SRS
Figure 5.2: Diagram showing the event categorization criteria. Candidate jets are categorized
by the number of ghost-associated b-tagged track jets as well as its pT ordering in the event.
                                               84


Signal Region
For an event to be assigned to the SR it must be a double-tagged candidate jet. If the event is
the leading jet then it will populate the leading jet signal region (SRL). If the double-tagged
jet is not the leading jet, but the subleading jet, it will populate the subleading-jet signal
region (SRS). Figure 5.3 shows the mass distributions for all the processes that contribute
to the signal region.
                        (a)                                           (b)
Figure 5.3: Jet mass distributions for the Higgs boson, Z+jets, W+jets, and top quark
contributions from the SM prediction as well as the multijet jet mass distribution extracted
from data in the signal region (SR) defined by the leading (a) and subleading (b) jets [10].
SR Configurations
The signal is extracted in three different SR configurations, providing three measurements of
the Higgs cross-section. First, for the inclusive measurement, the Higgs boson signal strength
µH is extracted from the signal region containing candidate jets with pT > 250 GeV. Second,
a fiducial measurement is performed on the fiducial volume where the candidate jets have
pT > 450 GeV and |yH | < 2 defined by the acceptance cuts of this analysis. Finally, the
                                                85


differential measurement, where the signal strength extraction is performed for candidate
jets in the pT ranges 250-450 GeV, 450-650 GeV, 650-1000 GeV and > 1 TeV. The bin with
250 < pT < 450 GeV is populated only by candidate jets from the subleading signal region
(SRS).
Table 5.5: Summary of the candidate jet pT requirements for the three Signal Region con-
figurations [10].
                                         Candidate jet pT (GeV)
                              Region
                                            SRL           SRS
                             Inclusive      > 450        > 250
                             Fiducial       > 450        > 450
                                         450 − 650,    250 − 450,
                            Differential 650 − 1000,   450 − 650,
                                           > 1000      650 − 1000
Validation Region
Similarly to the SR region the VR is subdivided into the leading jet validation region (VRL)
and the subleading jet validation region (VRS). The main difference lies in the b-tagging
requirements. Every event of the VR must be an anti-tagged candidate jet.
Control Region
To constrain the tt̄ background a dedicated control region CRtt̄ was defined. The high purity
of top quark pair events is achieved using the muon-trigger to choose events in which one of
the tops decays semiletopnically (t → lνb) while the other decays hadronically (t → qq 0 b).
Each large-R jet in this region must have at least one b-tagged VR track-jet associated to
it. The large-R jet that has a close isolated muon with pT > 52.5 GeV is labeled as Jb
and is associated to the semileptonically decaying top quark. On the other hand, for the
                                             86


hadronically decaying top, the large-R jet Jt requires at least 3 associated VR track-jets.
These two jets (Jb , Jt ) must have an angular separation of at least ∆φ > 2π/3 to ensure
a back-to-back topology. Table 5.6 summarizes the selection criteria used for the tt̄ control
region CRtt̄ .
Table 5.6: Summary of the CRtt̄ selection criteria for the semileptonically decaying top quark
(Jb ) and the hadronically decaying top quark (Jt ) [10].
      Jet N track-jets N b-tags               Angular selection            Jet mass [GeV]
                                                  µ
      Jb        ≥1             1      0.04 + 10/pT < ∆R(µ, J b ) < 1.5            −
      Jt        ≥3             1              ∆φ(J b , J t ) > 2π/3           140 − 200
5.5       Higgs Boson Modeling
The selection criteria chosen in this analysis provides an inclusive view of the Higgs boson
in terms of its four main production modes. Therefore the production modes considered are
ggF, V H, VBF and tt̄H. Considering Higgs bosons near the mass peak (105 < mJ < 140
GeV) and pT > 450 GeV, the largest contribution for Higgs production comes from the ggF
process. On the other hand, in the SRS (pT < 450 GeV) the largest production mode is
tt̄H. In this bin, highly energetic hadronic tops can satisfy the trigger requirements in events
where the Higgs boson has low pT due to the nature of three body decays. Table 5.7 shows
the relative contribution of the main production modes in the SR. The Higgs boson also
contributes to the VR to a lesser extent. The breakdown of the contribution of the Higgs
boson to the SR and VR as a function of mass is shown in Figure 5.4 and as a function of
pT in Figure 5.5.
    Multiple modeling systematics were considered for the Higgs boson. These are: factor-
ization and renormalization scale variations, cross-section and acceptance, PDF uncertainty,
                                               87


Table 5.7: Fractional contribution of each production mode to the SR configurations around
the Higgs boson mass peak (105 < mJ < 140 GeV) [10].
                                           Jet pT range (GeV)
                    Process
                              250 − 450    450 − 650 650 − 1000 > 1000
                                              SRL
                      ggF         −           0.56         0.50       0.39
                     VBF          −           0.17         0.16       0.17
                      VH          −           0.14         0.18       0.25
                      tt̄H        −           0.13         0.16       0.19
                                              SRS
                      ggF        0.28         0.46         0.43        −
                     VBF         0.07         0.19         0.21        −
                      VH         0.26         0.24         0.26        −
                      tt̄H       0.39         0.11         0.10        −
jet shower systematics and EW correction uncertainties. For the factorization and renormal-
ization scale variations a 7-point scale variation on µR/F was performed. The variation was
found to show a flat effect in the cross section of 2% for ggF, 0.5% for VBF, 5% for V H and
13% for tt̄H. For the shower systematics, Pythia and Herwig samples at the truth level were
compared. From these samples, pT and mJ dependent reweighting maps were constructed,
applied to the recontructed level samples and used to estimate the uncertainty. The shower
systematics were found to be negligible, as no substantial differences were seen in the mJ
shape in the resonance peak region. Appendix A contains plots related to the studies of the
jet parton showers as well as the EW correction systematics [117].
Higgs Boson Resolution
The pT and mass resolution of the Higgs jet is studied by truth matching candidate jets to
a Higgs boson. A Gaussian is then fitted to the difference of the reconstructed jet and the
truth jet values. The Gaussian resolution is therefore taken as the standard deviation of the
                                                88


                       (a)                                          (b)
                       (c)                                          (d)
Figure 5.4: Breakdown of the Higgs boson contributions to the mass peak for the different
production modes for signal (a,b) and validation regions (c,d). The plots on the left (a,c)
show the contribution to the leading regions and on the right (b,d) to the subleading regions.
fitted Gaussian. The impact of pile-up (PU) on the mass resolution was also studied and
shown to be small, given that the large-R jets are subject to a trimming procedure. The pT
resolution is shown in Table 5.8 and the mass resolution is shown in Table 5.9.
5.6      Background Process Modeling
The dominant background process is QCD multijet production which presents itself as a
non-resonant monotonically decreasing spectrum. The resonant backgrounds, the V +jets
process and the top quark, peak outside the Higgs boson signal window but still contribute
                                             89


                       (a)                                         (b)
Figure 5.5: Breakdown of the contributions for the different Higgs boson production modes
for the (a) SRL and (b) SRS as a function of pT .
Table 5.8: Momentum resolution of the candidate jets truth matched to a Higgs boson for
ggF events.
                                            pT Resolution [GeV]
                             pT [GeV]      Leading Subleading
                          250 < pT < 450       −        38.3 ± 1.2
                          450 < pT < 650  29.3 ± 0.3    39.9 ± 1.2
                         650 < pT < 1000  38.3 ± 0.9    56.1 ± 2.1
in a minor way. Within the Higgs mass peak (105 < mj < 140 GeV), the V +jets process,
represents approximately 1% of the total background, top quarks represent about 3%, and
the rest of the background is due to QCD multijets. Figure 5.6 shows the mass distributions
of the expected MC estimates (Asimov datasets) of the signal and backgrounds for the signal
regions.
Table 5.9: Mass resolution of the candidate jets truth matched to a Higgs boson for ggF
events.
                                          Mass Resolution [GeV]
                             pT [GeV]      Leading Subleading
                          250 < pT < 450       −        17.7 ± 0.7
                          450 < pT < 650  11.3 ± 0.1    13.4 ± 0.2
                         650 < pT < 1000  10.8 ± 0.3    13.5 ± 0.4
                                            90


                         (a)                                           (b)
Figure 5.6: Mass distributions in the (a) SRL and (b) SRS for the MC estimates of the signal
and backgrounds [10].
Top Quark Modeling
A top quark event is characterized by the presence of a b-quark and two hadronic decay
products of a W boson. The tt̄ control region CRtt̄ was defined to estimate the top quark
contributions to the signal regions. The jet mass distributions in both the CRtt̄ and the
SR are comparable given that both regions probe a similar phase space. Figure 5.7 shows
the breakdown of the tt̄ contribution to the candidate jet mass distribution in the signal
and control regions. The shape of the spectrum is taken from MC but the normalization
is extracted from data. Adjustments performed on the CRtt̄ are directly applied to the SR
by including it in the global likelihood fit. To extract the scale factor, a simultaneous fit is
performed on the CRtt̄ and SR together. Given that the CRtt̄ has a tt̄ purity of 97% (with
similar levels in the fiducial and differential regions), the normalization was determined from
data with better than or equal to 10% precision.
    In the SR, single top events contribute between 2-3% (tW ) and 1-5% (t-channel) of
the candidate jets relative to the total tt̄ yield. These events have a candidate jet mass
distribution similar to that of the tt̄ events. The s-channel contribution was found to be
negligible. To account for this contribution to the likelihood fits the tt̄ MC was scaled
                                                 91


                       (a)                                            (b)
                                              (c)
Figure 5.7: Breakdown of the tt̄ contribution to the candidate jet mass for the inclusive (a)
SRL, (b) SRS and (c) CRtt̄ .
accordingly to match number of events in tW and t-channel MC samples for each pT bin.
A 50% normalization uncertainty was applied to the estimated number of single top quark
events due to comparisons between diagram subtraction and diagram removal schemes [118]
in tW events.
    Standalone fits were also performed for the CRtt̄ using Asimov datasets. The measure-
ment of the scale factor had greater uncertainty in the high pT region due to the lower
number of events. It was found that the scale factor between the data and the MC was
about 0.8. For the global fit of the SR with the control region, the data and simulation seem
to agree. Figure 5.8 shows the results of the fit for the differential analysis regions.
                                              92


    Systematic uncertainty estimates were calculated using simulated samples for alternative
parton shower models (Powheg vs Herwig) finding 6-19% difference in yield across the
analysis regions. Similarly, uncertainties due to matrix element calculations (Madgraph5
vs Powheg Box v2) were performed and found to have a 1-19% difference in yield. Weight
variations on the nominal sample associated with initial and final state radiation (ISR and
FSR) produced uncertainties between 1-7%. Renormalization and factorization scale varia-
tions were found to be negligible. The two largest uncertainties on the tt̄ normalization were
from b-tagging efficiency and JMS.
                        (a)                                          (b)
                        (c)                                          (d)
Figure 5.8: The post-fit CRtt̄ Jt mass distribution in the four pT regions used in the global
likelihood of the differential fit. The W (lv) contribution is flat in jet mass and for events
with pT < 1 TeV it is estimated to be 1-3% of the total. The pT > 1 TeV region is shown
in 10 GeV jet mass bins. The ratio of the data to the background prediction is shown in the
lower panel. The shaded areas indicate the 68% CL for all background processes [10].
                                              93


V + jets Modeling
Vector boson production offers a unique opportunity to validate the signal measurement
procedure for the Higgs given that their decay structure, mass peak and resolution are
similar. The Z boson, especially, given that it populates the signal region with around 20
times more events than the Higgs boson, it can be used to study experimental effects that
would not be apparent with the statistically limited Higgs boson measurement. Therefore,
to have a proper measurement of H, a well understood Z background is neccesary.
    In the SR, the number of Z+jets events is more than 3 times that of W +jets because
of the large branching ratio of its b-quark pair decay (Z → bb̄) coupled with our b-tagging
selection criteria. For approximately 90% of the candidate jets in Z+jets events, the decay
products of the Z bosons are fully contained within the jet. On the other hand, only 40%
of the candidate jets in W +jets events contain its decay products. This is due to the low
misidentification rate for b-tagging. The remaining candidates from W +jets events come
from the recoiling hadronic system resulting in a broader mJ distribution in the SR.
    In the VR, due to the requirement that the candidate jets must be anti-tagged, W +jets
are three times larger than Z events. They both have comparable acceptance but the W has
a larger cross-section. The decay products of the vector bosons are reconstructed within the
candidate jets only in 60% of events. This results in a non-resonant mass distribution, with
a shape that is similar to the QCD multijet background.
    Given that the Z+jets normalization is directly extracted from the data with the global
likelihood fit, the systematic uncertainties of the modeling are limited to changes in accep-
tance in the different regions and to the mass distribution shape. For the W +jets cross
section, a 10% uncertainty in the signal region is assigned [119]. The semi-leptonic W +jets
                                               94


decays (W → lν) contribution in the CRtt̄ has a total uncertainty of 30%. Systematic un-
certainties due to renormalization and factorization scale variation represent a 3-20% error
to the acceptance across the different regions. Other variations were studied, but found to
have a negligible impact. These include an alternative PDF (MMHT2015NLO [120]), αs
variations in the nominal PDF and alternative cluster fragmentation modeling (Lund string
model [121]). For the normalization, the largest experimental uncertainties are associated
with the JMR and JMS. Agreement between simulation and data in the leading jet VR is
shown in Figure 5.10.
V + jets Resolution
It was found that the fitted Z+jets normalization in the SR had a correlation with the re-
constructed mass resolution. This is due to the flexibility of the Z+jets template and the
multijet model (discussed in the next section). In some cases, the best value of the JMR
parameter broadened the Z+jets peak, which corresponds to a increase of Z+jets normal-
ization and a decrease of the contribution of multijets compared to the expected values. A
dedicated control region rich in large-R jets containing W bosons from the decay products
of semileptonic tt̄ decays was created to constrain the JMR systematic in conjunction with
the VRL. This control region is denoted as the WCRtt̄ .
    The WCRtt̄ requires the presence of two top quarks in different hemispheres where one
top quark decays leptonically while the other top quark decays hadronically. The decay
products of the W boson from the hadronically decaying top quark must be isolated in
the large-R jet. This region provides a high purity reconstructed W peak with pT from
200-600 GeV. Similar to the CRtt̄ , an isolated medium quality muon is used. The selection
requires at least one large-R jet (leading will be the W candidate) with pT > 200 GeV and
                                               95


at least 2 VR track jets with pT > 10 GeV. Both VR track jets must pass the b-tagging
requirements. One of the b-tagged VR track jets has to be close to the muon by satisfying:
            µ
0.04 + 10/pT < ∆Rbtag1,muon < 1.5 and to also have a pT > 25 GeV. This b-tagged jet
must be well separated from the W candidate (∆Rbtag1,Wcand > 2.0). The second b-tagged
VR track jet is required within 1.0 < ∆Rbtag2,Wcand < 1.5 of the W candidate. Figure 5.9
shows a diagram of the topology of the WCRtt̄ events. The mass and pT distributions of
the inclusive WCRtt̄ can be seen on Figure 5.11.
                                            (a)
Figure 5.9: Diagram depicting the topology of a WCRtt̄ event. The hadronically decaying
W boson must be isolated in the large-R jet.
    The WCRtt̄ provides a good source of W bosons in the pT range below 600 GeV. The
VRL provides a clear peak in pT ranges above 450 GeV but with more multijet background.
The jet mass width of the W and Z resonances show a slow evolution from low pT in the
WCRtt̄ to high pT in the VRL. This can be seen in Figure 5.12. The results from of the
jet mass width using the WCRtt̄ have around 1/5 of the original JMR uncertainty after
the contraint transfer to the Z → bb̄ dominated V +jets sample in the SR. The correlation
                                            96


between the Z+jets normalization and the JMR is reduced when included in the global
likelihood fit. In the inclusive signal region, the correlation reduces from around 90% to 30%
when this auxiliary mass measurement is considered.
                                                 97


                        (a)                                         (b)
                                                (c)
Figure 5.10: Post-fit leading-jet invariant mass distributions after the multijet background
was subtracted in the validation region for data and the V +jets (W +Z) and top quark
components for (a) 450 < pT < 650 GeV, (b) 650 < pT < 1000 GeV, and (c) pT > 1 GeV
shown in wider 10 GeV jet mass bins. The V +jets contribution is split into five generator
‘truth’ pVT volumes labeled p0T –p4T for pVT < 300 GeV, 300–450 GeV, 450–650 GeV, 650–1000
GeV, and > 1000 GeV, respectively. The tt̄ normalization and its uncertainty are set to
the corresponding values from the CRtt̄ . The mJ range has been extended down to 60 GeV
for only this fit to show the level of agreement along the rising edge of the V +jets mJ
distribution. The ratio of the data to the background prediction is shown in the lower panel.
The shaded areas indicate the 68% CL for all background processes [10].
                                                98


                       (a)                                        (b)
Figure 5.11: Inclusive WCRtt̄ (a) mass distribution and (b) pT distribution of the W candi-
dates [122].
Figure 5.12: A summary of the Z and W resonance peak reconstructed-width measurements
as a function of the jet pT using a resolved W boson in top quark decays in the WCRtt̄
region and the combined W and Z boson mass distribution in the validation region. The
continuous black curve is a fit to the measurements with resultant errors shown as a cyan
band [122].
                                             99


Multijet Modeling
The QCD multijet background has a monotonically decreasing mass spectrum. It is modeled
using an exponential function of a polynomial of degree N with the form:
                                                   X N         
                                 fN (x|Θ) = Θ0 exp            i
                                                          Θi x ,                          (5.13)
                                                     i=1
where Θi are the parameters of the fit and x = (mJ − 140)/70 GeV. The parameters are
simultaneously determined during the signal extraction fit independently for each region.
The number of events and the shape of the spectrum has an impact on the optimal degree
of the polynomial of the model. Small values of N make the function too rigid and therefore
prone to bias in the resonant process yields. On the other hand, large values of N decrease the
statistical significance of the resonant process models, due to the increased correlation which
can create or absorb the resonances. Modified VR (which we call hybrid VR) ensembles
are used to study the optimal values for N given that they contain more than 50 times
the amount of data than the SR. Ten of these ensemble (VR slices), with roughly the same
amount of data as the SR, are used to find the optimal parameters of the fit by taking an
average for each of the regions.
    The hybrid VR is constructed by replacing the VR resonance peaks with the SM pre-
diction in the SR while correcting the mass spectrum to match the SR. A shape correction
factor, defined as the ratio of the SR multijet estimate (MJSR ) with the VR model (MJVR ),
is applied to the VR slices. The values of MJSR are obtained from the likelihood fit of the
SR and CRtt̄ while including all the systematic uncertainties. The MJVR is taken as the
average of likelihood fits of 10 random orthogonal subsets of the VR while including all sys-
tematic uncertainties. The resonant peak estimates for V + jets and Top (VVR and TopVR )
                                               100


are extracted from the average post-fit contributions of the same 10 VR fits. Each hybrid
VR (VRihyb ) is defined as
                                                    MJSR
               VRihyb = (VRi − VVR − TopVR ) ×             + VSR + tt̄SR + HSR ,         (5.14)
                                                    MJVR
where VRi is the mass distribution of the data events in the VR slice and the variables with
subscript SR are the nominal MC predictions for the resonant sources in the SR.
    The log-likelihood ratio (LLR) is used to test the results between different values of the
polynomial degree N in each VRhyb but without the injected resonances. The null hypothesis
is defined as the fit using a polynomial of degree N , while the alternate hypothesis is the fit
using a polynomial of degree N + 1. Wilk’s theorem [123] relates the log-likelihood ratio to
a χ2 distribution with N + 1 − N = 1 degrees of freedom (d.o.f.). The smallest value of N
that yields a uniform distribution of p-values is selected as the optimal model. A uniform
distribution is represented by a linear increase in the corresponding cumulative distribution
function (CDF). Figure 5.13 shows the CDF as a function of p-value for the VR.
                         (a)                                         (b)
Figure 5.13: Cumulative distribution function (CDF) of the p-values of the log-likelihood
ratio of the exponential polynomial of degrees N and N + 1. Plots correspond to the (a)
VRL and (b) VRS [122].
                                              101


    To look for local effects due to the resonances, tests were performed by including a free
normalization parameter µVR ± σstat VR for either the Z + jets process or the Higgs boson in
a fit for the VRhyb by doing an artificial signal injection. The quantity F2σ was used to
estimate the probability that the multijet model allows artificial excesses or deficits. F2σ is
                                                                               VR :
defined as the fraction of fitted Z and H signal in excess of twice its error σstat
                                      |µVR − 1| > 2σstat
                                                     VR .                                (5.15)
The average ratio µ/σ = (µVR − 1)/σstat VR quantifies the bias in the signal strength determi-
nation and can be used to estimate the spurious signal systematic uncertainty when applied
to VRhyb without any signal. The value of N is chosen so that F2σ is compatible with a
value of 0.05 and µ/σ is stable for both Z + jets and Higgs production. The spurious signal
systematic uncertainties range from 0.01-0.33 for H and 0.15-0.65 for Z. Figure 5.14 shows
the values of F2σ for the Higgs and the Z bosons.
                        (a)                                          (b)
Figure 5.14: Fraction of fitted signal in excess of twice its error for (a) H and (b) Z as a
function of the exponential polynomial degree N [122].
                                              102


    The optimal values for N were found to be N = 5 in the inclusive region and between
4 and 5 for the differential pT bins. The results are summarized in Table 5.10, where the
differential bins are labeled as p0T (250 < pT < 450 GeV), p1T (450 < pT < 650 GeV), p2T
(650 < pT < 1000 GeV) and p3T (pT > 1 TeV). A comparison of the QCD multijet fits for
all the pT binned analysis regions is shown on Figure 5.15.
Table 5.10: Optimal degree N of the exponential polynomial used to model the QCD multijet
background for all the analysis regions
                                         Inclusive     Differential
                         Candidate jet
                                                   p0T  p1T p2T p3T
                            Leading          5     −     5     4    4
                          Subleading         5      5    4     5    4
                                               103


                         (a)                                    (b)
                          (c)                                   (d)
                          (e)                                   (f)
Figure 5.15: Multijet jet-mass distribution from the different pT -binned analysis regions.
The solid lines show the multijet function after a fit to the SR data (gray) and VR data
(blue). The solid points are the data from VR slices with the same number of events as the
SR after the SM resonances are subtracted. The bottom panel shows the ratio of the SR
data fit to the VR data fit [10].
                                            104


5.7      Statistical Analysis
The signal extraction is performed using the maximum likelihood method. In practice, this
is achieved through the minimization of the negative log likelihood function L(µ, θ) using
the RooStats framework [124] and the RooFit library [125]. The likelihood function is
defined as a product of Poisson probability density functions as described in Sec 5.1. One
of these terms is defined for each mJ bin of the SRL, SRS and CRtt̄ . The bin width for the
mass distribution was chosen to be 5 GeV. A recent RooFit extension [126] was needed
to remove an existing bias for wide binned datasets. The nuisance parameters θ represent
the systematic uncertainties and are constrained with Gaussian or log-normal probability
density functions. The V +jets JMR contraints obtained from the WCRtt̄ and VRL are
implemented as Gaussian p.d.f priors. The normalization of the MC templates is controlled
by free parameters for each pT region or the truth-based volume common to the SRL, SRS
and CRtt̄ . For the multijet model, both the normalization and the polynomial coefficients
are treated as free parameters independent from each jet mass distribution. The yield of
each of the signals is given by the signal strength µ. The signal stregnth µ is defined as ratio
between the fitted number of signal events and the corresponding SM prediction. Upper
limits on the Higgs boson signal strength µH and production cross section σH are obtained
using the CLs method, where the expected limits are determined by assuming no Higgs
boson contribution.
Systematic Uncertainties
Uncertainties related to a low number of events in MC samples for the background predictions
were parametrized with the Beeston-Barlow technique [127]. A smoothing procedure was
                                               105


also used to remove large variations with a threshold for pruning of only 2% [128]. Table
5.11 summarizes all the systematic uncertainties considered in the likelihood fit for the H
and Z signal strength extraction.
Table 5.11: Summary of the systematic uncertainties included in the proifle likehood fit
for the signal strength extaction. The second column states the processes for which an
independent nuisance parameter is considered. The third column indicates the regions for
which the systematic uncertainty is correlated. The fourth column describes the effect of
the systematic uncertainty induced by the parameter: N denotes a normalization change
and S represents an impact to the shape. (*) tt̄ and V +jets events have two extra minor
components only applied to them. (?) This uncertainty only covers relative acceptance across
regions instead of the absolute cross section uncertainty. (•) Only applied to Z+jets when
the signal extraction performed on truth-based volumes is tested using the SR [10].
                    Description                      Processes       Category     Effect
                         Reconstructed object systematic uncertainties
                       JMR                       tt̄, V + jets, H        pT       N+S
                JMS (dominant)                   tt̄, V + jets, H        pT       N+S
                    JMS (rest)                   tt̄,V + jets +H         all      N+S
                Jet Energy Scale                        all(∗)           all      N+S
             Jet Energy Resolution                       all             all      N+S
            b-tag efficiency for b-jets                  all             all      N+S
            b-tag efficiency for c-jets                  all             all      N+S
          b-tag efficiency for light-jets                all             all      N+S
                          Process modeling systematic uncertainties
    Renormalization and factorization scale           V + jets           all      N+S
                  Cross section                       W + jets           all        N
         Cross section and acceptance                  W (lν)            all        N
              Parton shower model                         tt̄            all      N+S
           Matrix element calculation                     tt̄            all      N+S
        Initial and final state radiation                 tt̄            all      N+S
         Cross section and acceptance                      t             all        N
        Cross section and acceptance(?)                  H               all        N
                                                VBF +V H+tt̄H            all        N
              NLO EW corrections
                                                         H        pH
                                                                   T   bins × LS    N
                 Spurious signal                     Z+ jets(•)   pZT  bins × LS    N
                                              106


Chapter 6
Boosted H → bb̄ Results
Three different configurations are used to study Higgs boson production at high pT . The
inclusive region is used to measure the H signal strength, the fiducial region is used to
measure the fiducial cross section and the differential regions are used to measure the cross
section for four different pT bins.
    The signal strength extraction in the fiducial region considers the events on the fiducial
volume defined by the requirements applied to the truth Higgs boson transverse momentum
pH
 T and its rapidity yH . The same truth information is used for the differential regions.
The pT -y volume bins are based on the simplified template cross-section (STXS) framework
[129][130] for ggF production, with the modification of a tighter yH requirement and the
inclusion of all production modes. The STXS framework was developed to maximize the
sensitivity of the Higgs boson measurements, while at the same time minimizing the theory
dependence of their determination. The same pT boundaries are used for the V +jets pro-
duction cross section measurements in the VRL, and for Z+jets production in the SR, and
are used to validate the method. The fiducial and STXS volumes for these are defined by
requirements on the generator truth vector boson transverse momentum pVT . The summary
of the fiducial and differential region volumes is given in Table 6.1.
                                              107


Table 6.1: Summary of the fiducial and STXS volumes used to determine the signal events
considered for the signal strength measurement [10].
                                Volume         pHT       |yH |
                                Fiducial      > 450       <2
                                           300 − 450,
                                           450 − 650,
                                 STXS                     <2
                                          650 − 1000,
                                             > 1000
6.1      Inclusive Region
The inclusive region is the signal region containing candidate jets with pT > 250 GeV.
The extraction of the Higgs boson signal strength in the inclusive region yields a value of
µH = 0.8 ± 3.2 when combining the SRL, SRS and CRtt̄ information. The breakdown of
the uncertainty was found to be ± 3.2 (total) = ± 3 (stat) ± 1.1 (syst). The measurement is
dominated by the statistical uncertainty (size of the data sample) which limits the sensitivity
of the signal. The observed signal strength corresponds to a sensitivity of 0.29σ (0.36σ was
expected). The largest contributions to the systematic uncertainty are from the jet mass
resolution (JMR) and the jet mass scale (JES). For the tt̄ contribution, the value was found
to be µtt̄ = 0.80 ± 0.06. The poor modeling seen for tt̄ agrees with previous published
results where the boosted top-quark pair differential cross-section in the l+jets channel was
measured [131]. The Z+jets process had a signal strength value of µZ = 1.29 ± 0.22. These
results are summarized in Table 6.2. The yields in the three regions pertinent to the Higgs
boson signal extraction are presented in Table 6.3.
                                             108


                         (a)                                          (b)
Figure 6.1: The ratios of the measured fiducial phase-space absolute differential cross-sections
to the predictions obtained to (a) the Powheg+Pythia8 MC generator and to (b) the NNLO
predicitions, in the resolved and boosted topologies as a function of the top quark pT [131].
Table 6.2: Expected and observed values of the signal strengths for the H, Z, and tt̄ com-
ponents in the inclusive fit [10].
                          Result       µH          µZ           µtt̄
                        Expected    1.0 ± 3.2 1.00 ± 0.17   1.00 ± 0.07
                        Observed    0.8 ± 3.2 1.29 ± 0.22   0.80 ± 0.06
Table 6.3: Event yields and associated uncertainties after the global likelihood fit in the
inclusive region [10].
                    Process          SRL              SRS            CRtt̄
                   Multijet    590 700 ± 4200   529 300 ± 3500         −
                    Z+ jets     16 100 ± 2800    12 000 ± 2100         −
                   W + jets      3050 ± 720       2510 ± 500           −
                       Top      16 300 ± 1900    15 900 ± 2000   3737 ± 68
                    W (lν)            −                −          53 ± 16
                        H        400 ± 1500       300 ± 1300           −
                     Total      626 530 ± 820    560 090 ± 770   3790 ± 66
                     Data          626 532          560 083          3791
                                              109


                        (a)                                         (b)
Figure 6.2: Post-fit jet mass distributions for the various components in the inclusive SRL
(left) and SRS (right) regions. In the middle panels the shaded areas indicate the 68% CL for
the multijet background from the fitted parameters and normalizations of the exponentiated
polynomials. In the lower panels the shaded areas indicate the 68% CL for all background
processes [10].
                                              110


6.2      Fiducial Region
In the fiducial region, the Higgs boson yield is determined using the fiducial volume defined
by the Higgs boson transverse momentum (pH        T > 450) GeV and rapidity (|yH | < 2.0).
The transverse momentum cut was chosen to ensure an unbiased truth spectrum due to the
trigger turn on. Therefore, this measurement doesn’t include the SRS region below 450 GeV.
The signal acceptance times efficiency in the fiducial volume is presented in Table 6.4.
Table 6.4: Signal acceptance times efficiency within the fiducial volume used in the fiducial
region [10].
                                             pHT > 450 GeV
                                  Process
                                                 |yH | < 2
                                     All           0.24
                                     ggF           0.26
                                    VBF            0.22
                                     VH            0.27
                                     tt̄H          0.20
    The signal outside the fiducial region is set to the SM value within uncertainties. There-
fore the fit considers two Higgs boson mass templates in each SR. The component from
the fiducial volume accounts for more than 80% of the Higgs boson signal. The component
from outside the fiducial volume has a broader mass spectrum shifted to higher values. This
procedure was tested with W → qq 0 and Z → q q̄ in the VR and with Z → bb̄ in the SR. By
fixing the Higgs signal to the SM values, in both in and out the fiducial region, the signal
strength for V +jets was found to be µV = 1.01 ± 0.09. Similarly, for Z events in the SR the
signal strength was found to be µZ = 1.35 ± 0.25, both being in agreement with the SM.
    For the Higgs boson signal strength, the likelihood fit yields a value of µH = −0.1 ± 3.5.
The results are summarized in Table 6.5. Post-fit mass distributions are shown in Figure
6.3. No signal of the Higgs boson is shown given that the signal strength was found to be
                                               111


below 0. The SM prediction for the Higgs boson production with pH       T > 450 GeV is 18.4 fb.
Our measurement corresponds to a 95% CL upper limit on the observed (expected) Higgs
boson production cross section of
                                σH (pH
                                     T > 450 GeV) < 115 (128) fb.                           (6.1)
Table 6.5: Expected and observed values of the signal strengths for the H, Z and tt̄ compo-
nents in the fiducial fits [10].
                         Result         µH           µZ            µtt̄
                      Expected       1.0 ± 3.4   1.00 ± 0.18   1.00 ± 0.08
                      Observed      −0.1 ± 3.5   1.30 ± 0.22   0.75 ± 0.06
    The statistical uncertainty is the largest contributor to the total uncertainty of the signal
strength, with the systematic uncertainty being somewhat smaller. The largest component
of the systematic uncertainty, with almost a 80% contribution, is the jet systematics driven
by the jet mass scale (JMS) effects. The JMS uncertainty comes from both background
(V +jets and tt̄ contribute 50%) and the reconstructed Higgs bosons (which contributes the
other 50%). The breakdown is shown in Table 6.6.
Table 6.6: Contributions to the systematic uncertainties for the measurement of the fiducial
volume signal strength [10].
                          Uncertainty Contribution       pH
                                                          T > 450 GeV
                                      Total                    3.5
                                   Statistical                 2.6
                                  Systematic                   2.3
                         Jet systematic uncertainties          2.2
                         Modeling and theory systs.            0.8
                             Flavor-tagging systs.             0.2
                                               112


                        (a)                                         (b)
Figure 6.3: Post-fit jet mass distributions for the various components in the fiducial SRL
(left) and SRS (right) regions. In the middle panels the shaded areas indicate the 68% CL for
the multijet background from the fitted parameters and normalizations of the exponentiated
polynomials. In the lower panels the shaded areas indicate the 68% CL for all background
processes [10].
                                              113


6.3      Differential Regions
The differential region measurements aims to extract the Higgs boson transverse momentum
spectrum in four pH  T volumes. These are based o the STXS template and consist of the
pH
 T volumes with 300-450 GeV, 450-650 GeV, 650-1000 GeV and above 1 TeV. The same
procedure established for the Higgs boson measurement for the fiducial region is employed.
The Higgs boson mass template for each pH      T volume is used within each pT region in the
global likelihood. Only the SRL and CRtt̄ regions are included for measurements above 1
TeV, given that the SRS expected sensitivity in this region is marginal. Outside the volumes
(pH
  T < 300 GeV) the components were fixed to their SM expectations. The signal acceptance
times the efficiency for the STXS volumes is shown in Table 6.7. The expected yield and
percentage contributions of the Higgs boson subprocesses is shown in Figure 6.4.
Table 6.7: Signal acceptances times efficiency for the STXS volumes in the differential mea-
surement [10].
  Process 300 < pH  T < 450 GeV     450 < pH                    H
                                           T < 650 GeV 650 < pT < 1000 GeV       pH
                                                                                  T > 1 TeV
    All          1.3 × 10−2                 0.23                  0.31               0.23
    ggF          0.7 × 10−2                 0.25                  0.35               0.28
   VBF           0.4 × 10−2                 0.21                  0.32               0.25
    VH           1.7 × 10−2                 0.26                  0.30               0.20
    tt̄H         4.7 × 10−2                 0.19                  0.24               0.19
    Similar to what was done to the fiducial measurement, the signal determination method
was tested with W → qq 0 and Z → q q̄ in the VR, and with Z → bb̄ in the SR. The VRL is
divided into 5 slices with the fit being performed independently in each slice. The results are
then averaged. The Z fit is performed in the SRL, SRS, CRtt̄ regions with the Higgs boson
contribution fixed to the SM prediction. The results of the differential fit signal strengths
for V +jets in the VRL and Z+jets in the SR are shown in Figure 6.5.
                                               114


                        (a)                                            (b)
Figure 6.4: For each of the pH
                             T differential volumes (x-axis), the expected signal event yield for
all Higgs boson events (left) and the fraction of signal in percent (right) in each reconstructed
jet pT region (y-axis) is shown. The leading jet pT in the SRL is denoted by pL        T and the
                                                S
subleading jet pT in the SRS is denoted by pT [10].
                        (a)                                            (b)
Figure 6.5: Comparison of differential fit signal strengths for (a) V +jets in the VRL and
(b) Z+jets in the SR. The signal strength within the STXS volumes is calculated relative to
the prediction at NLO QCD and LO EW accuracy. They are compared with the NLO EW
correction provided by SHERPA, the NNLO QCD correction provided by the NNLOJET
group, and their product. The points are located at the weighted center of the bin considering
the underlying pVT or pHT spectrum [10].
                                               115


    The Higgs boson signal strengths in the STXS volumes is extracted by fitting simulta-
neously the ten differential SR and CR regions defined in Tables 5.5 and 6.1. The results
are summarized in Tables 6.8 and 6.9. The four Higgs boson signal strengths are compatible
with the SM and have a p-value of 0.53.
Table 6.8: Expected and observed values of the signal strengths for the H component in the
differential fits [10].
                                                       µH
                                pHT [GeV]       Exp.       Obs.
                                300 − 450     1.0 ± 18 −6 ± 18
                                450 − 600    1.0 ± 3.3 −3 ± 5
                               650 − 1000      1.0 ± 6     5±7
                                  > 1000      1.0 ± 30 18 ± 32
Table 6.9: Expected and observed values of the signal strengths for the Z and tt̄ components
in the differential fits [10].
                                          µZ                        µtt̄
                Jet pT [GeV]
                                  Exp.          Obs.         Exp.         Obs.
                  300 − 450     1.0 ± 1.1    1.8 ± 1.1    1.0 ± 0.07 0.85 ± 0.06
                  450 − 600    1.0 ± 0.17 1.28 ± 0.22     1.0 ± 0.07 0.76 ± 0.06
                 650 − 1000    1.0 ± 0.33 1.4 ± 0.4       1.0 ± 0.09 0.74 ± 0.08
                   > 1000       1.0 ± 1.6    2.4 ± 1.7    1.0 ± 0.22 0.57 ± 0.18
    The Higgs boson production cross section for pH    T > 1 TeV was found to be
                σH (pHT > 1 TeV) = 2.3 ± 3.9 (stat) ± 1.3 (syst) ± 0.5 (theory) fb.     (6.2)
The SM prediction for this quantity is 0.13 fb. Because the sensitivity was low, upper limits
were calculated. The 95% CL upper limits on the Higgs boson differential production cross
                                                116


section were found to be
                                σH (300 < pH T < 450 GeV) < 2.9 (3.1) pb,
                                 σH (450 < pH T < 650 GeV) < 89 (102) fb,
                                                                                                 (6.3)
                                 σH (650 <  pHT < 1000 GeV) < 39 (34) fb,
                                       σH (pH
                                            T > 1000 GeV) < 9.6 (7.4) fb.
    These results are shown in Figure 6.6. As for the first two results of this analysis, the
largest source of uncertainty is of statistical nature given that the sample is small. Table
6.10 summarizes the breakdown of the uncertainties associated with the measurement. The
largest contribution to the systematic uncertainties is the JMS uncertainty.
Table 6.10: Contributions to the systematic uncertainties for the differential measurements
of the signal strength [10].
   Uncertainty Contribution    300 < pH
                                      T < 450 GeV 450 < pH
                                                         T < 650 GeV 650 < pH
                                                                            T < 1000 GeV pH
                                                                                          T > 1 TeV
              Total                     18                5.0                 6.5            32
           Statistical                  16                3.0                 5.5            30
           Systematic                    7                3.9                 3.4            10
  Jet systematic uncertainties           6                3.8                 3.4            9.5
  Modeling and theory systs.             4                0.7                 0.7             2
      Flavor-tagging systs.            0.2                0.4                 0.4             2
    The correlations between the differential Higgs boson signal strength measurements in
pHT bins were found to be small. This implies a low number of events migrating from the
analysis bins and the STXS truth Higgs bins. Figure 6.7 shows the correlations of the µH
and µZ measurements. A post-fit mass distribution of all the components in the differential
leading jet signal region is shown in Figure 6.8.
                                                   117


                                             (a)
Figure 6.6: Summary of the STXS volume signal strengths measured using the differential
signal regions. Within the same kinematic regimes, measurements of the Z → bb̄ process
agree with the Standard Model predictions, validating the methods. The points are located
at the weighted center of the bin considering the underlying pH
                                                              T spectrum [10].
                                            118


                       (a)                                         (b)
Figure 6.7: Correlations among the four Higgs boson signal strengths, and between the four
Higgs boson and Z+jets signal strengths. The Higgs boson signal strengths µH are labeled
with the corresponding pH  T range as a superscript. The Z+jets signal strengths µZ are
labeled with the corresponding jet pT range as a superscript [10].
                                           119


                        (a)                                          (b)
                                               (c)
Figure 6.8: Post-fit jet mass distributions of the various components in the differential
leading-jet signal region defined by the selected candidate jet with (a) 450 < pT < 650 GeV,
(b) 650 < pT < 1000 GeV, and (c) pT > 1000 GeV shown in wider 10 GeV bins [10].
                                              120


Chapter 7
Unified Flow Objects
7.1      Introduction
From the results presented in Chapter 6, it is evident that the optimization of large-radius
jet definitions could result in considerable gains for our measurement in terms of preci-
sion. Approximately 90% of the systematic uncertainties result from the jet definitions, in
particular the jet mass resolution and jet mass scale. In the analysis presented, large-R jet
reconstruction was based on topological cluster inputs reconstructed using calorimeter-based
energy measurements. A trimming procedure was performed and a combined mass scheme
was employed. Even though a good energy resolution is achieved, in the high pT regime,
the resulting showers are so collimated that the calorimeter’s granularity is not sufficient
to spatially resolve individual particles in the jet. For this reason, the use of jet substruc-
ture variables (JSS) is limited with these type of jet definitions. As a step to reduce these
limitations during Run 2, particle-flow (PFlow) [134] algorithms were implemented to im-
prove performance at low pT . For high pT , on the other hand, Track-CaloClusters (TCCs)
[135] were designed in order to reconstruct jet substructure (JSS) variables. A new type of
jet input, called “unified flow object” (UFO) [11] was then developed using both particle-
flow objects (PFOs) and TCCs. This object combines calorimeter and inner detector based
signals in order to achieve optimal performance across a wide kinematic range. This new
                                              121


definition, combined with better pile-up mitigation techniques, such as Constituent Subtrac-
tion (CS) [136] and SoftKiller (SK) [137], as well as grooming algorithms, such as Soft-Drop
[138], motivated the re-optimization of the large-R jet definitions used by ATLAS. The AT-
LAS Jet Tagging and Scale Factor Derivation group has been developing UFO reconstructed
large-R jet dedicated taggers for hadronically decaying boosted objects. In this chapter we
explore the reconstruction algorithm behind UFO jets, the different dedicated taggers devel-
oped and the main ideas behind their development, such as jet substructure variables and
machine learning. The strategy used to extrapolate their scale factors to higher momenta
and the manner in which we estimate the uncertainties associated with the extrapolation
procedure, will be presented. The chapter contains final results of the extrapolation un-
certainties for the already calibrated UFO taggers supported by the ATLAS Collaboration.
Finally, studies regarding a multiclass tagger (MCT) that includes Higgs boson identification
will also be explored. These projects include work that started as the author’s “qualification
task” (authorship project) and the subsequent collaboration with the ATLAS Jet Tagging
and Scale Factor Derivation group.
Particle-flow Objects
Particle-flow (PFlow) [134] reconstruction combines both track and calorimeter based mea-
surements. Particle-flow objects (PFOs) themselves improve pile-up stability relative to jets
reconstructed from topo-clusters. The PFlow algorithm matches each selected track to a
single topo-cluster. For each track/topo-cluster system the probability that the particle’s
energy was deposited in more than one topo-cluster is evaluated. Then, the expected en-
ergy deposited in the calorimeter by the particle that produced the track is subtracted. Any
topo-cluster that is not matched to a track is considered to be produced by a neutral particle
                                              122


and is left unmodified. The subtraction is gradually disabled for tracks with pT < 100 GeV
if the energy deposited (Eclus ) in a cone of size ∆R = 0.15 by the extrapolated track satisfies
                         Eclus − hEdep i
                                          > 33.2 × log10 (40 GeV/ptrk
                                                                   T ),                   (7.1)
                            σ(Edep )
where Edep is the expected energy deposition. Any charged PF0 that is not matched to the
primary vertex is removed to reduce the contribution from pile-up. This procedure is known
as “Charged Hadron Subtraction” [139].
Track-CaloClusters
Track-CaloClusters (TCCs) [135] are optimized to perform jet substructure reconstruction
for very high pT jets. TCCs use energy scale information from topo-clusters and angular
information from tracks. The algorithm matches a loose track in a particular event to topo-
clusters that have been calibrated to the local hadronic scale. When a track is matched
to a topo-cluster, the pT is determined using the pT associated with the topo-cluster while
its angular coordinates (η,φ) are taken from the track. If a topo-cluster is not matched to
any track then the TCC is created only using the topo-clusters 4-vector directly. Similarly,
for an independent track that is not matched to any topo-cluster, the track information is
used directly to create the TCC. If multiple tracks are matched to a single topo-cluster,
each TCC object is given a fraction of the total pT of the topocluster. The momentum
fraction is determined using the ratios of momenta of the matched tracks. Any unmatched
topo-clusters are included as unmodified neutral objects.
                                                123


Pile-Up Mitigation and Grooming Algorithms
Before jet reconstruction, pile-up mitigation techniques are employed. For topo-clusters, the
techniques are applied to the entire set of inputs. On the other hand, for charged PFOs,
only the CHS method is employed, while neutral PFOs and TCCs are subject to the same
preprocessing techniques used for topo-clusters. During reconstruction, grooming algorithms
are applied to reduce contamination from soft radiation originating from the underlying event
(UE), pile-up and initial state radiation (ISR).
Constituent Subtraction
Constituent Subtraction (CS) [136] is a method that performs area subtraction on jet input
objects. Therefore, it is a local subtraction of pile-up at the level of individual jet con-
stituents. Each input is defined using ghost association, a process where massless particles
(ghosts) with low momentum are overlaid uniformly in the event. Ghosts have to satisfy
                                           g
                                         pT = Ag × ρ,                                    (7.2)
                                         g
where Ag is the area of the ghosts and pT is the expected contribution from pile-up radiation
in a small angular area. The pile-up energy density ρ is assumed to have a weak dependence
in azimuthal angle φ and rapidity y. ρ is estimated as the median of pT /A distribution of
R = 0.4 kt jets in each event.
    The distance ∆Ri,k between a cluster i and ghost k is given by
                                       q
                              ∆Ri,k = (ηi − ηk )2 + (φi − φk )2 .                        (7.3)
                                             124


    The algorithm proceeds iteratively through each cluster-ghost pair, after sorting in order
of ascending ∆Ri,k , by modifying the pT of each pair as follows
                            if pT,i ≥ pT,k : pT,i → pT,i − pT,k ,
                                               pT,k → 0;
                                                                                          (7.4)
                                  otherwise : pT,k → pT,k − pT,i ,
                                               pT,i → 0.
The procedure continues until ∆Ri,k > ∆Rmax , where ∆Rmax is a free parameter. The
particles with zero transverse momentum are then discarded.
    In the original formulation of this procedure, a similar modification was performed for
the mass, but this is ignored in the latest ATLAS implementation given that all neutral
ATLAS jet inputs are defined to be massless.
SoftKiller
SoftKiller (SK) [137] is an algorithm that applies a pT cut to input objects. The pT cutoff,
pcut
 T , is chosen such that the value of ρ is approximately zero. The event is broken into
square patches in the rapidity-azimuth plane. The parameter ρ is the event-wide estimate
of the transverse momentum density in an area patch
                                                    pT,i
                                                        
                                     ρ = median           .                               (7.5)
                                         i∈patches Ai
    The cut is determined such that when the cut is applied, half of the grid spaces are empty.
                                              125


Computationally, this is given by the next formula
                                    pcut
                                     T = median {pT,i },
                                                         max                               (7.6)
                                            i∈patches
where pmax
         T,i is the pT of the hardest particle in patch i. This implies that half the patches
will contain only particles that have pT < pcut T . Therefore, after applying the cut, the value
of ρ will be zero.
    The best performance is achieved when SK is applied after the CS algorithm.
Soft-Drop
Soft-Drop (SD) [138] is a grooming technique that removes soft and wide-angle radiation
from jets. For this procedure, the constituents of the large-R jets are reclustered using the
C/A algorithm. Using the angle-ordered jet clustering history, determined from the C/A
algorithm, the clustering sequence is traversed from the widest-angled radiation iterating to
the smallest-angle radiation. The condition
                               min(pT,1 , pT,2 )           ∆R12 β
                                                              
                                                  < zcut          ,                        (7.7)
                                 pT,1 + pT,2                R
is evaluated for each splitting, where 1 and 2 represent the harder and softer branches of
the splitting respectively. The parameters zcut and β dictate how aggresive the removal of
soft and wide-angle radiation is and its dependence on the distance parameter. When a
splitting fails the condition, the lower pT branch is removed. If the condition is satisfied,
the process ends and the consituents remaining form the groomed jet. The use of the SD
grooming algorithm allows the calculation of certain jet substructure observables to beyond
leading-logarithm (LL) accuracy, while other trimming algorithms do not [140].
                                                 126


Unified Flow Objects
The fact that no single jet definition is optimal according to all metrics motivated the
development of a new jet input that combines all the desirable aspects of both particle-
flow objects (PFOs) and Track-CaloClusters (TCCs) reconstruction. TCCs improve tagging
performance at high pT but their performance is worse than with baseline trimmed topo-
cluster based jets at low pT , as they are more sensitive to pile-up. PFOs on the other hand
can improve on the baseline definition for the entire pT range but their tagging performance
is worse than with TCCs at high pT . Combining both approaches by defining a new jet
input object, called Unified Flow Object (UFO) [11], we can achieve optimal performance
across the full kinematic range.
    Figure 7.1 contains an illustration of how the UFO reconstruction algorithm is performed.
The process starts by the application of the standard particle-flow (PFlow) algorithm. Any
charged PFO that is matched to a pile-up vertex is removed. The remaining PFOs are then
divided into three categories: neutral PFOs, charged PFOs that were used to subtract energy
from a topo-cluster, and the charged PFOs that were not used for energy subtraction. At
this point jet input pile-up mitigation algorithms (i.e. CS+SK) are applied. Then a modified
TCC splitting algorithm is applied. Tracks that have been used for the PFlow subtraction
are not considered as they have already been subtracted from the energy of the topo-clusters.
The TCC algorithm proceeds with the remaining collection of tracks to split neutral and
unsubtracted charged PFOs.
    The use of UFOs improves jet mass resolution (JMR) relative to topo-cluster-based jets,
by 40% for high pT hadronically decaying W bosons and by 26% for hadronically decaying
high pT top quarks. Figure 7.2 shows the JMR relative performance of different UFO
                                               127


Figure 7.1: Illustration of the UFO reconstruction algorithm. The procedure starts with the
identification of particle-flow objects (PFOs) and inner-detector tracks [11].
definitions compared to the ATLAS baseline jet definitions.
     For tagging, UFOs bring significant improvements over the usual topo-cluster and TCC
definitions. Some studies show an increase of 120% in background rejection at a fixed signal
effifiency of 50% at high pT . One example of the performance of UFOs for tagging is shown
in Figure 7.3. In this example, using UFOs for hadronically decaying top quarks at high pT ,
improves the background rejection by 135% when compared with the baseline jet definitions.
     The pT resolution is degraded for large-R jets coming from UFOs compared to the baseline
topo-cluster and TCC definitions, but given the improvements in jet mass resolution and jet
tagging at high pT , it is worthwhile to proceed with the development and study of these jet
definitions and to define taggers for future ATLAS analyses.
7.2       Jet Substructure Variables
The jet substructure techniques [141] can be summarized as a set of tools to exploit the
radiation pattern inside hadronic jets. These correlations are quantified by a set of variables
                                               128


                                                       (a)                                                                                               (b)
Figure 7.2: Jet mass resolution for (a) W boson jets and (b) top quark jets as a function
of pT . The relative performance of the studied UFO definitions compared to the current
ATLAS baseline topo-cluster and TCC jets is shown [11].
   Background Rejection                                                                              Background Rejection
                                                            ATLAS Simulation Preliminary                                                                      ATLAS Simulation Preliminary
                                                             s = 13 TeV, t → qqb                                                                               s = 13 TeV, t → qqb
                                                            500 GeV ≤ p true < 1000 GeV                                                                       1000 GeV ≤ p true < 1500 GeV
                          103                                ηtrue < 1.2
                                                                         T
                                                                                                                                                               ηtrue < 1.2
                                                                                                                                                                            T
                                                            JES+JMS                                                                                           JES+JMS
                                                                                                                            102
                          102
                                         LC Topo Trimming                                                                                  LC Topo Trimming
                                         TCC Trimming                                                                                      TCC Trimming
                                         CS+SK UFO Trimming                                                                                CS+SK UFO Trimming
                                         CS+SK UFO Soft Drop                                                                               CS+SK UFO Soft Drop
                                                                                                                            10
                          10             CS+SK UFO Recursive SD                                                                            CS+SK UFO Recursive SD
                                         CS+SK UFO Bottom-up SD                                                                            CS+SK UFO Bottom-up SD
                            0.2   0.25   0.3   0.35   0.4     0.45    0.5    0.55     0.6   0.65                              0.2   0.25   0.3   0.35   0.4     0.45    0.5    0.55     0.6   0.65
                                                                      Top-tagging efficiency                                                                            Top-tagging efficiency
                                                       (a)                                                                                               (b)
Figure 7.3: Background rejection as a function of signal efficiency for a tagger using the jet
mass and τ32 for top quark jets at (a) low pT and (b) high pT . The relative performance of
different UFO definitions are compared with the current ATLAS baseline topo-cluster and
TCC jets [11].
                                                                                                   129


which in principle are infrared and collinear (IRC) safe [142], i.e. they are insensitive to
infinitesimally soft or collinear emissions, the presence of which presents difficulties for higher
order QCD predictions.
Angularities
The generalized angularities [143] are a family of jet shapes defined as
                                                     ∆Ri,J β
                                                         
                                     aκβ       ziκ
                                           X
                                         =                     ,                              (7.8)
                                                      R
                                           i∈J
where zi is the jet transverse momentum fraction carried by the constituent i of jet J, and
∆Ri,J its distance to the jet axis. Only angularities with κ = 1 are IRC safe. Angularities
can be seen as a measure of QCD radiation around the jet axis.
N-subjettiness
N-subjettiness [144] is an angularity-type observable. Within a jet, N -subjettiness identifies
N subjet axes, calculates the jet thrust about each and sums all of them together. For a jet
J, with transverse momentum pTJ and particles i each with pTi , the N-subjettiness is given
by
                           (β)     1 X
                         τN =             pTi min{Ri,1 , Ri,2 , · · · Ri,N }β ,               (7.9)
                                  pTJ
                                      i∈J
where Ri,n is the distance between particle i and the closest subjet axis in the η-φ plane.
The angular exponent β controls the sensitivity to collinear radiation. By taking the ratio
of multiple N-subjettiness variables, new dimensionless quantities can be derived, which aid
                                                130


in the discrimination of multi-pronged objects. For example, the ratio
                                                          (β)
                                              (β)       τN
                                            τN,N −1  =        ,                           (7.10)
                                                         (β)
                                                       τN −1
can be used to identify the presence of N subjets within a jet.
Generalized Energy Correlation Functions
The N -point energy correlation function (ECF) [145] is defined as
                                                     n
                                                              ! n−1 n               !β
                                      X              Y             Y Y
                ECF(n, β) =                             pTia                 Rib ic    ,  (7.11)
                               i1 <i2 <···<in ∈J    a=1            b=1 c=b+1
where n denotes the number of particles to be correlated, Rij is the distance between particle
i and particle j in the η-φ plane and β is used to adjust the weighting of the distance between
particles. A dimensionless definition for the energy correlation functions can be constructed
using a ratio
                                             β    ECF(n, β)
                                           en ≡                .                          (7.12)
                                                  ECF(1, β)n
Combinations of these can be used to define other observables that have been found to
be very useful for identifying multi-pronged structure. Some examples used for 2-prong
discrimination in boosted jets include D2 and C2 , which are defined as
                                           (β)                        (β)
                              (β)        e3                 (β)      e3
                             C2    =               and   D2      =        .               (7.13)
                                          (β)                        (β)
                                       (e2 )2                      (e3 )3
These functions probe multiple angular scales simultaneously. To isolate different physics
effects, a more general definition that identifies one scale at a time was developed. These
                                                   131


are the generalized energy correlation functions [146]
                                                             v
                 (β)                                                                     (β)
                                                                      min(m)
                               X                            Y
              v en   =                    zi1 zi2 · · · zin                            {Rst }, (7.14)
                                                                s<t∈{i1 ,i2 ,··· ,in }
                       i1 <i2 <···<in ≤nJ                   m=1
                               and min(m) denotes the m-th smallest element in the list. The
                  P
where zi ≡ pTi /     j∈J pTj
subscript v represents the number of pair-wise distances entering the product. Using these
definitions new substructure discriminants have been defined and found to be useful for
boosted top tagging. Some examples of these are L2 and L3 defined as
                                        β=1                            β=1
                                     3 e3                           1 e2
                           L2 =       β=2
                                                  and L3 =           β=1
                                                                                .              (7.15)
                                  (1 e2 )3/2                     (3 e3 )1/3
Jet Charge
The jet charge is an energy weighted sum of the electric charges of the hadrons in a jet. For
a jet with energy EJ it is defined as
                                               X  Ei W
                                        QW =                   qi ,                            (7.16)
                                                        EJ
                                               i∈J
where Ei is the energy of particle i and charge qi . The parameter W controls how strong is
the weighting for each energy fraction.
                                                  132


Planar Flow
Planar flow [147] is a jet shape observable that distinguishes planar from linear configurations.
To define it, we first construct a matrix Iw for a given jet as
                                       kl =  1 X pi,k pi,l
                                     Iw                     ,                              (7.17)
                                            mJ       wi wi
                                                  i
where mJ is the jet mass, wi is the energy of particle i in the jet, and pi,k is the k th
component of its transverse momentum relative to the axis of the jet’s momentum. Then,
the planar flow P is defined as
                                              4 det(Iw )
                                          P=             .                                 (7.18)
                                               tr(Iw )2
7.3      Machine Learning
Machine learning is an umbrella term to describe a framework that automates statistical
models, through a set of algorithms, to make better predictions. More formally, “A computer
program is said to learn from experience E with respect to some task T and some performance
measure P, if its performance on T, as measured by P, improves with experience E.” [148],
as defined by Tom Mitchell. Therefore, machine learning problems can be described as
an optimization process where a performance measure is maximized. In practice, this is
achieved by the minimization of a loss function, which measures the model’s prediction
error. The experience part of the definition comes from the fact that you need data to train
your model to predict the outcome of interest. One pass of a training dataset through the
learning algorithm in the training process is referred to as an epoch. The number of epochs
is considered a hyperparameter that can be tuned, as once during an epoch the internal
                                               133


model’s parameters are updated as part of the loss function minimization process.
    Three categories are usually used to describe machine learning approaches, depending
on the nature of the task and structure of the dataset. These are, supervised learning,
unsupervised learning and reinforcement learning. For supervised learning, the computer
is shown examples and the desired outputs with the goal of creating a map between them.
Unsupervised learning involves leaving the computer on its own to find structure between
the inputs. For reinforcement learning on the other hand, the computer is provided with
feedback while it is navigating the problem space.
    Tagging jets is a classification problem. Classification is a suitable candidate to use the
supervised learning approach. A supervised learning problem neccesitates the inclusion of
‘labels’ associated with the data sample that the algorithm has to learn from by exploring
a set of features (referred to as ‘predictors’). In our particular case the ‘label’ would be the
type of particle truth matched to a jet. This information is known because we use Monte
Carlo programs to simulate the events, therefore access to the truth information is available.
Deep Neural Networks as Classifiers
Neural networks [149] are a set of machine learning models based on a collection of connected
nodes (artificial neurons) inspired by biological neurons. An artificial neuron receives one
or more separately weighted inputs that pass through a non-linear function known as the
activation function. A neural network is composed of groups of these nodes, referred to
as layers. Each layer of nodes is fully connected to the layer preceding it and to the layer
following it. The layer that initiates the network and receives the data is known as the input
layer. The last layer is known as the output layer and it will produce the final result of the
network. Any layers between the input and output are known as the hidden layers. Any
                                               134


Figure 7.4: Fully connected neural network with a categorical output. Circular nodes rep-
resent the artificial neurons and the arrows represent a connection from the output of one
neuron to the input of another [151].
networks with at least two hidden layers is considered to be a deep neural network [150].
    To construct a classifier from a deep neural network, a softmax (normalized exponential)
activation function has to be used for the output layer. The softmax function converts a
vector of values to a probability distribution. The elements of the layer are in the range
between 0 and 1 and will have a total sum of 1. The output of each node in the layer will be
associated with the probability that the inputs came from a certain class. A binary classifier
distinguishes between two classes, while a multiclass classifier can distinguish between more
than two. Figure 7.4 contains an illustration of a general deep neural network implemented
as a classifier of n classes. The labels of the classes have to represented as numerical values
through a technique called one-hot encoding. A binary classification problem just requires
two labels, 0 and 1. For multiple classes, the one-hot encoding consists of constructing unique
vectors or matrices for each class where only one of the values is 1 and all the others are 0.
7.4      Binary Taggers for Boosted UFO jets
With the introduction of UFO large-R jets, development of dedicated tagging algorithms
aiming at identifying boosted hadronic objects using this new reconstruction technique has
                                               135


been achieved with very promising results. These taggers exploit jet substructure (JSS)
variables within hadronic large-R jets making use of machine learning techniques to classify
between the desired signal and the QCD multijet background.
DNN Top Taggers
The UFO top taggers [152] consist of two taggers based on a deep neural network (DNN)
[150]. One tagger is optimized to identify inclusive top quark jets while the other is optimized
for identifying fully contained top quark jets. A contained top quark jet is a jet where
the decay products of the top quark (t → W b → qq 0 b) are inside the large-R jet. The
variables found to be optimal for discriminating between hadronically decaying top-quarks
and quark/gluon-initiated jets are the JSS variables: N-subjettiness (τ1 , τ2 , τ3 , τ4 ), the
generalized energy correlation functions and their ratios (ECF1 , ECF2 , ECF3 , C2 , D2 , L2 ,
                                                                                           √
L3 ), QW and thrust major TM [153]. Other variables used are the splitting scales [154]      d12
      √                           p
and     d23 . The splitting scale  dij is defined as the kt scale from the i → j splitting of a
reconstructed jet.
    The ungroomed truth labeling strategy based on particle-level large-R jets was used to
distinguish between signal and background jets while training these taggers. A full descrip-
tion of the strategy is included in Section 7.5. As shown in Figure 7.5, the multijet rejection
is improved by a factor of 30% to 100% when comparing UFO SD jets with LCTopo trimmed
jets.
                                               136


                          (a)                                         (b)
Figure 7.5: Comparison of the background rejection of the (a) contained and (b) inclusive
top tagging algorithms as a function of pT for a fixed signal efficiency of 50% and 80% [152].
W/Z boson taggers
The W/Z UFO taggers [155] are divided into three categories: a simple three variable (3-var)
tagger that uses rectangular cuts on JSS variables; a deep neural network (DNN) W tagger
and an adversarial deep neural network (ANN [156]) W/Z tagger.
    The cut-based 3-var W/Z tagger uses rectangular cuts (cut on the variables per pT bin)
on the jet mass mJ (both upper and lower cuts), the energy correlation function ratio D2
with β = 1 and ntrk , defined as the number of ID tracks with pT > 500 MeV ghost-associated
with the ungroomed jet.
    The DNN W tagger uses 3 fully connected layers of 32 nodes. The variables used are JSS
variables D2 , C2 , τ21 , Fox-Wolfram moment R2F W [157], planar flow P [158], angularity a3
                                                          √
[159], aplanarity A [160], Zcut [161], the splitting scale d12 and kt ∆R [162]. The improved
JSS variable reconstruction for UFO jets improves the background rejection by a factor of
2 to 4 when comparing the DNN tagger on UFO jets with its counterpart for LCTopo jets.
The main issue with this tagger is a strong sculpting of the background jet shape in the
regions close to the signal W jets. This was resolved by uncorrelating the mass from the
                                               137


                        (a)                                          (b)
Figure 7.6: Comparison of the background rejection as a function of signal efficiency of the
DNN and ANN taggers between UFO jets and LCTopo jets in (a) 500 < pT < 1000 GeV
and (b) 1000 < pT < 2000 GeV [155].
DNN by the implementation of an ANN.
    The ANN W/Z tagger consists of a trained adversarial network that competes with the
DNN tagger described before. The adversarial network is trained to infer the jet mass from
the outputs of the DNN. A modified loss function is used to minimize the ANN-DNN system.
This mass decorrelation makes it possible for this network to be trained for identifying both
W and Z bosons. Figure 7.6 shows a comparison of the performance of the W/Z taggers for
different pT ranges.
7.5      High pT Scale Factor Extrapolation
To correct the tagging efficiencies in simulation to match those observed in data, it is neces-
sary to define scale factors. The scale factors, defined as the ratio of the tagging efficiency
in data with the tagging efficiency in simulation, can be determined up to a pT where the
calibration dataset has enough statistics. Therefore, the data-to-simulation scale factors are
                                              138


limited and alternative methods to explore the high pT regime must be implemented. A
simulation-based method [163] was used to extrapolate the scale factors to a pT up to 3 TeV
for all the supported UFO taggers. Uncertainties were also estimated by recomputing the
MC-to-MC scale factors using alternative datasets.
Strategy
The scale factor for any tagger at a specific working point is defined as
                                                      data
                                          SF(pT ) = MC ,                                 (7.19)
                                                      
where data is the tagging efficiency in data and MC is the efficiency from simulation. Every
scale factor has an associated uncertainty σ(SF(pT )). The scale factors are available for
jets with pT < pT,ref , where pT,ref is defined as the “reference momentum”. For the top
taggers the reference momentum is 1 TeV and for the W/Z taggers the reference momentum
is 450 GeV. To extrapolate the scale factors to higher momenta (pT > pT,ref ) we define the
multiplicative factor RM C (pT ; pT,ref ). The model used is
                   SF(pT ) = SF(pT ) · RMC (pT ; pT,ref ) for pT > pT,ref .              (7.20)
The explicit definition of the multiplicative factor is
                                                 data (pT )/data (pT,ref )
                         RMC (p   T ; pT,ref ) =                             .           (7.21)
                                                  MC (pT )/MC (pT,ref )
                                                 139


Assuming a weak pT dependence on the scale factors in the extrapolation region, the ex-
trapolation scale factor can be estimated as
               SF(pT ) := SF(pT,ref ) ⇒ RMC (pT ; pT,ref ) = 1 for pT > pT,ref .        (7.22)
The extrapolation uncertainty then is composed of two terms,
     σ 2 (SF(pT )) = σ 2 (SF(pT,ref )) + σextrap
                                           2     (RMC (pT ; pT,ref )) for pT > pT,ref . (7.23)
Extrapolation Uncertainty
To calculate the extrapolation uncertainty, we use alternative samples (labeled MCi ) and
recalculate the efficiency of each tagger (MCi ). The uncertainty is then defined as a func-
tion of the sum in quadrature of the difference between extrapolation factor evaluated
for the nominal MC (equal to 1) and the factor evaluated for every systematic variation
(RMC
   i (pT ; pT,ref )):
               2                                                                     2
                      (RMC                    2                 ∆RMC
                                                           X
              σextrap     i (pT ; pT,ref ))/SF (pT,ref ) ∝           i (pT ; pT,ref ) , (7.24)
                                                            i
where
                                                 MCi (pT )/MCi (pT,ref )
                           RMC
                             i (pT ; pT,ref ) =                            .            (7.25)
                                                  MC (pT )/MC (pT,ref )
This proportionality can also be expressed as the difference between the relative impact of
a systematic uncertainty on the efficiency at pT,ref . The relative difference between the
                                                  140


nominal efficiency and the efficiency of an alternative sample is given by
                                            MCi (pT ) − MC (pT )
                                  δMCi =                                              (7.26)
                                                   MC (pT )
Therefore the extrapolation uncertainty can be quantified by
                                                                           2
                  2                                  MC            MC
                                          X
                 σextrap (pT ; pT,ref ) =    max δ i (pT ) − δ i (pT,ref ) .         (7.27)
                                             <pT
                                          i
The max represents an ad-hoc procedure in which the maximum uncertainty value observed
     <pT
up until the jet pT is chosen. This procedure was implemented to ensure that the extrapo-
lation uncertainty is a monotonically increasing function.
    For each source of uncertainty, the quantites σ + and σ − are calculated using the upper
and lower envelope values of efficiency. This is performed because the total uncertainty is
asymmetric and therefore must be considered separately.
Samples and Event Selection
The datasets consist of samples of BSM processes. For the top taggers, Z 0 → tt̄ events are
used while for the W/Z taggers a W 0 → W Z sample was produced. The cross-section of
the events is reweighted to produce a flat pT distribution in order to populate the region
200 GeV < pT < 3 TeV. Only events with at least one UFO jet with pT > 200 GeV, a
reconstructed m > 40 GeV, and |η| < 2.0 are selected.
                                                141


Top Quarks
Z 0 → tt̄ events are simulated using Pythia8 with the NNPDF2.3Lo PDF set using the A14
tuning for the parton shower and MPI interactions. Only hadronically decaying top quarks
are considered. To prevent the inclusion of high pT gluons recoiling against the tt̄ system,
the selection strategy requires that the tt̄ pairs are well separated; they must satisfy the
                       q
condition ∆R(t, t̄) = ∆φ2tt̄ + ∆ηt2t̄ > 2.0.
    The truth labeling procedure starts by geometrically matching detector-level large-R jet
(J) to a particle-level jet (Jtruth ) by requiring ∆R(J, Jtruth ) < 0.75. Then Jtruth is matched
to a top quark by requiring ∆R(Jtruth , top) < 0.75. If these two requirements are satisfied
then the large-R jet is labeled as inclusive top.
    For contained tops , all the decay products of the top quark (t → W b → qq 0 b) must be
included inside the large-R jet. The ungroomed truth labeling strategy is employed. The
strategy requires that at least one b-hadron is ghost-associated to the ungroomed truth jet
(Jtruth ). A cut on the ungroomed mass (mungroomed > 140 GeV) is also required. Finally it
                                                                        √
also has to satisfy the ungroomed pT -dependent kt splitting scale ( d32 ) cut. The splitting
scale cut is
                       p                    6.98 · 10−4              
                          d23 > exp 3.3 −               pT,ungroomed GeV.                  (7.28)
                                                GeV
W/Z Bosons
The W 0 → W Z events are also simulated using Pythia8 with the NNPDF2.3Lo PDF set
and using the A14 tuning for the parton shower and MPI interactions. Only hadronically
decaying W/Z bosons are considered. To prevent overlap, the W and Z bosons must satisfy
                              q
the condition ∆R(W, Z) = ∆φ2W Z + ∆ηW          2
                                                  Z > 2.0.
    The ungroomed truth labeling strategy is also applied for the W/Z boson sample. There-
                                                 142


fore large-R jets must satisfy ∆R(J, Jtruth ) < 0.75 and ∆R(Jtruth , W/Z) < 0.75. For W-
labeled jets, in order to remove contamination from top-quark decays it is required that the
number of ghost associated b-hadrons is zero. The ungroomed truth mass (mungroomed ) is
                                                                                          √
required to be above 50 GeV and a cut on the energy scale of the first kt -declustering ( d12 )
must also be satisfied. The cut is defined as:
                      p                      −2.34 · 10−3              
                         d12 > 55.25 · exp                 pT,ungroomed GeV.              (7.29)
                                                 GeV
Uncertainty Sources
Shower Variations
The variation of the unphysical scales that arise in fixed order QCD calculations can be used
to estimate the theoretical uncertainties associated with showering models.
    The Pythia event generator provides a way to evaluate these variations of the renor-
malization scale (µR ) and splitting kernel in the showering process. Standard parton shower
algorithms generate the scale of the next branching by solving an equation that is a function
of the differential branching probability [164]. Pythia uses transverse momentum-ordered
showers, where the differential branching probability is given by
                                                  αs (t) P (z)
                                       P (t, z) =              ,                          (7.30)
                                                   2π      t
where t = p2⊥ , z is the momentum fraction carried by a parton after the splitting and P (z) the
DGLAP splitting kernel. The splitting kernel is the function that represents the probability
of a particular parton splitting into two daughter partons with specified momentum fractions.
For a baseline gluon-emission with a NLO compensating term and a renormalization scale
                                                 143


variation µR = p⊥ → µ0R = kp⊥ , the probability density is given by
                                         αs (µ0R )
                                                                  
                                                        α            P (z)
                           P 0 (t, z) =              1+    β0 ln k         ,              (7.31)
                                           2π           2π             t
where β0 = (11NC − 2nF )/3 with NC = 3 and nF the number of active flavors at the
scale µ = p⊥ . The renormalization scale (µR ) variation is performed independently for ISR
and FSR branchings given that their kernels receive different NLO corrections. The scale is
varied by factors of 2 and 1/2. A 7-point scale variation prescription is employed to calculate
the uncertainty:
                    
                  ξ       
                       = (0.5, 0.5), (1, 0.5), (0.5, 1), (1, 1), (1, 2), (2, 1), (2, 2) , (7.32)
                  ξ0
where ξ = (µISR     FSR
              R , µR ) and ξ0 represents the nominal (central) value.
    In addition to the 7-point scale variation two extra variations related to non-singular
finite terms in the splitting functions are included. In the context of a DGLAP approach,
the shower splitting kernels are modified in the following way
                                                                         cNS Q2 dt
                                                                               
                  P (z) 2           P (z)     cNS       2
                        dQ →              + 2         dQ = P (z) +                     ,  (7.33)
                   Q2                Q2      mdip                         m2dip      t
where mdip is the invariant mass of the dipole in which the splitting occurs and cNS is a di-
mensionless constant that parametrizes the amount of non-singular splitting kernel variation.
These two splitting kernel variations are labeled as “hardHi” and “hardLo” and correspond
to the values cNS = ±2. The 7(+2) point variation envelope (up and down) around the
nominal value is then chosen to estimate the uncertainties. Two variations due to the PDF
set chosen are also included as an independent source of uncertainty.
                                                   144


Detector Geometry and Nuclear Interaction Models
Several alternative MC samples were produced to determine the uncertainties associated
with nuclear interactions and detector geometry variations. The samples are part of the
Geant4 “Physics List” [165] and thus modify observables at the reconstruction level. All
the samples are summarized with a short description in Tables 7.2 and 7.1.
    The detector geometry variations simulate misaligned detector geometry or other imper-
fections that could have an impact on how the detector records the collision events. The
variations include LAr distorted geometry, inner detector variations and material scaling.
    The nuclear interaction modeling involves the consideration of different models for elastic,
inelastic, capture and fission processes. Some of the nuclear interaction models used in these
alternative samples are the quark gluon string model (QGS) [166], the Bertini intranuclear
cascade model (BERT) [167], the Fritiof model of string excitation (FTF) [168] and the chiral
invariant phase space model (CHIPS) [169]. The quark gluon string model (QGS) is a model
that uses color flow between partons to simulate reactions between high energy hadrons
with nuclei and high energy electro-nuclear reactions. The Bertini intranuclear cascade
model (BERT) is a classical model that solves the Boltzmann equation for the transport of
a particle through a “gas” of nucleons. It is valid for protons, neutrons, pions, kaons and
hyperons in the kinetic range up to 10 GeV. The Fritiof model of string excitation (FTF)
is a high energy string model that simulates hadron-hadron, hadron-nucleus and nucleus-
nucleus interactions. This model includes elastic scattering and a separate simulation of
single diffractive and non-diffractive events. It is meant to work with energies between 3 GeV
and 1 TeV. The chiral invariant phase space model (CHIPS) is used to approximate the Drell-
Yan process in hadron-nucleon interactions. In Geant4 it is used for γ-nuclear interactions,
                                               145


nuclear capture of negatively charged hadrons and quasi-elastic scattering processes.
Table 7.1: Summary of the “Physics List” samples used to estimate the high pT extrapolation
uncertainties related to detector geometry variations.
   s-tag                                    Description
   3126               Nominal sample - Nominal detector geometry
   3373     LAr distorted geometry with material between LAr and Tile
   3374      LAr distorted geometry with all distortions before LAr EM
   3375               Inner detector systematic geometry variations
   3376             Inner detector with +5% overall material scaling
Table 7.2: Summary of the “Physics List” samples used to estimate the high pT extrapolation
uncertainties related to nuclear interaction models.
  s-tag                                      Description
   3126                           Nominal sample - QGS, BERT
   3295                           FTF, BERT with no diffraction
   3296              QGS for high energies, FTF for lower energies, BERT
   3297                FTF, BERT with no re-scattering of the final state
   3298                      FTF, BERT, CHIPS for nuclear capture
   3299   FTF, BERT with a high precision data driven neutron transport package
Results
The total uncertainties for the high pT extrapolation of the scale factors due to scale varia-
tions, PDFs, nuclear interaction modeling and detector geometry variations are summarized
in Figures 7.7 - 7.15. Appendix B contains all the effiiciency plots as well as the efficiency
envelopes used to calculated the uncertainties.
    For the top taggers, the total uncertainty ranges from 0% at pT,ref = 1000 GeV (by
                                             146


construction) to 1-4% at pT = 3 TeV. The ‘down’ component of the total uncertainty is
larger across all taggers and all working points. Shower scale variations dominate as the
largest source of uncertainty while nuclear interaction modeling, detector geometry and PDF
variation are negligible (< 1%).
    The 3-var W/Z tagger ‘up’ uncertainties range from 0.5 - 1.50% at pT = 3 TeV. On the
other hand, for the same range, the ‘down’ uncertainties go from 1.75 - 2.25%. Similar to
the top taggers, the largest source of uncertainty is the scale variations.
    For the DNN W tagger, the total uncertainties for the 50% WP are much larger than
the uncertainties for the 80% WP. The ‘up’ uncertainties for jets with pT = 3 TeV go from
1% for the 80% WP to 3% for the 50% WP. Similarly, the ‘down‘ components are 0.5% and
1.5% for the 80% WP and 50% WP respectively.
    The ANN W/Z taggers contain overall the lowest extrapolation uncertainties. Scale
variations still dominate as the largest source but not in all cases. The largest uncertainties
come from the ‘down’ components of the W tagger calibrated with the 50% WP with a value
of around 1.75% at a pT of 3 TeV.
                                              147


                       (a)                                         (b)
                       (c)                                         (d)
Figure 7.7: Total scale factor extrapolation uncertainty for the contained top DNN tagger
at the (a,b) 50% WP and (c,d) 80% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + )
uncertainty component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                            148


                       (a)                                          (b)
                       (c)                                          (d)
Figure 7.8: Total scale factor extrapolation uncertainty for the inclusive top DNN tagger at
the (a,b) 50% WP and (c,d) 80% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + )
uncertainty component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                             149


                       (a)                                         (b)
                       (c)                                         (d)
Figure 7.9: Total scale factor extrapolation uncertainty for the 3-var W tagger at the (a,b)
50% WP and (c,d) 80% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + ) uncertainty
component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                             150


                       (a)                                         (b)
                       (c)                                         (d)
Figure 7.10: Total scale factor extrapolation uncertainty for the 3-var Z tagger at the (a,b)
50% WP and (c,d) 80% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + ) uncertainty
component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                             151


                       (a)                                         (b)
                       (c)                                         (d)
Figure 7.11: Total scale factor extrapolation uncertainty for the DNN W tagger at the (a,b)
50% WP and (c,d) 80% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + ) uncertainty
component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                             152


                          (a)                                     (b)
                          (c)                                     (d)
                          (e)                                     (f)
Figure 7.12: Total scale factor extrapolation uncertainty for the ANN W tagger at the (a,b)
50% WP, (c,d) 60% WP and (e,f) 70% as a function of pT . Plots (a,c,e) show the ‘up’ (σ + )
uncertainty component and plots (b,d,f) show the ‘down’ (σ − ) uncertainty component.
                                             153


                       (a)                                         (b)
                       (c)                                         (d)
Figure 7.13: Total scale factor extrapolation uncertainty for the ANN W tagger at the (a,b)
80% WP, (c,d) and 90% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + ) uncertainty
component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                             154


                          (a)                                    (b)
                          (c)                                    (d)
                          (e)                                    (f)
Figure 7.14: Total scale factor extrapolation uncertainty for the ANN Z tagger at the (a,b)
50% WP, (c,d) 60% WP and (e,f) 70% as a function of pT . Plots (a,c,e) show the ‘up’ (σ + )
uncertainty component and plots (b,d,f) show the ‘down’ (σ − ) uncertainty component.
                                             155


                       (a)                                         (b)
                       (c)                                         (d)
Figure 7.15: Total scale factor extrapolation uncertainty for the ANN Z tagger at the (a,b)
80% WP, (c,d) and 90% WP as a function of pT . Plots (a,c) show the ‘up’ (σ + ) uncertainty
component and plots (b,d) show the ‘down’ (σ − ) uncertainty component.
                                             156


7.6       Multiclass Tagger for Boosted UFO jets
Multiclass problems have more structure than binary classification problems because the
output relates classes with each other. This section will explore studies related to building
a multiclass tagger (MCT) based on a deep neural network that classifies between 5 classes:
top quarks, W bosons, Z bosons, Higgs bosons and QCD jets. This kind of tagger has already
been found to be effective by the CMS collaboration [170]. The aim is to gain insight on
the feasability of a unified tagger for multiple types of particles as well as the performance
difference between taggers developed for UFO jets and taggers developed for LCTopo jets.
    To train the deep neural network, Monte Carlo samples were produced. For top quarks
and W/Z bosons the same samples and selection criteria described in Section 7.5 were em-
ployed. To produce Higgs bosons, a new sample of the BSM process G → HH events was
used. Here G refers to Randall-Sundrum gravitons [171]. The events are simulated using
Pythia8 with the NNPDF2.3Lo PDF set using the A14 tuning. An ungroomed truth la-
beling strategy was applied to select the events where the truth Higgs boson was associated
with the reconstructed large-R jet. Two b-hadrons ghost-associated to the ungroomed truth
jet were also required. The events are reweighted so that the pT distribution for each sample
is the same.
    The initial set of features used as input included kinematical variables, N-subjettiness
variables and their ratios, generalized energy correlation functions and their ratios, jet charge,
splitting scales, number of constituents, Fox-Wolfram moment, angularity, aplanarity, planar
flow, kinematic variables of up to three track jets associated with the large-R jet and finally
the output scores of the DL1r b-tagger. The total amount of features was 74 and thus a
strategy to reduce significantly this number was required.
                                               157


    Before any study, the input variables were standardized so they demonstrate features of
a standard Gaussian distribution,
                                                   x−µ
                                        xscaled =                                        (7.34)
                                                     σ
where µ and σ are the mean and standard deviation of the feature respectively. This proce-
dure reduces the impact of outliers and also ensures the dataset is on the same scale, avoiding
the domination of features with larger values. Class imbalances are reduced by weighting
the events such that the total number of events per class is of a similar value. Larger weights
are given to the classes that have fewer events in the samples.
Feature Selection
The permutation feature importance algorithm was performed on a simple deep neural net-
work with 4 hidden layers. The permutation feature importance [172] is a model inspection
technique that can be used to study the decrease in a model score when a single feature value
is randomly shuffled. Let s be the reference score (i.e. accuracy) of the model on our testing
dataset. For each feature j, we perform k in 1, · · · , K repeated permutations (shuffling) of
the feature j and recompute the score sk,j of the model. The importance of a feature j can
be then calculated from the importance, defined as
                                                    K
                                                1 X
                                     ij = s −          sk,j .                            (7.35)
                                                K
                                                   k=1
The permutation feature importance tests revealed rapidly declining and overall low values
for the importance scores for the original 74 features. A false lower importance score can
                                               158


Figure 7.16: Correlation matrix of the initial set of features used as inputs for the MCT.
Red represents high correlation while blue represents low correlation.
be seen when there are features with strong correlations between each other. The model
still has access to the feature through its correlated features and falsely concludes that it is
not important. Therefore, it is important to understand the correlations between features
as the set of features selected may contain redundant information. Figure 7.16 shows the
correlation matrix of the set of initial features used as inputs for the deep neural network.
It is evident that there are high correlations between some features which can be seen as
clusters in the correlation matrix.
     The high degree of correlation can be exploited to perform dimensionality reduction
methods by employing clustering algorithms. Hierarchical clustering is a method to build
nested clusters at different resolutions. This method requires a distance measure between the
                                               159


variables X = {X1 , · · · , Xn }. In this case, the Ward’s linkage [173] is used as our distance
metric d. The algorithm proceeds as follows:
    Let Tn = {C1 , C2 , · · · , Cn } where Ci = {Xi }, for j = n − 1 to 1:
        (a) Find j, k to minimize d(Cj , Ck ) over all Cj , Ck ∈ Tj+1 .
        (b) Let Tj be the same as Tj+1 except that Cj and Ck are replaced with Cj ∪ Ck
    Finally, return the sets of clusters T1 , · · · Tn . The results can then be represented as a
dendrogram. Inspecting the dendrogram we can then choose which feature from each cluster
to keep as a representative variable. A threshold of 2 was used for the cophenetic distance,
defined as the height of the dendrogram where two branches that include the two objects
merge into a single branch. The resulting dendrogram is shown in Figure 7.17. The selection
of features within each group are required to have generally larger importance, as well as
good agreement with data within modeling uncertainties. Appendix B contains plots of
data/MC comparisons of most of the jet substructure variables distributions. This method
reduced the number of features from 74 to 26 without a significant loss in the accuracy of
the classifier.
    The final features used as input for the MCT are ntrk , pT , m, τ32 , e23 , C2 , D2 , L2 ,
                         √        √
L3 , L4 , A, a3 , R2F W , d12 , d23 , Zcut , charge and the 3 DL1r outputs used to define the
discriminant (plight , pb , pc ) for the three leading track jets.
                                                  160


Figure 7.17: Dendrogram representation of the grouping of the input features correlations using hierarchical clustering with
Ward’s linkage as a distance metric.
                                                            161


Deep Neural Network Model
Tensorflow [174] was used as a backend to construct the model. For training the model,
75% of the total events are used and trained for 300 epochs. The other 25% is divided
equally to use for validation and testing. The network contains four hidden layers, each
with 10 to 20 nodes and a rectified linear unit (ReLU) [175] activation function. The output
layer uses a softmax activation function and outputs five scores, each one associated to the
probability of the jet being from each class (sQCD , stop , sW , sZ , sH ). The largest score is
taken as the predicted class. The loss function used was the categorical cross-entropy with a
L2 (Ridge regression) [176] regularization and was optimized using Adam [177], an extended
version of stochastic gradient descent. Results for the network are shown in Figure 7.18.
The regularization technique introduces a penalty term into the loss function to prevent
overfitting. The model loss as a function of epoch shows a steady decrease in value for both
the training loss and validation loss. Conversely, the model accuracy increases slowly as a
function of epoch. This shows that the model still has potential to learn (by training for
more epochs) as no signs of overfitting were detected.
    The average accuracy considering all classes is 74%. Per class, the tagger achieves an
accuracy of 75% for QCD multijets, 87% for top quarks, 72% for W bosons, 55% for Z bosons
and 81% for Higgs bosons. The MCT doesn’t perform well for identifying Z bosons, with
only a 55% accuracy; most of the true Z bosons are being identified as W bosons (20%) and
Higgs bosons (13%). The ANN W/Z tagger has shown a good performance when introducing
an adversarial term to the loss function as a mass decorrelation measure. Such an extension
to the MCT could alleviate the misidentification rate between these two classes.
    These results are compatible with other observations from a multiclass tagger on LCTopo
                                             162


                       (a)                                        (b)
                                             (c)
Figure 7.18: Final results for the DNN model used for the MCT. The training results are
presented as the (a) loss function and the (b) accuracy as a function of training epoch. (c)
Confusion matrix derived from the application of the MCT to the testing dataset.
                                            163


large-R jets [178]. Other than for vector bosons, the accuracy of every class shows an
improvement of 2-4% in accuracy when trained using UFO jets. Further optimization, in
the form of a deeper net as well as hyperparameter tuning, has the potential to increase the
performance of the UFO MCT.
MCT Output Scores
Even though the maximum output score indicates what class the jet is being identified as,
the taggers can be calibrated in such a way that the efficiency of the tagger is fixed at a
value. For a multiclass tagger, this can be done by placing cuts on the score of the class
(particle) of interest. Distributions of the scores for the 5 classes are shown in Figure 7.19.
For a class i, predicted as a signal, a single cut on the score si > si,cut fixes the signal
efficiency and background rejection. Figure 7.20 shows the background (QCD, top, W, Z)
rejection as a function of the Higgs signal efficiency. Comparing with 4.11, we see that the
unoptimized MCT, as a single Higgs tagger, is comparable in performance to the double b-
tagging approach employed for the prior boosted H → bb̄ analysis. Recent studies [179] have
shown that using the log of the ratio of the MCT output scores, inspired by log-likelihood
ratios, as a discriminator instead of the score itself provides better background rejection at
all signal efficiencies.
                                               164


                          (a)                                 (b)
                          (c)                                 (d)
                                             (e)
Figure 7.19: Multiclass tagger outputs scores for UFO SD CS-SK large-R jets truth matched
to (a) QCD multijets, (b) top quarks, (c) W bosons, (d) Z bosons and (e) Higgs bosons.
                                            165


Figure 7.20: Background rejection for all classes (QCD, top quarks, W bosons and Z bosons)
as a function of the Higgs boson signal efficiency.
b-tagging Importance
Most of these jet substructure variables are already being used for the binary UFO taggers,
providing supporting evidence of their usefulness for the identification of boosted jets recon-
structed with the UFO algorithm. For UFO taggers the only variables not currently used
for the binary taggers are the DL1r scores of the track jets associated with UFO large-R
jets. The DL1r scores are the output of a deep neural network that has been fed all the low-
level b-taggers outputs. The low-level algorithms include track jets kinematics, number of
tracks, energy fractions of the tracks, number of displaced vertices as well as their distances,
invariant masses and more [87]. Given the rich structure provided by that information it
is important to study their impact for boosted object tagging. Currently, only the Xbb
tagger [180] currently uses the discriminants of the DL1r tagger as an input for their neural
network and has shown very promising results as a tool for H → bb̄ tagging aimed at large-R
jets constructed from topological clusters. Using the MCT we can measure the impact of
including the DL1r scores on our UFO tagger by comparing the confusion matrices resulting
                                              166


                                              (a)
Figure 7.21: Confusion matrix derived from the application of the MCT to the testing dataset
were the MCT was trained without DL1r output scores.
from applying the model with and without including them on the testing dataset. Figure
7.21 shows the confusion matrix resulting from training the MCT without any of the DL1r
output scores as input. Negligible impact is seen for the top quark tagging accuracy and for
the vector bosons only a 2% decrease in accuracy was observed. The largest impact occurs
on the identification of Higgs boson where the accuracy falls from 81% to 57% and for QCD
jets where the accuracy falls from 75% to 66%. The false positive rate of QCD jets mistaken
as Higgs bosons increases from 4% to 12%. This shows that the information coming from
b-tagging the track jets is crucial for building a Higgs tagger. Similarly 11% and 15% of
jets associated to a Higgs boson are classified incorrectly as W and Z bosons respectively.
A random guessing approach would achieve roughly 20% accuracy for each class (total of 5
classes), therefore, with a 57% accuracy on only relying on jet substructure variables we can
conclude that they also provide rich and useful information for boosted jets identification.
                                              167


For Further Research
The combination of the b-tagger DL1r scores, along with jet substructure variables provide
the opportunity to build a versatile and powerful tool for tagging UFO large-R jets. Even
with the unoptimized small model studied and described in this document, comparable
performance to other taggers was achieved. It is possible that with proper hyperparameter
tuning and hidden layers with hundreds of nodes (as the state of the art taggers) that
considerable gains in performance could be achieved.
    The rate of confusion between the vector bosons could be alleviated by a modification of
the model to perform mass decorrelation. This could also help with the mass sculpting that is
usually seen in the predicted QCD jets mass distribution. Following what was implemented
for the ANN W/Z tagger, the loss fuction should be modified to be of the form
                                      L = LC − λLM P                                   (7.36)
where LC is the original loss function (categorical cross-entropy), LM P is the loss function
of a mass predictor network and λ quantifies how strong the penalty for the mass correlation
is. The mass predictor network should be trained with the same inputs as the classifier
and has to predict the mass of the jet. When minimizing, the new joint loss will improve
classification accuracy while preventing mass correlation.
    The exploration of discriminants constructed from the MCT outputs should also be
studied further. Using the logarithms of the ratios of different scores has been shown to
provide better background rejections than just using the raw scores. Using this method,
the MCT framework is currently being tested for a VV/VH semi-leptonic analysis by the
ATLAS Collaboration [179].
                                            168


Chapter 8
Conclusions
Boosted H → bb̄ Results
Higgs boson production was studied using its bb̄ decay channel at high transverse momenta.
                                                                    √
The measurements are based on pp collisions at a center-of-mass of    s = 13 TeV using the
ATLAS detector with a total integrated luminosity of 136 fb−1 . Large-R jets reconstructed
from topological clusters were used to recontruct the Higgs boson and b-tagging techniques
were employed for its identification. Signal strengths of the hadronic decays of the W
and Z were used in the validation region and Z → bb̄ in the signal region to validate the
experimental techniques with results that agree with the Standard Model.
    Upper limits for the Higgs boson cross section at 95% CL were obtained for the fiducial
region (pT > 450 GeV and |y| < 2) and for four differential regions (250-450, 450-650,
650-1000 and > 1000 GeV).
    The Higgs boson signal strength was measured inclusively for jets with pT > 250 GeV
by a simultaneous fit to the SRL, SRS and CRtt̄ yielding a value of µH = 0.8 ± 3.2.
    For the fiducial region, the observed (expected) 95% CL limit was found to be:
                               σH (pT > 450GeV) < 115 (128) fb                        (8.1)
                                             169


    For the differential region bins, the observed (expected) 95% CL limits were found to be
                            σH (300 < pT < 450 GeV) < 2.9 (3.1) pb,
                             σH (450 < pT < 650 GeV) < 89 (102) fb,
                                                                                        (8.2)
                             σH (650 < pT < 1000 GeV) < 39 (34) fb,
                                  σH (pT > 1000 GeV) < 9.6 (7.4) fb.
    For pT > 1 TeV the Higgs boson cross section was found to be:
               σH (pT > 1 TeV) = 2.3 ± 3.9 (stat) ± 1.3 (syst) ± 0.5 (theory) fb.       (8.3)
    All of the results are consistent with the Standard Model predictions. This analysis
established a template for future analyses with larger datasets (i.e. Run 3), has produced
results for cross sections for Higgs boson production in the high pT regions providing useful
inputs for combination analyses and constraints on beyond the standard model physics.
The work was published in the ATLAS publication “Constraints on Higgs boson production
with large transverse momentum using H → bb̄ decays in the ATLAS dectector” [10] and
supported through the internal documentation in the ATLAS note “Study of Higgs boson
production at High pH  T in the H → bb̄ Decay Mode” [122].
                                               170


UFO Jets Tagging
Studies supporting ATLAS efforts to develop new tagging frameworks using unified flow ob-
ject (UFO) jets were performed. First, for the currently supported and already implemented
UFO taggers, a strategy to extrapolate the scale factors to higher pT regimes innaccesi-
ble with calibration datasets was devised and implemented. Multiple sources were used to
estimate the uncertainty associated with the extrapolation procedure. These include renor-
malization scale variations for ISR and FSR branchings, alternative PDFs, detector geometry
variations and alternate nuclear interaction models. The uncertainties were found range from
negligible for some cases, to up to 4% at the highest pT accessible (3 TeV). The taggers and
working points for which this method was performed were: the DNN inclusive top tagger
(50%, 80%), the DNN contained top tagger (50%, 80%), the DNN W tagger (50%, 80%),
the ANN W/Z tagger (50%, 60%, 70%, 80%, 90%) and the 3-variable W/Z taggers (50%,
80%).
    Finally, studies laying the ground for a dedicated UFO jet multi-class tagger for jets
resulting from QCD, W bosons, Z bosons, top quarks and Higgs bosons were performed.
The tagger developed was compared to an earlier topocluster-based tagger, showing overall
better or equal accuracies for all classes. The tagger is a classifier built as a multi-output
deep neural network. Hierarchical clustering based on correlations in conjunction with a
permutation importance approach was used to establish the optimal number of features that
serve as input. The impact of b-tagging and track jet information was shown to be crucial,
specifically for Higgs boson tagging. Even though the calibration of the tagger was out of
the scope of this project, it is part of the many results that support the inclusion of jet
substructure variables in conjunction with track jet information to achieve better tagging
                                             171


performance.
    The move from trimmed topological clustered jets to soft drop groomed CS-SK UFO jets
will improve any analysis that focuses on boosted objects reconstructed as large-R jets due
to their improved mass resolution and jet substructure reconstruction capabilities. This new
jet input definition, coupled with all the taggers in development, will aid in future iterations
of multiple analyses, including the one presented in this thesis (boosted hadronic H → bb̄), by
providing better jet systematics and higher sensitivity by enhancing background rejection.
Furthermore, by the end of the LHC’s Run 3, it is expected that a total integrated luminosity
of 350 fb−1 will be delivered, reducing even further the statistical limitations of the analysis.
                                              172


                              BIBLIOGRAPHY
[1] ATLAS Collaboration. “Observation of a new particle in the search for the Standard
    Model Higgs boson with the ATLAS detector at the LHC”. In: Physics Letters B
    716.1 (Sept. 2012), pages 1–29. doi: 10 . 1016 / j . physletb . 2012 . 08 . 020. url: https :
    //doi.org/10.1016%2Fj.physletb.2012.08.020 (cited on page 2).
[2] CMS Collaboration. “Observation of a new boson at a mass of 125 GeV with the CMS
    experiment at the LHC”. In: Physics Letters B 716.1 (Sept. 2012), pages 30–61. doi:
    10.1016/j.physletb.2012.08.021. url: https://doi.org/10.1016%2Fj.physletb.2012.08.
    021 (cited on page 2).
[3] L. Evans and P. Bryant. “LHC Machine”. In: Journal of Instrumentation 3.08 (Aug.
    2008), S08001–S08001. doi: 10.1088/1748-0221/3/08/s08001. url: https://doi.org/
    10.1088/1748-0221/3/08/s08001 (cited on pages 2, 24, 33).
[4] CMS Collaboration.√“Combined measurements of Higgs boson couplings in proton-
    proton collisions at s = 13 TeV”. In: The European Physical Journal C 79.5 (May
    2019). doi: 10.1140/epjc/s10052-019-6909-y. url: https://doi.org/10.1140%2Fepjc%
    2Fs10052-019-6909-y (cited on page 2).
[5] ATLAS Collaboration. “Combined measurements of Higgs    √ boson production and decay
                      −1
    using up to 80 fb of proton-proton collision data at s = 13 TeV collected with the
    ATLAS experiment”. In: Physical Review D 101.1 (Jan. 2020). doi: 10.1103/physrevd.
    101.012002. url: https://doi.org/10.1103%2Fphysrevd.101.012002 (cited on page 2).
[6] ATLAS Collaboration. “Observation of H → bb̄ decays and VH production with the
    ATLAS detector”. In: Physics Letters B 786 (Nov. 2018), pages 59–86. doi: 10.1016/
    j.physletb.2018.09.013. url: https://doi.org/10.1016%2Fj.physletb.2018.09.013 (cited
    on page 2).
[7] R. V. Harlander and T. Neumann. “Probing the nature of the Higgs-gluon coupling”.
    In: Phys. Rev. D 88 (7 Oct. 2013), page 074015. doi: 10.1103/PhysRevD.88.074015.
    url: https://link.aps.org/doi/10.1103/PhysRevD.88.074015 (cited on page 2).
[8] K. Mimasu, V. Sanz, and C. Williams. “Higher order QCD predictions for associated
    Higgs production with anomalous couplings to gauge bosons”. In: Journal of High
    Energy Physics 2016.8 (Aug. 2016). doi: 10.1007/jhep08(2016)039. url: https://doi.
    org/10.1007%2Fjhep08%282016%29039 (cited on page 2).
                                       173


 [9] D. Kar. Experimental Particle Physics. 2053-2563. IOP Publishing, 2019. isbn: 978-0-
     7503-2112-9. doi: 10.1088/2053-2563/ab1be6. url: https://dx.doi.org/10.1088/2053-
     2563/ab1be6 (cited on page 3).
[10] ATLAS Collaboration. “Constraints on Higgs boson production with large transverse
     momentum using H → bb̄ decays in the ATLAS detector”. In: Physical Review D
     105.9 (May 2022). doi: 10.1103/physrevd.105.092003. url: https://doi.org/10.1103%
     2Fphysrevd.105.092003 (cited on pages 3, 81, 85–88, 91, 93, 98, 104, 106, 108–120,
     170).
[11] ATLAS      Collaboration.    Optimisation of large-radius jet reconstruction
     for the ATLAS detector in 13 TeV proton-proton collisions. Tech-
     nical report. All figures including auxiliary figures are available at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-
     CONF-2020-021. Geneva: CERN, 2020. url: https : / / cds . cern . ch / record / 2723736
     (cited on pages 4, 121, 127–129).
[12] S. Weinberg. The Quantum Theory of Fields. Press Syndicate of the University of
     Cambridge, 1995 (cited on pages 5, 6).
[13] M. Loaiza. “A Short Introduction to Hilbert Space Theory”. In: Journal of Physics:
     Conference Series 839.1 (May 2017), page 012002. doi: 10.1088/1742- 6596/839/1/
     012002. url: https://dx.doi.org/10.1088/1742-6596/839/1/012002 (cited on page 5).
[14] W. Tung. Group Theory in Physics. World Scientific Publishing, 1985 (cited on page 6).
[15] Wikimedia Commons. Standard Model of Elementary Particles. 2019. url: https://
     commons.wikimedia.org/wiki/File:Standard Model of Elementary Particles.svg (cited
     on page 17).
[16] R. L. Workman et al. “Review of Particle Physics”. In: PTEP 2022 (2022),
     page 083C01. doi: 10.1093/ptep/ptac097 (cited on pages 19, 20, 25).
[17] G. P. Salam. “Towards jetography”. In: The European Physical Journal C 67.3-4 (May
     2010), pages 637–686. doi: 10.1140/epjc/s10052-010-1314-6. url: https://doi.org/10.
     1140%2Fepjc%2Fs10052-010-1314-6 (cited on pages 18, 57).
[18] N. Cabibbo. “Unitary Symmetry and Leptonic Decays”. In: Phys. Rev. Lett. 10 (12
     June 1963), pages 531–533. doi: 10.1103/PhysRevLett.10.531. url: https://link.aps.
     org/doi/10.1103/PhysRevLett.10.531 (cited on page 20).
                                       174


[19] J. Riebesell. Higgs mechanism TikZ graphic. 2021. url: https : / / tikz . net / higgs -
     potential/ (cited on page 22).
[20] CERN. CERN Yellow Reports: Monographs, Vol 2 (2017): Handbook of LHC Higgs
     cross sections: 4. Deciphering the nature of the Higgs sector. en. 2017. doi: 10.23731/
     CYRM-2017-002. url: https://e-publishing.cern.ch/index.php/CYRM/issue/view/32
     (cited on page 24).
[21] None. “Design Report Tevatron 1 Project (Second Printing November 1982)”. In: (Oct.
     1982). doi: 10.2172/1413194. url: https://www.osti.gov/biblio/1413194 (cited on
     page 24).
[22] LHC Higgs Working Group. Higgs boson cross-sections as a function of mass. url:
     https://twiki.cern.ch/twiki/pub/LHCPhysics/LHCHWGCrossSectionsFigures/plot
     13tev H sqrt.pdf (cited on page 26).
[23] ATLAS Collaboration. “Measurement
                                 √           of the Higgs boson mass in the H → ZZ∗ → 4l
     and H → γγ channels with s = 13 TeV pp collisions using the ATLAS detector”. In:
     Physics Letters B 784 (2018), pages 345–366. issn: 0370-2693. doi: https://doi.org/
     10.1016/j.physletb.2018.07.050. url: https://www.sciencedirect.com/science/article/
     pii/S0370269318305884 (cited on page 25).
[24] CMS Collaboration. “A measurement of the Higgs boson mass in the diphoton decay
     channel”. In: Physics Letters B 805 (2020), page 135425. issn: 0370-2693. doi: https:
     //doi.org/10.1016/j.physletb.2020.135425. url: https://www.sciencedirect.com/
     science/article/pii/S037026932030229X (cited on page 25).
[25] LHC Higgs Working Group. Higgs boson decay branching ratios as a function of mass.
     url: https://twiki.cern.ch/twiki/pub/LHCPhysics/LHCHWGCrossSectionsFigures/
     SMHiggsBR.YR4-square.pdf (cited on page 27).
[26] ATLAS Collaboration. “Observation of H → bb̄ decays and VH production with the
     ATLAS detector”. In: Physics Letters B 786 (Nov. 2018), pages 59–86. doi: 10.1016/
     j.physletb.2018.09.013. url: https://doi.org/10.1016%2Fj.physletb.2018.09.013 (cited
     on pages 26, 73).
[27] CMS Collaboration. “Observation of Higgs Boson Decay to Bottom Quarks”. In: Phys-
     ical Review Letters 121.12 (Sept. 2018). doi: 10.1103/physrevlett.121.121801. url:
     https://doi.org/10.1103%2Fphysrevlett.121.121801 (cited on page 26).
                                         175


[28] M. Buschmann et al. “Mass effects in the Higgs-gluon coupling: boosted vs. off-shell
     production”. In: Journal of High Energy Physics 2015 (Feb. 2015), page 38. doi: 10.
     1007/JHEP02%282015%29038 (cited on page 28).
[29] A. Banfi et al. “Digging for top squarks from Higgs data: from signal strengths to
     differential distributions”. In: Journal of High Energy Physics 2018.11 (Nov. 2018). doi:
     10.1007/jhep11(2018)171. url: https://doi.org/10.1007%2Fjhep11%282018%29171
     (cited on page 27).
[30] C. Grojean et al. “Very boosted Higgs in gluon fusion”. In: Journal of High Energy
     Physics 2014.5 (May 2014). doi: 10.1007/jhep05(2014)022. url: https://doi.org/10.
     1007%2Fjhep05%282014%29022 (cited on page 27).
[31] A. Biekötter et al. “Vices and virtues of Higgs effective field theories at large energy”.
     In: Physical Review D 91.5 (Mar. 2015). doi: 10 . 1103 / physrevd . 91 . 055029. url:
     https://doi.org/10.1103%2Fphysrevd.91.055029 (cited on page 27).
[32] M. Grazzini et al. “Modeling BSM effects on the Higgs transverse-momentum spectrum
     in an EFT approach”. In: Journal of High Energy Physics 2017.3 (Mar. 2017). doi:
     10.1007/jhep03(2017)115. url: https://doi.org/10.1007%2Fjhep03%282017%29115
     (cited on page 27).
[33] M. Grazzini, A. Ilnicka, and M. Spira. “Higgs boson production at large transverse
     momentum within the SMEFT: analytical results”. In: The European Physical Journal
     C 78,808 (Oct. 2018). doi: 10.1140/epjc/s10052-018-6261-7. url: https://doi.org/10.
     1140/epjc/s10052-018-6261-7 (cited on pages 28, 29).
[34] G. Altarelli and G. Parisi. “Asymptotic freedom in parton language”. In: Nuclear
     Physics B 126.2 (1977), pages 298–318. issn: 0550-3213. doi: https : / / doi . org / 10 .
     1016/0550-3213(77)90384-4. url: https://www.sciencedirect.com/science/article/pii/
     0550321377903844 (cited on page 30).
[35] T.-J. Hou et al. “New CTEQ global analysis of quantum chromodynamics with high-
     precision data from the LHC”. In: Physical Review D 103.1 (Jan. 2021). doi: 10.1103/
     physrevd.103.014013. url: https://doi.org/10.1103%2Fphysrevd.103.014013 (cited on
     page 30).
[36] T. Sjöstrand et al. “High-energy-physics event generation with Pythia 6.1”. In: Com-
     puter Physics Communications 135.2 (Apr. 2001), pages 238–259. doi: 10.1016/s0010-
     4655(00)00236- 8. url: https://doi.org/10.1016%2Fs0010- 4655%2800%2900236- 8
     (cited on page 30).
                                           176


[37] G. Corcella et al. “HERWIG 6: an event generator for hadron emission reactions with
     interfering gluons (including supersymmetric processes)”. In: Journal of High Energy
     Physics 2001.01 (Jan. 2001), pages 010–010. doi: 10.1088/1126- 6708/2001/01/010.
     url: https://doi.org/10.1088%2F1126-6708%2F2001%2F01%2F010 (cited on page 30).
[38] T. Gleisberg et al. “Event generation with SHERPA 1.1”. In: Journal of High Energy
     Physics 2009.02 (Feb. 2009), pages 007–007. doi: 10.1088/1126- 6708/2009/02/007.
     url: https://doi.org/10.1088%2F1126-6708%2F2009%2F02%2F007 (cited on page 30).
[39] J. Alwall et al. “The automated computation of tree-level and next-to-leading order
     differential cross sections, and their matching to parton shower simulations”. In: Journal
     of High Energy Physics 2014.7 (July 2014). doi: 10.1007/jhep07(2014)079. url: https:
     //doi.org/10.1007%2Fjhep07%282014%29079 (cited on page 30).
[40] S. Hoche. “Introduction to parton-shower event generators”. In: 2014 (cited on page 31).
[41] S. Agostinelli et al. “Geant4—a simulation toolkit”. In: Nuclear Instruments and Meth-
     ods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associ-
     ated Equipment 506.3 (2003), pages 250–303. issn: 0168-9002. doi: https://doi.org/
     10.1016/S0168-9002(03)01368-8. url: https://www.sciencedirect.com/science/article/
     pii/S0168900203013688 (cited on pages 31, 80).
[42] ALICE Collaboration. ALICE Electromagnetic Calorimeter Technical Design Report.
     Technical report. 2008. url: https://cds.cern.ch/record/1121574 (cited on page 34).
[43] LHCb Collaboration. “The LHCb Detector at the LHC”. In: JINST 3 (2008). Also
     published by CERN Geneva in 2010, S08005. doi: 10.1088/1748-0221/3/08/S08005.
     url: https://cds.cern.ch/record/1129809 (cited on page 34).
[44] CMS Collaboration. CMS Physics: Technical Design Report Volume 1: Detector Per-
     formance and Software. Technical design report. CMS. There is an error on cover due
     to a technical problem for some items. Geneva: CERN, 2006. url: https://cds.cern.
     ch/record/922757 (cited on page 34).
[45] ATLAS Collaboration. “The ATLAS Experiment at the CERN Large Hadron Collider”.
     In: Journal of Instrumentation 3.08 (Aug. 2008), S08003–S08003. doi: 10.1088/1748-
     0221/3/08/s08003. url: https://doi.org/10.1088/1748-0221/3/08/s08003 (cited on
     pages 34, 36, 43).
[46] E. Mobs. “The CERN accelerator complex - 2019. Complexe des accélérateurs du
     CERN - 2019”. In: (July 2019). General Photo. url: https : / / cds . cern . ch / record /
     2684277 (cited on page 34).
                                            177


[47] ATLAS Collaboration. Luminosity Public Results Run 2. 2018. url: https://twiki.
     cern.ch/twiki/bin/view/AtlasPublic/LuminosityPublicResultsRun2 (visited on 2021)
     (cited on page 35).
[48] J. Pequenao. “Computer generated image of the whole ATLAS detector”. Mar. 2008.
     url: https://cds.cern.ch/record/1095924 (cited on page 36).
[49] ATLAS Collaboration. “ATLAS pixel detector electronics and sensors”. In: Journal of
     Instrumentation 3.07 (July 2008), P07007. doi: 10.1088/1748- 0221/3/07/P07007.
     url: https://dx.doi.org/10.1088/1748-0221/3/07/P07007 (cited on pages 37, 38).
[50] J. Pequenao. “Computer generated image of the ATLAS inner detector 1”. 2008. url:
     https://cds.cern.ch/record/1095926 (cited on page 38).
[51] G. Lindström et al. “Developments for radiation hard silicon detectors by defect engi-
     neering—results by the CERN RD48 (ROSE) Collaboration”. In: Nuclear Instruments
     and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors
     and Associated Equipment 465.1 (2001). SPD2000, pages 60–69. issn: 0168-9002. doi:
     https://doi.org/10.1016/S0168- 9002(01)00347- 3. url: https://www.sciencedirect.
     com/science/article/pii/S0168900201003473 (cited on page 38).
[52] M. Capeans et al. ATLAS Insertable B-Layer Technical Design Report. Technical re-
     port. 2010. url: https://cds.cern.ch/record/1291633 (cited on page 39).
[53] J. Pequenao. “Computer generated image of the ATLAS inner detector 2”. 2008. url:
     https://cds.cern.ch/record/1095926 (cited on page 40).
[54] J. Pequenao. “Computer Generated image of the ATLAS calorimeter”. 2008. url:
     https://cds.cern.ch/record/1095927 (cited on page 41).
[55] F. Cavallari. “Performance of calorimeters at the LHC”. In: Journal of Physics: Con-
     ference Series 293.1 (Apr. 2011), page 012001. doi: 10.1088/1742-6596/293/1/012001.
     url: https://dx.doi.org/10.1088/1742-6596/293/1/012001 (cited on page 42).
[56] ATLAS Collaboration. “The ATLAS Experiment at the CERN Large Hadron Collider”.
     In: Journal of Instrumentation 3.08 (Aug. 2008), page 122. doi: 10.1088/1748-0221/3/
     08/s08003. url: https://doi.org/10.1088/1748-0221/3/08/s08003 (cited on page 45).
[57] ATLAS Collaboration. “The ATLAS Experiment at the CERN Large Hadron Collider”.
     In: Journal of Instrumentation 3.08 (Aug. 2008), page 127. doi: 10.1088/1748-0221/3/
     08/s08003. url: https://doi.org/10.1088/1748-0221/3/08/s08003 (cited on page 46).
                                        178


[58] ATLAS Collaboration. “The ATLAS Experiment at the CERN Large Hadron Collider”.
     In: Journal of Instrumentation 3.08 (Aug. 2008), page 131. doi: 10.1088/1748-0221/3/
     08/s08003. url: https://doi.org/10.1088/1748-0221/3/08/s08003 (cited on page 46).
[59] J. Pequenao. “Computer generated image of the ATLAS Muons subsystem”. 2008.
     url: https://cds.cern.ch/record/1095929 (cited on page 48).
[60] J.J. Goodson. “Search for Supersymmetry in States with Large Missing Transverse
     Momentum and Three Leptons including a Z-Boson”. Presented 17 Apr 2012. PhD
     thesis. Stony Brook University, May 2012 (cited on page 49).
[61] The     ATLAS       Collaboration    Software    and     Firmware.    Technical    re-
     port.     All    figures    including    auxiliary    figures   are    available    at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-SOFT-
     PUB-2021-001. Geneva: CERN, 2021. url: https://cds.cern.ch/record/2767187 (cited
     on page 50).
[62] ATLAS Collaboration. “Performance of the ATLAS trigger system in 2015”. In: The
     European Physical Journal C 77.5 (May 2017). doi: 10.1140/epjc/s10052-017-4852-3.
     url: https://doi.org/10.1140%2Fepjc%2Fs10052-017-4852-3 (cited on page 50).
[63] T. A. collaboration. “Operation of the ATLAS trigger system in Run 2”. In: Journal of
     Instrumentation 15.10 (Oct. 2020), P10004–P10004. doi: 10.1088/1748-0221/15/10/
     p10004. url: https://doi.org/10.1088%2F1748-0221%2F15%2F10%2Fp10004 (cited
     on page 50).
[64] F. Meloni. Track and vertex reconstruction in the ATLAS experiment. Technical report.
     Geneva: CERN, 2012. url: https://cds.cern.ch/record/1455427 (cited on pages 52,
     54).
[65] N. Braun. “Combinatorial Kalman Filter”. In: Combinatorial Kalman Filter and High
     Level Trigger Reconstruction for the Belle II Experiment. Cham: Springer International
     Publishing, 2019, pages 117–174. isbn: 978-3-030-24997-7. doi: 10.1007/978- 3- 030-
     24997-7 6. url: https://doi.org/10.1007/978-3-030-24997-7 6 (cited on page 52).
[66] R. O. Duda and P. E. Hart. “Use of the Hough Transformation to Detect Lines and
     Curves in Pictures”. In: Commun. ACM 15.1 (Jan. 1972), pages 11–15. issn: 0001-
     0782. doi: 10.1145/361237.361242. url: https://doi.org/10.1145/361237.361242 (cited
     on page 53).
[67] ATLAS Collaboration. Performance of the ATLAS Inner Detector Track
     and Vertex Reconstruction in the High Pile-Up LHC Environment. Tech-
                                         179


     nical report. All figures including auxiliary figures are available at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-
     CONF-2012-042. Geneva: CERN, 2012. url: https : / / cds . cern . ch / record / 1435196
     (cited on page 53).
[68] M. Cacciari, G. P. Salam, and G. Soyez. “The anti-kt jet clustering algorithm”. In:
     Journal of High Energy Physics 2008.04 (Apr. 2008), pages 063–063. doi: 10.1088/
     1126- 6708/2008/04/063. url: https://doi.org/10.1088%2F1126- 6708%2F2008%
     2F04%2F063 (cited on pages 54, 56).
[69] ATLAS Collaboration. “Electron reconstruction and identification efficiency measure-
     ments with the ATLAS detector using the 2011 LHC proton–proton collision data”.
     In: The European Physical Journal C 74.7 (July 2014). doi: 10.1140/epjc/s10052-
     014-2941-0. url: https://doi.org/10.1140%2Fepjc%2Fs10052-014-2941-0 (cited on
     page 56).
[70] ATLAS Collaboration. Electron and photon reconstruction and performance
     in ATLAS using a dynamical, topological cell clustering-based approach.
     Technical report. All figures including auxiliary figures are available at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
     PUB-2017-022. Geneva: CERN, 2017. url: https://cds.cern.ch/record/2298955 (cited
     on pages 56, 61).
[71] ATLAS Collaboration. “Topological cell clustering in the ATLAS calorimeters and its
     performance in LHC Run 1”. In: The European Physical Journal C 77.7 (July 2017).
     doi: 10 . 1140 / epjc / s10052 - 017 - 5004 - 5. url: https : / / doi . org / 10 . 1140 % 2Fepjc %
     2Fs10052-017-5004-5 (cited on page 58).
[72] ATLAS       Collaboration.       Optimisation of large-radius jet reconstruction
     for the ATLAS detector in 13 TeV proton-proton collisions. Tech-
     nical report. All figures including auxiliary figures are available at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-
     CONF-2020-021. Geneva: CERN, 2020. url: https : / / cds . cern . ch / record / 2723736
     (cited on pages 59, 60).
[73] J. Varela. Boosted jets: Increasing transverse momentum. 2019. url: https://www.
     nevis.columbia.edu/reu/2019/talks/Varela slides.pdf (cited on page 59).
[74] D. Krohn, J. Thaler, and L.-T. Wang. “Jet trimming”. In: Journal of High Energy
     Physics 2010.2 (Feb. 2010). doi: 10.1007/jhep02(2010)084. url: https://doi.org/10.
     1007%2Fjhep02%282010%29084 (cited on page 60).
                                             180


[75] ATLAS Collaboration. “Performance
                                     √      of jet substructure techniques for large-R jets
     in proton-proton collisions at s = 7 TeV using the ATLAS detector”. In: Journal
     of High Energy Physics 2013.9 (Sept. 2013). doi: 10 . 1007 / jhep09(2013 ) 076. url:
     https://doi.org/10.1007%2Fjhep09%282013%29076 (cited on page 61).
[76] ATLAS Collaboration. √ “Jet energy measurement with the ATLAS detector in proton-
     proton collisions at s = 7 TeV”. In: The European Physical Journal C 73.3 (Mar.
     2013). doi: 10.1140/epjc/s10052-013-2304-2. url: https://doi.org/10.1140%2Fepjc%
     2Fs10052-013-2304-2 (cited on page 61).
[77] ATLAS Collaboration. “In situ calibration of large-radius jet energy and mass in 13
     TeV proton-proton collisions with the ATLAS detector”. In: The European Physical
     Journal C 79.2 (Feb. 2019). doi: 10 . 1140 / epjc / s10052 - 019 - 6632 - 8. url: https :
     //doi.org/10.1140%2Fepjc%2Fs10052-019-6632-8 (cited on pages 62, 63).
[78] ATLAS Collaboration. Jet mass reconstruction with the ATLAS Detector in early
     Run 2 data. Technical report. All figures including auxiliary figures are avail-
     able at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-
     CONF-2016-035. Geneva: CERN, 2016. url: https : / / cds . cern . ch / record / 2200211
     (cited on page 63).
[79] D. Krohn, J. Thaler, and L.-T. Wang. “Jets with variable R”. In: Journal of High
     Energy Physics 2009.06 (June 2009), pages 059–059. doi: 10.1088/1126-6708/2009/
     06/059. url: https://doi.org/10.1088%2F1126-6708%2F2009%2F06%2F059 (cited on
     page 64).
[80] ATLAS       Collaboration.    Variable Radius, Exclusive-kT , and Center-of-
     Mass Subjet Reconstruction for Higgs(→ bb̄) Tagging in ATLAS. Tech-
     nical report. All figures including auxiliary figures are available at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
     PUB-2017-010. Geneva: CERN, 2017. url: https://cds.cern.ch/record/2268678 (cited
     on page 65).
[81] ATLAS Collaboration. “Performance
                                     √      of jet substructure techniques for large-R jets
     in proton-proton collisions at s = 7 TeV using the ATLAS detector”. In: Journal
     of High Energy Physics 2013.9 (Sept. 2013). doi: 10 . 1007 / jhep09(2013 ) 076. url:
     https://doi.org/10.1007%2Fjhep09%282013%29076 (cited on page 65).
[82] ATLAS Collaboration. “Performance of b-jet identification in the ATLAS experiment”.
     In: Journal of Instrumentation 11.04 (Apr. 2016), P04008–P04008. doi: 10.1088/1748-
     0221/11/04/p04008. url: https://doi.org/10.1088%2F1748- 0221%2F11%2F04%
     2Fp04008 (cited on page 66).
                                        181


[83] ATLAS Collaboration. “ATLAS b-jet identification
                                                  √         performance and efficiency mea-
     surement with tt̄ events in pp collisions at s = 13 TeV”. In: The European Physical
     Journal C 79.11 (Nov. 2019). doi: 10 . 1140 / epjc / s10052 - 019 - 7450 - 8. url: https :
     //doi.org/10.1140%2Fepjc%2Fs10052-019-7450-8 (cited on pages 66, 68, 69).
[84] J. Varela. Track jet schematic with a displaced vertex. 2019. url: https://www.nevis.
     columbia.edu/reu/2019/talks/Varela slides.pdf (cited on page 67).
[85] ATLAS Collaboration. Secondary vertex finding for jet flavour identification with
     the ATLAS detector. Technical report. All figures including auxiliary figures are
     available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-
     PHYS-PUB-2017-011. Geneva: CERN, 2017. url: https://cds.cern.ch/record/2270366
     (cited on page 66).
[86] ATLAS Collaboration. Topological b-hadron decay reconstruction and iden-
     tification of b-jets with the JetFitter package in the ATLAS experiment
     at the LHC. Technical report. All figures including auxiliary figures are
     available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-
     PHYS-PUB-2018-025. Geneva: CERN, 2018. url: https://cds.cern.ch/record/2645405
     (cited on page 66).
[87] ATLAS Collaboration. “ATLAS b-jet identification
                                                  √         performance and efficiency mea-
     surement with tt̄ events in pp collisions at s = 13 TeV”. In: The European Physical
     Journal C 79.11 (Nov. 2019). doi: 10 . 1140 / epjc / s10052 - 019 - 7450 - 8. url: https :
     //doi.org/10.1140%2Fepjc%2Fs10052-019-7450-8 (cited on pages 67, 82, 166).
[88] A. Hoecker et al. TMVA - Toolkit for Multivariate Data Analysis. 2007. doi: 10.48550/
     ARXIV.PHYSICS/0703039. url: https://arxiv.org/abs/physics/0703039 (cited on
     page 67).
[89] F. Chollet et al. Keras. https://keras.io. 2015 (cited on page 67).
[90] The Theano Development Team et al. Theano: A Python framework for fast compu-
     tation of mathematical expressions. 2016. doi: 10 . 48550 / ARXIV . 1605 . 02688. url:
     https://arxiv.org/abs/1605.02688 (cited on page 67).
[91] ATLAS Collaboration. “Identification of boosted Higgs bosons decaying
     into b-quark pairs with the ATLAS detector at 13 TeV”. In: Eur. Phys.
     J. C 79.10 (2019). 54 pages in total, author list starting page 38, 20
     figures, 3 tables. All figures including auxiliary figures are available at
     http://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/PERF-2017-04,
                                         182


     page 836. doi: 10 . 1140 / epjc / s10052 - 019 - 7335 - x. arXiv: 1906 . 11005. url:
     https://cds.cern.ch/record/2680245 (cited on pages 69, 70, 82).
[92] ATLAS Collaboration. “Muon reconstruction
                                          √               performance of the ATLAS detector in
     proton–proton collision data at s=13 TeV. Muon √           reconstruction performance of the
     ATLAS detector in proton–proton collision data at s=13 TeV”. In: Eur. Phys. J. C
     76 (2016). Comments: 27 pages inlcuding cover page plus author list (45 pages total), 12
     figures, 3 tables, submitted to Eur. Phys. J. C. All figures including auxiliary figures are
     available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/PERF-
     2015-10/, page 292. doi: 10.1140/epjc/s10052-016-4120-y. arXiv: 1603.05598. url:
     https://cds.cern.ch/record/2139897 (cited on pages 70–72, 82).
[93] ATLAS Collaboration. “Measurements of Higgs Bosons Decaying to Bottom          √        Quarks
     from Vector Boson Fusion Production with the ATLAS Experiment at s = 13 TeV”.
     In: European Physics Journal C.81 (Nov. 2020). doi: 10.1140/epjc/s10052-021-09192-
     8. url: https://doi.org/10.48550/arXiv.2011.08280 (cited on page 73).
[94] ATLAS Collaboration. “Search for resonances in the mass distribution     √ of jet pairs with
     one or two jets identified as b-jets in proton-proton collisions at s = 13 TeV with
     the ATLAS detector”. In: Phys. Rev. D 98 (3 Aug. 2018), page 032016. doi: 10.1103/
     PhysRevD.98.032016. url: https://link.aps.org/doi/10.1103/PhysRevD.98.032016
     (cited on page 73).
[95] G. Cowan et al. “Asymptotic formulae for likelihood-based tests of new physics”. In:
     The European Physical Journal C 71.2 (Feb. 2011). doi: 10.1140/epjc/s10052- 011-
     1554 - 0. url: https : / / doi . org / 10 . 1140 % 2Fepjc % 2Fs10052 - 011 - 1554 - 0 (cited on
     page 76).
[96] Good Run Lists for Analysis - Run 2. Dec. 2022. url: https://twiki.cern.ch/twiki/
     bin/view/AtlasProtected/GoodRunListsForAnalysisRun2 (cited on page 77).
[97] ATLAS
     √            Collaboration. Luminosity determination in pp collisions at
        s    =      13 TeV using the ATLAS detector at the LHC. Techni-
     cal report. All figures including auxiliary figures are available at
     https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-
     CONF-2019-021. Geneva: CERN, 2019. url: https : / / cds . cern . ch / record / 2677054
     (cited on pages 77, 78).
[98] ATLAS Collaboration.
                    √           “Performance of pile-up mitigation techniques for jets in pp
     collisions at s = 8 TeV using the ATLAS detector”. In: The European Physical
     Journal C 76.11 (Oct. 2016). doi: 10.1140/epjc/s10052-016-4395-z (cited on page 78).
                                              183


 [99] ATLAS Collaboration. “Performance of the ATLAS muon triggers in Run 2”. In: Jour-
      nal of Instrumentation 15.09 (Sept. 2020), P09015–P09015. doi: 10.1088/1748-0221/
      15/09/p09015. url: https://doi.org/10.1088%2F1748-0221%2F15%2F09%2Fp09015
      (cited on page 79).
[100] K. Hamilton, P. Nason, and G. Zanderighi. “MINLO: multi-scale improved NLO”. In:
      Journal of High Energy Physics 2012.10 (Oct. 2012). doi: 10.1007/jhep10(2012)155.
      url: https://doi.org/10.1007%2Fjhep10%282012%29155 (cited on page 79).
[101] S. Alioli et al. “A general framework for implementing NLO calculations in shower
      Monte Carlo programs: the POWHEG BOX”. In: Journal of High Energy Physics
      2010.6 (June 2010). doi: 10.1007/jhep06(2010)043. url: https://doi.org/10.1007%
      2Fjhep06%282010%29043 (cited on page 80).
[102] K. Hamilton, P. Nason, and G. Zanderighi. Finite quark-mass effects in the NNLOPS
      POWHEG+MiNLO Higgs generator. 2015. doi: 10.48550/ARXIV.1501.04637. url:
      https://arxiv.org/abs/1501.04637 (cited on page 80).
[103] P. Nason and C. Oleari. “NLO Higgs boson production via vector-boson fusion matched
      with shower in POWHEG”. In: Journal of High Energy Physics 2010.2 (Feb. 2010). doi:
      10.1007/jhep02(2010)037. url: https://doi.org/10.1007%2Fjhep02%282010%29037
      (cited on page 80).
[104] H. Hartanto et al. “Higgs boson production in association with top quarks in the
      POWHEG BOX”. In: Physical Review D 91.9 (May 2015). doi: 10.1103/physrevd.91.
      094003. url: https://doi.org/10.1103%2Fphysrevd.91.094003 (cited on page 80).
[105] L. Chen et al. “ZH production in gluon fusion at NLO in QCD”. In: Journal of High
      Energy Physics 2022.8 (Aug. 2022). doi: 10.1007/jhep08(2022)056. url: https://doi.
      org/10.1007%2Fjhep08%282022%29056 (cited on page 80).
[106] G. Luisoni et al. “HW ± HZ + 0 and 1 jet at NLO with the POWHEG BOX interfaced
      to GoSam and their merging within MiNLO”. In: Journal of High Energy Physics
      2013.10 (Oct. 2013). doi: 10.1007/jhep10(2013)083. url: https://doi.org/10.1007%
      2Fjhep10%282013%29083 (cited on page 80).
[107] G. Cullen et al. “Automated one-loop calculations with GoSam”. In: The European
      Physical Journal C 72.3 (Mar. 2012). doi: 10.1140/epjc/s10052- 012- 1889- 1. url:
      https://doi.org/10.1140%2Fepjc%2Fs10052-012-1889-1 (cited on page 80).
[108] A. Denner et al. “HAWK 2.0: A Monte Carlo program for Higgs production in vector-
      boson fusion and Higgs strahlung at hadron colliders”. In: Computer Physics Com-
                                        184


      munications 195 (2015), pages 161–171. issn: 0010-4655. doi: https : / / doi . org / 10 .
      1016/j.cpc.2015.04.021. url: https://www.sciencedirect.com/science/article/pii/
      S0010465515001630 (cited on page 80).
[109] A. Djouadi, J. Kalinowski, and M. Spira. “HDECAY: a program for Higgs boson de-
      cays in the Standard Model and its supersymmetric extension”. In: Computer Physics
      Communications 108.1 (Jan. 1998), pages 56–74. doi: 10.1016/s0010-4655(97)00123-9.
      url: https://doi.org/10.1016%2Fs0010-4655%2897%2900123-9 (cited on page 80).
[110] A. Bredenstein et al. “Precision calculations for the Higgs decays H → ZZ/WW → 4
      leptons”. In: Nuclear Physics B - Proceedings Supplements 160 (Oct. 2006), pages 131–
      135. doi: 10.1016/j.nuclphysbps.2006.09.104. url: https://doi.org/10.1016%2Fj.
      nuclphysbps.2006.09.104 (cited on page 80).
[111] E. Bothmann et al. “Event generation with Sherpa 2.2”. In: SciPost Physics 7.3 (Sept.
      2019). doi: 10 . 21468 / scipostphys . 7 . 3 . 034. url: https : / / doi . org / 10 . 21468 %
      2Fscipostphys.7.3.034 (cited on page 80).
[112] E. Re. “Single-top Wt-channel production matched with parton showers using the
      POWHEG method”. In: The European Physical Journal C 71.2 (Feb. 2011). doi: 10.
      1140/epjc/s10052-011-1547-z. url: https://doi.org/10.1140%2Fepjc%2Fs10052-011-
      1547-z (cited on page 80).
[113] R. Frederix, E. Re, and P. Torrielli. “Single-top t-channel hadroproduction in the four-
      flavour scheme with POWHEG and aMC@NLO”. In: Journal of High Energy Physics
      2012.9 (Sept. 2012). doi: 10.1007/jhep09(2012)130. url: https://doi.org/10.1007%
      2Fjhep09%282012%29130 (cited on page 80).
[114] T. Sjöstrand et al. “An introduction to PYTHIA 8.2”. In: Computer Physics Com-
      munications 191 (June 2015), pages 159–177. doi: 10.1016/j.cpc.2015.01.024. url:
      https://doi.org/10.1016%2Fj.cpc.2015.01.024 (cited on page 80).
[115] The Pythia 8 A3 tune description of ATLAS minimum bias and inelas-
      tic measurements incorporating the Donnachie-Landshoff diffractive model.
      Technical report. All figures including auxiliary figures are available at
      https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
      PUB-2016-017. Geneva: CERN, 2016. url: https://cds.cern.ch/record/2206965 (cited
      on page 80).
[116] M. Cacciari, G. P. Salam, and G. Soyez. “FastJet user manual”. In: The European
      Physical Journal C 72.3 (Mar. 2012). doi: 10.1140/epjc/s10052- 012- 1896- 2. url:
      https://doi.org/10.1140%2Fepjc%2Fs10052-012-1896-2 (cited on page 81).
                                           185


[117] Ahmed Tarek. EW Corrections using HAWK. Dec. 2019. url: https://indico.cern.
      ch / event / 870440 / contributions / 3671095 / subcontributions / 295099 / attachments /
      1961576/3260333/Hbb boosted EW 12dec19.pdf (cited on page 88).
[118] S. Frixione et al. “Single-top hadroproduction in association with a W boson”. In:
      Journal of High Energy Physics 2008.07 (July 2008), pages 029–029. doi: 10.1088/1126-
      6708/2008/07/029. url: https://doi.org/10.1088%2F1126- 6708%2F2008%2F07%
      2F029 (cited on page 92).
[119] J. M. Lindert et al. “Precise predictions for V + jets dark matter backgrounds”. In: The
      European Physical Journal C 77.12 (Dec. 2017). doi: 10.1140/epjc/s10052-017-5389-1.
      url: https://doi.org/10.1140%2Fepjc%2Fs10052-017-5389-1 (cited on page 94).
[120] L. A. Harland-Lang et al. “Parton distributions in the LHC era: MMHT 2014 PDFs”.
      In: The European Physical Journal C 75.5 (May 2015). doi: 10.1140/epjc/s10052-
      015-3397-6. url: https://doi.org/10.1140%2Fepjc%2Fs10052-015-3397-6 (cited on
      page 95).
[121] T. Sjöstrand. “Jet fragmentation of multiparton configurations in a string framework”.
      In: Nuclear Physics B 248.2 (1984), pages 469–502. issn: 0550-3213. doi: https://doi.
      org/10.1016/0550- 3213(84)90607- 2. url: https://www.sciencedirect.com/science/
      article/pii/0550321384906072 (cited on page 95).
[122] M. Battaglia et al. Measurement of Inclusive Higgs Boson Production at High pH         T
      in the H → bb̄ Decay Mode. Technical report. Geneva: CERN, 2019. url: https :
      //cds.cern.ch/record/2703097 (cited on pages 99, 101, 102, 170).
[123] G. Cowan. Statistical data analysis. Oxford University Press, USA, 1998 (cited on
      page 101).
[124] L. Moneta et al. The RooStats Project. 2011. arXiv: 1009.1003 [physics.data-an]
      (cited on page 105).
[125] W. Verkerke and D. Kirkby. The RooFit toolkit for data modeling. 2003. arXiv: physics/
      0306116 [physics.data-an] (cited on page 105).
[126] V. Gligorov et al. “Avoiding biases in binned fits”. In: Journal of Instrumentation
      16.08 (Aug. 2021), T08004. doi: 10 . 1088 / 1748 - 0221 / 16 / 08 / t08004. url: https :
      //doi.org/10.1088%2F1748-0221%2F16%2F08%2Ft08004 (cited on page 105).
[127] R. Barlow and C. Beeston. “Fitting using finite Monte Carlo samples”. In: Computer
      Physics Communications 77.2 (1993), pages 219–228. issn: 0010-4655. doi: https://
                                            186


      doi . org / 10 . 1016 / 0010 - 4655(93 ) 90005 - W. url: https : / / www . sciencedirect . com /
      science/article/pii/001046559390005W (cited on page 105).
                                                                                            √
[128] M. Stankaityte. “Probing high energy Higgs bosons using H → bb̄ decays in s = 13
      TeV proton-proton collisions at the ATLAS experiment”. Presented 03 Dec 2021. 2022.
      url: https://cds.cern.ch/record/2800522 (cited on page 106).
[129] CERN. CERN Yellow Reports: Monographs, Vol 2 (2017): Handbook of LHC Higgs
      cross sections: 4. Deciphering the nature of the Higgs sector. en. 2017. doi: 10.23731/
      CYRM-2017-002. url: https://e-publishing.cern.ch/index.php/CYRM/issue/view/32
      (cited on page 107).
[130] N. Berger et al. Simplified Template Cross Sections - Stage 1.1. 2019. arXiv: 1906.02754
      [hep-ph] (cited on page 107).
[131] ATLAS Collaboration. “Measurements of top-quark pair differential          √      and double-
      differential cross-sections in the l+ jets channel with pp collisions at s = 13 TeV using
      the ATLAS detector”. In: The European Physical Journal C 79.12 (Dec. 2019). doi:
      10.1140/epjc/s10052-019-7525-6. url: https://doi.org/10.1140%2Fepjc%2Fs10052-
      019-7525-6 (cited on pages 108, 109).
[132] W, Z, and Higgs bosons as portals to exotic physics - CMS Experiment. url: https:
      //cms.cern/news/w-z-and-higgs-bosons-portals-exotic-physics.
[133] Hadronic Physics. url: https://www.anl.gov/phy/hadronic-physics.
[134] ATLAS Collaboration. “Jet reconstruction and performance using particle flow with
      the ATLAS Detector”. In: The European Physical Journal C 77.7 (July 2017). doi:
      10.1140/epjc/s10052-017-5031-2. url: https://doi.org/10.1140%2Fepjc%2Fs10052-
      017-5031-2 (cited on pages 121, 122).
[135] ATLAS Collaboration. Improving jet substructure performance in ATLAS using
      Track-CaloClusters. Technical report. All figures including auxiliary figures are
      available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-
      PHYS-PUB-2017-015. Geneva: CERN, 2017. url: https://cds.cern.ch/record/2275636
      (cited on pages 121, 123).
[136] P. Berta et al. “Particle-level pileup subtraction for jets and jet shapes”. In: Journal
      of High Energy Physics 2014.6 (June 2014). doi: 10 . 1007 / jhep06(2014 ) 092. url:
      https://doi.org/10.1007%2Fjhep06%282014%29092 (cited on pages 122, 124).
                                                187


[137] M. Cacciari, G. P. Salam, and G. Soyez. “SoftKiller, a particle-level pileup removal
      method”. In: The European Physical Journal C 75.2 (Feb. 2015). doi: 10.1140/epjc/
      s10052- 015- 3267- 2. url: https://doi.org/10.1140%2Fepjc%2Fs10052- 015- 3267- 2
      (cited on pages 122, 125).
[138] A. J. Larkoski et al. “Soft drop”. In: Journal of High Energy Physics 2014.5 (May
      2014). doi: 10 . 1007 / jhep05(2014 ) 146. url: https : / / doi . org / 10 . 1007 % 2Fjhep05 %
      282014%29146 (cited on pages 122, 126).
[139] CMS Collaboration. “Pileup mitigation at CMS in 13 TeV data”. In: Journal of Instru-
      mentation 15.09 (Sept. 2020), P09018–P09018. doi: 10.1088/1748-0221/15/09/p09018.
      url: https : / / doi . org / 10 . 1088 % 2F1748 - 0221 % 2F15 % 2F09 % 2Fp09018 (cited on
      page 123).
[140] S. Marzani, L. Schunk, and G. Soyez. “A study of jet mass distributions with grooming”.
      In: Journal of High Energy Physics 2017.7 (July 2017). doi: 10.1007/jhep07(2017)132.
      url: https://doi.org/10.1007%2Fjhep07%282017%29132 (cited on page 126).
[141] A. J. Larkoski, I. Moult, and B. Nachman. “Jet substructure at the Large Hadron
      Collider: A review of recent advances in theory and machine learning”. In: Physics
      Reports 841 (Jan. 2020), pages 1–63. doi: 10 . 1016 / j . physrep . 2019 . 11 . 001. url:
      https://doi.org/10.1016%2Fj.physrep.2019.11.001 (cited on page 128).
[142] R. K. Ellis, W. J. Stirling, and B. R. Webber. QCD and Collider Physics. Cambridge
      Monographs on Particle Physics, Nuclear Physics and Cosmology. Cambridge Univer-
      sity Press, 1996. doi: 10.1017/CBO9780511628788 (cited on page 130).
[143] S. Marzani, G. Soyez, and M. Spannowsky. Looking Inside Jets. Springer International
      Publishing, 2019. doi: 10.1007/978-3-030-15709-8. url: https://doi.org/10.1007%
      2F978-3-030-15709-8 (cited on page 130).
[144] C. F. Berger, T. Kucs, and G. Sterman. “Event shape–energy flow correlations”. In:
      Physical Review D 68.1 (July 2003). doi: 10.1103/physrevd.68.014012. url: https:
      //doi.org/10.1103%2Fphysrevd.68.014012 (cited on page 130).
[145] A. J. Larkoski, G. P. Salam, and J. Thaler. “Energy correlation functions for jet sub-
      structure”. In: Journal of High Energy Physics 2013.6 (June 2013). doi: 10 . 1007 /
      jhep06(2013)108. url: https://doi.org/10.1007%2Fjhep06%282013%29108 (cited on
      page 131).
                                              188


[146] I. Moult, L. Necib, and J. Thaler. “New angles on energy correlation functions”. In:
      Journal of High Energy Physics 2016.12 (Dec. 2016). doi: 10.1007/jhep12(2016)153.
      url: https://doi.org/10.1007%2Fjhep12%282016%29153 (cited on page 132).
[147] L. G. Almeida et al. “Substructure of high-pT jets at the LHC”. In: Physical Review
      D 79.7 (Apr. 2009). doi: 10.1103/physrevd.79.074017. url: https://doi.org/10.1103%
      2Fphysrevd.79.074017 (cited on page 133).
[148] T. M. Mitchell. Machine learning. Volume 1. 9. McGraw-hill New York, 1997 (cited on
      page 133).
[149] D. M. D’Addona. CIRP Encyclopedia of Production Engineering. Berlin, Heidelberg:
      Springer Berlin Heidelberg, 2014, pages 911–918. isbn: 978-3-642-20617-7. doi: 10 .
      1007/978-3-642-20617-7 6563. url: https://doi.org/10.1007/978-3-642-20617-7 6563
      (cited on page 134).
[150] J. Schmidhuber. “Deep learning in neural networks: An overview”. In: Neural Networks
      61 (Jan. 2015), pages 85–117. doi: 10.1016/j.neunet.2014.09.003. url: https://doi.
      org/10.1016%2Fj.neunet.2014.09.003 (cited on pages 135, 136).
[151] C. Zatout. A brief introduction to neural networks: A classification problem. Feb. 2023.
      url: https : / / towardsdatascience . com / a - brief - introduction - to - neural - networks - a -
      classification-problem-43e68c770081 (cited on page 135).
[152] Identification of hadronically-decaying top quarks using UFO jets with AT-
      LAS in Run 2. Technical report. All figures including auxiliary figures are
      available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-
      PHYS-PUB-2021-028. Geneva: CERN, 2021. url: https://cds.cern.ch/record/2776782
      (cited on pages 136, 137).
[153] C. Chen. “New approach to identifying boosted hadronically decaying particles using jet
      substructure in its center-of-mass frame”. In: Physical Review D 85.3 (Feb. 2012). doi:
      10.1103/physrevd.85.034007. url: https://doi.org/10.1103%2Fphysrevd.85.034007
      (cited on page 136).
                                                                                                 √
[154] ATLAS Collaboration. “Measurement of kT splitting scales in W → lν events at s = 7
      TeV with the ATLAS detector”. In: The European Physical Journal C 73.5 (May 2013).
      doi: 10 . 1140 / epjc / s10052 - 013 - 2432 - 8. url: https : / / doi . org / 10 . 1140 % 2Fepjc %
      2Fs10052-013-2432-8 (cited on page 136).
[155] ATLAS Collaboration. Performance of W /Z taggers using UFO jets in AT-
      LAS. Technical report. All figures including auxiliary figures are available at
                                              189


      https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
      PUB-2021-029. Geneva: CERN, 2021. url: https://cds.cern.ch/record/2777009 (cited
      on pages 137, 138).
[156] I. Goodfellow et al. “Generative Adversarial Nets”. In: Advances in Neural Informa-
      tion Processing Systems. Edited by Z. Ghahramani et al. Volume 27. Curran Asso-
      ciates, Inc., 2014. url: https://proceedings.neurips.cc/paper files/paper/2014/file/
      5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf (cited on page 137).
[157] M. Das-Gupta and G. P. Salam. “Event shapes in e+e- annihilation and deep inelastic
      scattering”. In: J. Phys. G 30 (2004), page 143. url: https://cds.cern.ch/record/694105
      (cited on page 137).
[158] L. G. Almeida et al. “Top quark jets at the LHC”. In: Physical Review D 79.7 (Apr.
      2009). doi: 10.1103/physrevd.79.074012. url: https://doi.org/10.1103%2Fphysrevd.
      79.074012 (cited on page 137).
[159] ATLAS Collaboration. “ATLAS measurements of the properties of jets for boosted
      particle searches”. In: Physical Review D 86.7 (Oct. 2012). doi: 10.1103/physrevd.86.
      072006. url: https://doi.org/10.1103%2Fphysrevd.86.072006 (cited on page 137).
[160] C. Chen. “New approach to identifying boosted hadronically decaying particles using jet
      substructure in its center-of-mass frame”. In: Physical Review D 85.3 (Feb. 2012). doi:
      10.1103/physrevd.85.034007. url: https://doi.org/10.1103%2Fphysrevd.85.034007
      (cited on page 137).
[161] J. Thaler and L.-T. Wang. “Strategies to identify boosted tops”. In: Journal of High
      Energy Physics 2008.07 (July 2008), pages 092–092. doi: 10.1088/1126- 6708/2008/
      07/092. url: https://doi.org/10.1088%2F1126-6708%2F2008%2F07%2F092 (cited on
      page 137).
[162] S. Catani et al. “Longitudinally-invariant kt -clustering algorithms for hadron-hadron
      collisions”. In: Nucl. Phys. B 406 (1993), pages 187–224. doi: 10.1016/0550-3213(93)
      90166-M. url: https://cds.cern.ch/record/246812 (cited on page 137).
[163] Simulation-based        extrapolation     of      b-tagging     calibrations    towards
      high      transverse     momenta       in    the     ATLAS       experiment.     Techni-
      cal report. All figures including auxiliary figures are available at
      https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
      PUB-2021-003. Geneva: CERN, 2021. url: https://cds.cern.ch/record/2753444 (cited
      on page 139).
                                            190


[164] S. Mrenna and P. Skands. “Automated parton-shower variations in pythia 8”. In: Phys-
      ical Review D 94.7 (Oct. 2016). doi: 10.1103/physrevd.94.074005. url: https://doi.
      org/10.1103%2Fphysrevd.94.074005 (cited on page 143).
[165] url: https://geant4-userdoc.web.cern.ch/UsersGuides/PhysicsListGuide/html/index.
      html (cited on page 145).
[166] G. Folger and J. P. Wellisch. String Parton Models in Geant4. 2003. arXiv: nucl -
      th/0306007 [nucl-th] (cited on page 145).
[167] D. Wright and M. Kelsey. “The Geant4 Bertini Cascade”. In: Nuclear Instruments
      and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors
      and Associated Equipment 804 (2015), pages 175–188. issn: 0168-9002. doi: https :
      //doi.org/10.1016/j.nima.2015.09.058. url: https://www.sciencedirect.com/science/
      article/pii/S0168900215011134 (cited on page 145).
[168] J. Yarba. “Recent Developments and Validation of Geant4 Hadronic Physics”. In: Jour-
      nal of Physics: Conference Series 396.2 (Dec. 2012), page 022060. doi: 10.1088/1742-
      6596/396/2/022060. url: https://dx.doi.org/10.1088/1742-6596/396/2/022060 (cited
      on page 145).
[169] M. V. Kossov. “Chiral Invariant Phase Space model”. In: Eur. Phys. J. A 36 (2008),
      pages 289–293. doi: 10.1140/epja/i2008-10597-2 (cited on page 145).
[170] CMS Collaboration. “Identification of heavy, energetic, hadronically decaying particles
      using machine-learning techniques”. In: Journal of Instrumentation 15.06 (June 2020),
      P06005–P06005. doi: 10.1088/1748- 0221/15/06/p06005. url: https://doi.org/10.
      1088%2F1748-0221%2F15%2F06%2Fp06005 (cited on page 157).
[171] L. Randall and R. Sundrum. “Large Mass Hierarchy from a Small Extra Dimension”. In:
      Physical Review Letters 83.17 (Oct. 1999), pages 3370–3373. doi: 10.1103/physrevlett.
      83.3370. url: https://doi.org/10.1103%2Fphysrevlett.83.3370 (cited on page 157).
[172] L. Breiman. “Random Forests”. English. In: Machine Learning 45.1 (2001), pages 5–32.
      issn: 0885-6125. doi: 10.1023/A:1010933404324. url: http://dx.doi.org/10.1023/A%
      3A1010933404324 (cited on page 158).
[173] Ward’s Linkage. url: https://www.statistics.com/glossary/wards-linkage/ (cited on
      page 160).
                                         191


[174] Martı́n Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Sys-
      tems. Software available from tensorflow.org. 2015. url: https://www.tensorflow.org/
      (cited on page 162).
[175] A. F. Agarap. Deep Learning using Rectified Linear Units (ReLU). cite
      arxiv:1803.08375Comment: 7 pages, 11 figures, 9 tables. 2018. url: http : / / arxiv .
      org/abs/1803.08375 (cited on page 162).
[176] A. E. Hoerl and R. W. Kennard. “Ridge Regression: Biased Estimation for Nonorthog-
      onal Problems”. In: Technometrics 42.1 (2000), pages 80–86. issn: 00401706. url:
      http://www.jstor.org/stable/1271436 (visited on 07/17/2023) (cited on page 162).
[177] D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. 2017. arXiv:
      1412.6980 [cs.LG] (cited on page 162).
[178] Garabed Halladjian. Estimation of top Scale Factors for Boosted Multi-Class Tagger
      using profile-likelihood fits. Apr. 2019. url: https : / / indico . cern . ch / event / 910074 /
      contributions/3828338/attachments/2021352/3379938/slides-2020-04-16.pdf (cited on
      page 164).
[179] Maria Mazza. Multi-class W/Z/h jet tagging. May 2023. url: https : / / indico .
      cern . ch / event / 1267254 / contributions / 5363497 / attachments / 2656595 / 4600890 /
      DBLWorkshop 230531 MariaMazza.pdf (cited on pages 164, 168).
[180] Identification of Boosted Higgs Bosons Decaying Into bb̄ With Neu-
      ral     Networks      and     Variable    Radius    Subjets      in      ATLAS.          Techni-
      cal report. All figures including auxiliary figures are available at
      https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
      PUB-2020-019. Geneva: CERN, 2020. url: https://cds.cern.ch/record/2724739 (cited
      on page 166).
[181] M. Murin. Signal Scale Factors in semileptonic ttbar for the new UFO jets W/top
      taggers. Oct. 2021. url: https://indico.cern.ch/event/1089531/contributions/4580167/
      attachments/2332414/3975094/2021- 10- 21 JetEtmiss WTopTagSF presentation.pdf
      (cited on pages 244–250).
                                            192


APPENDIX A
H → bb̄ Analysis
Large-R Jet Triggers
                        (a)                                        (b)
                        (c)                                        (d)
Figure A.1: Large-R jets trigger efficiency curves for the (a) 2015, (b) 2016 and (c,d) 2017
triggers as a function of the leading large-R jet pT .
                                             193


                         (a)                                       (b)
                                             (c)
Figure A.2: Large-R jets trigger efficiency curves for the 2018 triggers as a function of the
leading large-R jet pT .
                                             194


Higgs Shower Systematics
Parton showers rely on the factorization of a process into a hard scattering, a perturbative
shower describing soft collinear emissions, and a non-perturbative hadronization process into
final state hadrons. The two main parton shower models are the angular-ordered model used
by Pythia, and the dipole showering model used by Herwig. The choice of parton shower
model is considered a source of systematic uncertainty for the signal.
    To estimate the uncertainty associated with the choice of parton shower model, MC
samples using Powheg+Pythia8 (nominal) and Powheg+Herwig7 (alternate) were pro-
duced. The samples included ggF, VBF, W H, qq → ZH and gg → ZH with all the possible
W and Z decays. For the nominal samples at reconstruction level (full simulation), the sam-
ples described in Chapter 5 were used. A private production of Herwig7 samples at truth
level was produced as the alternative samples.
    After the selection cuts and truth-matching a Higgs boson to the large-R jets, the samples
are compared at truth level. The assumption is that the differences between samples at truth
level, translate directly to the differences at reconstructed level. Taking the ratio of the 2D
histograms of mJ and pT , a reweighting map wr is defined.
                                          H7truth
                                                   = wr                                    (A.1)
                                          Py8truth
After applying the map to the nominal sample, the reweighted Pythia mass distributions
should model a Herwig generated sample at reconstructed level. These reweighted samples
are then used to make comparisons with the unweighted versions.
                                 Py8rw = wr Py8reco ≈ H7reco                               (A.2)
                                               195


The reweighting maps and the comparisons of the samples at reconstructed level are pre-
sented below. A constant fit of the ratio of Py8rw /Py8reco is shown on the legend of the
mass distributions. The average percentage difference is around 5% when considering the
entire mass spectrum. Most distributions have approximately 1% difference, but the aver-
age is brought up by the samples with low statistics. After inspecting the differences close
to the mass peak, where statistics is not a limiting factor, it was decided that the parton
shower model differences are negligible and therefore not considered in the likelihood fit as
a nuisance parameter.
                         (a)                                       (b)
                          (c)                                      (d)
Figure A.3: Reweighting maps (Herwig/Pythia) used to estimate the Higgs jet shower sys-
tematic for leading (left) and subleading (right) large-R jets from (a,b) ggF and (c,d) VBF.
                                             196


                          (a)                                       (b)
                          (c)                                       (d)
                          (e)                                       (f)
Figure A.4: Reweighting maps (Herwig/Pythia) used to estimate the Higgs jet shower sys-
tematic for leading (left) and subleading (right) large-R jets from gg → ZH production with
subchannels (a,b) Z(l+ l− ) H, (c,d) Z(qq) H and (e,f) Z(νν) H.
                                              197


                          (a)                                       (b)
                          (c)                                       (d)
                          (e)                                       (f)
Figure A.5: Reweighting maps (Herwig/Pythia) used to estimate the Higgs jet shower sys-
tematic for leading (left) and subleading (right) large-R jets from pp → ZH production with
subchannels (a,b) Z(l+ l− ) H, (c,d) Z(qq) H and (e,f) Z(νν) H.
                                              198


                         (a)                                      (b)
                         (c)                                      (d)
Figure A.6: Reweighting maps (Herwig/Pythia) used to estimate the Higgs jet shower sys-
tematic for leading (left) and subleading (right) large-R jets from gg → W − H production
with subchannels (a,b) W − (l− νl ) H and (c,d) W − (qq 0 ) H.
                                             199


                         (a)                                      (b)
                         (c)                                      (d)
Figure A.7: Reweighting maps (Herwig/Pythia) used to estimate the Higgs jet shower sys-
tematic for leading (left) and subleading (right) large-R jets from gg → W + H production
with subchannels (a,b) W + (l+ νl ) H and (c,d) W + (qq 0 ) H.
                                             200


                         (a)                                      (b)
                         (c)                                      (d)
Figure A.8: Comparison of the Higgs jet mass distribution after applying the reweighting
maps for leading (left) and subleading (right) large-R jets from (a,b) ggF and (c,d) VBF.
                                            201


                         (a)                                     (b)
                         (c)                                     (d)
                         (e)                                     (f)
Figure A.9: Comparison of the Higgs jet mass distribution after applying the reweighting
maps for leading (left) and subleading (right) large-R jets from gg → ZH production with
subchannels (a,b) Z(l+ l− ) H, (c,d) Z(qq) H and (e,f) Z(νν) H.
                                            202


                         (a)                                     (b)
                         (c)                                     (d)
                         (e)                                     (f)
Figure A.10: Comparison of the Higgs jet mass distribution after applying the reweighting
maps for leading (left) and subleading (right) large-R jets from pp → ZH production with
subchannels (a,b) Z(l+ l− ) H, (c,d) Z(qq) H and (e,f) Z(νν) H.
                                            203


                         (a)                                       (b)
                         (c)                                       (d)
Figure A.11: Comparison of the Higgs jet mass distribution after applying the reweighting
maps for leading (left) and subleading (right) large-R jets from gg → W − H production with
subchannels (a,b) W − (l− νl ) H and (c,d) W − (qq 0 ) H.
                                             204


                         (a)                                       (b)
                         (c)                                       (d)
Figure A.12: Comparison of the Higgs jet mass distribution after applying the reweighting
maps for leading (left) and subleading (right) large-R jets from gg → W + H production with
subchannels (a,b) W + (l+ νl ) H and (c,d) W + (qq 0 ) H.
                                             205


Higgs Electroweak Corrections
                         (a)                                 (b)
                         (c)                                 (d)
                                           (e)
Figure A.13: Systematics of the NLO EW corrections calculated using HAWK as a function
of the Higgs pT for (a) VBF and (b) V H production, and the subchannels (c) Z(l+ l− ) H,
(d) Z(νν) H and (e) W H. The y-axis is the value of δEW = σ NLO EW /σ LO − 1.
                                          206


APPENDIX B
Unified Flow Objects
High pT Extrapolation Efficiencies
Shower Variations
                      (a)                                        (b)
Figure B.1: DNN W tagger 7(+2) point scale variation efficiencies as a function of pT for
W -jets at the (a) 50% and (b) 80% working points.
                                          207


                       (a)                                       (b)
Figure B.2: DNN W tagger 7(+2) point scale variation efficiency envelope as a function of
pT for W -jets at the (a) 50% and (b) 80% working points.
                                           208


                      (a)                                       (b)
                      (c)                                       (d)
Figure B.3: DNN top taggers 7(+2) point scale variation efficiencies as a function of pT
for contained tops (top) at the (a) 50% and (b) 80% working points and for inclusive tops
(bottom) at the (c) 50% and (d) 80% working points.
                                           209


                      (a)                                        (b)
                      (c)                                        (d)
Figure B.4: DNN top taggers 7(+2) point scale variation efficiency envelope as a function
of pT for contained tops (top) at the (a) 50% and (b) 80% working points and for inclusive
tops (bottom) at the (c) 50% and (d) 80% working points.
                                            210


                        (a)                                    (b)
                        (c)                                    (d)
                                            (e)
Figure B.5: ANN W/Z tagger 7(+2) point scale variation efficiencies as a function of pT for
W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                            211


                         (a)                                     (b)
                         (c)                                     (d)
                                             (e)
Figure B.6: ANN W/Z tagger 7(+2) point scale variation efficiency envelope as a function
of pT for W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                             212


                         (a)                                  (b)
                         (c)                                  (d)
                                            (e)
Figure B.7: ANN W/Z tagger 7(+2) point scale variation efficiencies as a function of pT for
Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                            213


                          (a)                                   (b)
                          (c)                                   (d)
                                             (e)
Figure B.8: ANN W/Z tagger 7(+2) point scale variation efficiency envelope as a function
of pT for Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                            214


                        (a)                                      (b)
                        (c)                                      (d)
Figure B.9: Three variable W/Z taggers 7(+2) point scale variation efficiencies as a function
of pT for W -jets (top) at the (a) 50% and (b) 80% working points and for Z-jets (bottom)
at the (c) 50% and (d) 80% working points.
                                            215


                       (a)                                       (b)
                       (c)                                       (d)
Figure B.10: Three variable W/Z taggers 7(+2) point scale variation efficiency envelope as
a function of pT for W -jets (top) at the (a) 50% and (b) 80% working points and for Z-jets
(bottom) at the (c) 50% and (d) 80% working points.
                                              216


PDF Variations
                      (a)                                       (b)
Figure B.11: DNN W tagger PDF variation efficiencies as a function of pT for W -jets at the
(a) 50% and (b) 80% working points.
                      (a)                                       (b)
Figure B.12: DNN W tagger PDF variation efficiency envelope as a function of pT for W -jets
at the (a) 50% and (b) 80% working points.
                                           217


                        (a)                                       (b)
                        (c)                                       (d)
Figure B.13: DNN top taggers PDF variation efficiencies as a function of pT for contained
tops (top) at the (a) 50% and (b) 80% working points and for inclusive tops (bottom) at the
(c) 50% and (d) 80% working points.
                                           218


                      (a)                                    (b)
                      (c)                                    (d)
Figure B.14: DNN top taggers PDF variation efficiency envelope as a function of pT for
contained tops (top) at the (a) 50% and (b) 80% working points and for inclusive tops
(bottom) at the (c) 50% and (d) 80% working points.
                                         219


                         (a)                                 (b)
                         (c)                                 (d)
                                            (e)
Figure B.15: ANN W/Z tagger PDF variation efficiencies as a function of pT for W -jets at
the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                           220


                          (a)                                 (b)
                          (c)                                 (d)
                                            (e)
Figure B.16: ANN W/Z tagger PDF variation efficiency envelope as a function of pT for
W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                            221


                         (a)                                 (b)
                         (c)                                 (d)
                                            (e)
Figure B.17: ANN W/Z tagger PDF variation efficiencies as a function of pT for Z-jets at
the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                           222


                         (a)                                  (b)
                         (c)                                  (d)
                                            (e)
Figure B.18: ANN W/Z tagger PDF variation efficiency envelope as a function of pT for
Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                            223


                        (a)                                     (b)
                        (c)                                     (d)
Figure B.19: Three variable W/Z taggers PDF variation efficiencies as a function of pT for
W -jets (top) at the (a) 50% and (b) 80% working points and for Z-jets (bottom) at the (c)
50% and (d) 80% working points.
                                          224


                       (a)                                      (b)
                        (c)                                     (d)
Figure B.20: Three variable W/Z taggers PDF variation efficiency envelope as a function of
pT for W -jets (top) at the (a) 50% and (b) 80% working points and for Z-jets (bottom) at
the (c) 50% and (d) 80% working points.
                                            225


Physics List Variations
                        (a)                                      (b)
Figure B.21: DNN W tagger Physics List nuclear model variation efficiencies as a function
of pT for W -jets at the (a) 50% and (b) 80% working points.
                        (a)                                      (b)
Figure B.22: DNN W tagger Physics List nuclear model variation efficiency envelope as a
function of pT for W -jets at the (a) 50% and (b) 80% working points.
                                            226


                        (a)                                       (b)
Figure B.23: DNN W tagger Physics List detector geometry variation efficiencies as a func-
tion of pT for W -jets at the (a) 50% and (b) 80% working points.
                        (a)                                       (b)
Figure B.24: DNN W tagger Physics List detector geometry variation efficiency envelope as
a function of pT for W -jets at the (a) 50% and (b) 80% working points.
                                             227


                       (a)                                      (b)
                       (c)                                      (d)
Figure B.25: DNN top taggers Physics List nuclear model variation efficiencies as a function
of pT for contained tops (top) at the (a) 50% and (b) 80% working points and for inclusive
tops (bottom) at the (c) 50% and (d) 80% working points.
                                            228


                       (a)                                       (b)
                       (c)                                       (d)
Figure B.26: DNN top taggers Physics List nuclear model variation efficiency envelope as a
function of pT for contained tops (top) at the (a) 50% and (b) 80% working points and for
inclusive tops (bottom) at the (c) 50% and (d) 80% working points.
                                            229


                       (a)                                       (b)
                       (c)                                       (d)
Figure B.27: DNN top taggers Physics List detector geometry variation efficiencies as a
function of pT for contained tops (top) at the (a) 50% and (b) 80% working points and for
inclusive tops (bottom) at the (c) 50% and (d) 80% working points.
                                            230


                        (a)                                       (b)
                        (c)                                       (d)
Figure B.28: DNN top taggers Physics List detector geometry variation efficiency envelope
as a function of pT for contained tops (top) at the (a) 50% and (b) 80% working points and
for inclusive tops (bottom) at the (c) 50% and (d) 80% working points.
                                             231


                          (a)                                   (b)
                          (c)                                   (d)
                                             (e)
Figure B.29: ANN W/Z tagger Physics List nuclear model variation efficiencies as a function
of pT for W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                             232


                         (a)                                  (b)
                         (c)                                  (d)
                                            (e)
Figure B.30: ANN W/Z tagger Physics List nuclear model variation efficiency envelope as
a function of pT for W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working
points.
                                           233


                         (a)                                (b)
                         (c)                                (d)
                                           (e)
Figure B.31: ANN W/Z tagger Physics List detector geometry variation efficiencies as a
function of pT for W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working
points.
                                          234


                         (a)                                    (b)
                         (c)                                    (d)
                                             (e)
Figure B.32: ANN W/Z tagger Physics List detector geometry variation efficiency envelope
as a function of pT for W -jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working
points.
                                             235


                          (a)                                   (b)
                          (c)                                   (d)
                                             (e)
Figure B.33: ANN W/Z tagger Physics List nuclear model variation efficiencies as a function
of pT for Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                            236


                          (a)                                    (b)
                          (c)                                    (d)
                                              (e)
Figure B.34: ANN W/Z tagger Physics List nuclear model variation efficiency envelope as a
function of pT for Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                              237


                          (a)                                    (b)
                          (c)                                    (d)
                                              (e)
Figure B.35: ANN W/Z tagger Physics List detector geometry variation efficiencies as a
function of pT for Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working points.
                                              238


                         (a)                                   (b)
                         (c)                                   (d)
                                             (e)
Figure B.36: ANN W/Z tagger Physics List detector geometry variation efficiency envelope
as a function of pT for Z-jets at the (a) 50%, (b) 60%, (c) 70%, (d) 80%, (e) 90% working
points.
                                            239


                       (a)                                       (b)
                       (c)                                       (d)
Figure B.37: Three variable W/Z taggers Physics List nuclear model variation efficiencies as
a function of pT for W -jets (top) at the (a) 50% and (b) 80% working points and for Z-jets
(bottom) at the (c) 50% and (d) 80% working points.
                                              240


                      (a)                                        (b)
                      (c)                                        (d)
Figure B.38: Three variable W/Z taggers Physics List nuclear model variation efficiency
envelope as a function of pT for W -jets (top) at the (a) 50% and (b) 80% working points
and for Z-jets (bottom) at the (c) 50% and (d) 80% working points.
                                            241


                        (a)                                         (b)
                        (c)                                         (d)
Figure B.39: Three variable W/Z taggers Physics List detector geometry variation efficiencies
as a function of pT for W -jets (top) at the (a) 50% and (b) 80% working points and for Z-jets
(bottom) at the (c) 50% and (d) 80% working points.
                                               242


                      (a)                                        (b)
                      (c)                                        (d)
Figure B.40: Three variable W/Z taggers Physics List detector geometry variation efficiency
envelope as a function of pT for W -jets (top) at the (a) 50% and (b) 80% working points
and for Z-jets (bottom) at the (c) 50% and (d) 80% working points.
                                            243


Jet Substructure Modeling
                       (a)                                     (b)
                                          (c)
Figure B.41: Distributions for UFO SD CS-SK Large-R jets passing the W selection criteria
of (a) mass, (b) D2 and (c) ntrk [181].
                                         244


                        (a)                                    (b)
                        (c)                                    (d)
Figure B.42: Distributions for UFO SD CS-SK Large-R jets passing the top quark selection
criteria of (a) mass, (b) D2 , (c) ntrk and (d) C2 [181].
                                              245


                         (a)                                   (b)
                          (c)                                  (d)
Figure B.43: Distributions for UFO SD CS-SK Large-R jets passing the top quark selection
criteria of (a) L1 , (b) L2 , (c) L3 and (d) L4 [181].
                                               246


                         (a)                                          (b)
                         (c)                                          (d)
Figure B.44: Distributions for UFO SD CS-SK Large-R jets passing the top quark selection
criteria of (a) sphericity, (b) planar flow, (c) aplanarity and (d) angularity [181].
                                                247


                       (a)                                        (b)
                                            (c)
Figure B.45: Distributions for UFO SD CS-SK Large-R jets passing the top quark selection
criteria of (a) Fox-Wolfram moment, (b) jet total charge and (c) Qw [181].
                                           248


                        (a)                                       (b)
                        (c)                                       (d)
Figure B.46: Distributions
               √          √ for UFO SD CS-SK Large-R jets passing the top quark selection
criteria of (a) d12 , (b) d23 , (c) thrust minor and (d) thrust major [181].
                                             249


                          (a)                                  (b)
                                               (c)
Figure B.47: Distributions for UFO SD CS-SK Large-R jets passing the top quark selection
                 wta , (b) τ and (c) τ wta [181].
criteria of (a) τ21          32       42
                                              250