SEARCH FOR VECTOR-LIKE QUARKS AT

√

S = 13 TeV USING THE ATLAS

DETECTOR

By

Carlos Josu´e Bux´o V´azquez

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

Physics — Doctor of Philosophy

2023

ABSTRACT

This dissertation presents two research topics. The ﬁrst topic focuses on the tagging of jets

to hadronically decaying top quarks and W bosons in the ATLAS detector. Two jet tagging

algorithm optimization studies are described. The second topic focuses on the search for

vector-like top quarks (T ), which are predicted by beyond the Standard Model theories that

aim to solve the Hierarchy Problem, using a dataset of proton-proton collisions with a center

of mass energy of

√

s = 13 TeV collected with the ATLAS detector. Two search analyses

that probe diﬀerent production mechanisms of the T are performed towards this goal.

The ﬁrst jet tagging algorithm study focuses on the optimization of two deep neural

network (DNN) top taggers and a three-variable W tagger, all of which use information

from the substructure of jets. The tagging signal eﬃciency is extracted for each tagger both

in Standard Model (SM) Monte Carlo (MC) simulations and in the data that was collected

by the ATLAS detector in 2015-2017. The performance of the taggers in MC is calibrated to

that of the data with the derivation of tagging signal eﬃciency scale factors. Additionally,

uncertainties are derived for this measurement, which take into account eﬀects from the

MC modeling of the SM processes considered and the reconstruction and calibration of the

diﬀerent physics objects used in this measurement.

The second jet tagging algorithm study consists of a topological data analysis (TDA) of

jets that analyzes their simplicial homology. A framework that applies a persistent homology

analysis and the Mapper algorithm to jets is devised. The information obtained from this

framework is applied in the design of a DNN and convolutional graph neural network (GNN)

top tagging algorithms. Optimization studies were performed in which these two taggers

achieved a comparable performance to the substructure-based DNN top taggers.

Two search analyses for a T are performed, one targeting the single production mechanism

of a T and the other targeting the pair production mechanism of T ¯T . Both analyses focus

on the decay topologies T → Ht and T → Zt in ﬁnal states that include a single electron

or muon. A search strategy is devised for each analysis that takes advantage of several

experimental features that are unique to each production mechanism. The tagging of jets to

hadronically decaying top quarks and W , Z, and Higgs bosons is a cornerstone of the search

strategies. A statistical analysis is performed for both searches to test for the presence of

potential T production events in the data. No signiﬁcant excesses over the SM prediction

are observed in both searches, and 95% CL upper limits are set on the T and T ¯T production

cross sections. These limits are interpreted as exclusion limits on the T mass and other

theory parameters that vary depending on the signal benchmark considered.

Dedicado a mi mam´a y la cari˜nosa memoria de mi pap´a.
(Dedicated to my mom and the loving memory of my dad.)

iv

ACKNOWLEDGMENTS

The journey to my Ph.D. has been a long and arduous one that wouldn’t have been possible

without the inﬂuence of many people. Thank you all for making this odyssey possible.

I would like to give a huge thank you to my advisor, Wade Fisher. I’m especially grateful

for accepting me as his graduate student and for the wonderful experiences that I have

had during my graduate studies under his guidance. I feel lucky to have an advisor who is a

walking encyclopedia of particle physics and data science knowledge, who is passionate about

teaching, and for the freedom and encouragement he has given me to explore diﬀerent ideas

in my research. Thank you for the support, guidance, instilled knowledge, and friendship

you have provided me during this journey.

I’m also thankful to the people I met and collaborated with while working on my jet

tagging research and VLQ search analyses for the guidance and imparted knowledge they

have provided me. A special thank you goes to Trisha Farooque, whom I’ve worked with

throughout most of my graduate career. Your mentoring, knowledge, and friendship have

become invaluable during my growth over these years. Thank you. I would also like to thank

Casey Bellgraph, who worked under my guidance. Thank you for teaching me how to be a

mentor and allowing me the opportunity to impart my knowledge to you.

I would like to give thanks to the group of ATLAS post-docs here at MSU, who have

also provided me with their guidance and knowleged. I appreciate how they have made our

research group feel more like a group of friends, especially during my visit to CERN.

To all the friends I have made during these years, thank you for your emotional support

and all the good moments we have shared that have made this journey an enjoyable one.

Finalmente, estoy eternamente agradecido a mis padres por ser los pilares que me alzan y

haberme criado para ser la persona que soy hoy en d´ıa. Pap´a, aunque nuestro tiempo junto

v

haya sido corto, estoy agradecido por el amor y las ense˜nanzas que me brindastes, las cuales

son m´as que suﬁcientes para toda la vida. Mam´a, gracias por todo el soporte emocional y

amor incondicional que siempre me has brindado, y por haberme inculcado la curiosidad de

saber como funciona el mundo. Tus palabras sabias me han ayudado a tener presente que

soy humano y que deber´ıa de estar orgulloso de mis logros a pesar de mis fallas. Gracias

por todos tus sacriﬁcios los cuales me han forjado para ser quien soy hoy. Espero que ambos

est´en orgullosos de lo que he logrado.

vi

TABLE OF CONTENTS

Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 2. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

5

Chapter 3. The LHC and the ATLAS Detector

. . . . . . . . . . . . . . . . . . . . .

50

Chapter 4. Processes of Interest and Data Selection . . . . . . . . . . . . . . . . . . .

91

Chapter 5. Tagging Top Quarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Chapter 6. Searches for Vector-Like Quarks . . . . . . . . . . . . . . . . . . . . . . . 186

Chapter 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

APPENDIX A. Monte Carlo Simulations . . . . . . . . . . . . . . . . . . . . . . . . . 298

APPENDIX B. Mapper Algorithm Optimization . . . . . . . . . . . . . . . . . . . . 303

APPENDIX C. Mapper Algorithm Comparison Plots . . . . . . . . . . . . . . . . . . 319

APPENDIX D. Single VLQ Background Reweighting . . . . . . . . . . . . . . . . . . 330

vii

Chapter 1

Introduction

Over the past few decades the ﬁeld of elementary particle physics has been the stage of numer-

ous advances both in theoretical and experimental physics. Being the fundamental building

blocks at the smallest distance scales that form our perception of the universe through their

energetic interactions, elementary particles are inherently both quantum and relativistic ob-

jects.

In order to give a mathematical description of the nature of elementary particles,

one would need to reconcile the disparate theories of quantum physics and relativity. This

resulted in one of the most elegant theoretical frameworks to date, known as Quantum Field

Theory (QFT), which serves as the rigorous foundation of the Standard Model (SM) of

particle physics. The SM describes the diﬀerent symmetries and interactions that particles

obey in nature, which form the basis of our understanding of modern particle physics. Like

all physical theories that we use to describe diﬀerent aspects of the universe we live in, the

SM has provided us with several predictions that have been experimentally veriﬁed. These

predictions range from simplistic facts such as the existence of particle-antiparticle pairs, to

more insightful predictions such as the existence of gauge bosons, a special family of particles

that are responsible for mediating the interactions between particles that form the ordinary

matter we observe in nature.

Experimental physics saw rapid advancements to provide the empirical evidence that

bridges QFT and the SM. Many engineering feats were made in the creation of the machin-

1

ery necessary to recreate the energetic conditions needed to study the simplest of particle

interactions. At present, the machinery required for particle physics experiments has become

sophisticated. Circular colliders, such as the Large Hadron Collider (LHC), employ a wide

array of technologies that allow for the conﬁnement and acceleration to relativistic speeds of

particle beams in order to study the outcome of their energetic collisions. However, recreat-

ing the conditions to study particle interactions and colliding the particles is only part of the

job, as one needs to be able to detect and identify what comes out from these interactions.

Modern-day particle detection and identiﬁcation has evolved from painstakingly analyzing

individual photographs of tracks traced by particles in cloud chambers to multi-tiered de-

tector systems, which are designed to detect large numbers of particles simultaneously. The

technology behind modern-day detectors is designed to elicit tailored interactions between

diﬀerent types of particles and detector components in order to detect the particles. This

makes modern-day detectors analogous to high-resolution cameras that allow us to capture

the ﬁne details of nature at the sub-atomic scale.

All this amalgamation of knowledge reached its current pinnacle with the discovery of

the Higgs boson in 2012, ﬁnalizing the current formulation of the SM. However, all is not

well in the SM because there are several open questions remaining, and the SM falls short

in providing explanations. Some examples of these open questions are the existence of dark

matter (DM), the abundance asymmetry between matter and antimatter in the universe,

and the Hierarchy Problem. The Hierarchy Problem, which is related to the research topics

presented in this thesis, can be stated as the relative lightness of the Higgs boson mass in

comparison to the energy scales, such as the Planck mass scale, at which new physics is

expected to emerge. Currently, the experiments at the LHC are providing precision mea-

surements to further test the validity of the SM and to search for new physics beyond the

2

Standard Model (BSM) that could help explain some of these unanswered questions.

This thesis describes two research topics. The ﬁrst topic aims to improve particle iden-

tiﬁcation in the ATLAS detector using collimated sprays of particle decays in the detector,

known as jets, which is an important process in nearly all precision measurements and BSM

searches. The second topic is a search for Vector-Like Quarks (VLQs), which are particles

predicted by some BSM theories seeking to resolve the Hierarchy Problem. The theoretical

background of particle physics needed to understand and motivate these analyses is described

in Chapter 2 by providing an introduction to the SM, its shortcomings, and a brief overview

of the BSM VLQ theory. The LHC and the ATLAS detector, which are the experimental

devices that provided the data utilized in the analyses described in the latter chapters, are

presented in Chapter 3. An overview of the diﬀerent detector components, the interactions

between particles and the detector, and the process of reconstructing physics objects from

detector data are discussed in this chapter. Chapter 4 describes the particle physics processes

of interest in the studies presented in this thesis, as well as the requirements made to select

events from the detector data and Monte Carlo (MC) simulations that provide a relatively

pure selection of these processes. Chapter 5 presents the ﬁrst research topic of this thesis

through the concept of tagging a jet to a particle. Some particles, such as the top quark

and the Higgs boson, decay into other particles before interacting with the detector due

to their short lifetime. If these decays are fully hadronic, then the source particle can be

reconstructed as a jet. Several particles that are predicted by BSM theories, such as VLQs,

can decay into these particles with short lifetimes. Jet tagging can aid in the reconstruction

of events where these yet undiscovered particles are produced by identifying jets with their

decays. Two jet tagging studies are presented in this chapter. The ﬁrst study consists of

utilizing information from the substructure of jets to improve top quark and W boson jet

3

tagging. The second study consists of utilizing topological data analysis techniques as an

alternative for top quark jet tagging. These techniques have not been used in the context of

jet tagging previously. The potential improvement that these techniques bring over the tra-

ditional taggers utilized in ATLAS is assessed in this chapter. Chapter 6 presents the second

research topic of this thesis through two search analyses of a vector-like top quark (T ). The

ﬁrst search analysis focuses on the single production of a T , while the second analysis focuses

on the pair production T ¯T . Both analyses target the decay channels T → Ht and T → Zt

in ﬁnal states associated with exactly one electron or muon. The search strategy, statistical

analysis, and results of both searches are discussed in this chapter. Finally, Chapter 7 gives

an overall summary and concluding remarks, as well as potential research outlooks, for both

research topics presented in this thesis.

4

Chapter 2

Theory

In this chapter, the theoretical framework that is necessary to understand and motivate

the subsequent chapters of this thesis, the Standard Model (SM) of particle physics, is

presented. The chapter starts with the introduction of the diﬀerent particles that form the

SM and a brief overview of their properties. This is followed by an exposure to the group

theory framework, which is used as the mathematical foundation of the SM, emphasizing the

importance of Lie groups and their role in describing the symmetries of the SM. The following

sections are dedicated to the construction of the SM Lagrangian using symmetry arguments

as the initial motivation. This will be presented in parts, starting with the electroweak

interactions of the SM, followed by the spontaneous symmetry breaking mechanism from

which the Higgs boson originates, and culminating with the strong force interactions.

The next section of this chapter focuses on the BSM aspect of the theoretical framework.

First, motivation for the necessity of extending the current SM is given through phenomeno-

logical examples that the SM is unable to explain. More emphasis will be placed on the

Hierarchy Problem, which serves as the theoretical motivation for many BSM searches per-

formed at ATLAS, such as the vector-like quark (VLQ) analyses presented in Chapter 6.

Finally, a brief overview of the Composite Higgs model, which aims to solve the Hierarchy

Problem, is presented. Some realizations of the Composite Higgs model predict the existence

of VLQs; thus, a discovery of VLQs could serve as validation of this extension to the SM.

5

The description provided for the theoretical background of the SM of elementary particle

physics is based on [1] with some elements of group theory from [2]. The discussion of the

BSM theory of Composite Higgs models and VLQs is based on [3]. In the following discussion,

all mathematical expressions that have repeating indices imply a sum of the indexed terms

following Einstein’s summation convention. Furthermore, the system of natural units will

be used throughout the discussion and the remainder of this thesis, as it is the convention

used in particle physics. The natural units are deﬁned by setting Planck’s reduced constant

and the value of the speed of light in vacuum to:

(cid:126) =

h
2π

= 1.055 × 10−34 J · s → (cid:126) = 1

(2.1)

c = 2.998 × 108 m/s → c = 1

Under this convention, quantities such as energy, momentum, and mass are measured in

electronvolts (eV), which is the energy of a single electron accelerated through a potential of

1 Volt, and quantities such as distance and time are measured in eV−1. Since quantities of

interest in particle physics are small, it is common practice to use gigaelectronvolts (1 GeV =

1.6 × 10−10 J) instead. Additionally, all electric charges are expressed in terms of the

fundamental charge e, which is the charge of the proton. As an example, the electron has a

charge of −1, while the up quark has a charge of 2/3. Finally, the spin angular momentum

of all particles is measured in units of (cid:126).

6

2.1 The Standard Model of Particle Physics

2.1.1 Particles of the Standard Model

All elementary particles that have been predicted by the SM of particle physics and ex-

perimentally observed are summarized in Figure 2.1. These particles can be classiﬁed into

two groups based on their intrinsic spin quantum number: fermions, deﬁned by having half-

integer spin, and bosons, deﬁned by having integer spin. The atoms that form the ordinary

Figure 2.1: Summary of all elementary particles in the SM and their properties. This ﬁgure
is taken from [4].

matter in the universe are composed of fermions that are bound together through the fun-

damental forces that are mediated by the gauge bosons. The gravitational force is the only

exception that has evaded prediction from the SM, thus lacking a mediator particle. All

fermions have an associated antiparticle that shares the same mass value but diﬀers in quan-

tum numbers, which dictates how these particles interact in the SM. Fermions are arranged

7

into three generations, which are characterized by their mass value and ﬂavor quantum num-

ber. Each subsequent generation contains a heavier particle of a given ﬂavor. The ordinary

matter we observe in the universe is only composed of ﬁrst generation fermions. Particles

in the other generations are not stable enough to compose matter due to their large masses

and consequently short lifetimes. Fermions are further subdivided into quarks and leptons

based on the types of interactions in which these particles can participate.

The leptons consist of the electron (e), muon (µ), and tau (τ ), which carry electric charge

Q = −1 and can interact through the electromagnetic force, and their associated neutrinos,

νe, νµ, and ντ , which are electrically neutral and thus cannot interact electromagnetically.

Both charged and neutral leptons contain an additional intrinsic quantum number known as

the weak isospin, which allows them to interact through the weak force. It should be noted

that neutrinos are predicted to be massless in the minimal formulation of the SM, which

disagrees with recent experimental evidence. The SM can be extended without signiﬁcant

eﬀort to incorporate neutrino masses; however, the details of the mechanism required to do

so cannot be explained solely by the SM.

The quarks consist of the up (u), down (d), charm (c), strange (s), top (t), and bottom

(b). All quarks carry electric charge, Q = 2/3 in the case of u, c, and t, and Q = −1/3

in the case of d, s, and b, which allows them to interact electromagnetically. Like leptons,

quarks also carry weak isospin allowing them to interact through the weak force. What sets

quarks apart from leptons is that quarks carry an additional quantum number known as

color charge, which allows them to participate in strong force interactions. Unlike electric

charge, which is characterized by a single value that can be either positive or negative, color

is characterized by three values that are labeled as red, green, and blue, along with their

corresponding anti-values: anti-red, anti-green, and anti-blue. All quarks carry a single unit

8

of color. Due to a phenomenon known as color conﬁnement, no single quark can constitute

ordinary matter; only bound states of multiple quarks with certain combinations of color

can constitute ordinary matter. One combination, known as baryons, consists of having a

quark of each color charge, or anti-color in the case of an antiparticle, in equal parts. An

example of a baryon is the proton, which distributes a single unit of red, green, and blue

charge across its quark constituents. The other possible combination, known as mesons,

consists of arrangements that contain at least one color anti-color pair.

The interactions between particles can be interpreted as the interacting particles exchang-

ing the corresponding mediator gauge boson of an interaction force. The photon (γ), which

is massless and electrically neutral, mediates the electromagnetic force between particles that

carry electric charge. The gluon (g), which is massless, charge-neutral, and carries a unit of

color and anti-color charge, mediates the strong interaction between particles that carry color

charge. The weak vector bosons W ± and Z mediate the weak interaction between particles

that carry weak isospin. The Z is electrically neutral, while the W + and W − carry electric

charges Q = 1 and Q = −1, respectively, allowing them to interact electromagnetically.

Finally, to complete the overview of particles in the SM, there is a single scalar boson,

which is the Higgs boson. The Higgs boson is both charge and color neutral; however it

has a weak isospin of −1/2. Unlike the gauge bosons in the SM, the Higgs boson does not

mediate a force from nature but plays an important role in the Higgs mechanism, which is

the process through which the weak vector bosons acquire their mass.

2.1.2 Symmetries and Lie Groups

The mathematical formulation of the SM is based on the Lagrangian formulation of QFT,

which is used to describe the diﬀerent interactions between particles as discussed in the

9

following sections. This formulation is attractive in particle physics for several reasons.

First, the Lagrangian is a scalar function, which implies that it must remain invariant under

transformations such as Lorentz transformations. This means that the description of physical

processes provided by the Lagrangian must be independent of the frame of reference of an

observer. This is a desired property when describing the interactions of elementary particles,

which are usually relativistic in nature. Another beneﬁt of the Lagrangian formulation is

its ability to encode conservation laws through symmetries in the Lagrangian. Conservation

of momentum and energy manifests when the Lagrangian is invariant under the Lorentz

transformations. Similarly, the conservation of quantum numbers such as electric charge,

weak isospin, and color charge is manifested when the Lagrangian is invariant under gauge

transformations, which alter the internal properties of particles. This is formally stated

in Noether’s Theorem, which states that any continuous local transformation that leaves

invariant the action of a Lagrangian is associated with a conserved quantity.

These continuous transformations are described with Lie groups, of which the most rel-

evant in particle physics are the Poincar´e group, the unitary group U (1), and the special

unitary groups SU (2) and SU (3). The Poincar´e group represents the symmetries of the

Lorentz transformations. The unitary group U (1) represents the symmetries associated with

the choice of the electromagnetic potential. The special unitary groups SU (2) and SU (3)

represent the rotation symmetries on the internal weak isospin and color charge spaces,

respectively. In the following discussion, more attention will be given to the groups that

describe the conservation of quantum numbers of a particle since they are more essential

in the construction of the SM Lagrangian. It should be noted that there are multiple rep-

resentations of Lie groups; however, the properties and results discussed in this section are

independent of the representation used. The matrix representation of Lie groups will be used

10

in order to maintain consistency with the SM Lagrangian formulation. Formally, a matrix

Lie group G is a group whose matrix elements M (θ) depend continuously on a set of real

parameters θ ∈ Rm [5]. This dependence on continuous parameters endows the Lie group G

with the additional structure of a topological manifold. Matrix multiplication can be viewed

as a continuous function f : G × G → G from the product manifold G × G to the manifold

G such that M (θ) = M (θ1)M (θ2) if θ = f (θ1, θ2). The choice of parameters is made so

that the identity matrix I coincides with θ = 0. For each matrix Lie group G, there is an

associated Lie algebra g = Span{v1, · · · , vn}, which is a vector space spanned by the matri-

ces in the tangent space to the identity matrix of G when viewed as a topological manifold.

The Lie algebra is equipped with an additional operation known as the Lie bracket, which

is analogous to a commutator of its elements in the case of matrix Lie groups:

g × g → g : [vi, vj] = vivj − vjvi

(2.2)

The basis matrices vi of the tangent space are known as the generators of the Lie algebra.

A Lie group is associated to its Lie algebra through the exponential map:

exp : g → G : M (θ) = e

i θivi = lim

I +

(cid:80)

N →∞ (cid:32)

N

1
N

θivi

(cid:33)

(cid:88)i

(2.3)

The exponential map is only valid for matrices that are path-connected to the identity matrix

of G. For this reason, the special unitary groups cannot be generalized to the unitary groups

U (2) and U (3) since unitary matrices are characterized by having their determinant equal

to ±1. This essentially splits the manifold structure of the groups U (2) and U (3) into two

path-connected components based on the sign of the determinant. The matrices with a

11

negative determinant act as reﬂections in the internal spaces of the weak isospin and color

charge; thus, they would include terms in the Lagrangian that do not conserve these quantum

numbers. The structure of a Lie group G can be fully described near its identity matrix with

the generators of the Lie algebra and the structure constants that are obtained from the Lie

bracket.

A particular representation of the group U (1) is given by the set of 2 × 2 matrices with

determinant 1 over the real numbers that depend on a single real parameter θ:

M (θ) = 




cos θ − sin θ

sin θ

cos θ






θ ∈ R

(2.4)

From this representation, it is understood that the group structure satisﬁes:

M (θ) = M (θ1)M (θ2) = M (θ1 + θ2)

(2.5)

and is subject to the periodicity condition θ + 2π = θ. This implies that the manifold

structure of this group is the unit circle, which is a compact and connected space as shown

in Figure 2.2. By diﬀerentiating Equation 2.5 with respect to θ1 and applying the chain rule,

the following expression is obtained:

dM (θ)
dθ1

=

dM (θ)
dθ

d(θ1 + θ2)
dθ1

=

dM (θ)
dθ

=

dM (θ1)
dθ1

M (θ2)

(2.6)

Evaluating this expression at θ = 0, near the identity element, the matrices in the tangent

12

space satisfy:

dM (θ)
dθ

which yields the following relation:

0 −1

= 
1



θ=0

(cid:12)
(cid:12)
(cid:12)
(cid:12)

0






dnM (θ)
dθn

M (θ2 = 0) = JI = J

(2.7)

= Jn

(2.8)

(cid:12)
(cid:12)
(cid:12)
(cid:12)
By expanding an arbitrary element of U (1) in this representation into a power series, the

θ=0

following expression is obtained:

M (θ) =

1
n!

dnM (θ)
dθn

∞

(cid:88)n=0

θ=0

(cid:12)
(cid:12)
(cid:12)
(cid:12)

θn = eθJ

(2.9)

which is the exponential map that maps the Lie algebra u(1) to U (1). From this expression,

u(1) is identiﬁed as the one-dimensional vector space generated by the matrix J, which is

isomorphic to the line iR. Although the actual tangent line in the manifold structure of

U (1) is 1 + iR, this line can be parametrized by the vector space iR by treating the point of

tangency as the origin of u(1). Under this isomorphism, U (1) is now represented by complex

numbers with modulus 1 under multiplication. The exponential map takes the familiar form

of Euler’s identity, which maps the line 1 + iR to the unit circle in the complex plane.

The group SU (2) can be represented as the set of all 2 × 2 unitary matrices with deter-

minant 1 that have the form:

a, b, c, d ∈ R

(2.10)

M = 




a + ib −c + id

c + id

a − ib






13

Figure 2.2: Topological representation of the group U (1). Each element of this group repre-
sents a point on the unit circle in the complex plane. The red line is the tangent line to the
identity element of this group, which corresponds to the associated Lie algebra u(1). Any
point on this line can be mapped to the unit circle via the exponential map. Figure adapted
from [6].

The condition on the determinant a2 + b2 + c2 + d2 = 1 implies that this group has the

manifold structure of the boundary of a 4-dimensional sphere of radius 1. In this case, it is

more instructive to show how the Lie algebra su(2) is constructed based on the properties

of the matrices in SU (2). Using the fact that any complex matrix σ satisﬁes the identity

det eσ = eTr(σ)

(2.11)

it follows from the exponential map that if M ∈ SU (2) and σ ∈ su(2), then the Lie algebra

su(2) consists of all 2 × 2 traceless complex matrices, which can be spanned by the set of

Pauli matrices:

σ1 = 




0 1

1 0






,

σ2 = 




0 −i

i

0






,

σ3 = 




1

0

0 −1






(2.12)

Unlike U (1), the group SU (2) is a non-abelian group, which is what originates the self-

interaction terms of the electroweak bosons in the SM Lagrangian.

Finally, the group SU (3) is similar to SU (2) but with 3 × 3 matrices instead. The non-

14

xy𝑖1-𝑖-1𝜃1+𝑖𝜃exp(𝑖𝜃)abelian nature of SU (3) gives rise to the gluon self-interaction terms in the SM Lagrangian.

The procedure to obtain the associated Lie algebra su(3) is similar to the one used for su(2),

with the spanning set being the Gell-Mann λ matrices:

0 1 0

0 −i 0

λ2 =

i

0










0

0



0




0




λ5 =

0 0 −i

0 0

i 0










0

0










1

0

0

0 −1 0

0

0

0

0 0 0










0 0 1

0 1 0










λ3 =

λ6 =





















(2.13)

0 0 −i

0 i

0










λ8 =

1
√
3










1 0

0 1

0

0

0 0 −2








1 0 0

0 0 0

0 0 1

0 0 0

1 0 0

0 0


















0

λ1 =

λ4 =

λ7 =




























2.1.3 Quantum Electrodynamics

To start the description of the SM Lagrangian, we begin with Quantum Electrodynam-

ics (QED), which describes the electromagnetic interactions between electrically charged

fermions. All fermions that are not subject to an external potential can be described with

Dirac’s equation

iγµ∂µψ − mψ = 0

(2.14)

15

where ψ(x) is a Dirac spinor representing the wave function of the fermion ﬁeld, m is the

mass of the fermion, and the γµ are the 4 × 4 matrices

γ0 = 



I

0

0 −I

,






0

−σi

γi = 



σi

0






(2.15)

where σi are the Pauli matrices in Equation 2.12 for i = 1, 2, 3. Dirac’s equation can be

obtained as the equation of motion of the Lagrangian

L = ¯ψ(iγµ∂µ − m)ψ

(2.16)

From this Lagrangian it can already be seen that it is invariant under Lorentz transformation,

which is one of the desired symmetries in the SM. Furthermore, upon closer inspection, this

Lagrangian is also invariant under the U (1) global gauge transformation

ψ → ψ

(cid:48)

= ψeiθ

(2.17)

Although in its global form, it is not yet clear that this gauge transformation is associated

with the choice electromagnetic potential and the conservation of electric charge. However, as

stated in subsection 2.1.2, in order to have a conserved quantity, there must be an associated

local gauge transformation. Promoting Equation 2.17 to a local gauge transformation by

introducing a space-time dependence on the phase θ(x) = qβ(x)

(cid:48)

ψ → ψ

= ψeiqβ(x)

(2.18)

16

we see that the Lagrangian is no longer invariant under this transformation. The transformed

Lagrangian L

(cid:48)

takes the form

(cid:48)

L

= L − q ¯ψ(∂µβ(x))ψ

(2.19)

This is remedied using the minimal coupling rule, which introduces a minimal number of new

ﬁelds to the Lagrangian so that it remains locally invariant under the gauge transformation.

For electromagnetic interactions, it is only necessary to introduce a single vector ﬁeld Aµ(x)

such that it transforms under the local gauge transformation as:

Aµ(x) → A

(cid:48)
µ(x) = Aµ(x) − ∂µβ(x)

(2.20)

and to redeﬁne the covariant derivative term as:

∂µ → ∂µ + iqAµ(x)

(2.21)

With these minimal coupling changes, the Lagrangian takes the form:

L = ¯ψ(iγµ∂µ − m)ψ − q ¯ψγµψAµ

(2.22)

making it invariant under the local U (1) transformations described in Equation 2.18 and Equa-

tion 2.20. It should be noted that the local U (1) transformation also allows to accommodate

a kinetic term for the ﬁeld Aµ of the form:

Lkin. = −

1
4

AµνAµν, Aµν = ∂µAν − ∂νAµ

(2.23)

17

However, Aµ must be massless since introducing a mass term that is proportional to AµAµ

breaks the local U (1) invariance. Based from this description, the ﬁeld Aµ represents the

photon in the SM. Thus, the Lagrangian

LQED = ¯ψ(iγµ∂µ − m)ψ − q ¯ψγµψAµ −

1
4

AµνAµν

(2.24)

is the quantum description of electromagnetic interactions in the SM. The second term in

LQED indicates that the fermion ﬁelds couple to the photon ﬁeld with the coupling strength

being proportional to the electric charge q of the fermion. This is represented in the Feynman

diagram shown in Figure 2.3.

Figure 2.3: Vertex interaction from the QED Lagrangian, which can be combined to de-
scribed processes like particle-antiparticle annihiliation or scattering.

2.1.4 The Weak Force and Electroweak Uniﬁcation

As mentioned in subsection 2.1.1, the weak force is mediated by three gauge bosons: the

electrically neutral Z boson and the two electrically charged W + and W − bosons. Since

18

the Z boson is electrically neutral, all interactions mediated by it are similar to the electro-

magnetic interactions that are mediated by the photon. Since electric charge is a conserved

quantity, all SM weak interactions that are mediated by the W ± result in the interacting

fermion changing to the other fermion from the same generation, a process known as ﬂavor-

changing current. For example, an up quark interacting with a W − can turn into a down

quark, while an electron interacting with a W + can turn into an electron neutrino. If each

generation of fermions is thought of as a two-dimensional space with each axis corresponding

to a fermion ﬂavor, then the W boson acts as a rotation operator in this space.

Experimentally, it is known that the SM weak interactions violate the discrete symmetry

of parity [7]. If a given particle interaction is physically valid, then the parity symmetry

dictates that reversing the interaction is also a physically valid process. This violation is

manifested in the chirality of neutrinos. The chirality of a particle is a property that measures

the orientation of the spin of a particle relative to its momentum. Particles that have their

spin parallel to their momentum are referred to as right-handed, while those that are anti-

parallel are referred to as left-handed. Neutrinos are restricted to being left-handed, while

anti-neutrinos are restricted to being right-handed, as a consequence of the SM predicting the

neutrinos as massless particles. Thus, the W boson only interacts with left-handed particles

as a consequence of this. The Lagrangian that describes the weak interactions must take

into account the chirality distinction of particles.

Instead of only describing the weak interaction Lagrangian, it will be more instructive to

consider the combination of the electromagnetic force and the weak force in a single force,

known as the electroweak (EWK) force. This is in part motivated by the fact that the Z

boson behaves like a photon with mass. With all these considerations, we can split the SM

19

Lagrangian into two sectors: a left-chirality sector and a right-chirality sector.

L = LL + LR

(2.25)

LL will contain the interactions between left-handed fermions that are mediated by the

photon and all three weak gauge bosons, while LR will contain the interactions between

right-handed fermions that are mediated by the photon and the Z boson. Both LL and

LR are given by the free fermion Lagrangian in Equation 2.16, with the ﬁeld ψL in LL

representing the left-handed isospin doublets:

u

d








, 

L







c

s



, 

L







t

b



, 

L







e

νe



, 

L







µ

νµ



, 

L







τ

ντ



L




(2.26)

while the ﬁeld ψR in LR represents the right-handed isospin singlets:

uR, bR, cR, sR, tR, bR, eR, µR, τR

(2.27)

Both sectors transform identically under the global U (1)Q transformations, while the left-

chirality sector contains the additional global symmetry of SU (2)I3

transformation invari-

ance. The labels Q and I3 represent the charge and weak isospin, which are to be conserved

when these transformations are promoted to local transformations. The combination of the

electromagnetic force and weak force symmetries can be represented with the cartesian prod-

uct of these two groups, SU (2) × U (1)Y , where the quantity Y = 2(Q − I3), known as the

weak hypercharge, is to be conserved under EWK interactions. Following a similar argument

20

as used in subsection 2.1.3, we let the left-handed isospin doublets transform as:

ψL → ψ

(cid:48)

L = eig1Y β(x)eig2αa(x)σaψL

(2.28)

where σa are the Pauli matrices in Equation 2.12 that generate the SU (2) rotations, while

the right-handed isospin singlet transform as:

ψR → ψ

(cid:48)
R = eig1Y β(x)ψR

(2.29)

The Lagrangian is again no longer invariant under these transformations, taking the form:

(cid:48)
L

= LL − ¯ψL(g1Y γµ∂µβ(x) + g2γµ∂µαa(x)σa)ψL + LR − g1Y ¯ψRγµ∂µβ(x)ψR

(2.30)

Invoking the minimal coupling rule, four vector ﬁelds, Bµ and W a

µ , are introduced such that

they transform under the local SU (2) × U (1)Y transformation as:

Bµ(x) → B

(cid:48)
µ(x) = Bµ(x) − ∂µβ(x)

W a

µ (x) → W

(cid:48)a
µ (x) = W a

µ (x) − ∂µαa(x)

and the left-chirality and right-chirality covariant derivative terms transform as:

∂µ → ∂µ + ig1Y Bµ + ig2W a

µ σa

∂µ → ∂µ + ig1Y Bµ

(2.31)

(2.32)

(2.33)

With these minimal coupling changes the Lagrangian describing EWK interactions takes the

21

form:

LEWK = LR + LR + Lkin.

LL = ¯ψL(iγµ∂µ − m)ψL − g1Y ¯ψLγµBµψL − g2

¯ψLγµW a

µ σaψL

(2.34)

LR = ¯ψR(iγµ∂µ − m)ψR − g1Y ¯ψRγµBµψR
1
4

BµνBµν −

µνW µν
i

Lkin. = −

W i

1
4

where the ﬁelds B and W a couple to the fermion ﬁelds with strengths g1 and g2, respectively.

The kinetic term Bµν is analogous to the photon kinetic term introduced in the previous

section, while the kinetic term W i

µν is given by:

W i

µν = ∂µW i

ν − ∂νW i

µ + g2(cid:15)ijkW j

µW k
ν

(2.35)

The last term in this expression arises from the non-abelian nature of SU (2) and is responsi-

ble for the self-interaction terms of the vector boson ﬁelds. Introducing mass terms for these

ﬁelds in Equation 2.34 breaks the local SU (2) × U (1)Y gauge invariance, which is a glaring

issue since the weak vector bosons are experimentally known to have mass. This issue can

be remedied with a spontaneous symmetry breaking process known as the Higgs mechanism,

which will be discussed in the next section. Additionally, based on the shapes of the Pauli

matrices, the ﬁelds W 1 and W 2 will mix together to form the W ± boson ﬁelds, while the B

and W 3 ﬁelds will mix together to form the Z boson and the photon ﬁelds. The Feynman

diagrams of the weak interactions introduced by LEWK are shown in Figure 2.4.

22

Figure 2.4: Vertex interaction from the weak force in the EWK Lagrangian.

2.1.5 Higgs Mechanism

As discussed in the previous section, four massless vector bosons arise when the Lagrangian

describing the EWK interactions of the SM particles is required to be invariant under local

SU (2)×U (1)Y transformations. From experimental evidence, it is known that three of these

vector bosons have non-zero mass. The Higgs mechanism enables the W ± and Z bosons

to acquire their mass in the SM through a spontaneous breaking of the EWK symmetry.

This is achieved by introducing a spin-zero ﬁeld, known as the Higgs ﬁeld, which has a

non-zero vacuum expectation value (VEV) with non-zero SU (2) × U (1)Y quantum numbers.

This process will leave the Lagrangian invariant under local SU (2) × U (1)Y transformations;

however, the ground state of the system will no longer be invariant due to it having a non-

zero Y quantum number. A way to achieve the spontaneous symmetry breaking is to allow

the Higgs ﬁeld to transform as an SU (2) doublet:

φ(x) =

1
√
2 



φ1(x) + iφ2(x)

φ3(x) + iφ4(x)






23

(2.36)

where the ﬁelds φ1−4 are real-valued scalar ﬁelds. Furthermore, the Higgs ﬁeld is allowed to

interact with itself through the Lagrangian

Lφ = (∂µ

¯φ)(∂µφ) − µ2 ¯φφ − λ( ¯φφ)2

(2.37)

where µ and λ are the parameters that govern the self-interaction potential of the Higgs

ﬁeld. For now, it suﬃces to assume that λ > 0 in order to have the potential bounded from

below. To ﬁnd the VEV of the Higgs ﬁeld we need to minimize its potential:

∂V (φ)
∂φ

= 0 =⇒ ¯φ(µ2 + 2λ( ¯φφ)) = 0

(2.38)

If µ2 > 0, then the potential has its minimum when φ(x) = 0, meaning there is no Higgs

ﬁeld and thus the vector bosons remain massless. The interesting case is when µ2 < 0, in

which we obtain a non-trivial solution

¯φφ =

1 + φ2
φ2

3 + φ2
4

2 + φ2
2

= −

µ2
2λ

=

v2
2

(2.39)

where choosing a particular set of values for the ﬁelds φ1−4 will spontaneously break the

SU (2) symmetry of the vacuum, as shown in Figure 2.5. An appropriate choice so that the

vector bosons W ± and Z acquire mass is to set φ3(x) = v and the other ﬁelds to zero. Thus,

with this choice of ﬁeld values, the Higgs ﬁeld at the minimum of the potential becomes

φ0(x) =

0

v

1
√
2 








24

(2.40)

Figure 2.5: Graphical representation of the Higgs interaction potential with µ2 < 0 and
λ > 0. The EWK symmetry is spontaneously broken when a particular value of the ﬁeld
components φi is chosen to represent the VEV. This ﬁgure is taken from [8].

25

Since we are interested in breaking the SU (2) component of the SU (2) × U (1)Y symmetry,

this choice of the ﬁeld at the minimum corresponds to the Higgs ﬁeld being electrically

neutral, so that charge is conserved at the ground state. This choice also determines the

remaining quantum numbers of the ﬁeld. Since only the down component of the doublet is

non-zero, then the weak isospin must be I3 = −1/2, and through Y = 2(Q − I3), the weak

hypercharge is determined to be Yh = 1. To see the Higgs mechanism operate, we consider

perturbations around the minimum of the potential of the form

φ(x) =

0

v + h(x)

1
√
2 








(2.41)

where h(x) is the actual Higgs ﬁeld. To make the Lagrangian in Equation 2.37 locally

invariant under the SU (2) × U (1)Y transformation, we set the covariant term to the one

obtained in Equation 2.32. Thus, the Lagrangian now takes the form

Lφ = −

µ2h2
2

+

1
2

0 v

(cid:18)

(cid:19)

(g1YhBµ + g2σaW a

µ )(g1YhBµ + g2σaW a

µ ) 




+ O

(2.42)

0

v






where only the relevant terms for the Higgs mechanism are shown. This can be further

simpliﬁed by setting Yh = 1 for the Higgs ﬁeld and writing down the Pauli matrices explicitly:

Lφ = −

µ2h2
2

+

1
2

v2g2

2((W 1

µ)2 + (W 2

µ)2) +

1
2

v2(g1Bµ − g2W 3

µ)2 + O

(2.43)

From here we can see that the Higgs ﬁeld has a mass term m2

h = |µ|2. To make the mass

terms of the weak vector bosons apparent, we need to change from the W a

µ basis to the

26

electric charge basis as follows:

(−W 1

µ ± iW 2
µ)

(2.44)

W ±

µ =

1
√
2

W 0

µ = W 3
µ

Under this basis, the Z boson and the photon are represented as

Zµ =

Aµ =

1
1 + g2
g2
2
1
g2
1 + g2
2

(cid:113)

(cid:113)

(g2W 3

µ − g1Bµ)

(g1W 3

µ + g2Bµ)

(2.45)

Thus, the Lagrangian in Equation 2.43 becomes

Lφ = −

µ2h2
2

+

1
2

By identifying the mass terms

v2g2

2W +

µ W −µ +

v2

1
2

1 + g2
g2

2ZµZµ + O

(2.46)

(cid:113)

mW ± =

mZ =

vg2
2
mW ±
cos θW

mA = 0

,

tan θW =

g1
g2

(2.47)

where θW is known as the weak mixing angle, we see that the Higgs mechanism has reconciled

the SM theory with experimental observations by providing the mass terms to the weak

vector bosons while still requiring the photon to be massless.

The main purpose of the Higgs mechanism was to incorporate the masses of the weak

27

vector bosons into the SM; however, for fermions a similar argument can be made to explain

why not all fermions are massless. To achieve this, we consider the following SU (2) invari-

ant interaction between fermions and the Higgs doublet, which is added to the Lagrangian

in Equation 2.43

Lint = gf ( ¯ψLφψR + ¯φ ¯ψRψL)

(2.48)

where ψL and ψR are the left-handed doublet and right-handed singlet of a fermion that

couples to the Higgs ﬁeld with strength gf . By perturbing the VEV of the Higgs doublet,

the interaction Lagrangian becomes

Lint =

=

0

(cid:18)

gf√
2 


( ¯ψ2

gf v
√
2

¯ψ1
L

¯ψ2
L



(cid:19)



LψR + ¯ψRψ2
L) +

v + h


gf√
2



ψR +

0 v + h

(cid:18)

(cid:19)

( ¯ψ2

LψR + ¯ψRψ2

L)h

ψ1
L

ψ2
L

¯ψR 













(2.49)

where the fermion left-handed doublet has been decomposed into its up (ψ1

L) and down (ψ2
L)

components. From this equation it can be observed that the fermions acquire a mass term

√

mf = gf v/

2, which is directly proportional to the Higgs coupling. The new interactions

included in the SM with the introduction of the Higgs mechanism are shown in Figure 2.6.

2.1.6 Quantum Chromodynamics

The ﬁnal important piece of the SM Lagrangian to be introduced is Quantum Chromody-

namics (QCD), which describes the interactions between particles that carry color charge.

Since there are three color charges, the QCD Lagrangian will be required to have local SU (3)

symmetry, which will give rise to the gluons of the SM. Using a similar argument as it was

done for the EWK Lagrangian, we start with the free fermion Lagrangian in Equation 2.16,

28

Figure 2.6: Vertices corresponding to the interactions between the Higgs boson and the weak
vector bosons W ± and Z, fermions with non-zero mass fm, and the self-interactions of the
Higgs boson.

where the ﬁeld ψ now represents the quark SU (3) triplet in color space. We now require the

Lagrangian to be invariant under the transformation

ψ → ψ

(cid:48)

= eig3αa(x)λaψ

(2.50)

where λa are the eight Gell-Mann matrices introduced in Equation 2.13. To make the QCD

Lagrangian invariant under this local transformation, we require that the covariant derivative

transforms as

∂µ → ∂µ + ig3Ga

µλa

(2.51)

where eight new vector ﬁelds Ga

µ have been introduced through the minimal coupling rule.

These ﬁelds correspond to the gluon in the SM and must transform as

Ga

µ(x) → Ga

µ(x) − ∂µαa(x) − fabcGc

µ(x)αb(x)

(2.52)

where the fabc terms are the structure constants obtained from the Lie brackets of su(3)

that give rise to the gluon self-interaction terms. With these transformation rules, the part

29

of the QCD Lagrangian that describes the strong force interaction between quarks is given

by

LQCD = ig3

¯ψGa

µλaψ

(2.53)

The physical explanation as to why there are eight ﬁelds associated with the gluon instead

of one is that when gluons mediate the strong force between quarks, in order to conserve

color charge, each gluon must carry a unit of color and another unit of anti-color. Naively,

one would assume that there would be nine gluon ﬁelds since there are three diﬀerent color

charges. To understand why this is not the case, we can represent each single color state as

the three axes of the internal color charge space

, b =

r =

1

0



0













0

1



0













, g =

0

0



1













(2.54)

Since each new vector boson corresponds to a generator of the su(3) Lie algebra, we can

represent each of the Gell-Mann matrix generators as the following exterior products:

√

(r¯b + b¯r)/

√

(r¯r − b¯b)/

√

2 → λ1/

√

2 → λ3/

−i(r¯g − g¯r)/

√

−i(b¯g − g¯b)/

√

√

2 → λ5/

√

2 → λ7/

2

2

2

2

−i(r¯b − b¯r)/

√

√

(r¯g + g¯r)/

2 → λ2/

√
2

2 → λ4/

√
2

√

(b¯g + g¯b)/

2 → λ6/

√
2

(r¯r + b¯b − 2g¯g)/

√

6 → λ8/

√
2

(2.55)

Any other combination of exterior products will result in a linear combination of the existing

generators or in a matrix that is not in the Lie algebra su(3) since it will not be traceless,

30

Figure 2.7: Vertices corresponding to the strong interactions between quarks q and self-
interaction terms between gluons g.

like the case of the color singlet (r¯r + b¯b + g¯g)/

3. The new interactions included in the SM

√

through the QCD Lagrangian are shown in Figure 2.7

2.1.7 The Standard Model Lagrangian

After going through the individual components of the Lagrangian that repesent the elec-

troweak and strong interactions and having a mechanism that spontaneously breaks the

EWK symmetry, which explains how the weak vector bosons acquire their mass, we are now

in a position to give a full description of the most important terms of the SM Lagrangian.

31

The SM Lagrangian can be summarized as:

L = i

¯ψf γµ∂µψf

(cid:88)f

vgf ( ¯ψf

Rψf

L + ¯ψf

Lψf

R)/

fermion kinetic terms

2

fermion mass terms

√

+

+

+

(cid:88)f

(cid:88)f

q
(cid:88)

¯ψf (ig1Y γµBµ + ig2γµW a

µ σa)ψf

fermion-EWK boson interaction terms

¯q(ig3Ga

µλa)q

quark-gluon interaction terms

(2.56)

+ (∂µ ¯φ)(∂µφ)

− (µ2 ¯φφ + λ( ¯φφ)2)

+ ¯φ(ig1Y Bµ + ig2W a

µ σa)φ + O

Higgs kinetic term

Higgs potential term

where O includes additional kinetic terms of the vector bosons, chirality terms and higher

order terms.

2.2 Shortcomings of the Standard Model

The SM has proven to be a successful theory that describes the interactions of elementary

particles and has gone through a battery of experimental tests to validate its predictions.

However, there are certain phenomena that we observe in the universe for which the SM

is unable to provide an explanation, suggesting that the SM is in fact an eﬀective theory

that may be valid up to a certain energy scale. This is not the ﬁrst time that a physical

theory that has provided several veriﬁable predictions falls short when accommodating new

observations. In fact one could argue that this is a desirable thing to happen, for it is a

32

sign that new physics is to be discovered.

In this section, a limited exposure to some of

these phenomenological issues that may be hinting at new physics will be presented. More

attention will be given to the Hierarchy Problem, which is related to the VLQ searches

presented in this thesis.

2.2.1 Gravity

As discussed in section 2.1, the SM predicts the existence of four gauge bosons, which

are the quantizations of the EWK and strong fundamental forces. Gravity is the remaining

fundamental force that has evaded a quantum description of its interactions. Fundamentally,

this issue can be explained as trying to reconcile QFT and general relativity (GR) into a single

theory. Although QFT incorporates special relativity, as can be seen in Dirac’s equation, the

full eﬀects of curved space-time are not taken into account in QFT. This would render the

theory nonrenormalizable due to the self-interaction terms that a mediator particle of gravity

would have. This is a puzzling phenomenon, as gravitational interactions are very weak at

short-distance scales that are characteristic of the SM interactions but become dominant at

astronomical scales. This suggests that the SM is an eﬀective theory that is not able to fully

resolve all degrees of freedom at smaller length scales, which correspond to more energetic

interactions.

2.2.2 Baryogenesis

The fact that the universe exists with matter predominantly occupying space and is not a

vacuum occupied by energy is a phenomenon that the SM cannot explain. Although the SM

does provide the interactions to produce matter and anti-matter pairs from energy, there

33

is no built-in mechanism that favors the production of one over the other, which resulted

in the excess of ordinary matter after the big bang. As ordinary matter is composed of

baryons, this phenomenon, known as Baryogenesis, indicates that the SM does not conserve

the number of baryons at a fundamental level, hinting at a possible breaking of an unknown

symmetry.

2.2.3 Dark Matter

The existence of dark matter has been used to explain astronomical phenomena such as the

discrepancy in rotational curves of galaxies [9], which measure the velocity distribution of

stars in galaxies as a function of their distance to the center of the galaxy. Without dark

matter, the rotational curves are expected to decrease at larger distances from the center

since there is less matter to provide gravitational pull to the stars.

Instead, an increase

in velocity is observed at larger distances, as shown in Figure 2.8. Another phenomenon

Figure 2.8: Rotational curve of the spiral galaxy Messier 33 (M33) broken down into indi-
vidual contributions. The contribution to the velocity distribution from gas in the galaxy is
shown by the long dashed line, from the stellar disk by the short dashed line, and from the
dark matter halo by the dashed-dotted line. The continuous line is the best ﬁt model which
incorporate dark matter to explain the observed velocity data points. This ﬁgure is taken
from [10].

that can be explained by dark matter is gravitational lensing in regions where there is not

enough visible matter after the collision of galaxy clusters [11].

In both examples, dark

34

Figure 2.9: Loop corrections to the Higgs boson mass from interactions with the top quark,
EWK bosons, and self-interactions. Figure taken from [3].

matter would explain these observations by being an abundant source of dense matter that

does not interact electromagnetically with ordinary matter. The phenomenological aspect of

dark matter in the SM is the existence of new particles that form dark matter, which could

have their own set of interactions that are not part of the SM.

2.2.4 Hierarchy Problem

The Hierarchy Problem has its origins in the large discrepancy between the EWK energy

scale v ≈ 256 GeV, which is related to the Higgs boson mass through the EWK spontaneous

symmetry breaking, and the Planck energy scale MP =

(cid:126)/(8πGN ) ≈ 2.4 × 1018 GeV, at

which gravitational eﬀects in particle interactions need to be taken into consideration. The

(cid:112)

Planck energy scale can be taken as the cutoﬀ energy scale ΛSM at which the SM loses its

predictive power as an eﬀective theory. Given that the EWK and Planck scales are separated

by approximately 17 orders of magnitude and the Higgs boson mass sits at the lower end of

the spectrum, this indicates that there is a wide range of energy scales with no physics that

is described by the SM. To fully appreciate the phenomenological issue that is the Hierarchy

Problem, one needs to take into account the contributions that the particles of the SM that

interact with the Higgs have on its mass. These contributions arise from loop corrections,

such as the ones shown in Figure 2.9.

If all fundamental parameters of the theory that

35

describe the Higgs mass are known, then the Higgs mass can be generalized as

∞

dE

dm2
h
dE

m2

h =

0
(cid:90)

(E; ptrue)

(2.57)

where the integrand contains all loop corrections to the Higgs mass that originate from the

SM particles. This motivates to split the integral into two regions that are deﬁned by the

cutoﬀ energy scale ΛSM as follows:

ΛSM

dE

dm2
h
dE

m2

h =

0
(cid:90)

(E; ptrue) +

∞

ΛSM
(cid:90)

dE

dm2
h
dE

(E; ptrue) = δSMm2

h + δBSMm2
h

(2.58)

where δSMm2

h are the contributions to the Higgs mass that are attributed to the SM inter-

actions, and δBSMm2

h are unknown contributions from BSM physics. The SM term can be

roughly estimated from the main loop correction contributions as

δSMm2

h =

3y2
t
4π2

Λ2

SM −

3g2
2
8π2

1
4

+

1
8 cos2 θW (cid:19)

(cid:18)

Λ2

SM −

3λ
8π2

Λ2

SM

(2.59)

where each individual term corresponds to the top quark loop, EWK boson loops, and Higgs

loop, respectively. The most important of these terms is the one from the top quark, due

to its large Yukawa coupling y2

t ≈ 1 that is proportional to the top quark mass. Since this

term has a large positive contribution if ΛSM is suﬃciently large, in order to produce the

relatively small value of the Higgs mass with the true theory, the BSM term must provide a

cancellation of roughly equal magnitude and opposite sign as the SM term. This can only be

achieved if the fundamental parameters of the theory are ﬁne-tuned to produce a cancellation

36

∆ that can be bounded below as:

∆ ≥

δSMm2
h
m2
h

=

3y2
t
4π2

2

ΛSM
mh (cid:19)

(cid:18)

≈

(cid:18)

ΛSM
450 GeV

2

(cid:19)

(2.60)

As an example, if the energy scale to discover new physics turns out to be the scale of a

grand uniﬁed theory (GUT), ΛSM = MGUT ≈ 1015 GeV, where all fundamental forces of

nature are described by a single force, the cancellation would be of the order ∆ ≥ 1024. This

is a glaring issue when the true theory parameters that describe the SM and BSM terms,

which are completely unrelated, have to produce a 24 digit cancellation in order to explain

the Higgs mass.

2.3 Vector-Like Quark Theory Overview

Several BSM theories have been proposed to solve the Hierarchy Problem presented in the

previous section, such as Supersymmetry (SUSY) and Composite Higgs (CH). The latter

will be the focus of this section, as a key prediction of Higgs compositeness is the existence

of new fermionic resonances that are referred to as vector-like quarks (VLQs).

2.3.1 Composite Higgs Models

The idea behind the CH model is that the Higgs is not an elementary particle but instead a

bound state of some new particles that interact through a new force. This new force would

then give the Higgs boson a ﬁnite geometric size lh, similar to how the quark constituents

of the proton are bound within its radius by the strong force. The binding energy of the

Higgs is then given by m∗ = 1/lh, which can be taken as the cutoﬀ scale in Equation 2.58.

37

Figure 2.10: Representation of the Higgs mass integrand under the Composite Higgs model.
This ﬁgure is adapted from [3].

Thus, for energies below m∗, the Higgs boson behaves like an elementary point-like particle

since any interaction with the Higgs will not have enough energy to resolve its substructure,

just like a photon with a wavelength larger than the radius of the proton cannot resolve

the individual quark constituents.

In this energy range, the integrand in Equation 2.58

behaves linearly, with the largest quadratic contribution coming from the top quark. As the

energy approaches the scale m∗, the ﬁnite size of the Higgs becomes evident, which results

in the integrand reaching a maximum value. The integrand then sharply decreases at higher

energies once the compositeness of the Higgs becomes apparent, as shown in Figure 2.10.

The decrease in the integrand for energies above m∗ results from the fact that there are

no particles in this energy regime that would provide radiative contributions to the Higgs

mass. Eﬀectively, the CH model solves the Hierarchy Problem by providing a mechanism

that stabilizes the Higgs mass as a result of the interactions of new particles under a new

force, which form a bound state corresponding to the Higgs boson.

In order to accommodate the CH model into the SM two diﬀerent structures will need

to be deﬁned: a composite sector (CS), which will contain the new particles that form the

Higgs boson bound state along with their interactions, and an elementary sector (ES), which

38

contains all the SM particles that are known to be elementary.

In addition to these two

sectors, a new set of Elementary-Composite interactions LEC has to be included in order

to generate the masses of the SM gauge bosons and fermions since the Higgs is no longer

present in the ES.

A potential issue that the CS might have is that if the Higgs boson is a bound state of

particles in the CS, then it is expected that mh should be close to the binding energy scale

m∗. This is motivated by observing that the masses of hadrons, which are bound states of

QCD interactions, are close to the color conﬁnement scale ΛQCD ≈ 300 MeV. Speciﬁcally,

the issue is that if m∗ ≈ mh, then other bound states of the CS diﬀerent than the Higgs

would have been observed by now. This absence of additional bound states close to the

EWK energy scale motivates placing m∗ at least at the TeV scale as a minimum.

The problematic lightness of the Higgs mass can be explained by the Higgs boson being a

pseudo Nambu-Goldstone Boson (pNGB)1 of a symmetry group G. The group G is required

to contain the SM symmetry groups as a subgroup in order to be compatible with the

description of the ES. Since the symmetry to be broken is unrelated to the ES this corresponds

to reducing G into an unbroken subgroup H ≤ G. By Goldstone’s Theorem, for each broken

symmetry generator that is not an element of H a massless NGB arises. If the Higgs boson

is to have mass, then it cannot be fully generated in the CS. This requires that a mass

generating mechanism is present in LEC.

In reality, the CH model is not just a single model but a family of models, which are

mostly deﬁned by the nature of the group G and its unbroken subgroup H. The remainder

of the discussion will be kept as general as possible regarding the choice of CH model, since

1A pNGB arises when an approximate symmetry is spontaneously broken instead of an exact symmetry,

therefore giving them mass.

39

the theory behind these models is outside the scope of this thesis. When additional model-

speciﬁc details are required, only the Minimal Composite Higgs Model (MCHM) will be

discussed. The MCHM is based on the choice of G = SO(5) rotational symmetry in the CS,

which is reduced to the H = SO(4) rotational symmetry in order to spawn the Higgs boson

as a pNGB. Other extended CH models are based on larger groups G and H, which contain

embeddings of their MCHM counterparts.

2.3.2 Gauge Boson Masses

In the MCHM, the Higgs is now represented as a 5-dimensional vector with real entries that

can be parametrized as

Φ = f 



(2.61)

sin Π
f

(cid:126)Π
Π

cos Π
f



where (cid:126)Π is a 4-dimensional vector with norm Π, of which its entries will be the Goldstone




bosons of the MCHM. The parameter f is the Higgs decay constant that represents the

energy scale of the spontaneous symmetry breaking of SO(5) into SO(4), which is analogous

to the pion decay constant fπ in QCD interactions where pions are pNGB. In order to be

consistent with the SM SU (2) representation of the Higgs as given in Equation 2.36, we

require that

Π1

φ2









Π2

φ1

(cid:126)Π =











which can be interpreted as an isomorphism between the unbroken subgroup SO(4) and the


































(2.62)

Π4

Π3

φ3

φ4

=

group SU (2)L × SU (2)R, where SU (2)L is the same as the one present in the SM, while

40

SU (2)R generalizes the SM hypercharge U (1)Y . To spontaneously break the CS symme-

try, we follow a similar procedure as done with the EWK spontaneous symmetry breaking,

starting with the Lagrangian

L =

1
2

(∂µ

¯Φ)(∂µΦ) + V (Φ)

and letting the covariant derivative transform as

∂µ → ∂µ + ig2W a

µ T a

L + ig1BµT 3
R

(2.63)

(2.64)

where W a

µ , a = 1, 2, 3, and Bµ are the SM gauge bosons and T a

L, T 3

R are the generators

of SO(4). Next, we expand the composite Higgs around the minimum of an interaction

potential V (Φ), which will be left unspeciﬁed as its form depends on the CH model being

studied, with the following choice of values for the ﬁelds Πi that corresponds to the unitary

gauge

Φ =

(cid:126)Π = (cid:126)0

V + h(x)

1
√
2 








(2.65)

where V is the VEV of the composite Higgs under the interaction potential. Inserting this

expansion into the Lagrangian results in the following expression

L =

1
2

∂µh∂µh +

g2
2f 2
4

sin2

V + h
f

(cid:18)

W +

µ W −µ +

(cid:19) (cid:18)

ZµZµ
2 cos2 θW (cid:19)

(2.66)

In order to maintain compatibility with the SM, the following relation must hold true for

the gauge boson masses

mW = mZ cos θW =

g2f
2

sin

V
f

(cid:18)

(cid:19)

41

(2.67)

which links the EWK symmetry breaking scale v with the Higgs decay constant f . The

angle θ = V /f measures how misaligned the VEV of the Higgs is relative to the direction in

5-dimensional space where the Higgs has vanishing VEV. Thus, the parameter ξ = v2/f 2 =

sin2 (V /f ) measures the relative size of the EWK symmetry breaking scale to the SO(5)

spontaneous symmetry breaking scale. If we expand Equation 2.66 in a Taylor series with

respect to the Higgs ﬁeld h(x), the following inﬁnite set of interactions with the gauge bosons

are obtained

L =

g2
2v2
4

W +

µ W −µ +

(cid:18)

ZµZµ
2 cos2 θW (cid:19) (cid:20)

(cid:112)

1 − ξ

2h
v

+ (1 − 2ξ)

h2
v2

4h3
3v3

− ξ

1 − ξ

(cid:112)

+ · · ·

(cid:21)
(2.68)

From this expansion, we can see that single and double interactions between the gauge and

Higgs bosons arise similar to the SM scenario, but with modiﬁed coupling strengths

kV =

gCH
hV V
gSM
hV V

=

1 − ξ,

kV h =

(cid:112)

gCH
hhV V
gSM
hhV V

= 1 − 2ξ

(2.69)

Thus, being able to measure these coupling strengths to high precision or any of the additional

Higgs-gauge boson interactions that are absent from the SM could experimentally validate

the CH model. It should be noted that as ξ → 0 when v is held ﬁxed and f → ∞, then

the modiﬁed couplings reduce to their SM values and any interaction beyond the double

interaction vanish due to being proportional to ξ.

In this limit, the composite nature of

the Higgs reduces to its elementary SM behavior, and we recover the SM description of the

EWK symmetry breaking.

42

2.3.3 Fermion Masses

To incorporate the SM fermion masses, the following set of terms must be included in the

interaction Lagrangian

Lfermion
int

= λL

¯Ψi

LOi + λR

¯Ψi

ROi

(2.70)

The CS operators O couple to the SO(5) embeddings of the SM SU (2) × U (1) left-handed

fermion doublet, ΨL, with coupling strength λL, and the right-handed fermion singlet, ΨR,

with coupling strength λR. The embeddings are represented as

ΨL =

1
√
2

−iψL

−ψL

−iψL

ψL

0



































, ΨR =

0

0

0

0

ψR



































(2.71)

Additionally, the following interaction term between the SM fermions and the CS Higgs must

be included, which will determine their coupling strength

˜Lfermion
int

= −

√

2mψ
(ξ(1 − ξ))

Φi

¯Ψi

LΨR

(cid:112)

(2.72)

Only the terms pertaining to the third generation of quarks will be considered as they are

the most relevant when discussing the Hierarchy Problem. Following a similar procedure

as with the SM gauge bosons, the interaction Lagrangian in Equation 2.72 can be Taylor

expanded near the VEV of the Higgs ﬁeld Φ into the following expression that only involves

43

the top quark terms:

˜Lfermion
int

= −mt¯tt −

1 − 2ξ
√
1 − ξ

mt
v

h¯tt + 2ξ

mt
v2

h2¯tt + · · ·

(2.73)

From this expression, it is observed that the top quark coupling to the Higgs is modiﬁed as:

kt =

gCH
h¯tt
gSM
h¯tt

=

1 − 2ξ
√
1 − ξ

(2.74)

A similar expansion is obtained for the bottom quark, which yields modiﬁed couplings be-

tween the Higgs and the bottom quark. However, since each term in the expansion is propor-

tional to mb and mb < mt, they are thus irrelevant to the radiative corrections to the Higgs

mass. To summarize, the basic structure of the MCHM is described by Equation 2.66, Equa-

tion 2.70, and Equation 2.72. Three model-dependent parameters are introduced: λL, λR,

and ξ, of which this last parameter modiﬁes the couplings between the Higgs and the re-

maining SM particles. It should be noted that a ﬁne tuning requirement similar to the one

derived in Equation 2.60 can be constructed for the parameter ξ, which behaves as

∆ ≥

1
2ξ

(2.75)

Thus, as the measurements of the coupling strengths become more stringent, the MCHM

becomes more unnatural due to the degree of ﬁne tuning it requires.

2.3.4 Vector-Like Quarks

The exact form of the CS operator O has remained unspeciﬁed until now. By inspect-

ing Equation 2.70, we can determine that the particles associated with this operator must

44

Vector-Like Quark

Electric Charge

X
T
B
Y

+5/3
+2/3
-1/3
-4/3

Multiplet

Decays

Hypercharge

(T )
(B)

SU (2) singlets
T → Ht/Zt/W +b
T → Hb/Zb/W −t

SU (2) doublets

(T ,B)
(X,T )
(B,Y )

T → Ht/Zt, B → W −t
T → Ht/Zt, X → W +t
B → Hb/Zb, Y → W −b

+2/3
-1/3

+1/6
+7/6
-5/3

Table 2.1: Overview of VLQs and their multiplets that are predicted by the MCHM.

have spin 1/2 in order to couple with the SM fermions and for the interaction Lagrangian

to remain Lorentz invariant. Secondly, their left-handed and right-handed components

must transform similarly under the weak isospin SU (2) gauge group, meaning that they

are “vector-like” fermions, which diﬀers from the SM chiral fermions. Finally, these new

operators must transform as an SU (3) triplet in order to be compatible with the SM La-

grangian; thus, the particles associated to the operators must have color charge, just like

quarks. For the stated reasons, these new particles are known as vector-like quarks (VLQs).

An overview of the VLQs as well as their possible multiplets that are predicted by the MCHM

is summarized in Table 2.1.

At the LHC, VLQs are expected to be produced in pairs through QCD interactions or

singly through weak interactions, as shown in Figure 2.11. The pair production mechanism

is analogous to the pair production of SM quarks, taking the form q ¯q → Q ¯Q or gg → Q ¯Q,

where q and g are SM quarks and gluons, respectively, and Q is a VLQ. The cross section

45

for the pair production process only depends on the mass of the VLQ and the center of mass

energy of the interaction, as shown in Figure 2.12. On the other hand, the single production

cross section is more model dependent since this mechanism couples the VLQs with the

third generation SM quarks, the weak gauge bosons, and the Higgs boson. As a consequence

of this, the single production cross section overtakes the pair production cross section at

higher VLQ masses since the phase space is less suppressed for the production of a single

heavy VLQ due to its dependence on the coupling parameters. Speciﬁcally, the interaction

Lagrangian for the single production mechanism can be expressed phenomenologically as:

Lsingle = cW q(cid:48)

ζ

¯Qζ W +

µ γµq(cid:48)

ζ + cZq
ζ

¯Qζ Zµγµqζ + cHq
ζ

¯Qζ Hqζ + h.c.

(2.76)

where the parameters cAq
ζ

represent the coupling strengths between the left- (ζ = L) and

right-handed (ζ = R) chiral components of the VLQ, Qζ , and the SM third generation

quarks, qζ (q(cid:48)

ζ ), via the exchange of a SM boson A = W, Z, H. The distinction between
q and q(cid:48) is made in order to diﬀerentiate between the allowed decays of Q, as shown in

Table 2.1. A universal coupling strength parameter κ is introduced, which depends on the

individual coupling parameters cAq

ζ , and three parameters ξA that sum to unity and control

the branching ratios of the VLQs to the SM bosons and third generation quarks [12]. The

single production cross section can be shown to depend on these parameters as:

σ(Q → Aq) ∝ (cAq

ζ )2 ∝ κ2ξ2
A

(2.77)

The results of the single production of a vector-like top analysis presented in subsection 6.1.8

will probe diﬀerent regions of the phase space assuming diﬀerent values of the universal

46

(a)

(b)

Figure 2.11: Feynman diagrams of single production (a) and pair production (b) of a vector-
like top.

coupling strength parameter κ. The branching ratios of the vector-like top and bottom

as a function of their mass is shown in Figure 2.13. The vector-like top in the singlet

representation is expected to decay into W b half of the time, with the other half being

distributed almost equitably between the Ht and Zt decays as the mass of the VLQ increases.

A similar behavior is expected for the vector-like bottom in the singlet representation by

exchanging the top and bottom quarks. For the doublet representations (T ,B) and (X,T ),

the decays into W b are ruled out due to charge conservation. Similarly, the W t decay is

ruled out in the doublet representation (B,Y ).

Mixing terms between the SM quarks and their VLQ partners are also generated, of

which the most important one is the top and vector-like top mixing term

Lmix =

λL
m ∗

g∗ ¯T tL +

λR
g∗

m∗

¯˜T tR

(2.78)

where g∗ = m∗/f , t is the SM top quark and T is vector-like top component of an SU (2)

doublet and ˜T the corresponding singlet. The mixing term can be used to generate loop

47

Figure 2.12: Cross section of the diﬀerent production mechanisms for VLQs as a function
of the VLQ mass in GeV at a center-of-mass energy of 13 TeV. The dashed black line
represents the cross section of pair production of VLQs while the colored lines represent the
cross section of the single production of VLQs for diﬀerent SU (2) doublet conﬁgurations.
For single production the maximum cross section at each mass point is obtained by setting
the mixing terms between the VLQ and the SM quarks to the maximum value allowed by
theoretical constraints. The dotted portion of the colored lines indicate outdated exclusion
limits on the cross section. This ﬁgure is taken from [13].

(a)

(b)

Figure 2.13: Theoretical predictions of the branching ratios of a vector-like top (a) and
a vector-like bottom (b) as a function of their mass in GeV for diﬀerent SU (2) multiplet
conﬁgurations. This ﬁgure is taken from [14].

48

 [GeV]Tm3004005006007008009001000Branching Ratio00.20.40.60.81 Wb→T  Zt→T  Ht→T  Wb→T  Zt→T  Ht→T SU(2) Singlet(X,T) Doublet(T,B) orPROTOS [GeV]Bm3004005006007008009001000Branching Ratio00.20.40.60.81 Wt→B  Zb→B  Hb→B  Wt→B  Zb→B  Hb→B  Wt→B SU(2) Singlet(B,Y) Doublet(T,B) DoubletPROTOSFigure 2.14: Representation of the vector-like top contribution to the Higgs boson mass.

diagrams, as shown in Figure 2.14, that involve the exchange of a virtual top quark and

vector-like top resulting in the following expression of the Higgs mass

m2

h ≈ aL

λ2
L
16π2

M 2

T + aR

λ2
R
16π2

M 2
T

(2.79)

where the coeﬃcients aL and aR depend on the actual CH model being studied. If the masses

of the VLQs are large, then in order to obtain the observed value of the Higgs mass, the

coeﬃcients aL and aR must provide a cancellation that has a ﬁne tunning bounded below

by

∆ ≥

3y2
t
4π2

2

MT
mH (cid:19)

(cid:18)

(2.80)

under the assumption that λL = λR =

√

ytg∗, which minimizes this lower bound. Thus,

this ﬁne tuning requirement provides an experimental handle to constrain the CH models

through limits on the mass of VLQs.

49

Chapter 3

The LHC and the ATLAS Detector

All of the ordinary matter in the universe is mostly composed of up and down quarks,

which are bound together into protons and neutrons that form the nuclei of atoms, and elec-

trons, which orbit around the nucleus and dictate the chemistry that allows the formation

of complex structures like molecules. The remainder of the SM particles and most of the

hypothetical particles that are predicted by BSM theories, such as VLQs, are short-lived

due to their large masses and subsequently decay into the lighter particles of the SM. As a

consequence of this, these exotic particles can only be produced in highly energetic interac-

tions, such as relativistic collisions between lighter particles. To properly study these exotic

particles, an experimental apparatus that can produce them in a controlled experimental

environment is required. This is achieved with man-made particle accelerators that acceler-

ate beams of particles to relativistic speeds and focus the beam into desired collision points

with the help of strong electric and magnetic ﬁelds.

At the time of writing this dissertation, the Large Hadron Collider (LHC) [15] is the most

powerful man-made accelerator used to produce the energetic collisions that allow us to make

precision measurements of parameters of interest in the SM and perform searches for BSM

physics. In order to probe the results of these energetic collisions, the LHC has four main col-

lider detector experiments that are designed for diﬀerent purposes. The ATLAS (A Toroidal

LHC ApparatuS) [16] and CMS (Compact Muon Solenoid) [17] experiments are both general

50

purpose detectors that are used for SM precision measurements and BSM searches. These

two experiments are designed to be independent from each other by having diﬀerent detector

designs and collaborations, which is essential when potential discoveries made at the LHC

need to be validated. The LHCb (Large Hadron Collider beauty) [18] experiment focuses

on precision measurements of processes that exhibit charge-parity (CP) violations and b-

hadron physics. Finally, the ALICE (A Large Ion Collider Experiment) [19] detector focuses

on heavy-ion collisions to study QCD interactions and the physics of quark-gluon plasma.

In addition to the four main detector experiments, there are also three smaller experiments.

The TOTEM (TOTal Elastic and diﬀractive cross section Measurement) [20] experiment is

dedicated to the measurement of the cross-section of proton-proton (pp) collisions and is lo-

cated at the CMS interaction point. The LHCf (LHC forward) [21] experiment is dedicated to

the study of particles that are emitted in the forward regions of LHC collisions and provides

calibrations for the hadron interaction models in extremely high-energy cosmic rays. This

experiment is located at the ATLAS interaction point. Finally, the MoEDAL (Monopole and

Exotics Detector at the LHC) [22] experiment, which is located near the LHCb interaction

point, is designed for the search of magnetic monopoles and massive pseudo-stable charged

particles that are predicted by BSM theories.

In this chapter, a description of the LHC is given, focusing on how particles are accelerated

to achieve the energies required to produce and study exotic particles. Next, a description

of the ATLAS detector will be presented, which is the experimental apparatus from which

the data and simulations used in the analyses presented in this dissertation were obtained.

Finally, the process of reconstructing and calibrating the diﬀerent physics objects that are

used in ATLAS analyses from the inputs of the detector is described.

51

3.1 The Large Hadron Collider

The LHC is a circular synchrotron accelerator with a circumference of 26.7 km that is located

underground at a depth that varies between 50 m and 170 m and crosses the Franco-Swiss

border near Geneva, as depicted in the schematic in Figure 3.1.

It started its operations

Figure 3.1: Schematic representation of the LHC and its four main experiments [23].

in the year 2009, collecting data from collisions at a center of mass energy of 7 TeV until

2011, at which point the energy was increased to 8 TeV, providing additional collision data

until 2013. This data collection period is known as Run-1, after which the LHC was shut

down to perform upgrades to the accelerator complex and its detectors. These upgrades

allowed the LHC to resume its data collection operations in 2015 using collisions with a

center of mass energy of 13 TeV. This data collection period, known as Run-2, lasted until

2018, at which point the LHC entered into another scheduled shutdown for upgrades. At

the time of writing this dissertation, the LHC resumed its operations in July 2022, starting

52

Figure 3.2: Cross section of the LHC beam pipe [24].

the Run-3 data collection period with a center of mass energy of 14 TeV, which will last for

approximately four years until the next scheduled shutdown. The data used in the analyses

presented in this dissertation are from the Run-2 period.

The main ring of the LHC consists of two separate beam pipes in which a beam of particles

to be collided is split to travel in opposite directions along the circumference of the ring,

guided by superconducting magnets that surround the pipes, as shown in Figure 3.2. Along

the LHC ring, there are four sections known as cryomodules that contain radio frequency

(RF) cavities that produce an oscillating electric ﬁeld tuned to a frequency of 400 MHz

designed to accelerate particles up to an energy of 6.5 TeV 1. The particles in the beam

that arrive to the RF cavities in phase with the electric ﬁeld are accelerated, while those

that arrive out of phase are decelerated, which allows to sort the particle beam into particle

bunches. Once the bunches reach the desired beam energy, they are collided at the interaction

points where the diﬀerent detector experiments are located. In order for this to occur, the

1Since the start of Run-3 particles are now accelerated to 6.8 TeV.

53

particle beam must consist of stable and charged particles so that they do not decay before

reaching the interaction points and can be accelerated and guided through electromagnetic

interactions. This limits the choice of beam constituents to electrons, protons2, or ions.

Since the LHC is a circular accelerator, a beam of electrons loses more energy per revolution

across the LHC compared to heavier particles due to synchrotron radiation, which can be

quantiﬁed as:

P =

∆E
2πR

=

4πq2β2E4
3Rm4

(3.1)

where q and m are the charge and mass of the particles in the beam, E is the beam energy,

R is the orbit radius and β = v/c is the ratio of the speed at which the particles in the beam

are traveling at to the speed of light, which is approximately equal to 1 for the purposes

of the LHC operations. Thus, an electron beam that is accelerated circularly to the same

energy as a proton beam will lose more energy by a factor of (mp/me)4 ≈ 1013, making the

use of an electron beam energetically ineﬃcient to maintain at the LHC. For this reason, the

LHC was designed primarily to collide proton-proton (pp) beams to study the fundamental

particles they produce. Lead ions have also been used in lead-lead and lead-proton beam

collisions. These collisions are used to study a state of matter known as quark-gluon plasma.

From this point on, only pp collisions will be discussed, as they are the main interaction of

interest for this thesis.

The process of producing the proton beams is depicted in Figure 3.3. This process starts

by ionizing hydrogen gas with the use of electric ﬁelds so that the protons are separated from

the electrons. The protons are then transported through the 33 m long linear accelerator

(Linac2), which accelerates them to an energy of 50 MeV. The next steps consist of sequen-

2Or their corresponding antiparticles.

54

Figure 3.3: Schematic representation of the process of injecting the proton beam into the
LHC. The light gray arrows indicate the direction in which the beam of protons travel during
the injection process [25].

tially accelerating the beam that is formed in Linac2 through the use of smaller circular

accelerators, which also help in sorting the beam into proton bunches. First, the incoming

beam from Linac2 is injected into the Proton Synchrotron Booster (PSB) that increases the

beam energy to 1.4 GeV. The beam is then transported to the Proton Synchrotron (PS),

which increases the beam energy to 25 GeV. The beam is then transported to the last of the

small circular accelerators, the Super Proton Synchrotron (SPS), where the beam is further

accelerated to an energy of 450 GeV. Finally, the proton beam is split into two beams and

injected into the LHC beam pipes, where each beam will travel in opposite directions while

being accelerated to its ﬁnal energy of 6.5 TeV. Once this energy is reached, the opposing

beams are ready to collide at the interaction points, resulting in a collision with a center

of mass energy of 13 TeV. Under nominal pp collision operations, there are 2808 proton

55

bunches circulating around the LHC ring, with each bunch equidistantly spaced by 25 ns

and containing approximately 1.1 × 1011 protons.

In an accelerator experiment, the number of events of a particular process that are

produced can be expressed as

Nevents = σ

Ldt

(cid:90)

(3.2)

where σ is the cross section of the process of interest and L is the luminosity of the accel-

erator. In particle physics it is standard practice to measure the cross section of an event

in units of barns (b), where 1b = 10−28m2. It should be noted that Equation 3.2 is only

valid if the detector completely encapsulates the collision target, which is the case for the

diﬀerent detectors at the LHC. The value of the cross section is determined by nature, so the

only experimental handle for tuning the event rate of a process comes from the accelerator

luminosity. The luminosity of the LHC can be expressed as

L =

nbn1n2fr
2πΣxΣy

(3.3)

where nb is the number of proton bunches crossing the interaction points, n1 and n2 are the

number of protons in the colliding bunches, fr is the collider revolution frequency, and Σx and

Σy are the horizontal and vertical geometric widths of the proton beams, respectively [26].

The cross section of a process is inversely proportional to the energy scale associated to the

process. In the case of particles that are predicted by BSM theories with masses above the

TeV scale, it is necessary to increase the accelerator luminosity in order to have suﬃcient

statistics to make a claim of discovery of these potentially new particles. This can be achieved

by focusing the proton beams as they approach the interaction points, thereby reducing

the geometric widths in the denominator of Equation 3.3 and increasing the probability of

56

collision between opposing proton bunches.

The total integrated luminosity delivered by the LHC and recorded by the ATLAS de-

tector during Run-2 is shown in Figure 3.4. The eﬀect of increasing the interaction rate of

Figure 3.4: The integrated luminosity, in units of inverse femptobarn, over the time period of
Run-2 delivered by the LHC (green) and recorded by ATLAS (yellow) during stable beams
for pp collisions at a center of mass energy of 13 TeV. This ﬁgure is taken from [27].

protons in a single bunch crossing is known as in-time pile-up. The number of interactions in

a single bunch crossing follows a Poisson distribution, with the mean number of interactions

given by:

µ =

Lσpp inel
f

(3.4)

where L is the instantaneous luminosity in Equation 3.3, σpp inel is the cross section of

inelastic pp collisions and f is the collision frequency of the LHC. At nominal operations

for pp collisions, the peak luminosity of the LHC is L ≈ 1034 cm−2s−1 = 10 nb−1s−1

with a collision frequency of 40 MHz. The cross section for inelastic pp collisions can be

approximated from data as σpp inel ≈ 80 mb for a center of mass energy of 13 TeV, as shown

in Figure 3.5. Using these values, the average number of interactions in a single bunch

57

Month in YearJan '15Jul '15Jan '16Jul '16Jan '17Jul '17Jan '18Jul '18-1fbTotal Integrated Luminosity 020406080100120140160ATLASPreliminaryLHC DeliveredATLAS Recorded = 13 TeVs-1 fbDelivered: 156-1 fbRecorded: 1472/19 calibrationcrossing can be estimated at approximately 20 interactions. The distribution of the average

number of interactions in a single bunch crossing weighted to the luminosity for the Run-2

period is shown in Figure 3.6.

Since each proton bunch is equidistantly spaced by 25 ns,

Figure 3.5: The inelastic proton-proton collision cross section measurements as a function
of the center of mass energy for diﬀerent experiments and overlayed with theory predictions.
This ﬁgure is taken from [28].

Figure 3.6: The average number of pp interactions in a single bunch crossing for the diﬀerent
years during Run-2, weighted by luminosity. This ﬁgure is taken from [27].

emissions from previous bunch crossing interactions may appear as part of the current bunch

58

 [GeV]s210310410 [mb]inels30405060708090100ATLAS (MBTS)ATLAS (ALFA)TOTEMALICELHCbAugerpp (non-LHC)ppPythia 8EPOS LHCQGSJET-IIATLAS7000800090001000011000120001300065707580LHC region01020304050607080Mean Number of Interactions per Crossing0100200300400500600/0.1]-1Recorded Luminosity [pbOnline, 13 TeVATLAS-1Ldt=146.9 fb(cid:242)> = 13.4m2015: <> = 25.1m2016: <> = 37.8m2017: <> = 36.1m2018: <> = 33.7mTotal: <2/19 calibrationcrossing interaction due to technical limitations on the detector readout time. This eﬀect is

known as out-of-time pile-up. As the LHC increases its luminosity after each upgrade, the

eﬀects of pile-up will become a signiﬁcant source of systematic uncertainty that analyses will

need to consider.

3.2 The ATLAS Detector

The LHC provides the energetic collisions that allow us to probe the physics of the SM and

potentially make new discoveries that will extend it. However, the collision of particles is

only part of the job, as one needs to detect what is produced from the collisions and have the

ability to recognize, select, and correctly reconstruct the events of interest. This is achieved

with the diﬀerent detectors that are placed along the LHC ring. The ATLAS detector, shown

in Figure 3.7, is a general purpose detector that is made up of many specialized components

that measure a wide range of signals from high energy particle interactions. These signals

are used in the reconstruction process of pp collisions.

Figure 3.7: Schematic representation of the ATLAS detector showing its dimensions and
diﬀerent components. This ﬁgure is taken from [29].

59

3.2.1 Particle Interactions with Matter

Converting the particles produced in pp collisions into physical signatures requires that the

diﬀerent specialized components of the ATLAS detector are made up of diﬀerent materials.

These materials must elecit speciﬁc types of interactions with the particles as they travel

throughout the detector. An important factor to consider in the choice of these materials

is their ability to contain the particles and their subsequent decays within a certain length

of the detector. This ensures that the particles deposit all their energy into the detector for

measurement while also shielding exterior detector components from interactions they are

not designed to withstand.

In a pp collision, particles that have electric or color charge can be produced, so the

ATLAS detector employs materials that interact with these particles through the elec-

tromagnetic or strong force. The electromagnetic interactions usually take the form of

bremsstrahlung radiation, which occurs when a charged particle is slowed down by the nuclei

of the atoms in the detector material, or through ionizing radiation of the detector mate-

rial. Electrically charged particles will thus lose energy due to the deceleration caused by

these electromagnetic interactions. Photons can either directly ionize the detector material

or decay into an electron-positron pair, which are subject to the deceleration process just

described. The strong interactions take the form of inelastic nuclear collisions, which start

a process known as hadronization, in which a hadron decays into more hadronic particles.

Due to color conﬁnement, it is energetically favorable for particles with color charge to stay

in color-neutral bound states. The strong interactions that initialize the hadronization of a

particle transform its kinetic energy into the creation of quark-antiquark pairs in order to

conform to color conﬁnement. The hadronization process is eventually halted as the total

60

kinetic energy is depleted. This process results in shower-like patterns of particle decay that

are eventually reconstructed as objects known as jets (see 3.3.4).

An overview of the diﬀerent particle interactions as they travel throughout a longitudi-

nal cross-section of the ATLAS detector is shown in Figure 3.8. The innermost layer of the

Figure 3.8: Schematic representation of the diﬀerent detector components of ATLAS in a
longitudinal cross-section. Example trajectories and interactions of particles produced in
pp collisions with the diﬀerent detector components are also shown. This ﬁgure is taken
from [30].

ATLAS detector contains the tracking chamber, which provides information on the position

of charged particles as they start to leave the interaction point. It is designed so that elec-

tromagnetic interactions are minimal in order for charged particles not to lose most of their

energy. Immediately above the tracking chamber is the central solenoid, which exerts an ax-

ial magnetic ﬁeld that curves the trajectory of charged particles. This allows us to identify

the charge of a particle and its momentum based on the curvature that the trajectory takes

(see 3.3.1). Above the central solenoid is the electromagnetic calorimeter, which maximizes

the electromagnetic interactions so that particles like electrons or photons transfer all their

energy into the detector. The hadronic calorimeter is positioned a layer above the electro-

61

magnetic calorimeter and is designed to fully stop electrically neutral hadrons. Additionally,

it also stops charged hadrons that do not deposit all their energy in the electromagnetic

calorimeter.

In both cases, the absorption of the energy of the particles by the detector

is converted into electric signals, which are then used to reconstruct the underlying event.

Finally, particles such as muons and neutrinos barely interact with the detector. Muons, on

average, are produced with suﬃcient energy that makes them minimum-ionizing particles

(MIPS), so they traverse the inner detector and calorimeter without depositing the majority

of their energy in these detector components. Energy measurements of muons are delegated

to the muon spectrometer, which forms the outermost layer of the ATLAS detector. Em-

bedded within the muon spectrometer is a toroidal magnet system that further curves the

trajectory of muons. This allows us to measure the energy of muons by measuring the cur-

vature of tracks formed as muons ionize the material of the spectrometer. Neutrinos, on the

other hand, are practically invisible to the detector, so they are inferred as missing energy

during the event reconstruction process.

3.2.2 Detector Coordinate System

A coordinate system must be established in order to properly describe the kinematics of the

particle interactions that are detected by ATLAS. This coordinate system must take into

account the fact that the ATLAS detector has a cylindrical symmetry and that the emissions

from particle interactions are spherically symmetric. The origin of the system is placed at

the nominal interaction point (IP) of the detector with three perpendicular cartesian axes

spanning from this point. The x-axis points towards the center of the LHC ring, the y-axis

points upwards towards the sky, and the z-axis points along the accelerator beam line in the

direction that makes the coordinate system a right-handed system, as shown in Figure 3.9.

62

The x-y plane is known as the transverse plane of the detector and is characterized by the

Figure 3.9: Schematic representation of the coordinate system used for the ATLAS detector.
The relationships between angles and the cartesian axes are shown. The red solid line
represents the three-momentum of a particle that is emitted from the IP.

azimuthal angle φ. This is the angle between the x-axis and a two-dimensional vector that

starts from the IP and lies completely in the transverse plane. The azimuthal angle ranges

between [0, 2π), being 0 when the vector is parallel to the x-axis and π/2 when parallel to the

y-axis. The polar angle θ is deﬁned as the angle between the z-axis and any vector starting

from the IP. This angle ranges between [0, π], attaining the lower and upper bounds of the

interval when the vector is parallel and anti-parallel to the z-axis, respectively.

Since the description of particle collisions takes place at the center of mass reference

frame, which is boosted along the z-axis, a quantity known as the rapidity of a particle is

often used instead of the polar angle. This choice is motivated by the fact that rapidity

is invariant under boost transformations along the z-axis. Additionally, the production of

particles is uniform with respect to rapidity. The rapidity is deﬁned as

y =

1
2

ln

E + pz
E − pz (cid:19)

(cid:18)

(3.5)

where E and pz are the energy and the momentum component along the z-axis of a particle,

63

yTowards the skyxTowards the center of the LHC ringzBeam lineIPrespectively.

In actual applications, an approximation of the rapidity known as pseudo-

rapidity is used instead. The pseudorapidity is obtained in the limit of massless particles

in Equation 3.5 and is applicable to the particles that are detected by ATLAS whose masses

are negligible compared to their kinetic energy. The pseudorapidity is deﬁned as

η = − ln

tan

(cid:18)

(cid:18)

θ
2

(cid:19)(cid:19)

(3.6)

where θ is the polar angle. The pseudorapidity has values that range from (−∞, ∞), ap-

proaching ±∞ along the ±z-axes and being 0 in the transverse plane. The momentum

components of a particle can be expressed in terms of η and φ as

px = pT cos (φ), py = pT sin (φ), pz = pT sinh (η)

(3.7)

where pT is the transverse momentum of the particle

pT = |p| sin (θ) =

x + p2
p2
y

(cid:113)

(3.8)

and p is the three-momentum of the particle. Other kinematic variables can be derived

through standard relations involving the four-momentum of the particle. The angular dis-

tance between two particles in the detector can be expressed in terms of η and φ as

∆R =

(∆η)2 + (∆φ)2

(cid:113)

(3.9)

where ∆η and ∆φ are the separations in pseudorapidity and azimuthal angle, respectively.

64

3.2.3 Inner Detector

The inner detector (ID) [31] is the innermost part of the ATLAS detector and encapsulates

the beam line. It is designed to primarily measure the momentum of charged particles with

the aid of a 2 T axial solenoidal ﬁeld, in which all its subcomponents are immersed. The ID

is also used to identify primary and secondary vertices (see 3.3.1) that are used to provide

preliminary information for particle identiﬁcation. The ID has three main subcomponents:

the pixel detector, the semiconductor tracker (SCT) system, and the transition radiation

tracker (TRT) system. Mechanically, the ID and its three subcomponents span a radius of

1.1 m from the beam line and a length of 6.2 m parallel to the beam line. The ID is split

into three regions: a cylindrical barrel region, which provides a pseudorapidity coverage of

|η| < 1.2, and two end-cap regions, which provide a coverage of 1.2 < |η| < 2.5. A schematic

representation of the ID and its subcomponents is shown in Figure 3.10.

(a)

(b)

Figure 3.10: Schematic representation of the ID including the barrel and end-cap regions (a).
A longitudinal cross-section of the diﬀerent layers along the barrel region with their radial
distances measured from the beam line is also shown (b). These ﬁgures are taken from [32].

65

3.2.3.1 Pixel Detector

The pixel detector is the innermost subsystem of the ID, consisting of silicon pixel detector

modules. Being the closest component to the IP, it is designed to endure the intense radiation

environment it is subject to for a lifetime of ten years. It is mainly used to provide tracking

information and pattern recognition of short-lived particles, such as b quarks and τ . The

pixel detector was originally designed with three barrel layers and three end-cap disk layers

on each side. During the upgrade period between Run-1 and Run-2, a fourth layer directly

encompassing the beam line, known as the insertible b-layer (IBL) [33], was introduced to

compensate for the radiation damage sustained by the other layers and to reduce readout

ineﬃciencies due to pile-up at higher luminosities. The IBL has the largest granularity in

the barrel region of the pixel detector, with approximately 6 × 106 silicon pixels with a Rφ-z

resolution of 50 × 250 µm2. The three layers above the IBL have a smaller granularity with

a pixel size of 50 × 300 µm2 in the Rφ-z plane and a total of approximately 67.2 × 106 pixels.

The three end-cap disk layers are positioned at a distance |z| of 495 mm, 580 mm, and

650 mm on each side from the IP, with each disk containing approximately 2.2 × 106 pixels.

Overall, the ID contains approximately 86.4 × 106 silicon pixels. The charged particles that

are produced at the IP interact with the pixel detector by ionizing the silicon, producing

electron-hole pairs, which are then read as an electric current by a sensor.

3.2.3.2 Semiconductor Tracker

The SCT system envelopes the pixel detector and is designed to continue tracking charged

particles by measuring their momentum and vertex position and providing pattern recogni-

tion of particles. The barrel component of the SCT consists of four layers of silicon microstrip

detectors with a size of 6.36 × 6.40 cm2 and a resolution of 16 × 580 µm2 in the Rφ-z plane

66

per silicon detector. Each layer is positioned at a radii of 300 mm, 373 mm, 447 mm, and

520 mm. The end-cap component of the SCT consists of nine disks with similar silicon

modules as the barrel component. Charged particles interact with the SCT in the same way

they do with the pixel detector.

3.2.3.3 Transition Radiation Tracker

The TRT is the outermost subsystem of the ID and consists of straw tube detectors with

a diameter of 4mm. Each tube encapsulates an anode wire and is ﬁlled with a gas mixture

composed of 70% Xe, 27% CO2, and 3% O2. Charged particles that pass through the

straw tubes ionize the gas, and the resulting electrons drift towards the anode, which is

then recorded as a signal. The TRT continues to provide tracking information of charged

particles, but its main function is to assist in the pattern recognition of particles. This is

achieved by including a radiator material between individual straw tubes, which acts as a

medium boundary that forces particles to emit transition-radiation photons. These photons

then ionize the gas inside the tubes, with the emission rate of transition-radiation photons

being characteristic of a particle at a given momentum. Mechanically, the TRT consists of

50000 straw tubes in the barrel region that are placed parallel to the beam line and 320000

straw tubes that are placed at the end-caps in a radial conﬁguration and distributed across

18 wheel structures on each side.

3.2.4 Calorimetry

The calorimetry system of the ATLAS detector encapsulates the ID and solenoid magnet

and is located a layer below the muon spectrometer. It is composed of two main subsystems:

the liquid argon (LAr) calorimeter and the tile hadronic calorimeter (TileCal). The primary

67

function of these two subsystems is to stop incoming charged particles and hadrons that are

exiting the ID in order to measure their total energy. This is achieved by having diﬀerent

subcomponents in each subsystem with varying material compositions and lengths that are

tailored to maximize the interactions between the detector and particles. Together, both

subsystems provide a coverage of |η| < 4.9. In addition to measuring the energy of particles,

the calorimeter system also ensures good measurement of the missing transverse energy

(Emiss
T

)3 as a consequence of its wide η coverage and subcomponent material thicknesses.

This is important in indetifying many physics signatures, such as the production of neutrinos.

A schematic representation of the ATLAS calorimetry system with all its subcomponents to

be discussed in the following subsections is shown in Figure 3.11.

Figure 3.11: Schematic representation of the calorimetry system of the ATLAS detector with
its diﬀerent subcomponents. This ﬁgure is taken from [34].

3This value is a scalar related to the momentum imbalance in the transverse plane (see 3.3.5).

68

3.2.4.1 Liquid Argon Calorimeter

The LAr calorimeter provides both electromagnetic and hadronic energy measurement ca-

pabilities through the use of four subcomponents, which are distributed in the barrel and

end-cap regions. The barrel component consists only of the LAr electromagnetic (EM)

calorimeter, which provides a pseudorapidity coverage of |η| < 1.475. The end-cap region

contains the three remaining components, which are distributed in two coaxial wheels and

two longitudinal wheels. The outermost coaxial wheels contain the LAr electromagnetic

end-cap (EMEC) calorimeter, which is the innermost longitudinal wheel and provides a cov-

ering range of 1.375 < |η| < 2.5, and the LAr hadronic end-cap (HEC) calorimeter, which is

positioned longitudinally after the EMEC and provides a covering range of 1.5 < |η| < 3.2.

The innermost coaxial wheel contains the LAr forward calorimeter (FCal), which covers the

high pseudorapidity region of 3.1 < |η| < 4.9.

The EM and EMEC components consist of alternating layers of lead absorption plates

and LAr. Charged particles or photons interact with the lead plates either by bremsstrahlung

or ionizing radiation, which produces electromagnetic showers that ionize the LAr medium

providing electrical signals for measurements. To fully contain the electromagnetic showers,

the EM and EMEC are designed to have thicknesses of at least X0 > 22 g.cm−2 and

X0 > 24 g.cm−2 radiation lengths (X0), respectively. The radiation length is an inherent

property of the material that is deﬁned as the average distance at which an electron loses

its energy by a factor of 1/e while traversing the material.

The HEC component consists of two independent wheels per end-cap that have a similar

absorber-LAr structure as the LAr electromagnetic components but use copper plates instead

of lead. The choice of copper over lead as the absorption material takes into account the fact

that hadronic showers are longer than electromagnetic showers due to the nuclear interaction

69

length (λI ) of hadrons being usually larger than the radiation length X0 by an order of

magnitude.

The FCal provides both electromagnetic and hadronic calorimetry with an approximate

thickness of 10 λI . It is split into three modules: the innermost module uses copper as its

absorption material and is optimized to provide electromagnetic measurements, while the

two remaining modules use tungsten as their absorption material for its higher density and

are optimized to provide hadronic energy measurements.

3.2.4.2 Tile Hadronic Calorimeter

The TileCal is designed to simultaneously measure the energy of hadronic interactions and

halt incoming hadrons from leaving the detector.

It consists of a barrel component that

encapsulates the LAr barrel calorimeter. The TileCal is sectioned into a central long barrel

of length 5.8 m that covers a range of |η| < 1.0, and two extended barrels of length 2.6 m that

cover the range 0.8 < |η| < 1.7. The TileCal extends radially from an inner radius of 2.28 m

up to an outer radius of 4.25 m and is segmented into three layers that are approximately

1.5, 4.1, and 1.8 λI thick in the central barrel region, and 1.5, 2.6, and 3.3 λI thick in the

extended barrel region. It uses steel as the absorption material and plastic scintillating tiles

as the active medium. The particles that are produced in hadronic showers interact with

the scintillators producing photons which are read out using photomultiplier tubes.

3.2.5 Muon Spectrometer

The muon spectrometer (MS), shown in Figure 3.12, is the outermost layer of the ATLAS

detector. It is designed to provide tracking and momentum measurements of muons, which

hardly interact with the ATLAS calorimeter due to muons being MIPS. It is composed of two

70

subsystems: the muon precision chambers, which contain the monitored drift tubes (MDT)

and the cathode strip chambers (CSC) subcomponents; and the muon trigger chambers,

which contain the resistive plate chambers (RPC) and the thin gap chambers (TGC) sub-

components. The subcomponents provide the aforementioned muon measurements with the

assistance of three superconducting air-core toroidal magnets: one positioned in the central

barrel region of |η| < 1.4, and one at each end-cap covering the regions 1.6 < |η| < 2.7. The

central barrel toroidal magnet generates a magnetic ﬁeld of 0.5 T, while the end-cap toroidal

magnets generate a 1 T magnetic ﬁeld. The magnetic ﬁelds are oriented along the azimuthal

direction, which curves the trajectories of muons towards the diﬀerent subcomponents of the

MS.

Figure 3.12: Schematic representation of the muon spectrometer of the ATLAS detector
with its diﬀerent subcomponents. This ﬁgure is taken from [35].

3.2.5.1 Muon Precision Chambers

The muon precision chambers are designed to provide precise tracking information and mo-

mentum measurements of muons at the cost of a higher processing time. The MDT are

71

aluminum tubes that contain a central tungsten-rhenium wire. Each tube is ﬁlled with a gas

mixture that is mostly composed of argon. The MDT are arranged into chambers, providing

a coverage of |η| < 2.0 in the central region and up to |η| < 2.7 in the end-cap regions. The

CSC is composed of multiwired drift tubes, which contrasts to the monowire design of the

MDT, in order to cope with the demanding particle ﬂux in the high pseudorapidity region

2.0 < |η| < 2.7 that it was designed to cover. Muons interact with the MDT and CSC

by ionizing the gas inside, which produces electrons that are read out by the central anode

wires.

3.2.5.2 Muon Trigger Chambers

The muon trigger chambers are designed to primarily provide well-deﬁned pT thresholds for

muons that are used by the ATLAS trigger system. However, it also provides information on

bunch-crossing identiﬁcation and complements the muon tracking measurements performed

by the muon precision chambers, as they are orthogonal in direction. The RPC component

of the trigger chamber consists of parallel electrode plates that are separated by a gas gap

to be ionized by muons passing through. The RPC is located in the barrel region of the

detector and provides a covering of |η| < 1.05. The TGC is composed of multiwired drift

tubes and is positioned at the end-cap regions, providing a coverage of 1.05 < |η| < 2.7 that

is used for tracking measurements, while the triggering decision information is restricted to

1.05 < |η| < 2.4.

3.2.6 Magnet System

The ATLAS detector magnet system [36] is split into three subsystem: a central solenoid, a

barrel toroid, and the end-cap toroids. As previously discussed, the magnetic ﬁelds generated

72

by these magnets aid in measuring the momentum of charged particles. A schematic repre-

sentation of the magnetic ﬁeld lines produced by the magnet system is shown in Figure 3.13.

Figure 3.13: Schematic representation of the magnetic ﬁeld lines produced by the ATLAS
detector magnet system. This ﬁgure is taken from [37].

The central solenoid is located between the ID and the electromagnetic calorimeter. It has

a longitudinal length of 5.3 m, an inner diameter of 2.44 m, and an outer diameter of 2.63 m.

The solenoid consists of a single-layer coil of superconducting wire made of an aluminum-

stabilized niobium-titanium-copper alloy. The superconducting wire is wound 1173 times

around the coil in a supporting cylinder and is designed to operate with a 7.6 kA current.

This allows the central solenoid to provide a nominal 2 T axial magnetic ﬁeld, although it

can produce a ﬁeld with a peak strength of 2.6 T.

The toroidal magnets are embedded within the MS and are mechanically split into a

73

barrel toroid and two end-cap toroids. The barrel toroid has a longitudinal length of 25.3 m,

an inner diameter of 9.4 m, and an outer diameter of 20.1 m. A single end-cap toroid

has a longitudinal length of 5 m, an inner diameter of 1.65 m, and an outer diameter of

10.7 m. All three toroids consist of 8 individual air-cored coils that are positioned radially

around the detector. Each individual coil contains superconducting wires made from the

same material as the superconducting wire in the central solenoid, although the ratio of the

diﬀerent elements varies between the barrel and end-cap toroids. The wires are wound 120

times around each individual coil of the barrel toroid, and 116 times around each individual

coil of the end-cap toroids. The operating current is 20.5 kA in the case of the barrel toroid,

and 20 kA in the case of the end-cap toroids. This allows the toroids to produce peak

magnetic ﬁeld strengths of 3.9 T and 4.1 T that are directed azimuthally for the barrel

toroid and end-cap toroids, respectively.

3.2.7 Trigger and Data Acquisition System

At a nominal collision rate of 40 MHz, it is unfeasible to store the data of all events that

are detected by ATLAS. The trigger and data acquisition (TDAQ) system of the ATLAS

detector [38] is designed to select events based on certain triggering requirements that are

deemed interesting for the diﬀerent analyses carried out by the ATLAS experiment. The

TDAQ system is split into three subsystems. The ﬁrst subsystem is the Level-1 (L1) trigger,

which reduces the initial event rate of 40 MHz down to approximately 100 kHz using direct

information from the detector hardware. The second subsystem is the Level-2 (L2) trigger,

which is seeded by the information provided from the L1 trigger to further reduce the event

rate down to approximately 3.5 kHz using full granularity and precision from the detector

hardware. The third subsystem is the event ﬁlter, which takes the information from the L2

74

trigger and processes it using oﬄine reconstruction algorithms to further reduce the event

rate down to approximately 200 Hz. Events that pass the event ﬁlter are then written to

disk and stored for oﬄine reconstruction.

The L1 trigger searches for events that have patterns of potential candidate high-pT

objects, such as electrons, muons, hadronically decaying τ , photons and jets. The L1 trigger

also searches for events that have large Emiss

T

and total transverse energy. The L1 trigger

is subdivided into three components: the L1 muon trigger, which identiﬁes potential muon

candidates from the hardware information obtained from the MS; the L1 calorimeter trigger

(L1Calo), which identiﬁes the remaining aforementioned objects using information from the

ATLAS calorimetry system; and the central trigger processor (CTP), which combines the

information obtained from the L1 muon and L1Calo triggers to decide whether an event

should be selected for further processing by the L2 trigger or not. Additionally, the L1

trigger deﬁnes regions of interest (RoI), which include data such as the spatial information

in the η-φ plane of interesting patterns it has identiﬁed from a single detector component,

the type of pattern identiﬁed, and the passing criteria imposed by the trigger.

The L2 trigger further analyzes the information stored in the RoI using full granularity

and precision from the data of all the detector components in the RoI. A trigger menu system

that contains individual items from the L1 trigger is used by the L2 trigger to accept or reject

events. Events that are accepted by the L2 trigger are built into a single data structure that

is sent to the event ﬁlter.

The event ﬁlter uses the event data structure constructed by the L2 trigger and processes

it with reconstruction algorithms to make the ﬁnal decision on whether to keep the event

or not. Events that pass the event ﬁlter are then classiﬁed as events to be used for physics

analysis or performance measurements, which are then saved in separate data streams.

75

3.3 Object Reconstruction and Calibrations

In the preceding subsections of this chapter, the preparation of pp collisions at the LHC and

how the ATLAS detector is designed to detect processes that arise from these collisions were

discussed. Interactions between particles and the diﬀerent subcomponents of the ATLAS

detector are converted into electrical signals. These signals are then processed individually

on a subcomponent basis to determine the region in the detector where the interaction took

place and the amount of energy deposited by the particles. Events that pass the selection

criteria imposed by the ATLAS trigger system are then stored and used in the diﬀerent

analyses and calibrations performed by the ATLAS collaboration. The last step to get

events ready for these tasks is to reconstruct and calibrate the diﬀerent physics objects

corresponding to particle detections.

As can be observed in Figure 3.8, the identiﬁcation of a particle cannot be based solely

on a single component of the detector. As an example, both electrons and photons deposit

most of their energy in the EM calorimeter as they produce electromagnetic showers. Thus,

at the calorimeter level, these two particles are practically indistinguishable. However, when

the information of the tracking system is considered, one can use the fact that electrons

do interact with the ID while photons do not. Combining these two detector signatures

allows us to distinguish between electrons and photons. Thus, the identiﬁcation of particles

produced in events from reconstructed objects is an algorithmic, multi-step procedure that

combines information from all relevant detector components.

Finally, the reconstruction of objects must take into account detector eﬀects. For exam-

ple, a detector component could have mechanical gaps where there is no material in which

particles can deposit their energy. Additionally, some detector components can accumulate

76

damage from the radiation produced by particle interactions. These eﬀects can result in

inaccuracies between the energy that is deposited by a particle and what is measured by

the detector component. For this reason, reconstructed objects are calibrated in order to

address detector eﬀects.

3.3.1 Tracks

Charged particles that pass through the ID interact with its diﬀerent layers creating hits,

which are then reconstructed into tracks that are associated with the trajectories of particles.

Since the ID is immersed in a solenoidal magnetic ﬁeld, the tracks will follow a helicoidal

trajectory. The charge of the particles can be determined from the direction of the curva-

ture of the tracks. Additionally, the momentum of the particle can be measured using the

curvature since these two quantities are inversely proportional.

The track reconstruction at the ID consists of three subprocesses. The ﬁrst subprocess

takes the data from the pixel and SCT detectors and clusters it into spatial coordinates. The

second subprocess consists of applying track-ﬁnding algorithms [39, 31] that form track seeds

using the spatial coordinates from the pixel detector and the ﬁrst layer of SCT detector as

inputs. These track seeds are then extended through the remainder of the SCT, forming track

candidates that are ﬁtted to the track clusters in the SCT. Track candidates that are found

to be outliers are removed, while those that are deemed valid are then extended through the

TRT to resolve track direction ambiguities. Finally, the extended track candidates are ﬁtted

using all the information from the ID to evaluate the ﬁt quality. A complementary track-

ﬁnding algorithm, known as back-tracking, is also applied as part of the second subprocess.

This algorithm starts from the outermost layer of the ID and works its way through the

ID. The purpose of this algorithm is to ﬁnd leftover tracks that can be extended down into

77

the SCT and pixel detector in order to identify particles that undergo conversion or decay

while traversing the ID. The third subprocess consists of applying vertex ﬁnding algorithms

using tracks as inputs. Vertices are deﬁned as tracks that have a common spatial origin and

are close together. Vertices that have their spatial origin near the beam line are denoted as

primary vertices. Secondary vertices correspond to particle conversion and decay processes

that take place far from the beam line.

3.3.2 Electrons and Photons

The majority of electrons and photons are fully stopped by the electromagnetic calorimeters

in the detector. The representation of the energy of these particles is constructed using the

energy measurements obtained from this calorimetry subsystem. This process is performed

by applying a sliding-window algorithm [40] that scans the diﬀerent layers of the electro-

magnetic calorimeter in η-φ space using a ﬁxed-size window. The transverse energy that

is deposited at a given window interval is added to form candidate energy clusters. The

candidate clusters are rejected if their total transverse energy is below the noise threshold of

the calorimeter. Clusters that pass the noise threshold criteria are then compared with the

reconstructed tracks from the ID in order to determine if these signatures should be recon-

structed as an electron or photon. If the energy cluster spatially matches a reconstructed

track belonging to a primary vertex, then both objects are reconstructed as an electron. If

the energy cluster matches a reconstructed track belonging to a secondary vertex, then both

objects are reconstructed as a photon that decayed into an electron-positron pair. Finally,

if the energy cluster does not match any tracks, then it is reconstructed as a photon. In the

cases where the energy cluster is matched to a track, the momentum of the reconstructed ob-

ject is recalculated using the cluster energy and the momentum measurement obtained from

78

the ID tracks. A more detailed description of the reconstruction of electrons and photons

can be found in references [41, 42].

At the detector level, other particles can produce detector signatures that mimic the

signatures of the particle of interest one wants to reconstruct, which can result in the mis-

reconstruction of an object. Photons have a low mimic rate due to their unique detector

signature. Charged particles will often produce a complete set of tracks in the ID, which

makes their signature inconsistent with photons. Particles that carry no charge and interact

with the detector, such as neutral hadrons, deposit most of their energy in the hadronic

calorimeter, which also makes their signatures inconsistent with those of a photon. Elec-

trons, on the other hand, can be mimicked by several particles. For example, charged pions

that decay while traversing the electromagnetic calorimeter can mimic the signature of an

electron if the pion decays deposit all of their energy in this detector component, and the

tracks associated with the pion are consistent with a primary vertex. Although mimicking

detector signatures from electrons is a rare process, the mimic rates can start to become rel-

evant at higher detector luminosities. For this reason, the reconstructed electron candidates

are classiﬁed as loose, medium, and tight based on selection criteria that lower the mimic

rate but also lower the acceptance rate moving from loose to tight.

3.3.3 Muons

The process of reconstructing muons is straightforward compared to other particles due to

muons being MIPS and lacking color charge. At the detector level, muons are identiﬁed as

tracks in the ID and MS, with the possibility of small energy deposits in the electromagnetic

calorimeter. There are four muon reconstruction algorithms [43] that are used based on the

availability of the detector signatures. The combined (CB) muon algorithm is used when

79

both the ID and MS tracks are available and have a good spatial match. The segmented-

tagged (ST) muon algorithm is used when the ID tracks are fully reconstructed but only

partial track segments are available from the MS. In this scenario a muon has low pT or

passes through a region of the MS where a single layer is hit. The calorimeter-tagged (CT)

muon algorithm is used when there is no information available from the MS but an ID track

is matched to an energy cluster in the electromagnetic calorimeter that is consistent with a

MIP. Finally, the extrapolated (ME) muon algorithm is used when a full track from the MS

is available but there is no ID track. This algorithm is used in the region 2.5 < |η| < 2.7

where there is coverage from the MS but not from the ID.

Although very few particles besides muons and neutrinos do reach the MS, there are

certain particles that can mimic muon signatures. An example of such particles are charged

pions, which will leave tracks in the ID, traverse the calorimeters if they are produced with

a lot of energy, and most likely decay into a muon and neutrino once inside the MS.

3.3.4 Jets

As mentioned in subsection 3.2.1, the particles that have color charge initiate a process known

as hadronization due to color conﬁnement. This process starts as soon as quark-antiquark

pairs are created and start to separate due to their large momentum. The result of this is

the creation of additional quark-antiquark pairs as soon as it becomes energetically favorable

in order to conform to color conﬁnement. The inelastic nuclear collisions that occur in the

hadronic calorimeter further elicit this process from hadrons. The hadronization process

culminates when all the kinetic energy of the individual quarks is depleted, resulting in

the creation of shower-like structures of energy deposits in the calorimeter. The degree of

collimation of these showers along the direction of the original particle that initiated the

80

hadronization process is dependent on the pT of the original particle. In addition to quarks,

it is also possible to produce other particles during hadronization, such as gluons, photons,

electrons, and muons.

The energy deposits that result from the hadronization process are reconstructed as

objects known as jets. There are diﬀerent types of algorithms that are used to reconstruct

jets, each yielding jets that have varying properties that are suitable for diﬀerent applications.

An important application of jets in analyses is to identify short-lived particles, such as top

quarks, W/Z bosons, and Higgs bosons, that initialize the hadronization process through

their decays. This is done with the use of dedicated algorithms, known as jet taggers, that

employ diﬀerent identiﬁcation techniques based on the target particle to be identiﬁed. The

concept of jet taggers will be discussed in more detail in Chapter 5.

The jet reconstruction process starts by reconstructing the energy deposits from the

hadronization showers into objects known as topological energy clusters [40] (topoclusters).

This is done with an algorithm that clusters neighboring calorimeter cells based on whether

the total energy of the cluster exceeds a threshold deﬁned on the expected noise of the cells.

Unlike the reconstruction of electrons and photons, which uses only the energy deposits in the

electromagnetic calorimeter, the reconstruction of topoclusters uses both the electromagnetic

and hadronic calorimeters in order to take into account the production of other particles,

such as photons and electrons, during the hadronization process.

Once the topoclusters have been built, they are clustered into jets using a clustering algo-

rithm that combines the spatial and energy information of the topoclusters. The clustering

algorithm is summarized in the following steps:

1. Given two topocluster labeled as i and j, their relative momentum-weighted distance

di,j = min {p2n

T,i, p2n

T,j}∆R2

i,j/R2 is calculated. The parameter n determines the mo-

81

mentum dependence of the clustering algorithm, while the parameter R deﬁnes the

catchment area of the jet.

2. Calculate di,j for all possible topocluster pairs.

3. Determine d = min {di,j, dk}, where dk = p2n

T,k is the momentum weight of an individ-

ual topocluster k. If d = di,j for a pair of topoclusters (i,j), then both topoclusters

are combined. If d = dk for a single topocluster k, then the topocluster is considered

to be a jet and is removed from further consideration in the clustering algorithm.

4. Repeat the algorithm until all topoclusters are combined into jets.

The most common choices of the parameter n are 0, 1, and -1, which are known as the

Cambridge-Aachen (CA) [44], the kT [45], and the anti-kT [46] algorithms, respectively. The

CA algorithm ignores the topocluster momentum, making it a purely geometric clustering

algorithm. The kT algorithm has the eﬀect of clustering ﬁrst the low pT topoclusters, while

the anti-kT algorithm prioritizes high pT topoclusters. Figure 3.14 shows the outcome of

these three algorithms for an example set of energy deposition. As can be observed from

this ﬁgure, the anti-kT algorithm tends to produce jets with circular shapes that have a

radius approximately equal to the parameter R. For this reason, this parameter is mostly

referred to as the jet radius, a convention that will be adopted throughout the remainder of

this thesis. The circular shape of anti-kT jets is attributed to clustering the most energetic

topoclusters ﬁrst, which deﬁnes a stable centroid of the jet. The remaining topoclusters that

are added to the jet during the clustering process accumulate around the centroid. A good

approximation to determine if a jet captures all the subsequent decays of the particle that

82

(b)

(a)

Figure 3.14: Result of applying the Cambrige-Aachen (a), the kT (b), and the anti-kT (c)
clustering algorithms in the y-φ plane as a function of the topocluster pT with parameter
R = 1.0. Each uniquely colored cluster represents a single jet. This ﬁgure is taken from [46].

(c)

83

initiated the hadronization process within its radius is to deﬁne the jet radius as

R =

2mparticle
pT particle

(3.10)

Jets that are used in ATLAS analyses are usually reconstructed using a radius parameter

of R = 0.4, known as small-R jets, or R = 1.0, known as large-R jets. Small-R jets are

designed to capture the hadronization of a single non-massive quark and the radiation of

gluons. Small-R jets are often used as inputs to ﬂavor tagging algorithms [47, 48], which

identify jets that originate from the hadronization of b and c quarks. On the other hand,

large-R jets are designed to capture the hadronization of heavier particles such as the top

quark and hadronically decaying W/Z and Higgs bosons. For this reason, large-R jets are

used as inputs to tagging algorithms dedicated to identify these heavier particles.

Another type of jet reconstruction used in ATLAS analyses consists of using small-R

jets as inputs to a reclustering algorithm that combines them into a larger jet known as a

reclustered (RC) jet. Unlike large-R jets, which require calibrations to their mass and energy

(see 3.3.8), RC jets do not require additional calibrations since they are built from calibrated

small-R jets. The RC jets are usually constructed using the anti-kT algorithm and can have

a ﬁxed or variable radius parameter. For ﬁxed-radius RC jets, the radius parameter is set

to R = 1.0, while for variable-radius RC jets, the radius parameter is set to R = ρ/pT,

where ρ is an input parameter that controls the evolution of the eﬀective size of the RC

jet [49]. This shape ﬂexibility that variable-radius RC jets oﬀer allows them to capture the

decays of boosted particles in a wide pT range. Particles that are produced with pT greater

than their mass are considered boosted particles. For this reason, the variable-radius RC

jets are used as the inputs to the tagging algorithms used in the VLQ searches presented

84

in Chapter 6 since the mass energy of the VLQs is converted mostly into the kinetic energy

of their decays.

3.3.5 Missing Transverse Energy

The sources of missing transverse energy (Emiss

T

) [50] can be attributed to particles that

exit the detector without interacting, such as neutrinos, and object misreconstruction, such

as reconstructing a jet with mismeasured energy. Since protons that are traveling along

the beam line have net zero momentum in the transverse plane prior to colliding4, the

total transverse momentum from all particles produced in the collision must be zero by

conservation of momentum. If the total transverse momentum after the collision is not zero,

then the corresponding event has a source of Emiss

T

. The amount of Emiss

T

is quantiﬁed as

Emiss

T =

pmiss
x

2 +

2

pmiss
y

(cid:113)(cid:0)

(cid:1)

(cid:0)

(cid:1)

(3.11)

where pmiss

x

and pmiss

y

are the components of a vector in the transverse plane that quantify

the momentum imbalance. This vector also has an associated azimuthal angle that indicates

the direction of the net momentum imbalance and is given by

φmiss = arctan

pmiss
y
pmiss
x (cid:33)

(cid:32)

(3.12)

4This is not quite true since the net momentum of the quarks inside protons can be non-zero. The

momentum components of quarks follow a distribution, albeit narrowly centered around zero.

85

3.3.6 Tau Leptons

Tau leptons present one of the hardest experimental challenges when trying to reconstruct

them from detector signatures due to their short lifetime and possible decay modes, which

are shown in Figure 3.15. In all decay scenarios the tau always produces a neutrino, which

Figure 3.15: Feynman diagram depicting the diﬀerent decay modes of a tau.

implies that the full energy of a tau cannot be fully reconstructed. In the case of a leptonically

decaying tau it is very diﬃcult to distinguish this process from the detector signatures of

individual electrons and muons. For a hadronically decaying tau, it is possible to reconstruct

and identify the tau from a combination of narrow calorimeter energy clusters and 1-3 ID

tracks that originate from the charged hadron [51]. Although the reconstruction of taus

is not used in the analyses presented in this thesis, the misreconstruction of taus as other

objects can be a potential source of Emiss

T

.

86

3.3.7 Overlap Removal

As discussed in the preceding subsections, diﬀerent particles can mimic the detector signa-

tures that other particles produce. In addition to object misreconstruction, this can lead

to the potential double counting of detector signatures in diﬀerent object reconstruction

algorithms if the particle that produced the signatures satisﬁes the reconstruction criteria

of multiple objects. Reﬁnement procedures known as Overlap Removal (OR) are applied to

reconstructed objects in order to reduce the overlap of the detector signatures from a given

particle from appearing in multiple object classes.

For the studies presented in this thesis, the most relevant OR process is between electrons,

muons, and jets. The OR process is done sequentially, starting with candidate electrons and

muons that are within ∆R = 0.01. If a candidate electron and muon satisfy this criteria,

then the electron is removed from the event in order to suppress contributions from muon

bremsstrahlung. The next OR process to be applied is between candidate electrons and jets.

During the jet reconstruction procedure, the energy depositions in the calorimeter that have

been identiﬁed with electrons are not excluded. To avoid the double counting of electrons

as jets, the closest jet that is within ∆R = 0.2 of an electron is removed since these jets are

produced primarily from the showering of electrons. If there is any electron that is within

∆R = 0.4 of a jet after the initial OR of jets, then the jet is retained and the electron

is removed since the electron is likely to have been produced from the decay of a hadron

associated with the jet. The ﬁnal OR process to be applied is between candidate muons and

jets. Jets and muons can appear in close proximity when the jet originates from high-pT

muon bremsstrahlung, and in such cases the muon should be kept in favor of the jet. Such

jets are characterized by having very few matching ID tracks. Candidate muons that satisfy

87

∆R < 0.4 + 10GeV/pT muon with a jet are removed if the jet has at least three tracks

originating from the primary vertex; otherwise the jet is removed and the muon is kept.

3.3.8 Calibrations

As previously mentioned, the object reconstruction procedure must take into account de-

tector eﬀects so that objects accurately represent the result of the interactions between the

detector and particles produced in pp collisions. Examples of sources of these eﬀects include,

but are not limited to: diﬀerences in the detector response due to material variability within

a detector subsystem; transitions between diﬀerent detector technologies, such as diﬀerent

component resolutions; and detector damage due to prolonged radiation exposure. Addition-

ally, the large instantaneous luminosity and pile-up conditions in which the ATLAS detector

operates can also have eﬀects on the object reconstruction process by introducing excess

energy from other events, which can result in the mismeasurement of an object. In order to

compensate for these eﬀects, calibrations are derived for both Monte-Carlo (MC) simulation

and data independently. These calibrations are designed to reduce the inaccuracy between

the energy measured by the detector and the actual energy that was deposited by particles.

The calibrations are accurate by design but can have large variations between MC and data.

These variations originate from the diﬀerences between the actual detector response and

the imperfections in the simulation of the interactions between particles and the detector

components. Thus, the diﬀerences between the calibrations that are applied to MC and data

give rise to sources of systematic uncertainties that need to be considered in analyses.

The calibrations that are applied to jets are some of the most important compared to

other physics objects. This is in part due to the complexity of jet reconstruction and the

inherent randomness of the hadronization process. Since hadronization is random, jets can

88

have large variations in their particle composition, which results in large variations in detector

responses between jets. Additionally, jet calibrations rely on other well-calibrated objects

such as electrons and photons. For the stated reasons, the systematic uncertainties that are

associated with jet calibrations are a very important source of uncertainty in analyses. The

four main types of calibrations applied to jets are on the jet energy scale (JES) and resolution

(JER), and the jet mass scale (JMS) and resolution (JMR). A more detailed description of

these calibrations, how they are derived, and their associated uncertainties can be found

in [52, 53, 54].

These calibrations are ﬁrst applied to jets from MC. The events from MC simulation

contain the information of all stable particles that are produced in a given event, which

is known as the truth information. The truth information includes the four-momentum of

the particles and their decay chain, which can be used to reconstruct the full decay tree

of the event. Jets can be reconstructed in two ways using the truth information. The

ﬁrst type of jets, known as truth jets, are obtained by applying the clustering algorithm

described in 3.3.4 to the four-momentum of the stable particles in the MC record. The

second type of jets, simply known as reconstructed jets, are obtained by reconstructing

them from simulated detector responses from the truth information particles. Since these

simulations are imperfect, the reconstructed jets are calibrated so that their energy and mass

match those of truth jets.

The JES and JMS calibrations are designed to correct the energy and mass measurements

of jets from sources that can aﬀect these measurements. First, a calibration is applied to

correct the jet origin so that the direction of the jet matches that of its primary vertex.

Next, a calibration is applied in order to remove eﬀects from pile-up contamination, which

can result in the mismeasurement of the jet properties. The next calibration consists of

89

correcting the JES and JMS from the diﬀerences between the calorimeter response and the

truth jet scales. The ﬁnal calibration applied to MC jets consists of reducing the dependence

of the jet constituent ﬂavor composition on the detector response. An in-situ calibration is

applied to jets from data to correct any remaining inaccuracies between data and MC.

The JER and JMR calibrations are designed to correct the variance of the jet energy and

mass measurements in MC so that they match that of data. These calibrations take into

consideration eﬀects such as pile-up contamination, variability in the detector material, and

energy deposition in passive detector components.

90

Chapter 4

Processes of Interest and Data

Selection

In this Chapter, the relevant physics processes that are studied in Chapter 5 and Chapter 6

are described. These processes are classiﬁed as signal processes, which contain physical sig-

natures of interest, and background processes, which can mimic signal processes in diﬀerent

ways. Both studies use data recorded by the ATLAS detector and MC simulations, which

are obtained from the theoretical predictions of the signal and background processes. The

details of the MC samples used in these studies are decribed in Appendix A. Events from

data and MC go through the object reconstruction and calibration procedures described

in section 3.3. The event selection criteria that are used in each study are also described

in this Chapter. These are kinematic requirements that are imposed on the reconstructed

physics objects from data and MC events and are designed to simultaneously maximize the

acceptance of signal-like processes and the rejection of background-like processes.

4.1 Jet Tagging Studies

The studies presented in Chapter 5 are dedicated to the optimization and calibration of jet

tagging algorithms, which are designed to identify a jet to the particle that originated it.

The signal processes in these studies are events that contain a jet that was produced by a

91

particle of interest. These jets will be referred to as signal jets. Background processes, on

the other hand, are events that lack the particle of interest but contain jets that mimic signal

jets, which will be referred to as background jets. However, as previously discussed, since

hadronization is a stochastic process and jet reconstruction is not a fully eﬃcient process, it is

possible that signal jets get mistaken for background jets. This can happen if the kinematic

features of the jet are mismeasured or misreconstructed. Both signal and background jets are

used as inputs to the tagging algorithms in order to optimize and calibrate the jet taggers,

as will be discussed in Chapter 5. The data sample used in the jet tagging studies comes

from data recorded from pp collisions at a center of mass energy of 13 TeV in the period

2015-2017 and corresponds to an integrated luminosity of 80.5 fb−1.

4.1.1 Signal Processes

The jet taggers studied in Chapter 5 are designed to identify jets that originate from the

decays of boosted top quarks and W bosons. The taggers are optimized using MC samples

from the BSM Heavy Vector Triplet (HVT) model [55] as signal processes. This model

predicts the existence of two heavy gauge bosons: the W (cid:48) and the Z(cid:48). The W (cid:48) production

processes considered are the W (cid:48) → W Z → q ¯qq ¯q decays, which serve as a source of signal W

jets, and the W (cid:48) → tb decays, which serve as a source of jets arising from top quark decays.

The Z(cid:48) production process considered is the Z(cid:48) → t¯t decay, which also provides a source of

jets arising from top quark decays. The samples used are required to have the heavy gauge

bosons produced with a resonant mass of at least 2 TeV. This ensures that the jets produced

in these events are highly boosted. Example Feynman diagrams of these HVT processes are

shown in Figure 4.1.

MC simulations of SM processes are used for calibrating the tagger performance to match

92

(b)

(a)

Figure 4.1: Feynman diagrams depicting the production of the heavy W (cid:48) and Z(cid:48) vector
bosons. The W (cid:48) decay into SM vector bosons is shown in (a). The W (cid:48) → tb decay channel
is shown in (b). The Z(cid:48) → t¯t decay is shown in (c).

(c)

93

that of data. Simulations of t¯t and single-top production are used as signal processes for

the calibration of the jet taggers. For reasons that will be discussed in subsection 4.1.3, the

presence of a single electron or muon that is associated with the production processes is

required. Both of these processes serve as potential sources of boosted top and W jets. The

jet reconstruction process is simulated in these samples. If a top quark from these events

is highly boosted, then it can be reconstructed into a single large-R jet since the decays of

the top will be collimated. On the other hand, if the top quark is not suﬃciently boosted,

then its decays will be more spatially separated, which allows the possibility to individually

identify the jets produced from its decays. This scenario is referred to as a resolved top decay.

If the W boson that originates from a resolved top decay is suﬃciently boosted, then it can

be reconstructed into a single large-R jet. Example Feynman diagrams of t¯t and single-top

production are shown in Figures 4.2 and 4.3, respectively.

Figure 4.2: Feynman diagram depicting the pair production of top quarks through the strong
force where one of the top quarks decays hadronically and the other decays leptonically.

94

(a)

(b)

Figure 4.3: Feynman diagrams depicting the production of a single top quark through the
t-channel (a), the s-channel (b), and the W t-channel (c).

(c)

95

4.1.2 Background Processes

The boosted object taggers are optimized to identify signal top and W jets against jets

originating from QCD multijet production. The jets produced from this process originate

from gluon radiation and the hadronization of non-top quarks. QCD multijet production is

one of the most common processes that is initiated from hadron collisions. This ensures that

the boosted object taggers are properly optimized against a large selection of background

jets that span a wide kinematic regime.

As will be discussed in subsection 4.1.3, the QCD multijet production process is eﬀectively

suppressed by the event selection used for the tagger calibration studies. Thus, other sources

of background processes that can mimic the production of signal top and W jets must be

considered for the tagger calibration. These processes are the production of vector bosons

associated with additional jets (V +jets) and the pair production of vector bosons (dibosons).

Other potential background processes are signiﬁcantly suppressed by the event selection

criteria that will be discussed in the next subsection. Example Feynman diagrams of V +jets

and diboson production processes are shown in Figures 4.4 and 4.5.

4.1.3 Event Selection

The event selection for the boosted object tagging studies requires:

• Exactly one electron or muon1 with pT > 30 GeV.

• Emiss

T > 20 GeV and Emiss

T > 60 GeV 2.
1Electrons and muons are often jointly referred to as leptons in analyses, a convention that will be adopted

T + mW

2mW

from this point on.
2p(cid:96)
T =
system, where p(cid:96)
between the lepton and the direction of the missing transverse energy.

TEmiss
T (1 − cos ∆φ) is the transverse mass of the lepton and the missing transverse energy
T is the transverse momentum of the lepton and ∆φ is the azimuthal angle separation

(cid:113)

96

Figure 4.4: Feynman diagram depicting the production of a vector boson V in association
with jets that originate from the subsequent hadronization of quarks.

Figure 4.5: Feynman diagram depicting the pair production of vector bosons.

97

• At least one small-R jet with pT > 25 GeV and ∆R(jet, lepton) < 1.5.

• At least one large-R jet with pT > 200 GeV.

• At least one b-tagged variable radius jet with Rmin = 0.02, Rmax = 0.4 and ρ =

30 GeV. The b-tagging algorithm used is the DL1 [47] algorithm, which uses a deep

neural network to determine the probability of a jet originating from a b, c, or other

light quarks.

The presence of a single lepton ensures that the signal t¯t and single-top production processes

provide a relatively pure sample of signal top and W jets by allowing these jets to be properly

reconstructed. This is especially important in t¯t processes, where one top quark must have

a leptonic decay while the other must have a hadronic decay in order to satisfy the event

selection criteria. This reduces the potential contamination and interference eﬀects that two

candidate signal jets might have in the jet reconstruction process. The presence of a single

lepton also has the eﬀect of signiﬁcantly reducing the Z+jets background. Events from this

process that pass the lepton selection criteria are likely to come from dileptonic decays of the

Z where one of the leptons is misreconstructed as another object. This artiﬁcially generates

additional missing transverse energy in the event. However, these events are suppressed with

the Emiss

T

related selection requirements. In QCD multijet processes, the presence of a single

lepton originates from a misreconstructed jet, resulting in the artiﬁcial generation of Emiss
T .

These events are also suppressed with the Emiss

T

related selection requirements. The large-R

jet pT requirement ensures that the jet captures the majority of the decay products of the

boosted particles within its radius. The presence of at least one b-tagged jet provides an

experimental handle for identifying signal processes in data where no truth information is

available. Additionally, as will be discussed in subsection 5.1.4, the b-tagged jet is required

98

in order to provide a containment-based candidacy criteria in order to distinguish between

signal top and W jets.

4.2 VLQ Searches

For the VLQ searches presented in Chapter 6, the signal processes are events in which these

yet undiscovered particles are produced. The VLQ production processes studied are the

single production of a vector-like top (T ) and the pair production of vector-like tops (T ¯T ).

Separate MC samples are used to model these two processes, including the reconstruction

of all the associated physics objects produced in the events of interest. The background

processes in these searches are SM processes with events that mimic the kinematic signatures

of events in which a T is produced. Processes that have a small fraction of events that pass the

event selection criteria are known as reducible backgrounds. On the other hand, background

events that have a signiﬁcant number of events passing the event selection criteria are known

as irreducible backgrounds.

As will be discussed in Chapter 6, analysis search regions are deﬁned that elicit kinematic

features of signal processes. A statistical analysis is performed in each T production search to

determine if there is a signiﬁcant excess of data events over the SM background prediction in

the analysis search regions. If such an excess is found, then a discovery claim can be made on

the production of T . The data sample used in these searches comes from data recorded from

pp collisions at a center of mass energy of 13 TeV in the period 2015-2018 and corresponds

to an integrated luminosity of 139 fb−1.

99

4.2.1 Signal Processes

The signal processes of interest that are studied in the vector-like T searches are the single

production of a T through the electroweak force and the pair production T ¯T through the

strong force. Only the single production process associated with a single electron or muon,

referred to as the 1-lepton channel, is studied. Pair production processes are studied in the

0-lepton and the 1-lepton channels. Example Feynman diagrams that highlight characteristic

features of these processes are shown in Figure 4.6.

(a)

(b)

Figure 4.6: Feynman diagrams depicting the single production of a vector-like T (a) and the
pair production T ¯T (b).

The single production of a T is characterized by the simultaneous presence of several

unique objects. These objects can be used to experimentally identify single-T production

processes against the SM background. The initial quark q that recoils oﬀ the oﬀ-shell vector

boson can often emerge as a jet with high pseudorapidity. These jets, denoted as forward jets

(fj), are produced in the forward region of the detector. Additionally, the initial state gluon

g can split into b¯b or t¯t with one of the quarks coupling with the vector boson. Since the mass

of the top quark is larger than that of the bottom quark, the top-associated production mode

is kinematically disfavored. The T decay channels that are studied in the single production

100

process are T → Ht and T → Zt. The decay products of the T are expected to be boosted

as a result of its large mass. The associated lepton is likely to be produced from the leptonic

decay of the boosted top quark. The Higgs and Z bosons are expected to decay hadronically.

In the case of the T → Ht decay channel, the dominant b¯b decay of the Higgs characterizes

the signal process with a large presence of b-initiated jets.

The pair production process T ¯T lacks experimental handles such as the presence of

forward jets and an associated quark with the production. However, the number of boosted

objects increases due to the additional T that is produced. This characterizes the T ¯T

production process with interesting combinatorial decay topologies. In the 0-lepton channel,

the main decay topologies of interest are T ¯T → HtHt, HtZt, and ZtZt3. An interesting

feature of the ZtZt decay topology are events in which at least one Z boson decays as

Z → ν ¯ν. Since the T decays are boosted, this results in a large Emiss
T .

In the 1-lepton

channel the main decay topologies of interest are T ¯T → HtHt, HtZt and HtW b. The

lepton is likely to be produced from a leptonically decaying top quark. Both the 0-lepton

and 1-lepton channels are characterized by a large number of b-initiated jets that originate

from the predominant H → b¯b decay and the top quark decays.

4.2.2 Background Processes

The main irreducible background process that can mimic the single and pair production

of vector-like T is t¯t production in association with additional jets (t¯t+jets). As can be

observed by comparing Figures 4.2 and 4.6, the decay topologies are very similar between

t¯t production and the diﬀerent production mechanisms of the T . The presence of a boosted

3Throughout the remainder of this thesis, the notation HtHt is used to denote both HtH¯t and its charge

conjugate H¯tHt. A similar notation is used for all other T ¯T decay topologies.

101

Higgs boson in T production processes can aid in the discrimination between signal and

background events by correctly tagging a jet to the Higgs. However, top-initiated jets can

mimic Higgs jets if the full decay of the top is not contained in the jet. This can result in

the top jet having kinematic features similar to those of a Higgs jet, such as its mass, which

can potentially lead to mistagging the top jet as a Higgs jet.

Single-top and V +jets production are subdominant background processes that can mimic

the signal in events that have few b-tagged jets. The reducible background processes are

diboson production, t¯t production in association with a vector or Higgs boson (t¯tV /H),

QCD multijet production, and the production of four top quarks (t¯tt¯t). These processes are

reduced signiﬁcantly with the event selection criteria and further kinematic requirements

from the analysis search regions, which will be discussed in Chapter 6.

4.2.3 Event Selection

The event selection for the search of a singly produced T is based on the following criteria:

• Exactly one lepton with pT > 30 GeV.

• At least 3 small-R jets.

• At least one DL1 77% working point b-tagged small-R jets.

• Emiss

T > 20 GeV and Emiss

T + mW

T > 60 GeV.

The event selection criteria for the search of T ¯T production in the 1-lepton channel is deﬁned

similarly, but instead at least 5 small-R jets and 2 b-tagged jets are required. Additionally,

the b-tagging algorithm is switched from DL1 to DL1r [48], which is an optimized version

of the DL1 algorithm. The Emiss

T

requirements are used to suppress the QCD multijet

102

production background in the 1-lepton channel. The event selection criteria for the 0-lepton

channel of the pair production search is summarized as follows:

• Exactly zero leptons.

• At least 6 small-R jets.

• At least 2 DL1r 77% working point b-tagged small-R jets.

• Emiss

T > 200 GeV and ∆φ4j

min > 0.44

In the 0-lepton channel, the source of missing transverse energy in QCD multijet events is

likely to be produced from jet energy mismeasurements. This implies that the most energetic

jets of these events are expected to be collinear with the missing transverse energy. Thus,

requiring a large azimuthal separation between the leading jets and the associated direction

of Emiss
T

can signiﬁcantly reduce the QCD background.

4∆φ

4j
min > 0.4 is the minimum azimuthal angle separation between the four leading in pT jets in the

event and the direction of the missing transverse energy.

103

Chapter 5

Tagging Top Quarks

As previously discussed in Chapter 2, one of the issues stemming from the Hierarchy Problem

is the low mass of the Higgs boson. Several BSM theories have been formulated with the

goal of solving the Hierarchy Problem by introducing new particles that would provide the

quantum loop corrections necessary to explain the value of the Higgs mass. For example, in

Composite Higgs models, the VLQs provide the mechanism for the Higgs boson to acquire its

mass. Since the masses of these potentially new particles are above the TeV scale, any direct

observation with the ATLAS detector is impossible due to their short lifetimes. However,

in most scenarios, these new particles are expected to decay into SM bosons and quarks,

which can be reconstructed as jets. If the jets are correctly identiﬁed to their source particle,

then one could use these jets as inputs to reconstruct the BSM particles that initiated the

decay chain. The process of identifying a jet to a source particle, known as jet tagging,

plays an important role in many BSM analyses searching for new particles. This will become

evident in the discussion of Chapter 6, where jet tagging is embedded in several aspects of

the analysis strategy, such as the reconstruction of candidate VLQs and the deﬁnition of the

analysis search regions.

This Chapter is divided into two sections. The ﬁrst section serves as an introduction

to the concept of jet tagging. This is done through the discussion of the optimization and

calibration studies of boosted object taggers that are designed to tag jets as top jets and W

104

jets using information from jet substructure variables. The concept of jet substructure has

recently become very important in the design of jet tagging algorithms. A short overview of

the jet substructure variables that are relevant to the studies presented in this Chapter is also

included in the ﬁrst section. The second section discusses the optimization studies of two

tagging algorithms that are designed to identify top jets using information from topological

data analysis (TDA), which has not been used in the context of jet tagging before. An

overview of the TDA tools used is given in this section. This is followed by the optimization

studies of the two tagging algorithms, which are compared in performance with one of the

top taggers discussed in the ﬁrst section.

5.1 Jet Tagging with Jet Substructure

In this section, jet tagging is introduced through the studies performed on the optimization

and calibration of jet taggers designed to identify jets originating from boosted hadronically

decaying top quarks and W bosons. The physics processes of interest and the event selection

used in these studies were described in section 4.1. The author of this thesis contributed to

the calibration eﬀort as part of his ATLAS authorship qualiﬁcation task. The work culmi-

nated with the derivation of data to Monte Carlo (MC) scale factors that are used to calibrate

the signal eﬃciency of the taggers using the data collected in 2015-2017. The development

and results of the calibration eﬀort were published in an ATLAS internal note [56]. The

taggers studied in this section use information from jet substructure variables that quantify

how energy is distributed within the internal structure of jets. The jet tagging and jet sub-

structure concepts that are introduced in this section are also used in the following section

of this Chapter. Although the scope of the discussion is mostly limited to top quark and W

105

boson jet tagging, these concepts are applicable to the tagging of jets to other particles.

5.1.1 Jet Substructure Variables

As discussed in subsection 3.3.4, jets are complex structures that are constructed from the

detector response of hadrons that originate from the decay chains initiated by the hadroniza-

tion process. Diﬀerent types of particles can produce jets with certain characteristic radiation

patterns and structures inside the jet, which deﬁne the jet substructure. For example, a jet

originating from the hadronic decay of a top quark can be characterized by the presence of

three smaller subjets that originate from the b quark and the two quarks from the hadronic

decay of the W boson. Depending on the initial energy of the top quark, these substruc-

tures can have varying degrees of collimation that can have an eﬀect on the overall radiation

pattern and structure of the top-initiated jet. Jet substructure variables quantify how the

energy of the jet is distributed across its internal structure. This information allows us to

discriminate between diﬀerent types of jets based on the substructure that is present in them.

To achieve this, diﬀerent types of substructure variables analyze the energy distribution at

diﬀerent scales within the jet, as shown in Figure 5.1. For example, some variables quantify

the energy distribution using the topocluster constituents of jets as inputs. Other variables

take as input more complex substructures, such as subjets that are obtained by forming

smaller jets from constituents that are within a localized region of the jet. In the following,

a brief description of some of these substructure variables is given.

5.1.1.1 n-Subjettiness

The n-subjettiness variables τn [58, 59] are designed to measure how well the jet system

is represented with a substructure of n subjets. The subjets are obtained by applying the

106

Figure 5.1: Depiction of a jet with radius parameter ∆R and its internal structure. The
individual jet constituents are represented by the orange decagons. Diﬀerent distance scales
are shown between the jet constituents that are used to deﬁne jet substructure variables.
The distance between an individual constituent i and the jet axis, which is deﬁned as the
direction of the net momentum of the jet, is denoted as ∆Ra1,i. The “winner-takes-all” (wta)
axis is deﬁned as the direction of the constituent with the largest pT in the jet. The distance
between a jet constituent i and the wta axis is denoted as ∆Rwta1,i. Finally, the distance
between two jet constituents i and j is denoted as ∆Rij. This ﬁgure is taken from [57].

107

kT clustering algorithm on the jet constituents. This process is stopped once n subjets are

deﬁned. The n-subjettiness variables calculate the sum of the pT-weighted distances between

all jet constituents and the closest subjet:

τn =

1
pref
T

(cid:88)i

pT i min {∆Ri 1, · · · , ∆Ri n}

(5.1)

where the summation index i runs over the jet constituents. The distance in the η-φ plane

between the ith constituent and the jth subjet is denoted by ∆Ri j. Speciﬁcally, these are

the distances between the constituents and the axis of the subjet, which is deﬁned as the

direction of the net momentum of the subjet. These variables are weighted by a reference

pT scale that is deﬁned as:

pref
T =

(cid:88)i
where R0 is the radius parameter of the jet. A special sub-family of these variables uses the

pTi

R0

(5.2)

“winner-takes-all” (τ wta

n ) conﬁguration [60], where the distances ∆Ri j are taken between

ith constituent of the jet and the constituent with the largest pT within the jth subjet.

As an example, boosted jets originating from QCD multijet processes can have radiation

patterns that consists of large angle soft splittings. This results in jet constituents that are

spread apart with relatively low pT. Thus, for small values of n, the values of τn will be larger

in these jets due to the majority of constituents being farther away from the subjet axes. In

contrast, boosted jets that arise from the hadronic decays of W bosons and top quarks are

highly collimated. This results in well-deﬁned pronged-like structures that coincide with the

direction of the decays of these particles, as shown in Figure 5.2. In the case of W jets, 1-

and 2-pronged substructures are usually formed depending on the level of collimation. On

the other hand, for top jets, 2- and 3-pronged substructures are usually formed. Since the

108

jet constituents are close to the axes of these substructures, this is reﬂected in smaller values

of τn for the corresponding n-prong structure.

Figure 5.2: Depiction of pronged-like structures that are formed from a jet initiated by a
boson. The degree of collimation of these structures is dependent on the momentum of the
boson. At lower momentum, the individual jets that arise from the hadronic decays of the
boson are resolved. As the momentum of the boson increases, the jets from the decays of the
boson start to become more collimated to the point where they can be considered subjets of
a large-R jet. The pronged-like structures coincide with the direction of the subjets in the
large-R jet. This ﬁgure is taken from [61].

The ratios of consecutive n-subjettiness variables, τab = τa/τb, are used to discriminate

jets based on the relative likelihood of being represented by an n-pronged structure. The

comparisons of the distribution of τ32 between top and QCD jets and the distribution of τ21

between W and QCD jets are shown in Figure 5.3. As observed in the τ32 distribution, top

jets are better represented by a 3-pronged structure relative to a 2-pronged structure when

compared to QCD jets. Similarly, as observed in the τ21 distribution, W jets are better

represented by a 2-pronged structure instead of a single prong structure when compared to

QCD jets.

109

(a)

(b)

Figure 5.3: The τ32 (a) distribution for top and QCD jets with R0 = 0.8 and the τ21 (b)
distribution for W and QCD jets with R0 = 0.6. The selection criteria for these jets require
pT > 300 GeV and |η| < 1.3. Additionally, a window cut on the jet mass that selects jets
that have a mass close to the top quark and W boson mass is applied. These ﬁgures are
taken from [58].

110

00.20.40.60.8100.010.020.030.040.050.06τ3/τ2 of jetRelative occurence145 GeV < mj < 205 GeV  Top jetsQCD jets00.20.40.60.8100.010.020.030.040.050.060.070.08τ2/τ1 of jetRelative occurence65 GeV < mj < 95 GeV  W jetsQCD jets5.1.1.2

kT Splitting Scales

The next set of substructure variables are the kT clustering algorithm splitting scales [62].

These scales, which are denoted as

dn n+1, are deﬁned as the smallest kT distance between

two subjets before they get merged during the clustering step from n + 1 to n subjets. The

(cid:112)

process of obtaining these variables can be thought of as a declustering of the jet that probes

the original constituent structure of the jet. As an example, for W jets, the scale

√

d12

should be approximately equal to half the mass of the W boson. This is because the jet

energy is distributed almost equitably between the two subjets that correspond to the two

quark decays of the W boson.

5.1.1.3 n-Point Energy Correlation Functions

Similar to the n-subjettiness variables, the set of n-point energy correlation functions en [63,

64] are designed to probe for n-pronged structures in jets. However, this contrasts with

n-subjettiness in that en quantiﬁes the jet energy distribution relative to the jet constituents

instead of subjets. Mathematically, the n-point energy correlation functions for n = 2 and

n = 3 are deﬁned as:

e2 =

e3 =

1
(pref
T )2
1
(pref
T )3

pT ipT j∆Ri j

(cid:88)1≤i<j≤Nconst

(cid:88)1≤i<j<k≤Nconst

pT ipT jpT k∆Ri j∆Ri k∆Rj k

(5.3)

where the summation runs over the constituents in the jet, Nconst is the number of con-

stituents in the jet, and pref

T is deﬁned similarly as in Equation 5.2. The only relevant energy

correlation functions for the discussion in this Chapter are e2 and e3. The deﬁnition of

higher order energy correlation functions follows a natural generalization from Equation 5.3.

111

The phase space of e2 and e3 separates jets with single-pronged substructures and two-

pronged substructures into diﬀerent regions. As an example, we consider the case of QCD

jets with a single-pronged substructure and Z jets with a two-pronged substructure, as shown

in Figure 5.4. A single-pronged QCD jet is usually characterized by collinear radiation that

(a)

(b)

Figure 5.4: Depiction of the radiation patterns of a single-pronged jet (a) and a two-pronged
jet (b). The radiation pattern of the single-pronged jet is characterized by a collinear emission
core that carries most of the pT of the jet and is localized within a small angular region Rcc
of the jet. Additionally, soft radiation may be present from gluon and quark emissions,
which carries a small fraction zs of the jet pT. The radiation pattern of the two-pronged
jet is characterized by two collinear emission cores that are localized within a small angular
region Rcc and separated by an angular distance R12. In addition to global soft radiation,
there also may be collinear soft-radiation present that is localized within an angular region
of size R12. These ﬁgures are taken from [64].

is localized within a small angular region of the jet (Rcc (cid:28) 1). Additionally, the presence

of soft radiation in QCD jets from gluon and quark emissions is characterized by having a

low fraction of the jet pT (zs = pT soft/pT jet (cid:28) 1). With these limits in consideration for

QCD jets, the n-point energy correlation functions in Equation 5.3 can be shown to behave

approximately as:

e2 ≈ Rcc + zs

(5.4)

e3 ≈ R3

cc + z2

s + Rcczs

112

CollinearSoft)zsRccCollinearSoftC-Soft)))RccR12zsRccIf the soft radiation in QCD jets has the largest contribution in Equation 5.4, then e2 ≈ zs

and e3 ≈ e2

2. On the other hand, if the collinear radiation from the single prong has the

largest contribution, then e2 ≈ Rcc and e3 ≈ e3

2. Thus, under these limits, QCD jets with

a single-prong substructure populate the region of phase space where e3

2 ≤ e3 ≤ e2

2. Jets

with a two-pronged substructure, such as a Z-initiated jet, are usually characterized by the

angular size of their collinear emissions satisfying Rcc (cid:28) R12 (cid:28) 1, where R12 is the angular

separation between the two prong structures. In addition to soft radiation, there may also

be collinear soft radiation that is localized in an angular region of size R12 with pT fraction

zcs. With these limits in consideration for a two-pronged jet, the n-point energy correlation

functions in Equation 5.3 can be shown to behave approximately as:

e2 ≈ R12

(5.5)

e3 ≈ R12zs + R2

12Rcc + R3

12zcs

If the jet has negligible non-collinear soft radiation, then most of its energy is carried by

the two prong substructures and zcs (cid:28) 1. Under these limits, the jet populates the region

of phase space where e3

2 ≈ R3

12 (cid:29) R3

12zcs ≈ e3. The ratios of n-point energy correlation

functions C2 = e3/e2

2 and D2 = e3/e3

2 provide a relative measure of how well a jet is

characterized with two prong substructures compared to a single prong substructure. The

distribution of these two ratios compared between Z jets and QCD jets is shown in Figure 5.5.

As can be observed, Z jets populate the regions C2 < 1 and D2 < 1, which is characteristic

of a two-pronged jet, while QCD jets populate the regions C2 < 1 and D2 > 1, which is

characteristic of a single-pronged jet.

113

(a)

(b)

Figure 5.5: Comparison of the energy correlation function ratios C2 (a) and D2 (b) between
Z large-R jets and QCD large-R jets. The selection criteria for the large-R jets requires
the jet mass to be less than 100 GeV and the jet pT > 400 GeV. These ﬁgures are taken
from [64].

5.1.1.4 Reconstructed Substructure Mass

The mass of boosted jets originating from particles such as W bosons and top quarks holds

discriminatory power against background jets that originate from other particles. This is

due to the mass of the jet being close in value to the mass of the particle that originated it.

However, if jets are suﬃciently boosted or misreconstructed, they can have mass values that

diﬀer from the mass of their source particle, which can lead to the misidentiﬁcation of both

signal and background jets. The last set of substructure variables to be considered are the

invariant masses of reconstructed substructures within the jet [65]. Candidate substructures

can be reconstructed by combining subjets that satisfy a hypothesis based on the radiation

pattern of the desired substructure and achieve an invariant mass that is close to the real

mass value of the substructure. For example, in jets that arise from the decay products of

the top quark, the W boson can be reconstructed by pairing two subjets that are relatively

close and result in an invariant mass close to the mass of the W boson. If this procedure

114

������������������������(���)�������������������������������(�������)��������������<���������>���������=�����(β)�β=����������������������������(�)�������������������������������(�������)��������������<���������>���������=�����(β)�β=�is applied to jets originating from particles that do not contain a W boson in their decay

products, then the resulting invariant mass can diﬀer signiﬁcantly from the W mass as a

result of reconstructing a W boson from inconsistent radiation patterns.

5.1.2 Jet Substructure Taggers

From the previous discussion, it is clear that the diﬀerent types of jet substructure variables

hold signiﬁcant discriminatory power to diﬀerentiate between signal and background jets.

However, in certain kinematic regimes, these variables by themselves may not provide suﬃ-

cient information in order to classify jets. For example, one could consider a highly boosted

top quark jet in which the decays of the W boson are extremely collimated. In this scenario,

the top jet is better represented with a 2-prong structure, which can result in similar values

of τ32 to those of a W jet with a resolved structure. In order to fully exploit the information

from these substructure variables and their correlations, three taggers were optimized for

tagging jets to top quarks and W bosons [66]. The optimization of these taggers was per-

formed using the MC samples from the HVT model and QCD multijet production processes

described in section 4.1 as sources of signal and background jets, respectively.

Two substructure-based deep neural network (DNN) taggers were optimized for tagging

jets to top quarks. One tagger, known as the contained DNN tagger, is designed to tag

jets that contain the full decay products of the top quark. The other tagger, known as the

inclusive DNN tagger, is designed to tag jets regardless of the full containment of the top

decays in the jet. Both DNN taggers use the same set of input variables, which are: the

jet mass, the jet pT, the n-subjettiness variables τ wta

1

, τ wta
2

, τ wta
3

, τ wta

21 , and τ wta

32 ; the 3-
√

point energy correlation function e3 and the ratios C2 and D2; the kT splitting scales

d12

and

√

d23; and the invariant mass of the reconstructed W boson that originates from the

115

decay of the top quark. These variables are then combined into a single output variable

that quantiﬁes the probability of the jet originating from a top quark. A selection cut can

be deﬁned on the output variable that divides it into two regions: a top jet region and a

background jet region, which allows for a binary classiﬁcation of jets.

For W jet tagging, a three-variable tagger was optimized for this task. This tagger

takes as input the jet mass, the energy correlation function ratio D2, and the number of

tracks, ntrack, that are associated with the jet. The tagger deﬁnes selection cuts, which are

parametrized by the jet pT, on the input variables. These cuts split the phase space into a

W jet region and a background jet region. Jets that pass the selection criteria of the W jet

region are then tagged as a W jet.

5.1.3 Jet Truth Labeling

A labeling criteria for jets in events from MC signal processes must be provided in order to

properly deﬁne and optimize the signal eﬃciency of the taggers. This labeling criteria will

allow us to deﬁne which jets are signal-like and background-like for each tagger. As it was

discussed in subsection 3.3.8, two types of jets can be obtained from MC events: truth jets

(Jtruth) and reconstructed jets (Jreco). The reconstructed jets that are used in the tagging

studies presented in this Chapter are obtained by applying the anti-kT clustering algorithm

with a radius parameter R = 1.0 to the topoclusters obtained from the simulated detector

responses from the truth information particles. The taggers are calibrated using Jreco, so

the labeling criteria is applied to these jets. The labeling procedure starts by spatially

matching Jtruth to a hadronically decaying truth top quark or W boson by requiring that

∆R(Jtruth, truth particle) < 0.75. Next, a Jreco is spatially matched to a Jtruth that has

been matched to a truth particle using the same distance criteria. In the case of the DNN

116

top taggers, Jreco is labeled as an inclusive top jet if it is matched to a Jtruth that is matched

to a truth top. If, in addition to being matched to a truth top, the mass of Jtruth satisﬁes

> 140 GeV and Jtruth has at least one associated b-hadron [67], then Jreco is labeled

mJtruth
as a contained top jet. In the case of the W tagger, Jreco is labeled as a W jet if it is matched

to a Jtruth that is matched to a truth W with a mass that satisﬁes 50 < mJtruth
and Jtruth has zero associated b-hadrons. If Jreco fails the signal label criteria for a given

< 100 GeV

tagger, then it is labeled as a background jet for that tagger.

5.1.4 Tagger Signal Eﬃciency Optimization

The signal eﬃciency of a boosted object tagger in MC events is deﬁned as the fraction of

signal jets that are correctly tagged:

(cid:15)MC(pT) =

signal (pT)

N tagged
signal (pT) + N not tagged
N tagged

signal

(pT)

(5.6)

The eﬃciency is evaluated in bins of the jet pT that are designed to contain jets that are

kinematically similar. The binning used for each tagger will be listed in subsection 5.1.5.

Two ﬁxed signal eﬃciency working points were optimized for the taggers studied. The loose

working point requires that the signal eﬃciency be 80% in all pT bins considered by the

tagger. This working point is designed to maximize the acceptance of signal jets, but at

the cost of a higher mistag rate for background jets. The tight working point requires that

the signal eﬃciency be 50% in all pT bins. This working point is designed to reject a larger

fraction of background jets, but at the cost of lower signal jet acceptance. For the DNN

top taggers the working points are obtained by deﬁning cuts on the DNN output variable

that achieve the desired eﬃciency in each pT bin. This procedure is extended for the W

117

tagger by deﬁning simultaneous cuts on the jet mass, D2, and ntrack that achieve the desired

eﬃciency. These cuts are then parametrized by performing a functional ﬁt as a function of

the jet pT [66]. The results of these ﬁts for the mass and D2 are shown in Figure 5.6. The

jet mass ﬁt deﬁnes a window cut, while the D2 ﬁt deﬁnes an upper bound cut. In the case

of ntrack, the ﬁts were found to be consistent with a constant upper bound cut of 26 for the

50% woking point and 34 for the 80% working point.

(a)

(c)

(b)

(d)

Figure 5.6: The jet mass window cuts and the D2 upper bound cuts that deﬁne the tagging
criteria as a function of the jet pT for the 50% ﬁxed signal eﬃciency working point W
tagger (a) - (b) and the 80% ﬁxed signal eﬃciency working point W tagger (c) - (d). The
solid and dotted lines in each plot indicate the parametrized tagging criteria as a function
of the jet pT. These ﬁgures are taken from [56].

The performance of the taggers is assessed using the background rejection metric. For a

given ﬁxed signal eﬃciency working point, the background rejection is deﬁned as the number

118

5001000150020002500 [GeV]TJet p406080100120140160 [GeV]combm cut high binnedcombm cut high fitcombm cut low binnedcombm cut low fitcombmATLASSimulation Preliminary = 13 TeVs tagger 50% signal efficiencyW=1.0 jetsR tkTrimmed anti-| < 2.0h > 200 GeV, |Tp5001000150020002500 [GeV]TJet p0.511.522.533.5=1.0b2D cut binned=1.0b2D cut fit=1.0b2DATLASSimulation Preliminary = 13 TeVs tagger 50% signal efficiencyW=1.0 jetsR tkTrimmed anti-| < 2.0h > 200 GeV, |Tp5001000150020002500 [GeV]TJet p406080100120140160180 [GeV]combm cut high binnedcombm cut high fitcombm cut low binnedcombm cut low fitcombmATLASSimulation Preliminary = 13 TeVs tagger 80% signal efficiencyW=1.0 jetsR tkTrimmed anti-| < 2.0h > 200 GeV, |Tp5001000150020002500 [GeV]TJet p11.522.533.544.5=1.0b2D cut binned=1.0b2D cut fit=1.0b2DATLASSimulation Preliminary = 13 TeVs tagger 80% signal efficiencyW=1.0 jetsR tkTrimmed anti-| < 2.0h > 200 GeV, |Tpof background jets that are not tagged to the particle of interest. Figure 5.7 shows the QCD

multijet background rejection as a function of the jet pT for the DNN top taggers and the

W tagger. For the W tagger, it can be observed that the rejection is maximal in the 800-

1000 GeV pT interval, which coincides with a narrow jet mass window requirement, as shown

in Figure 5.6. Additionally, the D2 requirement in this pT interval is low, which is indicative

of a jet with a 2-pronged structure. In the case of top tagging, the additional requirements

imposed on candidate top jets for the contained tagger result in a higher background rejection

when compared to the inclusive tagger.

(a)

(b)

Figure 5.7: The QCD multijet background rejections for the W tagger (a) and the contained
and inclusive top DNN taggers (b) overlayed for the 50% and 80% ﬁxed signal eﬃciency
working points as a function of the jet pT. The jets are required to have a pT > 200 GeV
in (a) and pT > 350 GeV in (b) to be in the validity range of the W and top taggers,
respectively. These ﬁgures are taken from [56].

5.1.5 Tagger Signal Eﬃciency Calibration

In order to calibrate the signal eﬃciency of the boosted objects taggers in MC events to that

of data, the signal eﬃciency needs to be extracted from MC simulations of SM processes.

As described in subsection 4.1.1, the t¯t and single-top production processes are used as a

source of candidate signal jets for the boosted object taggers. The events used for the tagger

119

05001000150020002500 [GeV]TJet p050100150200250300350)bkgeBackground rejection (1/= 50%sige= 80%sigeTATLAS                 Simulation Preliminarys = 13 TeVTrimmed anti-kt R = 1.0p  > 200 GeV, |h| < 2.0W tagger5001000150020002500300035004000 [GeV]TpJet 050100150200 )bkgeBackground rejection (1/ ATLAS Simulation Preliminary = 13 TeVs = 1.0 jetsR tkTrimmed anti- | < 2.0h > 350 GeV, | TpTop tagger: = 50%sigeContained,   = 80%sigeContained,   = 50%sigeInclusive,   = 80%sigeInclusive,  calibrations are required to have at least one Jreco with pT > 200 GeV and at least one

b-tagged small-R jet (jb). The Jreco with the largest pT is selected as a candidate top jet if it

satisﬁes the containment criteria ∆R(Jreco, jb) < 1.0, which corresponds to the topology of

a jet initiated by a top quark. On the other hand, if the jet does not satisfy the containment

criteria, then it is selected as a candidate W jet. Candidate MC jets that satisfy the signal

labeling criteria described in 5.1.3 are then considered signal jets that are used to extract

the signal eﬃciency of the corresponding tagger. The same procedure outlined for selecting

candidate top and W jets in MC is also applied to jets in data.

The MC modeling of data is ﬁrst assessed in the input variables of the taggers prior to

evaluating the tagger eﬃciencies with data. Several sources of systematic uncertainties are

included in the evaluation of the MC modeling. These uncertainties originate from the theory

assumptions made in the MC predictions of the samples used and from the reconstruction

and calibrations of relevant physics objects. The uncertainties are grouped as follows:

• t¯t modeling uncertainties: These are the uncertainties in the modeling of the signal t¯t

process. These uncertainties are associated with the choice of the MC generator used

to model the t¯t process, the modeling of the hadronization process, and the modeling of

the initial state radiation (ISR) and ﬁnal state radiation (FSR). These uncertainties are

estimated by comparing the nominal MC sample that is used to model the t¯t process

with alternative MC samples obtained by varying the MC generator algorithms and

modeling parameters, as described in Appendix A.

• Theory uncertainties: These are the uncertainties on the cross-section of the t¯t, single-

top, and W +jets production processes.

• Large-R jet uncertainties: These are the uncertainties in the calibration of the jet

120

energy scale and resolution and the jet mass scale and resolution.

• Flavor tagging uncertainties: These are the uncertainties in the eﬃciency of tagging

jets to b-, c-, and light-quarks. Additionally, uncertainties on the data to MC scale

factors that are used to calibrate the eﬃciency of ﬂavor tagging are also taken into

account.

• Other experimental uncertainties: These are the uncertainties in the luminosity mea-

surement for the 2015-2017 dataset and the detector response to leptons and missing

transverse energy.

The t¯t modeling uncertainties are expected to be the largest source of uncertainty in the

calibration of the taggers. Since these uncertainties vary the hadronization, the ISR, and

the FSR models of the signal t¯t process, the simulated detector response will also vary,

thereby varying the jet reconstruction process. Variations in the reconstructed jets can then

impact the outcome of the jet labeling procedure and the diﬀerent tagger input variables,

resulting in variations in the signal eﬃciency measurement.

Figure 5.8 shows the distributions of the DNN scores and the jet mass for candidate top

jets compared between data and the total MC prediction. As can be observed, the MC sim-

ulation models the data well. All diﬀerences between data and MC in the regions dominated

by signal jets are within the t¯t modeling uncertainties. Figure 5.9 shows the distributions of

D2, ntrack, and jet mass for candidate W jets. The D2 and ntrack distributions are shown

in W -enhanced regions by requiring that the jet mass be in the [65, 95] GeV interval. In

addition to the jet mass cut, a requirement of D2 < 1.2 is included in the ntrack distribu-

tion. As can be observed, the mass distribution shows good agreement between data and

MC. However, the D2 and ntrack distributions show large diﬀerences between data and MC.

121

These diﬀerences, however, are within the total uncertainty considered.

(a)

(b)

(c)

Figure 5.8: Comparisons between data and MC of the contained DNN score (a), the inclusive
DNN score (b), and the jet mass (c) distributions for candidate top jets. The candidate top
jets from MC signal processes that pass the signal top jet labeling criteria are indicated
as t¯t (top) and Single Top (top). The contained top labeling criteria is used in (a), while
the inclusive top labeling criteria is used in (b) - (c), as described in subsection 5.1.3. The
candidate top jets from MC signal processes that fail the corresponding top labeling criteria
are indicated as t¯t (other) and Single Top (other). The bottom panel in each plot shows the
ratio of data to the total MC prediction for each bin of the distributions. The dark green
band represents the statistical uncertainty, the red line the total t¯t modeling systematic
uncertainty, and the light green band the total uncertainty for each bin.

122

00.10.20.30.40.50.60.70.80.91Events / 0.025100020003000400050006000Data 2015-2017 (top)tt)W (tt (other)ttSingle Top (top))WSingle Top (Single Top (other) + jetsW + jetsZ, VVTotal uncertStat. uncert. modelling uncert.tt PreliminaryATLAS-1 = 13 TeV, 80.5 fbs+jetsm=1.0 jetsR tkTrimmed anti--jet) < 1.0b jet, R(large-RD| < 2.0h > 350 GeV, |Tp > 50 GeVcombmtop - Contained jet DNN top discriminantRLeading large-00.10.20.30.40.50.60.70.80.91Data/Pred.0.511.500.10.20.30.40.50.60.70.80.91Events / 0.02510002000300040005000Data 2015-2017 (top)tt)W (tt (other)ttSingle Top (top))WSingle Top (Single Top (other) + jetsW + jetsZ, VVTotal uncertStat. uncert. modelling uncert.tt PreliminaryATLAS-1 = 13 TeV, 80.5 fbs+jetsm=1.0 jetsR tkTrimmed anti--jet) < 1.0b jet, R(large-RD| < 2.0h > 350 GeV, |Tp > 50 GeVcombmtop - Inclusive jet DNN top discriminantRLeading large-00.10.20.30.40.50.60.70.80.91Data/Pred.0.511.56080100120140160180200220240Events / 5 GeV50010001500200025003000Data 2015-2017 (top)tt)W (tt (other)ttSingle Top (top))WSingle Top (Single Top (other) + jetsW + jetsZ, VVTotal uncert.Stat. uncert. modelling uncert.tt PreliminaryATLAS-1 = 13 TeV, 80.5 fbs+jetsm=1.0 jetsR tkTrimmed anti--jet) < 1.0b jet, R(large-RD| < 2.0h > 350 GeV, |Tp [GeV]combm jet RLeading large-6080100120140160180200220240Data/Pred.0.511.5(a)

(b)

(c)

Figure 5.9: Comparisons between data and MC of the D2 (a), the ntrack (b), and the
mass (c) distributions for candidate W jets. The candidate W jets from MC t¯t (W ) and
Single Top (W ) signal processes are required to pass the W labeling criteria, as described
in subsection 5.1.3. The candidate W jets from MC signal processes that fail the W labeling
criteria are indicated as t¯t (other) and Single Top (other). A mass window selection of
[65, 95] GeV is included in both the D2 and ntrack distributions, with an additional selection
cut of D2 < 1.2 applied to the ntrack distribution. These selections are included in order
to highlight the observed diﬀerences between data and MC in a region that is close to the
W tagger acceptance region. The bottom panel in each plot shows the ratio of data to the
total MC prediction for each bin of the distributions. The dark green band represents the
statistical uncertainty, the red line the total t¯t modeling systematic uncertainty, and the
light green band the total uncertainty for each bin.

123

00.511.522.533.54Events / 1200040006000800010000Data 2015-2017 (top)tt)W (tt (other)ttSingle Top (top))WSingle Top (Single Top (other) + jetsW + jetsZ, VVTotal uncert.Stat. uncert. modelling uncert.tt PreliminaryATLAS-1 = 13 TeV, 80.5 fbs+jetsm=1.0 jetsR tkTrimmed anti--jet) > 1.0b jet, R(large-RD| < 2.0h > 200 GeV, |Tp [65,95] GeV˛ combm2D jet RLeading large-00.511.522.533.54Data/Pred.0.511.501020304050607080Events / 1500100015002000250030003500Data 2015-2017 (top)tt)W (tt (other)ttSingle Top (top))WSingle Top (Single Top (other) + jetsW + jetsZ, VVTotal uncert.Stat. uncert. modelling uncert.tt PreliminaryATLAS-1 = 13 TeV, 80.5 fbs+jetsm=1.0 jetsR tkTrimmed anti--jet) > 1.0b jet, R(large-RD| < 2.0h > 200 GeV, |Tp < 1.22D [65,95] GeV, ˛ combmtrackn jet RLeading large-01020304050607080Data/Pred.0.511.56080100120140160180200220240Events / 5 GeV2000400060008000100001200014000160001800020000Data 2015-2017 (top)tt)W (tt (other)ttSingle Top (top))WSingle Top (Single Top (other) + jetsW + jetsZ, VVTotal uncert.Stat. uncert. modelling uncert.tt PreliminaryATLAS-1 = 13 TeV, 80.5 fbs+jetsm=1.0 jetsR tkTrimmed anti--jet) > 1.0b jet, R(large-RD| < 2.0h > 200 GeV, |Tp [GeV]combm jet RLeading large-6080100120140160180200220240Data/Pred.0.511.5Unlike MC simulation, there is no truth information in data that can be used to select

jets that are signal-like jets.

Instead, the number of signal jets in data is determined by

performing a χ2 ﬁt of the number of candidate signal jets in MC to the number of candidate

jets in data. The ﬁt is performed both for jets that are tagged and those that are not tagged

in order to determine the normalization factors N tagged

ﬁtted signal and N not tagged

ﬁtted signal. The jet mass

distribution is used as the template on which the ﬁts are performed. Additionally, the ﬁts are

done in diﬀerent jet pT bins using an independent χ2 ﬁt in each bin. For the top taggers, the

pT bins in units of GeV are [350, 400], [400, 450], [450, 500], [500, 600], and [600, 1000]. For

the W tagger, the bins are [200, 250], [250, 300], [300, 350], and [350, 600]. Figure 5.10 shows

examples of the jet mass distributions for the W and contained top taggers after performing

the ﬁt.

The tagger signal eﬃciency in data is deﬁned as:

(cid:15)Data(pT) =

N tagged

ﬁtted signal(pT)

ﬁtted signal(pT) + N not tagged
N tagged

ﬁtted signal(pT)

(5.7)

Scale factors are calculated to calibrate the MC signal eﬃciency to that of data. These are

deﬁned as:

SF(pT) =

(cid:15)Data(pT)
(cid:15)MC(pT)

(5.8)

The propagation of the systematic uncertainties to the signal eﬃciency measurement in MC

is obtained by evaluating the tagger signal eﬃciency with the systematically varied jets. To

obtain the systematically varied signal eﬃciency in data, the χ2 ﬁts to data are repeated using

the systematically varied jet mass distributions. The systematically varied signal eﬃciencies

in MC and data are then used to obtain the systematically varied scale factor. The total

uncertainty on the scale factors is obtained by adding in quadrature the individual scale

124

(a)

(b)

(c)

(d)

Figure 5.10: Comparison between data and the post-ﬁt MC jet mass distributions for jets
that pass and fail the tagging criteria of the 50% ﬁxed signal eﬃciency W tagger and the
80% ﬁxed signal eﬃciency contained top DNN tagger. The distributions are shown in the jet
pT bin [250, 300] GeV for the W tagger and in the bin [500, 600] GeV for the contained top
DNN tagger. The t¯t signal MC template contains candidate jets from signal processes that
are labeled as signal jets for the corresponding tagger. The background MC template in the
W tagger plots contains candidate jets from signal processes that fail the W labeling criteria
and candidate jets from background processes. This template is split in the top tagger plots
for visualization purposes. The t¯t background template contains candidate jets from signal
processes that fail the contained top labeling criteria. The non-t¯t backgrounnd template
contains candidate jets from background processes. In all plots, the t¯t signal template has
been scaled with the normalization parameters N tagged
ﬁtted signal(pT) for
jets that pass and fail the tagger, respectively. The bottom panel in each plot shows the
ratio of data to MC for each bin.

ﬁtted signal(pT) and N not tagged

125

0100200300400Events / 5 GeV50010001500Data 2015-2017 signalttbackgroundATLASPreliminary-1 = 13 TeV, 80.5 fbs = 50%)sig˛W tagger (+jets selectionm=1.0 jetsR tkTrimmed anti- <  300  GeVTp250  GeV < | < 2.0h|Tag pass [GeV]combm0100200300400Data/MC0.511.50100200300400Events / 5 GeV1000200030004000Data 2015-2017 signalttbackgroundATLASPreliminary-1 = 13 TeV, 80.5 fbs = 50%)sig˛W tagger (+jets selectionm=1.0 jetsR tkTrimmed anti- <  300  GeVTp250  GeV < | < 2.0h|Tag fail [GeV]combm0100200300400Data/MC0.511.50100200300400Events / 5 GeV100200300400500Data 2015-2017 signaltt backgroundtnon-t backgroundttATLASPreliminary-1 = 13 TeV, 80.5 fbs = 80%): containedsig˛Top tagger (+jets selectionm=1.0 jetsR tkTrimmed anti- <  600  GeVTp500  GeV < | < 2.0h|Tag pass [GeV]combm0100200300400Data/MC0.511.50100200300400Events / 5 GeV100200300Data 2015-2017 signaltt backgroundtnon-t backgroundttATLASPreliminary-1 = 13 TeV, 80.5 fbs = 80%): containedsig˛Top tagger (+jets selectionm=1.0 jetsR tkTrimmed anti- <  600  GeVTp500  GeV < | < 2.0h|Tag fail [GeV]combm0100200300400Data/MC0.511.5factor variances for each source of systematic uncertainty.

Figures 5.11 and 5.12 show the tagger signal eﬃciencies in MC and data for the 50%

and 80% ﬁxed signal eﬃciency working points, respectively. The bottom panels in these

plots show the corresponding scale factor in the jet pT bin. The total uncertainty on the

scale factors is also shown. The eﬃciency in MC slightly overestimates the eﬃciency in

data, as can be observed from the scale factors ranging between 0.8 and 1. This is more

apparent in the W tagger, where the majority of the pT bins have a scale factor below

0.9, which could be a result of the diﬀerences observed in the input variables of the W

tagger between MC and data. Tables 5.1 - 5.3 show the breakdown of the contribution to

the total scale factor uncertainty from the diﬀerent uncertainty groups considered in the

50% working point taggers. The same information is shown in Tables 5.4 - 5.6 for the

80% working point taggers. Overall, the uncertainty is systematically dominated by the t¯t

modeling uncertainties. The last pT bins also show signiﬁcant contribution from statistical

uncertainty due to low statistics in this kinematic region.

126

(a)

(b)

(c)

Figure 5.11: Comparison between data and MC of the tagger signal eﬃciencies for the
contained top DNN tagger (a), the inclusive top DNN tagger (b), and the W tagger (c) that
were optimized to a 50% ﬁxed signal eﬃciency working point. The bottom panel in each
plot shows the ratio of the data signal eﬃciency to the MC signal eﬃciency in each jet pT
bin, which is equivalent to the tagger scale factor. The green uncertainty band represents
the total uncertainty that is propagated to the scale factors.

127

4006008001000)sig˛Signal efficiency (0.511.5Data 2015-2017PowhegPythia8Total uncert.ATLASPreliminary-1 = 13 TeV, 80.5 fbs+jets selectionm=1.0 jetsR tkTrimmed anti-| < 2.0h| = 50%): containedsig˛Top tagger ( [GeV]Tp jet RLeading large-4006008001000Data/MC0.60.814006008001000)sig˛Signal efficiency (0.511.5Data 2015-2017PowhegPythia8Total uncert.ATLASPreliminary-1 = 13 TeV, 80.5 fbs+jets selectionm=1.0 jetsR tkTrimmed anti-| < 2.0h| = 50%): inclusivesig˛Top tagger ( [GeV]Tp jet RLeading large-4006008001000Data/MC0.60.81200300400500600)sig˛Signal efficiency (0.511.5Data 2015-2017PowhegPythia8Total uncert.ATLASPreliminary-1 = 13 TeV, 80.5 fbs+jets selectionm=1.0 jetsR tkTrimmed anti-| < 2.0h| = 50%)sig˛W tagger ( [GeV]Tp jet RLeading large-200300400500600Data/MC0.60.81(a)

(b)

(c)

Figure 5.12: Comparison between data and MC of the tagger signal eﬃciencies for the
contained top DNN tagger (a), the inclusive top DNN tagger (b), and the W tagger (c) that
were optimized to a 80% ﬁxed signal eﬃciency working point. The bottom panel in each
plot shows the ratio of the data signal eﬃciency to the MC signal eﬃciency in each jet pT
bin, which is equivalent to the tagger scale factor. The green uncertainty band represents
the total uncertainty that is propagated to the scale factors.

128

4006008001000)sig˛Signal efficiency (0.511.5Data 2015-2017PowhegPythia8Total uncert.ATLASPreliminary-1 = 13 TeV, 80.5 fbs+jets selectionm=1.0 jetsR tkTrimmed anti-| < 2.0h| = 80%): containedsig˛Top tagger ( [GeV]Tp jet RLeading large-4006008001000Data/MC0.911.14006008001000)sig˛Signal efficiency (0.511.5Data 2015-2017PowhegPythia8Total uncert.ATLASPreliminary-1 = 13 TeV, 80.5 fbs+jets selectionm=1.0 jetsR tkTrimmed anti-| < 2.0h| = 80%): inclusivesig˛Top tagger ( [GeV]Tp jet RLeading large-4006008001000Data/MC0.911.1200300400500600)sig˛Signal efficiency (0.511.5Data 2015-2017PowhegPythia8Total uncert.ATLASPreliminary-1 = 13 TeV, 80.5 fbs+jets selectionm=1.0 jetsR tkTrimmed anti-| < 2.0h| = 80%)sig˛W tagger ( [GeV]Tp jet RLeading large-200300400500600Data/MC0.60.81Systematic Group

[350,400]

Contained top tagger pT bins [GeV]
[450,500]

[500,600]

[400,450]

[600,1000]

Statistical
Theory
t¯t modeling
Large-R jet
Other experimental < 0.01
< 0.01
b-tagging

0.02
< 0.01
0.16
0.02

0.02
< 0.01
0.15
0.01
< 0.01
< 0.01

0.03
< 0.01
0.12
0.01
< 0.01
< 0.01

0.03
< 0.01
0.13
0.03
< 0.01
< 0.01

Total Uncertainty

0.16

0.15

0.13

0.14

0.04
< 0.01
0.11
0.03
< 0.01
0.02

0.12

Table 5.1: The uncertainty on the scale factor measurement of the 50% ﬁxed signal eﬃciency
contained top tagger from each individual systematic uncertainty group. Each row shows the
uncertainty obtained by adding in quadrature the impact of all uncertainties in the group.
The total uncertainty is obtained by adding in quadrature the impact of all uncertainties.

Systematic Group

[350,400]

Inclusive top tagger pT bins [GeV]
[450,500]
[400,450]

[500,600]

[600,1000]

Statistical
Theory
t¯t modeling
Large-R jet
Other experimental < 0.01
< 0.01
b-tagging

0.01
< 0.01
0.09
0.01

0.02
< 0.01
0.09
0.01
< 0.01
< 0.01

0.02
< 0.01
0.07
0.01
< 0.01
< 0.01

0.02
< 0.01
0.09
0.02
< 0.01
0.01

Total Uncertainty

0.09

0.10

0.08

0.09

0.03
< 0.01
0.08
0.02
< 0.01
0.02

0.09

Table 5.2: The uncertainty on the scale factor measurement of the 50% ﬁxed signal eﬃciency
inclusive top tagger from each individual systematic uncertainty group. Each row shows the
uncertainty obtained by adding in quadrature the impact of all uncertainties in the group.
The total uncertainty is obtained by adding in quadrature the impact of all uncertainties.

129

Systematic Group

W tagger pT bins [GeV]

[200,250]

[250,300]

[300,350]

[350,600]

Statistical
Theory
t¯t modeling
Large-R jet
Other experimental < 0.01
< 0.01
b-tagging

0.01
< 0.01
0.21
0.01

0.02
< 0.01
0.20
0.01
< 0.01
< 0.01

0.03
< 0.01
0.15
< 0.01
< 0.01
< 0.01

0.04
< 0.01
0.12
< 0.01
< 0.01
< 0.01

Total Uncertainty

0.21

0.20

0.15

0.12

Table 5.3: The uncertainty on the scale factor measurement of the 50% ﬁxed signal eﬃciency
W tagger from each individual systematic uncertainty group. Each row shows the uncertainty
obtained by adding in quadrature the impact of all uncertainties in the group. The total
uncertainty is obtained by adding in quadrature the impact of all uncertainties.

130

Systematic Group

[350,400]

Contained top tagger pT bins [GeV]
[450,500]

[500,600]

[400,450]

[600,1000]

Statistical
Theory
t¯t modeling
Large-R jet
Other experimental < 0.01
< 0.01
b-tagging

0.01
< 0.01
0.06
0.01

0.01
< 0.01
0.03
0.01
< 0.01
< 0.01

0.02
< 0.01
0.03
0.02
< 0.01
< 0.01

0.02
< 0.01
0.03
0.01
< 0.01
< 0.01

Total Uncertainty

0.06

0.04

0.04

0.04

0.03
< 0.01
0.02
0.02
< 0.01
0.02

0.04

Table 5.4: The uncertainty on the scale factor measurement of the 80% ﬁxed signal eﬃciency
contained top tagger from each individual systematic uncertainty group. Each row shows the
uncertainty obtained by adding in quadrature the impact of all uncertainties in the group.
The total uncertainty is obtained by adding in quadrature the impact of all uncertainties.

Systematic Group

[350,400]

Inclusive top tagger pT bins [GeV]
[450,500]
[400,450]

[500,600]

[600,1000]

< 0.01
Statistical
< 0.01
Theory
t¯t modeling
0.04
< 0.01
Large-R jet
Other experimental < 0.01
< 0.01
b-tagging

< 0.01
< 0.01
0.02
0.01
< 0.01
< 0.01

0.01
< 0.01
0.01
< 0.01
< 0.01
< 0.01

0.02
< 0.01
0.02
0.01
< 0.01
0.01

Total Uncertainty

0.04

0.04

0.02

0.03

0.02
< 0.01
0.02
< 0.01
< 0.01
0.02

0.04

Table 5.5: The uncertainty on the scale factor measurement of the 80% ﬁxed signal eﬃciency
inclusive top tagger from each individual systematic uncertainty group. Each row shows the
uncertainty obtained by adding in quadrature the impact of all uncertainties in the group.
The total uncertainty is obtained by adding in quadrature the impact of all uncertainties.

131

Systematic Group

W tagger pT bins [GeV]

[200,250]

[250,300]

[300,350]

[350,600]

< 0.01
Statistical
< 0.01
Theory
t¯t modeling
0.12
Large-R jet
< 0.01
Other experimental < 0.01
< 0.01
b-tagging

0.01
< 0.01
0.10
< 0.01
< 0.01
< 0.01

0.02
< 0.01
0.06
0.01
< 0.01
< 0.01

0.03
< 0.01
0.05
< 0.01
< 0.01
< 0.01

Total Uncertainty

0.12

0.10

0.06

0.05

Table 5.6: The uncertainty on the scale factor measurement of the 80% ﬁxed signal eﬃciency
W tagger from each individual systematic uncertainty group. Each row shows the uncertainty
obtained by adding in quadrature the impact of all uncertainties in the group. The total
uncertainty is obtained by adding in quadrature the impact of all uncertainties.

132

5.2 Jet Tagging with Topological Data Analysis

This section presents an alternative approach to top jet tagging by using information from

topological data analysis (TDA) that has not been used in the context of jet tagging before.

TDA is a recent ﬁeld of statistical analysis that utilizes concepts from algebraic topology to

analyze data that has a notion of distance. The main driving hypothesis of TDA is that the

data to be analyzed is sampled from an unknown topological manifold. The manifold can be

fully characterized by its topological features, or homology, such as connected components

and n-dimensional voids. The number of independent features of each homology class, known

as Betti numbers, can be used to classify the manifold. The goal of TDA is to infer the Betti

numbers from data. This is achieved by reconstructing an approximation of the underlying

manifold, known as a simplicial complex, with the data. The simplicial complex consists

of a collection of points, edges, triangles, and higher-dimensional polytopes that are formed

with the notion of distance between datapoints. Once the simplicial complex is constructed,

its simplicial homology is calculated as an approximation of the homology of the underlying

manifold. The application of TDA to jet tagging is motivated by the geometric nature that

jet topoclusters have. The topoclusters that are associated with a jet can be used as the

input dataset into the TDA methodology to be analyzed on a jet-by-jet basis. The Betti

numbers and other topological information of the jet can then be used as inputs for a jet

tagger.

This section starts with the introduction of the necessary concepts to understand the TDA

methodology. Two TDA tools that are used in the workﬂow to tag jets are presented. The

ﬁrst tool is a persistent homology (PH) analysis [68]. The construction of a simplicial complex

is sensitive to the distance scale in the data. Some topological features can appear within a

133

speciﬁc distance scale as new objects that are introduced into the simplicial complex alter

its homology. This raises the question of which of these topological features are statistically

relevant to classify the underlying manifold. PH is used to determine an optimal distance

scale to build the simplicial complex from the input topoclusters. This scale can be optimized

to maximize the number of topological features that persist the most while simultaneously

minimizing those that emerge within narrow distance scale windows. The second tool used

in the workﬂow is the Mapper algorithm [69]. The Mapper algorithm analyzes the interplay

between the simplicial homology of data and functions, also referred to as maps or ﬁlters,

that are deﬁned on the data. These functions can be used to highlight topological features

relative to kinematic features of topoclusters. Finally, the result of applying this workﬂow

to tag top jets from signal W (cid:48) → tb processes against jets from QCD multijet background

processes in MC is presented. The events used in this study are required to satisfy the

event selection described in section 4.1. Additionally, signal jets from W (cid:48) → tb processes

are required to pass the contained top labeling criteria and signal top jet candidacy criteria

that were described in subsection 5.1.3 and subsection 5.1.5, respectively. Two top tagging

algorithms were developed that use the information from TDA: a deep neural network (DNN)

tagger and a convolutional graph neural network (GNN) tagger. The performances of these

two taggers are compared with the contained top DNN tagger that was discussed in the

previous section of this Chapter.

134

5.2.1 Simplicial Complexes and Simplicial Homology

5.2.1.1 Deﬁnition of a Simplicial Complex

As previously discussed, the application of TDA techniques involves the construction of a

simplicial complex from the input data. Given a ﬁnite dataset X, a simplicial complex K of

X is a collection of subsets of X that satisﬁes the following two properties:

1. ∀v ∈ X =⇒ {v} ∈ K (Inclusion of points.)

2. If σ ∈ K and τ ⊂ σ =⇒ τ ∈ K (Closed under the subset operation.)

The elements of K are known as simplices. Simplices are classiﬁed by their dimension, with

a p-dimensional simplex being a subset of X that has p + 1 elements. If σ is a p-dimensional

simplex and τ ⊂ σ is a p − 1-dimensional simplex, then τ is said to be a face of σ. Addi-

tionally, the set of all p-dimensional simplices is denoted as Kp. The standard nomenclature

for simplices of dimensions 0 through 3 is to denote them as vertices, edges, triangles, and

tetrahedra, respectively. The dimension of the simplicial complex K is deﬁned as the maxi-

mum dimension of its simplices. An example of a 2-dimensional simplicial complex of a set

with four elements is shown in Figure 5.13.

5.2.1.2 Constructing a Simplicial Complex

The process of constructing a simplicial complex is not unique. As a result, diﬀerent families

of complexes exist, each with unique properties and varying degrees of approximation of

the underlying manifold. The ˇCech ( ˇC) and Vietoris-Rips (VR) complexes will be discussed

since they are used in the Mapper algorithm and PH studies presented in this Chapter,

respectively.

135

Figure 5.13: Example of a set X with four elements and a simplicial complex K of dimension
2 that is constructed from X. The simplicial complex contains four vertices corresponding
to the individual elements of X, ﬁve edges corresponding to all possible subsets of X with
two elements except for {a, d}, and one triangle, depicted by the blue shaded area, which
corresponds to the subset with three elements {a, b, c}.

In order to deﬁne the ˇC complex, the notion of a covering set of a topological space and

the nerve of the covering set must be deﬁned ﬁrst. A covering set U = {Ui}i∈I of a topological

space X is deﬁned as an indexed family of subsets Ui of X with indexing set I, such that for

all elements xj of X there exists at least one cover element Uj that contains xj. The nerve

of a cover is the collection of ﬁnite subsets of indices in I corresponding to elements of U

with non-empty intersection. The ˇC complex is deﬁned as the nerve of a covering set U of a

topological space. Thus, σ = {i0, · · · , ip} ∈ ˇC is a p-dimensional simplex if

p
j=0 Uij

(cid:54)= ∅.

In TDA applications, the cover U is usually taken as the collection of (cid:15)-spheres that are

(cid:84)

centered at each datapoint. An (cid:15)-sphere centered at a point x is deﬁned as the set of all

points y within distance (cid:15) from x in Euclidean n-space. The parameter (cid:15) is the distance

scale that parametrizes the construction of the ˇC complex. An example of a topological

manifold with a ˇC complex that is identical to the example depicted in Figure 5.13 is shown

in Figure 5.14.

The VR complex parametrized with a distance scale (cid:15) is deﬁned as the clique-complex

of an (cid:15)-neighborhood graph [70]. The graph can be built from the data with the vertices

representing individual datapoints. Edges connect two vertices if the distance between dat-

136

U2

Figure 5.14: Example of a topological manifold with a cover given by U = {U1, U2, U3, U4}.
The colored dashed-dotted lines represent the boundaries of each cover element. Note that
U4 (cid:54)= ∅. By relabeling the cover elements as
U1
U1 → a, U2 → b, U3 → c, and U4 → d, the simplicial complex shown in Figure 5.13 is
(cid:84)
(cid:84)
obtained.
In this example the manifold consists of a single connected component which
encompasses a circular void.

U4 (cid:54)= ∅, and U3

U3 (cid:54)= ∅, U2

(cid:84)

(cid:84)

apoints is less than (cid:15). Higher-dimensional simplices are included in the VR complex if all

the edges associated with the simplex are in the graph. For example, if a VR complex is

constructed from the nerve of the manifold shown in Figure 5.14, then it would also contain

the 2-dimensional simplex {b, c, d}.

From this discussion, it is clear that the ˇC and VR complexes satisfy the properties of a

simplicial complex that were previously discussed. It should be noted that the approximation

of the underlying manifold, and consequently its homology, obtained from the ˇC complex

improves as the granularity of the cover decreases. On the other hand, the VR complex

provides an approximation of the simplicial homology of ˇC. Speciﬁcally, it can be shown that

ˇC(cid:15) ⊆ VR(cid:15) ⊆ ˇC√

2(cid:15)

. The VR complex is used in the PH studies since it is computationally

more eﬃcient to construct compared to the ˇC complex.

137

U1U2U3U45.2.1.3 Computing Simplicial Homology

To compute the simplicial homology of a simplicial complex, a relationship between the

topological features of the underlying manifold and those of the simplicial complex must

be established. This will be achieved with the introduction of Homology groups, which are

vector spaces that represent the topological features in each dimension of a manifold as

vectors. This will allow us to determine the number of unique, up to path deformation,

topological features from the dimensions of these vector spaces.

We ﬁrst start by introducing the notion of the boundary of a simplex through boundary

linear transformations ∂p : V (Kp) → V (Kp−1), where V (Kp) = Span(Kp, F) is the vec-

tor space spanned by the set of p-dimensional simplices over a ﬁeld F, which will be left

unspeciﬁed for the moment. The purpose of the transformation ∂p is to establish a linear re-

lationship between a p-dimensional simplex σ and its faces τ , such that the result of applying

the transformation corresponds to the notion of the region boundary that is encapsulated

by σ on the manifold, with its faces forming the boundary. These transformations must

preserve the topological features that are bounded by linear combinations of simplices. Ad-

ditionally, these transformations must satisfy the constraint on their functional composition

∂p−1 ◦ ∂p = 0, which indicates that the boundary of a boundary is empty. The exact form of

these transformations depends on the ﬁeld F, as diﬀerent ﬁelds can take into account eﬀects

such as whether the manifold has a well-deﬁned orientation or not. Throughout the remain-

der of this Chapter, the ﬁeld F is taken as the ﬁeld with two elements, Z2 = {0, 1}, due to its

simplicity in implementation. A full discussion of computing simplicial homology in other

ﬁelds is outside the scope of this thesis. Under the ﬁeld Z2, the boundary transformation

138

takes the form:

∂p(σ) =

τ

(cid:88)τ ⊂σ, τ ∈Kp−1

(5.9)

In the case where p = 0 or is greater than the dimension of the simplicial complex K, then

∂p is deﬁned as the zero map.

After introducing the concept of the boundary of a simplex, we are now in a position to

deﬁne the concepts of p-boundaries and p-cycles. Both p-boundaries and p-cycles correspond

to path components in the manifold that form closed p-dimensional loops. The elements of

the null subspace ker(∂p) are known as p-cycles since all closed paths map to zero in a lower

dimension. The elements of the image subspace Im(∂p+1) are known as p-boundaries since

they bound higher-dimensional simplices. From the constraint ∂p ◦ ∂p+1 = 0, it can be seen

that Im(∂p+1) is fully contained within ker(∂p). All p-cycles that are not p-boundaries corre-

spond to p-dimensional voids in the manifold since there are no higher-dimensional simplices

that are encompassed by the p-cycle. These two subspaces are the essential ingredients in

the deﬁnition of Homology groups. As an example, the subspaces ker(∂1) and Im(∂2) from

the simplicial complex in Figure 5.13 are shown in Table 5.7.

ker(∂1)

Im(∂2)

{a, b} + {a, c} + {b, c}
{b, c} + {b, d} + {c, d}
{a, b} + {a, c} + {b, d} + {c, d}

{a, b} + {a, c} + {b, c}

Table 5.7: The elements of the subspaces ker(∂1) and Im(∂2) of the simplicial complex
shown in Figure 5.13. Note that ker(∂1) is a two-dimensional subspace since the ﬁrst two
rows add to the third row with the Z2 algebra, and Im(∂2) is a one-dimensional subspace.
Furthermore, {a, b} + {a, c} + {b, c} is a 1-boundary while {b, c} + {b, d} + {c, d} is a 1-cycle
that is not a boundary.

Since the topological features of the simplicial complex are unique up to path deformation,

139

the computation of simplicial homology counts the instances of independent features. This

is achieved by deﬁning the quotient vector space Hp = ker(∂p)/Im(∂p+1), known as the pth

Homology group. The elements of Hp are the equivalence classes of p-cycles that represent

unique p-dimensional topological features up to path deformation. Consequently, the pth

Betti number βp is deﬁned as the dimension of Hp:

βp = dim(Hp) = dim(ker(∂p)) − dim(Im(∂p+1))

(5.10)

The standard procedure to calculate the Betti numbers is to obtain the matrix representa-

tions of the linear transformations ∂p in Z2, and perform Gaussian elimination to determine

the rank and nullity of the matrices. To ﬁnalize the example of the simplicial complex in

Figure 5.13, the matrix representation of the boundary transformation ∂1 is given by:

{a, b} {a, c} {b, c} {b, d} {c, d}

∂1 =

{a}

{b}

{c}

{d}


















1

1

0

0

1

0

1

0

0

1

1

0

0

1

0

1

0

0

1

1


















(5.11)

This representation is deﬁned by the ordered basis of V (K0) and V (K1), which are shown

to the left of the vertical line and above the horizontal line in Equation 5.11, respectively.

140

After performing Gaussian elimination on its columns, the matrix reduces to:

{a, b} {a, c} {a, b} + {a, c} + {b, c} {b, d} {b, c} + {b, d} + {c, d}

∂1 =

{a}

{b}

{c}

{d}


















1

1

0

0

1

0

1

0

0

0

0

0

0

1

0

1

0

0

0

0

Similarly, the boundary transformation ∂2 matrix representation is given by:

∂2 =

{a, b}

{a, c}

{b, c}

{b, d}

{c, d}






















{a, b, c}

1

1

1

0

0





































(5.12)

(5.13)

As it can be observed in Equation 5.12, the rank of the matrix is 3. To determine the value

of β0 we must know dim(ker(∂0)), but since ∂0 is the zero map, its null space is V (K0).

Thus, β0 = dim(V (K0)) − dim(Im(∂1)) = 4 − 3 = 1, which implies that there is a single

connected component in the simplicial complex. Similarly, from Equation 5.13, the rank

of the matrix is trivially equal to 1. To determine β1 we use the Rank-Nullity theorem

to obtain dim(ker(∂1)) = dim(V (K1)) − dim(Im(∂1)) = 5 − 3 = 2. Thus, we get that

β1 = dim(ker(∂1)) − dim(Im(∂2)) = 2 − 1 = 1, which implies that there is a single circular

void in the simplicial complex. Both calculations give the correct number of topological

141

features of the underlying manifold and the simplicial complex.

5.2.1.4 Filtered Simplicial Complex and Persistent Homology

Up to this point in the discussion, it has been assumed that the simplicial complex is ﬁxed

in structure. This limits the simplicial homology analysis of the data to a ﬁxed conﬁguration

of the distance scale parameter (cid:15). To analyze the simplicial homology as a function of the

distance scale, it is necessary to introduce a ﬁnal construction known as a ﬁltered simplicial

complex. This is the central object that drives the PH analysis. A ﬁltered simplicial complex

is deﬁned as a ﬁnite sequence of nested simplicial complexes, {Ki}i≤N(cid:15), where N(cid:15) is the
number of ﬁltration steps, which will depend on the distance scale parameter, and Ki ⊂ Kj

if i < j. The index i is used to denote the ﬁltration step of the sequence, with larger

indices corresponding to larger values of the distance scale parameter (cid:15). This allows the

deﬁnition of inclusion maps, fi≤j : Hi

p → Hj

p, that give information on how the Betti

numbers change between ﬁltration steps. Each new ﬁltration step brings new simplices into

consideration. The pth Betti number increases if new path independent p-cycles that are not

p-boundaries are formed. Otherwise, the pth Betti number decreases if previous voids are

ﬁlled in with new simplices. An example of a ﬁltered simplicial complex with four ﬁltration

steps is shown in Figure 5.15. In this example, the number of connected components changes

along the ﬁltration as β0 = 4 → 1 → 1 → 1. The number of circular voids changes as

β1 = 0 → 1 → 2 → 1. These results are usually interpreted as persistence diagrams, which

is shown in Figure 5.16 for the ﬁltered simplicial complex shown in Figure 5.15.

142

Figure 5.15: Example of a ﬁltered simplicial complex with four ﬁltration steps. The ﬁrst
step consists of four individual points. In the second step, points are pairwise connected so
that a single connected component that encompasses a void is formed. In the third step,
points b and c are connected, resulting in the creation of a new void. The ﬁnal ﬁltration
step is the same simplicial complex shown in Figure 5.13, which is obtained after ﬁlling one
of the voids.

143

K1K2K3K4Figure 5.16: Persistence diagram that summarizes the simplicial homology of the ﬁltered
simplicial complex shown in 5.15. The horizontal axis, denoted as birth, indicates the ﬁltra-
tion step at which a topological feature enters in the simplicial complex. The vertical axis,
denoted as death, indicates the ﬁltration step at which a topological feature ceases to exist
in the simplicial complex. Topological features are represented as (birth,death) points. The
blue points correspond to connected components, while the red points correspond to circular
voids. The size of each point is proportional to the Betti number of the topological feature
at the corresponding ﬁltration step. Closed points correspond to features that died before
the ﬁnal ﬁltration step. Open points correspond to features that persisted until the ﬁnal
ﬁltration step.

144

Birth 2134234Death 5.2.2 Persistent Homology Studies

In this section, the large-R jets from W (cid:48) → tb and QCD multijet events are analyzed with

the PH algorithm on a jet-by-jet basis. The jets used in this study are required to have

pT > 350 GeV and at least 10 topoclusters. The selection requirement on the number

of topoclusters is made since the topoclusters will be used as the inputs to the PH al-

gorithm, thereby removing jets that will not have an interesting topology associated with

their topoclusters. The signal jets in this study are large-R jets from the W (cid:48) → tb process

that pass the contained top jet label requirement, as described in subsection 5.1.3, while

all large-R jets from the QCD multijet process are background jets. All topoclusters of a

jet are boosted to the center of momentum (CoM) frame of the jet prior to being analyzed

with the PH algorithm. This is done so that the PH algorithm processes jets with diﬀerent

levels of collimation on an equal basis. After this preprocessing step, the pseudorapidity

and azimuthal angle pairs, (η, φ), of each topocluster in the jet are used to build the VR

complex by treating each coordinate pair as a vertex of the VR complex. The VR complex

is then extended to a ﬁltered VR complex in order to analyze its simplicial homology with

the PH algorithm. The processing of the topoclusters of a jet through the PH algorithm is

summarized in the following steps:

1. Build the (cid:15)-neighborhood graph of the jet using the topocluster (η, φ) coordinate pairs

as the vertices of the graph.

2. Deﬁne edges ei,j between all possible topocluster pairs (ti, tj) and assign a weight

ωi,j = ∆R(ti, tj) to each edge.

3. Build the VR complex from the (cid:15)-neighborhood graph by including all simplices up

to dimension 2. For each simplex of dimension 2, deﬁne its weight as the maximum

145

weight of all its edges.

4. Construct the ﬁltered VR complex by sorting the weights ωi,j in ascending order. A

ﬁltration step is introduced for each distinct weight value.

5. Calculate the boundary linear transformation matrices in all ﬁltration steps in order

to obtain the Betti numbers, as discussed in subsection 5.2.1.

6. Build the persistence diagram of the jet for β0 and β1.

Since the topoclusters are represented as two-dimensional coordinate pairs, the only

meaningful Betti numbers that can be extracted from the PH analysis are the number of

connected components, β0, and the number of circular voids, β1. The persistence of the

simplicial homology of jets is summarized in the plots shown in Figures 5.17. The β0 max-

imum persistence length is the ∆R scale at which all topoclusters in the jet form a single

connected component. This scale is analogous to reconstructing the jet from its topoclusters.

The β1 maximum persistence length is deﬁned as the maximal ∆R interval length that a

circular void achieves in the ﬁltered VR complex of the jet. Speciﬁcally, this is deﬁned as the

diﬀerence between the ∆R scale at which the void disappears from the ﬁltered VR complex

(death scale) and the scale at which the void is introduced in the ﬁltered VR complex (birth

scale) that is maximal. As observed in the plots, on average, signal jets become a single

connected component at lower distance scales compared to background jets. Both signal

and background jets populate the same two regions of the persistence diagram of the most

persistent circular void. The majority of jets populate the upper region of the diagram,

which corresponds to jets that have their most persistent circular void appearing late in the

ﬁltration and disappearing after the topoclusters form a single connected component. The

lower region of the diagram contains jets that populate the region close to the death=birth

146

region for low values of the birth scale, which corresponds to jets with circular voids that ap-

pear early in the ﬁltration and are short-lived. The fraction of background jets that populate

the lower region of the diagram is larger compared to signal jets. Additionally, on average,

the most persistent circular void in signal jets persists longer when compared to background

jets.

Based on these observations, the topoclusters in signal jets appear to be clustered along

ﬁlament-like structures that match in direction with the prong structures that are formed

by the top decays in the CoM frame of the jet. Since the topoclusters are clustered along

well-deﬁned structures, the jet can be reconstructed as a single connected component at

smaller ∆R scales. Additionally, any circular void that is formed in signal jets would be

in between the prong structures of the jet. On the other hand, the observations made for

background jets could be indicative of the topoclusters being distributed amorphously in the

CoM frame of the jet. Since there is no well-deﬁned structure, the jet is reconstructed at

larger ∆R scales. This could explain why signal jets have, on average, a smaller β0 maximum

persistence length compared to background jets. Additionally, all short-lived circular voids

in background jets could be explained as noise from dispersed topoclusters.

To verify these claims, the three- and two-pronged substructures of signal top jets are

taken as hypotheses. These two cases correspond to a resolved top decay and to a colli-

mated decay of the W boson, respectively. To achieve this, the kinematic features of the

connected components (CCs) are analyzed separately when there are exactly three and two

CCs in the ﬁltered VR complex of the jet. To obtain the kinematic description of a CC,

the four-momenta of the topoclusters associated with the CC are added. Thus, the CCs can

be interpreted as subjets that originate the prong structures in the large-R jet for each hy-

pothesis. The distributions of ∆R scales at which the ﬁltered VR complexes of jets contain

147

(a)

(b)

(c)

(d)

Figure 5.17: The plots in (a) and (b) show the ∆R interval length of the most persistent con-
nected component and circular void of all jets analyzed with the PH algorithm, respectively.
The plots in (c) and (d) show the cumulative persistence diagram of the most persistent
circular void for signal top jets and background QCD jets, respectively. This corresponds
to taking the point of the β1 persistence diagram of each jet that maximizes the persistence
length. The horizontal axis of the persistence diagrams represent the ∆R scale at which
the circular void appears in the ﬁltered VR complex of the jet (“birth”), while the vertical
axis represents the ∆R scale at which the void disappears from the ﬁltered VR complex
(“death”).

148

R LengthD Maximum Persistence 0b00.511.522.533.544.55Fraction of events00.020.040.060.080.10.120.14QCD jettop quark jet  R LengthD Maximum Persistence 1b00.511.522.5Fraction of events00.020.040.060.080.1QCD jettop quark jet  00.511.522.533.5RD Maximum Persistence Birth 1b00.511.522.533.5 RD Maximum Persistence Death 1b0200040006000800010000top quark jet00.511.522.533.5RD Maximum Persistence Birth 1b00.511.522.533.5 RD Maximum Persistence Death 1b050100150200250300350400450310·QCD jetexactly three and two CCs are shown in Figure 5.18.

(a)

(b)

Figure 5.18: The distributions of the ∆R length scales at which the ﬁltered VR complexes
of jets have exactly three connected components (a) and two connected components (b)
compared between signal and background jets.

The mass distributions of the CCs after being sorted in descending order by their pT are

shown in Figures 5.19 and 5.20 when there are three CCs and two CCs, respectively. The

three CCs have reconstructed some of the relevant substructure in signal jets when assuming

the three-pronged top jet hypothesis. As can be observed in Figure 5.19, the leading CC

shows bumps in the mass distribution near the W and top mass. The subleading CC shows

a small bump close to the W mass, while the mass distribution of the third leading CC

could correspond to reconstructing the b quark or one of the quarks from the W . These

observations indicate that a β0 = 3 jet topology has partially resolved some of the relevant

substructure in signal top jets. The mass bumps in the leading CC become more prominent

when assuming the two-pronged top jet hypothesis. Additionally, the subleading CC mass

bump near the W mass becomes slightly more prominent. For background jets, the mass

distributions of the CCs peak at lower values and exhibit a long tail at higher values that

149

R Scale to 3 Connected ComponentsD00.511.522.53Fraction of events00.020.040.060.080.10.12=30b ﬁ=4 0bQCD jettop quark jet  R Scale to 2 Connected ComponentsD00.511.522.53Fraction of events00.020.040.060.080.1=20b ﬁ=3 0bQCD jettop quark jet  lacks prominent structures like those present in the CCs of signal jets. This implies that

the CCs are reconstructing random patterns from the topoclusters of background jets. From

these observations, the scale ∆R = 1.2, which is approximately equal to the mean of the ∆R

length scale distribution when there are two CCs in signal jets, provides a good qualitative

description between signal and background jets. The CCs have reconstructed most of the

interesting substructure of signal jets at this scale. The value of this distance scale is used

as an input parameter to the Mapper algorithm, as will be discussed in the next section.

150

(a)

(b)

(c)

Figure 5.19: The mass distributions of the leading in pT connected component (CC0) (a), the
second leading in pT connected component (CC1) (b), and the third leading in pT connected
component (CC2) (c) in the ﬁltration step of the ﬁltered VR complex when there are exactly
three CCs. The distributions are compared between signal and background jets.

151

 mass [GeV]0CC020406080100120140160180200Fraction of events00.050.10.150.20.250.30.350.40.45=30bQCD jettop quark jet   mass [GeV]1CC020406080100120140160180200Fraction of events00.050.10.150.20.250.30.350.40.45=30bQCD jettop quark jet   mass [GeV]2CC020406080100120140160180200Fraction of events00.20.40.60.81=30bQCD jettop quark jet  (a)

(b)

Figure 5.20: The mass distributions of the leading in pT connected component (CC0) (a) and
the second leading in pT connected component (CC1) (b) in the ﬁltration step of the ﬁltered
VR complex when there are exactly two CCs. The distributions are compared between signal
and background jets.

152

 mass [GeV]0CC020406080100120140160180200Fraction of events00.050.10.150.20.25=20bQCD jettop quark jet   mass [GeV]1CC020406080100120140160180200Fraction of events00.10.20.30.40.50.60.7=20bQCD jettop quark jet  5.2.3 Mapper Algorithm Studies

The next step in the TDA workﬂow is to analyze jets with the Mapper algorithm. The

mapper algorithm will allow us to analyze the interplay between the simplicial homology of a

jet and the kinematic features of the topoclusters in the jet. This will be achieved through the

use of continuous ﬁltering functions that map the topoclusters from their underlying manifold

to a known image topological space where its simplicial homology can be analyzed. Unlike

the PH analysis study presented in the previous section, the Mapper algorithm analyzes the

jet at a ﬁxed distance scale, which is known as the resolution scale (∆Rres) of the algorithm.

Another parameter that needs to be provided to the Mapper algorithm is a ﬁnite covering

set for the topological space that the topoclusters are mapped onto. The elements of the

covering set will be allowed to overlap so that a topocluster has the possibility of being

mapped onto multiple cover elements. As discussed in subsubsection 5.2.1.2, this will allow

us to deﬁne a non-trivial nerve of the cover from which the ˇC complex of the jet can be

constructed. The topoclusters will be spatially clustered in each cover element using ∆Rres

as the clustering distance threshold. The clusters of topoclusters will form the vertices of

the ˇC complex. The vertices correspond to collections of topoclusters that are spatially near

within ∆Rres and have a similar response to the ﬁlter function since they are mapped to

the same cover element. The n-dimensional simplices are obtained from n+1 vertices that

share at least one topocluster in common. The higher-dimensional simplices allow us to

study how the topocluster response to the ﬁlter function transitions along path connected

components in the image topological space. Once the ˇC complex of the jet is built, its

simplicial homology is calculated. Since the ﬁlter functions are assumed to be continuous,

the simplicial homology obtained from the ˇC complex of the jet in the image topological

153

space is the same as the one from the underlying manifold of the jet.

For the studies presented in this section, the ﬁlter function used is the φ-projection of

the topocluster in the η-φ plane. Thus, the topoclusters are mapped to a topological space

that corresponds to an arc of a ring. The covering set chosen for this space is the set of

overlapping intervals given by:

U = {[−3.2, −1.2], [−2.0, 0.4], [−0.4, 2.0], [1.2, 3.2]}

(5.14)

The topoclusters that are mapped onto each cover element are spatially clustered using a

single-linkage clustering algorithm, which deﬁnes the distance between two clusters vn and

vm of topoclusters as:

∆R(vn, vm) =

min
ti∈vn, tj ∈vm

∆R(ti, tj)

(5.15)

The motivation behind this choice of clustering algorithm is that two clusters of topoclusters

are merged at a given clustering step if they achieve the minimal distance between topoclus-

ters that are not in the same cluster, which is similar in behavior to how the topoclusters

are aggregated to form CCs by the PH algorithm. The clustering process in a given cover

element is stopped after all remaining clusters have a single-linkage distance greater than the

resolution scale, which is set to ∆Rres = 1.2 as motivated at the end of the preceding section.

A detailed study of the optimization of the Mapper algorithm by using other ﬁlter functions

and parameter options can be found in Appendix B. The processing of the topoclusters of a

jet through the Mapper algorithm is summarized in the following steps:

1. For each topocluster in the jet, evaluate its φ-projection and map it to the correspond-

ing cover elements from Equation 5.14.

154

2. For each cover element, apply the single-linkage clustering algorithm to the topoclusters

that are mapped onto the cover element. The clustering process is stopped once all

clusters of topoclusters have a single-linkage distance greater than ∆Rres = 1.2. The

resulting clusters of topoclusters will form the vertices of the ˇC complex of the jet.

3. Construct the ˇC complex of the jet from the nerve of the cover by checking which

vertices in consecutive cover elements share at least one topocluster. Only simplices

up to dimension 2 will be included in the ˇC complex.

4. Evaluate the simplicial homology of the ˇC complex of the jet.

Similar to the PH analysis study, since the topoclusters are represented as two-dimensional

objects, the only meaningful Betti numbers that can be extracted in this study are β0 and

β1. The CCs that are obtained from the Mapper algorithm correspond to vertices that form

path components from sharing topoclusters. These CCs are interpreted as subjets by adding

the four-momenta of all distinct topoclusters that are associated with a given CC, similar

to how it was done in the PH analysis study. The circular voids correspond to regions in

the η-φ plane where the path components branch oﬀ due to a deﬁcit of topoclusters for a

given range of values of η and then merge back to a single branch. An example event display

demonstrating a signal top jet being processed through the ﬁrst three steps of the Mapper

algorithm is shown in Figure 5.21.

The distributions of the Betti numbers β0 and β1 of jets are shown in Figure 5.22. As can

be observed from these plots, the topology of both signal and background jets is characterized

by the presence of multiple CCs, with the majority of jets populating the β0 = 1 − 4 region,

and a lack of circular voids. The absence of circular voids could be a side eﬀect of using the

φ-projection as the ﬁltering function with a resolution scale of ∆Rres = 1.2 and a covering

155

(a)

(b)

(c)

Figure 5.21: The topoclusters of a signal top jet in the CoM frame of the jet are represented
as (η,φ) coordinate pairs in the η-φ plane, as shown in (a). The coordinate pairs are color
coded based on the pT of the topoclusters. The topoclusters are mapped onto the intervals
of the covering set in Equation 5.14. The light green shaded regions represent the overlap
regions of the cover elements. The single-linkage clustering of the topoclusters that are
mapped onto the intervals [−2.0, 0.4] and [−0.4, 2.0] is shown in (b). A circle of radius
R = ∆Rres/2 = 0.6 is drawn centered around each topocluster. The red and blue circles
correspond to the topoclusters that are mapped exclusively onto the intervals [−2.0, 0.4] and
[−0.4, 2.0], respectively, while the purple circles correspond to topoclusters that are mapped
onto both intervals. All topoclusters that have overlapping circles in a given interval form
a vertex of the ˇC complex.
In this event, a single vertex is formed in each of these two
intervals.

156

321012332101230102030405060pT [GeV]321012332101230102030405060pT [GeV]321012332101230102030405060pT [GeV]Figure 5.21:
(cont’d) The “x” marks represent the coordinates of the vertices after adding the four-
momenta of the topoclusters associated with the vertex, with their color corresponding to
the pT scale. The ˇC simplicial complex is constructed after forming edges, represented by the
black lines, between vertices from diﬀerent cover intervals that have at least one topocluster
in common, as shown in (c). The vertices obtained from the remaining intervals are shown
in this step. The end result is a jet that has a single connected component and no circular
voids.

set that is very granular, thereby reducing the ability of the Mapper algorithm to resolve

circular features in the jets.

(a)

(b)

Figure 5.22: Comparison of the number of connected components (a) and the number of
circular voids (b) between signal top jets and background QCD jets after being processed
through the Mapper algorithm.

Two metrics are calculated in order to quantify how the topoclusters are distributed in

each CC. The ﬁrst metric is the average ∆R in the CoM frame of the jet between the CC and

the topoclusters associated with it (∆Ravg(CC, t)). This metric quantiﬁes the eﬀective size of

the CC by measuring how displaced the topoclusters are from the axis of the CC. Large values

of ∆Ravg(CC, t) indicate that the CC has a large fraction of topoclusters far from the CC axis,

while small values indicate that the topoclusters are distributed close to the CC axis. The

157

0b024681012Fraction of events00.050.10.150.20.250.30.350.4QCD jettop quark jet  1b00.511.522.533.54Fraction of events00.20.40.60.811.2QCD jettop quark jet  second metric is the average ∆R in the CoM frame of the jet between all possible topocluster

pairs that are associated with a given CC (∆Ravg(ti, tj ∈ CC)). This metric quantiﬁes the

eccentricity of the topocluster distribution in the CC. Large values of ∆Ravg(ti, tj ∈ CC)

indicate that the topoclusters in the CC are distributed along large ﬁlament-like structures,

while small values indicate that the topoclusters are densely distributed in the CC. The

distributions of these two metrics evaluated on the leading (CC0) and subleading (CC1)

connected components are shown in Figures 5.23 and 5.24, respectively. The distributions

are shown separately for jets that have a low number of CCs (β0 = 2 − 3) and a high number

of CCs (β0 ≥ 4) in order to highlight any eﬀects that the value of β0 may have on these

metrics. As can be observed from the ∆Ravg(CC0, t) distribution, the CC0 in signal top jets

tends to be larger in eﬀective size when compared to background QCD jets. Additionally,

from the ∆Ravg(ti, tj ∈ CC0) distribution, it is observed that the topoclusters in CC0 from

signal top jets are more eccentrically distributed when compared to background QCD jets.

These observations are independent on the value of β0 of the jet. However, the eﬀective size

and eccentricity of CC0 from jets with β0 ≥ 4 are slightly smaller when compared to jets

with β0 = 2 − 3, which may be due to CC0 containing a lower fraction of topoclusters in the

former case. These observations are indicative that the topoclusters in CC0 from signal top

jets are spatially spread out, forming long path-connected structures. On the other hand,

the topoclusters in CC0 from background QCD jets are densely distributed, forming smaller

structures. No signiﬁcant diﬀerences are observed in the distributions for CC1 between signal

and background jets.

The inclusive distributions of mass and pT of CCs from jets with β0 = 1 − 3 are shown

in Figure 5.25. The mass distribution exhibits bumps close to the W and top mass, which

gives conﬁdence in the top jet substructure reconstruction with the CCs. On the other

158

(a)

(b)

(c)

(d)

Figure 5.23: Comparison between signal top jets and background QCD jets of the met-
rics ∆Ravg(CC0, t) and ∆Ravg(ti, tj ∈ CC0). The distribution of ∆Ravg(CC0, t) and
∆Ravg(ti, tj ∈ CC0) are shown for jets with β0 = 2 − 3 in (a) - (c) and for jets with
β0 ≥ 4 in (b) - (d).

159

,t)0(CCavgRD00.511.522.533.54Fraction of events00.050.10.150.20.250.3 = 2-30bQCD jettop quark jet  ,t)0(CCavgRD00.511.522.533.54Fraction of events00.10.20.30.40.50.6 4‡ 0bQCD jettop quark jet  )0 CC˛ j,ti(tavgRD00.511.522.5Fraction of events00.050.10.150.20.250.3 = 2-30bQCD jettop quark jet  )0 CC˛ j,ti(tavgRD00.511.522.5Fraction of events00.10.20.30.40.5 4‡ 0bQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.24: Comparison between signal top jets and background QCD jets of the met-
rics ∆Ravg(CC1, t) and ∆Ravg(ti, tj ∈ CC1). The distribution of ∆Ravg(CC1, t) and
∆Ravg(ti, tj ∈ CC1) are shown for jets with β0 = 2 − 3 in (a) - (c) and for jets with
β0 ≥ 4 in (b) - (d).

160

,t)1(CCavgRD00.20.40.60.811.21.41.61.82Fraction of events00.050.10.150.20.250.30.35 = 2-30bQCD jettop quark jet  ,t)1(CCavgRD00.20.40.60.811.21.41.61.82Fraction of events00.050.10.150.20.250.30.350.4 4‡ 0bQCD jettop quark jet  )1 CC˛ j,ti(tavgRD00.20.40.60.811.21.41.61.82Fraction of events00.050.10.150.20.250.3 = 2-30bQCD jettop quark jet  )1 CC˛ j,ti(tavgRD00.20.40.60.811.21.41.61.82Fraction of events00.050.10.150.20.250.30.350.4 4‡ 0bQCD jettop quark jet  hand, the mass distribution in background QCD jets peaks at lower values and exhibits a

long tail, which is consistent with the CCs reconstructing objects from random patterns of

topoclusters. In order to extend the kinematic description of the CCs, variables that are

(a)

(b)

Figure 5.25: The inclusive distributions of mass (a) and pT (b) of connected components
from jets with β0 = 1 − 3.

inspired from the jet substructure observables discussed in subsection 5.1.1 are deﬁned using

the CCs as subjets. The n-subjettiness variables are obtained by calculating the ∆R distance

between topoclusters and the closest CC. If the jet has β0 > n, the CCs are reclustered using

the Cambridge-Aachen algorithm until there are exactly n CCs in the jet. The Cambridge-

Aachen algorithm is used in order to maintain consistency with the spatial clustering that is

used by the Mapper algorithm when creating vertices. Furthermore, by the same reasoning,

splitting scales that are analogous to the kT splitting scales are deﬁned as the minimum

distance between two CCs before they are merged using the Cambridge-Aachen algorithm.

Two variations of the n-point energy correlation functions and their ratios are deﬁned. The

ﬁrst set calculates the energy correlation of the jet by using the CCs as the jet constituents.

161

CCs mass [GeV]020406080100120140160180200Fraction of events00.050.10.150.20.250.30.35 = 1-30bQCD jettop quark jet   [GeV]TCCs p050010001500200025003000Fraction of events00.050.10.150.20.25 = 1-30bQCD jettop quark jet  The second set calculates the energy correlation of a CC by using the topoclusters associated

with the CC as its constituents. Figure 5.26 shows example distributions of these variables

that are inspired by the jet substructure observables, with additional plots presented in

Appendix C. As can be observed from the plots, the interplay between the topological

structure of jets and the kinematics of the CCs contains discriminatory power between signal

and background jets. In the next section, the information obtained from the topology of jets

and the kinematics of the CCs will be used to train two taggers that are designed to tag

signal top jets against background QCD jets.

162

(a)

(b)

(c)

(d)

Figure 5.26: Comparisons of jet substructure variable distributions that are evaluated using
the CCs of the jet and the topoclusters associated with a given CC between signal and back-
ground jets. The Cambridge-Aachen splitting scale of a jet when the last two remaining CCs
are merged into a single CC is shown in (a) for jets with β0 ≥ 4. The n-subjettiness ratio τ21
is shown in (b) for jets with β0 ≥ 4. The observables τn are obtained by reclustering the CCs
in the jet until there are n CCs using the Cambridge-Aachen algorithm and evaluating the
minimum distance between the topoclusters in the jet and the reclustered CCs. The inclu-
sive distributions of the e2 energy correlation function and the ratio D2 for CCs that contain
at least three topoclusters are shown in (c) and in (d), respectively. These observables are
evaluated using the topoclusters that are associated with the CC as the input constituents.

163

12dJet 00.20.40.60.811.2Fraction of events00.020.040.060.080.10.12 4‡ 0bQCD jettop quark jet  21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.060.070.08 4‡ 0bQCD jettop quark jet  2CCs e00.20.40.60.811.21.4Fraction of events00.010.020.030.040.050.060.070.083 topoclusters‡QCD jettop quark jet  2CCs D00.511.522.533.54Fraction of events00.020.040.060.080.13 topoclusters‡QCD jettop quark jet  5.2.4 Machine Learning Studies

In this section, the design and optimization of two tagging algorithms for tagging jets from

signal W (cid:48) → tb and background QCD multijet production processes as either signal top

or QCD background jets are presented. Both tagging algorithms are designed to use the

topological and kinematic information of jets, vertices, and CCs that was obtained from the

Mapper algorithm. The ﬁrst tagging algorithm is a deep neural network (DNN) tagger that

uses variables introduced in the previous section that are inspired by the jet substructure

observables as input. The design of the DNN tagger is motivated in order to determine

if there is residual information in the jet substructure observables obtained from the TDA

that is not utilized by the contained and inclusive top taggers discussed in section 5.1. The

second tagging algorithm is a convolutional graph neural network (GNN) that uses graph

representations of jets as input. As will be detailed shortly, the graph representation of a jet is

built from the ˇC complex of the jet that is obtained from the Mapper algorithm. The design of

the GNN is motivated in part by the ability of a graph to encode the topological information

of jets in a single structure. Additionally, this allows the deﬁnition of a simpler tagging

algorithm that does not utilize high-level information from the jet substructure variables.

Both tagging algorithms were trained using signal top jets that satisfy the contained top

jet labeling criteria, as discussed in subsection 5.1.3. The taggers are optimized to a 50%

and 80% ﬁxed signal eﬃciency working points. Finally, the performance of both taggers is

compared to the performance of the contained top DNN tagger, which is referred to as the

jet substructure (JSS) tagger in this section.

The optimization process of the DNN tagger started with the selection of the input

variables from a baseline set of 74 variables. These variables consist of the CCs substructure

164

information of jets, the kinematic information of vertices, the kinematic information of CCs,

and the topocluster substructure information of CCs. As discussed in the previous section,

the vertices and CCs obtained from the ˇC complex are interpreted as subjets by adding the

four-momenta of the topoclusters that are associated with these objects. The baseline set

of variables was reduced by clustering variables into groups based on their correlation. The

correlation distance metric between two variables is deﬁned as:

d(x, y) =

1 − ρ2(x, y)

(5.16)

(cid:113)

where ρ(x, y) is the correlation coeﬃcient between two variables x and y. Variables that

are highly correlated or anti-correlated are mapped close to zero with this metric. The

distance between two clusters of variables Ai and Aj is determined using the complete-

linkage distance:

D(Ai, Aj) = max

x∈Ai, y∈Aj

d(x, y)

(5.17)

Two clusters of variables are merged if they achieve the minimal complete-linkage distance

between all possible pairs of clusters of variables. This corresponds to grouping together

all variables that contain approximately the same amount of information. The clustering

process was carried out up to a threshold distance of 0.92, which corresponds to a minimum

absolute correlation within the variable cluster of 0.39. The end result of the clustering

process yielded 26 clusters of variables, which are summarized in the dendogram shown

in Figure 5.27. A single variable was retained from each individual cluster based on the

separation power that the variable has between signal and background jets. This intermediate

set of variables was reduced to 21 variables by removing those that did not contain suﬃcient

discriminatory power. The ﬁnal set of variables chosen as the inputs for the DNN tagger

165

is summarized in Table 5.8. The Keras [71] software package was used in the design of

the DNN tagging algorithm. The architecture and optimized hyperparameters of the DNN

tagger are summarized in Table 5.9. The tagger was trained using 200000 contained top jets

from the W (cid:48) → tb process as signal jets and 4167611 jets from the QCD multijet production

process as background jets. Both signal and background jets were split evenly into training

and validation datasets.

Figure 5.27: The clustering dendrogram of the initial baseline set of variables that shows how
the variable clusters are formed based on the correlation metric in Equation 5.16. The black
vertical line is the clustering distance threshold at which the intermediate set of variables
was chosen.

The optimization process of the GNN tagger started with the design of the graph rep-

166

Variable types

Variables

Fraction of contained topoclusters
Mass
pT
e2
D2
∆RCoM(CCi, CCj)
∆R(CCi, CCj)
Cambridge-Aachen splitting scales

v1, CC1, CC3
v2, CC0, CC2
v2, CC0, CC2
CC0, CC1
CC0, CC1, CC2
(i,j) = (0,1), (0,2), (1,2)
(i,j) = (0,2)
√
d2 3,

d1 2,

d3 4

√

√

Table 5.8: List of variables used as input to the DNN grouped by variable type. The variables
that are deﬁned on vertices and connected components obtained from the Mapper algorithm
are denoted by vi and CCi respectively, where the index i is used to denote the ordering
of the objects based on their pT. The fraction of contained topoclusters is the ratio of
the number of topoclusters associated with the object to the total number of topoclusters
in the jet. The mass and pT are obtained by adding the four-momentum vectors of the
topoclusters associated with the object. The energy correlation functions are calculated by
using the associated topoclusters of the connected components as the constituents.

Hyperparameter

Layer

Option used

Dense

Number of hidden layers
Number of nodes per hidden layer
Activation function
L1 regularizer
L2 regularizer
Weight initializer

3
20, 15, 10
Scaled exponential linear unit (selu) [72]
None
None
lecun normal

Optimizer
Learning rate
Batch size
Batch normalization
Number of epochs
Loss function

Adam with Nesterov momentum (Nadam) [73]
0.00001
500
Yes
1000
Binary crossentropy

Table 5.9: List of hyperparameters optimized for the DNN tagger. The DNN consists of 3
hidden layers with the number of nodes decreasing in each subsequent layer.

resentation of jets. Each individual jet is assigned a graph whose vertices correspond to

the CCs of the ˇC complex of the jet. Each vertex of the graph is assigned a set of input

167

features that consist of the CC four-momentum, mass, and pT, which are evaluated in the

CoM frame of the jet. The graph is made fully connected by including edges ei,j between

all possible pairs of CCs. In order to encode the degree of disconnectedness between two

CCs, each edge is assigned a set of input features that consist of the angular distances be-

tween CCs evaluated in the CoM frame of the jet: ∆RCoM(CCi, CCj), ∆φCoM(CCi, CCj)

and ∆ηCoM(CCi, CCj). This graph structure with the corresponding set of input features

is used as the input to the GNN tagger in order to classify jets. The Spektral [74] soft-

ware package was used in the design of the GNN tagging algorithm. The architecture and

optimized hyperparameters of the GNN tagger are summarized in Table 5.10. The tagger

was trained using 200000 contained top jets from the W (cid:48) → tb process as signal jets and

500000 jets from the QCD multijet production process as background jets, which were split

evenly into training and validation datasets. The number of background jets used for the

training of the GNN had to be reduced compared to the DNN training due to limits on the

available memory resources. This is because the GNN requires all graphs from the dataset

to be available during the training process, and the amount of memory that each graph takes

is large, which can exceed the available resources if a large number of jets are included in

the training.

The performance during the training process of the DNN and GNN tagging algorithms

are summarized in Figure 5.28. The accuracy, which quantiﬁes the frequency of a given

tagger correctly classifying jets as either signal or background jets, and the loss function

of both models were evaluated as a function of the training epoch, both with the training

and validation datasets. Both the DNN and GNN taggers show no sign of overtraining

since the performance between the training and validation datasets agrees well. However,

the GNN tagger shows signs of undertraining since the accuracy of the validation dataset

168

Hyperparameter

Layer 1

Number of output channels
Activation function
Weight initializer
L1 regularizer
L2 regularizer

Layer 2

Number of output channels
Activation function
Weight initializer
MLP number of hidden layers
MLP number of nodes per hidden layer
L1 regularizer
L2 regularizer

Layer 3

Layers 4 − 6

Number of hidden layers
Number of nodes per hidden layer
Activation function
L1 regularizer
L2 regularizer
Weight initializer

Optimizer
Learning rate
Batch size
Batch normalization
Number of epochs
Loss function

Option used

Graph convolution with skip
connection (GCS) [75]

6
Exponential linear unit (elu) [76]
Glorot uniform [77]
None
None

Edge-conditioned
convolutional layer (ECC) [78]

6
None
Glorot uniform
2
9, 7
None
0.0001

Global sum pool

Dense

3
10, 9, 8
elu
None
0.0001
Glorot uniform

Nadam
0.001
350
Yes
200
Binary crossentropy

Table 5.10: List of hyperparameters optimized for the GNN tagger. The input graphs are
ﬁrst processed through the GCS layer. The output of this convolution layer is used as an
input to the ECC layer. The output of the ECC layer is pooled by summing the individual
output channel features per node. The pooled features are then used as the input into a
DNN with three hidden layers, which performs the jet classiﬁcation.

169

exceeds that of the training dataset at later epochs. The tagger score distributions shown in

Figure 5.29 indicate that the GNN tagger is not robust enough when classifying signal top

jets, as the GNN tagger score peaks at lower values when compared to the DNN tagger score

for signal top jets. However, both models show good separation power between signal and

background jets. Both 50% and 80% ﬁxed signal eﬃciency working points were deﬁned for

both taggers using the training dataset. The performance of each tagger at these working

points is compared to the corresponding working point of the JSS tagger. The number

of signal and background jets that pass or fail a given working point is summarized in

Figure 5.30. As can be observed in the plots, both the DNN and GNN taggers are slightly

outperformed by the JSS tagger. However, a similar level of background jet rejection is

obtained for all taggers considered.

In order to determine if there is residual information from the topology of jets that is

not being used by the taggers, the distributions of variables obtained from the TDA are

compared between signal and background jets in diﬀerent tagging selection regions. The

distributions are compared for jets that pass either the 80% working point tagging criteria

of the DNN or GNN taggers independently while simultaneously failing the 50% working

point tagging criteria of the JSS tagger. Conversely, the distributions are also compared for

jets that fail either the 50% working point of the DNN or GNN taggers independently while

simultaneously passing the 80% JSS tagger. The jets that satisfy these tagging selections

populate a phase space where the classiﬁcation of jets by the taggers is ambiguous. For

example, signal top jets in these regions contain features that are deemed background-like

by the tight signal requirements of a tagger while being loosely considered as signal by

another tagger. Thus, if the distributions of variables show diﬀerences between signal and

background jets in these regions, then this implies that there is residual information from the

170

(a)

(c)

(b)

(d)

Figure 5.28: The DNN loss function (a) and accuracy (b), and the GNN loss function (c) and
accuracy (d). Both metrics are shown for the training and validation datasets as a function
of the training epochs of the networks.

(a)

(b)

Figure 5.29: The score distributions of the DNN (a) and the GNN (b) models overlayed
between signal top jets and background QCD jets.

171

02004006008001000DNN Training Epoch0.60.70.80.91.01.11.2DNN Loss FunctionTraining SetValidation Set02004006008001000DNN Training Epoch0.550.600.650.700.750.800.85DNN AccuracyTraining SetValidation Set0255075100125150175200GNN Training Epoch0.300.320.340.360.380.400.420.44GNN Loss FunctionTraining SetValidation Set0255075100125150175200GNN Training Epoch0.800.810.820.830.840.850.860.87GNN AccuracyTraining SetValidation SetDNN score00.10.20.30.40.50.60.70.80.91Fraction of events00.050.10.150.20.250.30.350.40.45QCD jettop quark jet  GNN score00.10.20.30.40.50.60.70.80.91Fraction of events00.050.10.150.20.250.30.350.4QCD jettop quark jet  Figure 5.30: The 50% (top) and 80% (bottom) ﬁxed signal eﬃciency working point confusion
matrices for the jet substructure contained top DNN tagger (JSS), the DNN trained using
the information from the Mapper algorithm, and the GNN using the graph representation
of the jets. The rows of the matrices correspond to the number of events for the signal top
jet and background QCD multijet classes while the columns correspond to the number of
predicted events in each class for a given tagger. The total number of events for each class
is shown on the right vertical axis.

172

65051952095145JSS DNN topJSS DNN dijetDNN topDNN dijetGNN topGNN dijetPredicted classtopdijetTrue class33273431778532524332527632449532602411469845094816115300455056510020713635002378250% Fixed Signal Efficiency Working Point Confusion Matrices65051952095145JSS DNN topJSS DNN dijetDNN topDNN dijetGNN topGNN dijetPredicted classtopdijetTrue class52502112549851900413151552006313045643967154769843049379154715723067376554535749080% Fixed Signal Efficiency Working Point Confusion Matricestopology of jets that is not fully used by the taggers and can improve the jet classiﬁcation.

First, the number of connected components in jets were compared in order to determine if

the topology of the jets contained residual information in the tagging phase spaces considered.

As can be observed in Figure 5.31, the number of connected components between signal top

jets and background QCD jets does not show any diﬀerences. Thus, at a surface level, the

information from the homology of jets has been fully used by the taggers. As a next step,

the kinematic and substructure observables of the objects that deﬁne the homology of jets

were analyzed in order to determine if there is residual information in the interplay between

the topology and kinematics of jets.

The kinematic distributions of the vertices of the ˇC complex were compared in order to

determine if the mapping onto the ﬁlter function feature space and spatial clustering of the

topoclusters contains residual information. Figures 5.32 and 5.33 show the pT distributions

of the second leading and third leading vertices in the ˇC complex of the jets, respectively.

As can be observed from these distributions, signal top jets that fail the 50% working points

of the DNN or GNN and pass the 80% working point of the JSS tagger have a narrower

pT distributions for their vertices when compared to background jets. Thus, the JSS tagger

is able to identify signal top jets as having vertices with well-deﬁned pT values. On the

other hand, jets that pass either the 80% working point of the DNN or GNN and fail the

50% working point of the JSS tagger do not show signiﬁcant diﬀerences in the kinematics of

vertices. This indicates that the DNN and GNN taggers have used most of the information

from vertices.

The Cambridge-Aachen splitting scales were compared in order to assess if there is resid-

ual information from how vertices form CCs and how the CCs are distributed within the

jet. Figures 5.34 and 5.35 show the Cambridge-Aachen splitting scales from the three-to-two

173

and two-to-one CC mergings, respectively. As can be observed in these plots, signal top jets

tend to have larger merging scales compared to QCD jets in all tagging selection regions

considered except for jets that pass the 80% working point of the DNN tagger and fail the

50% working point of the JSS tagger. Thus, all taggers except the DNN tagger have not

used to their full extent the information that signal top jets tend to have CCs that are more

spread out within the jet when compared to background jets.

The n-subjettiness and n-point energy correlation observables in jets were compared in

order to determine if there is additional discriminatory information from the substructure and

radiation pattern of the CCs in the jets. The distributions of the 2-point energy correlation

function e2 and the n-subjettiness ratios τ21 and τ32 are shown in Figures 5.36 - 5.38. The

2-point energy correlation function e2 contains some residual information in all phase spaces

considered except for jets that pass the 80% DNN and fail the 50% JSS taggers. Thus, all

taggers except the DNN have not used all the information available from how the energy

of the jet is distributed across its CCs. The τ21 distribution is bimodal for both signal and

background jets while the τ32 distribution peaks sharply at 1 for both signal and background

jets. These observations indicate that the jets that populate these phase spaces are better

modeled with either two CCs or a single CC, with the degree of the preferred substructure

varying across the diﬀerent tagging criteria.

Finally, the kinematic distributions of CCs and the substructure observables evaluated

using the topoclusters associated with a given CC are compared. This is done in order

to determine if there is residual information from the energy and radiation patterns that

the CCs reconstruct from the topoclusters. The distributions of the energy correlation

function ratio D2 of the second leading CC and the mass of the ﬁrst leading CC are shown

in Figures 5.39 and 5.40, respectively. The CC1 D2 distribution in background QCD jets is

174

narrower compared to signal top jets, which indicates that there is some residual information

in the energy distribution of topoclusters within the CCs that is not fully used by the taggers.

The mass bumps near the W boson mass and top quark mass that are observed in the CC0

mass distribution for signal top jets indicate that CC0 has partially reconstructed some of

the substructure of the jet in these phase spaces. Additionally, the CC0 mass distribution

for background QCD jets that pass the 80% GNN tagger and fail the 50% JSS tagger shows

a peak below the top quark mass with a long tail that extends to higher values, which

is characteristic when trying to reconstruct the substructure of top jets from inconsistent

radiation patterns. This could be indicative that the convolutional layers of the GNN have

learned to reconstruct the top mass from the graph structure of jets but have not fully used

this information for jet classiﬁcation.

To summarize these observations, the variables obtained from the TDA of jets contain

residual information that is not used to its full extent by the taggers studied. This infor-

mation could be used to improve the separation between signal top jets and background

QCD jets in phase spaces where their classiﬁcation by the taggers is ambiguous. As dis-

cussed, both signal and background jets that populate these phase spaces are characterized

by a topology that best models the jets with a single CC or two CCs. In the case of signal

top jets, the CCs are more spatially spread out compared to the CCs in background QCD

jets. Additionally, the leading CC in signal top jets has partially reconstructed some of the

relevant substructure of the jet, while in background QCD jets the reconstruction is more

consistent with reconstructing substructures from random patterns of topoclusters.

175

(a)

(b)

(c)

(d)

Figure 5.31: The distribution of the number of connected components in jets that pass the
50% working point of the JSS tagger and fail the 80% working point of the DNN tagger (a)
and the 80% of the GNN tagger (b) respectively. The same distributions are shown for jets
that fail the 50% working point of the JSS tagger and pass the 80% working point of the
DNN tagger (c) and the 80% working point of the GNN tagger (d) respectively.

176

0b024681012Fraction of events00.050.10.150.20.250.30.350.4fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  0b024681012Fraction of events00.050.10.150.20.250.30.350.4fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  0b024681012Fraction of events00.050.10.150.20.250.30.35pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  0b024681012Fraction of events00.050.10.150.20.250.30.35pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.32: The pT distribution of the second leading vertex of the ˇC complex of jets
that pass the 50% working point of the JSS tagger and fail the 80% working point of the
DNN tagger (a) and the 80% working point of the GNN tagger (b) respectively. The same
distributions are shown for jets that fail the 50% working point of the JSS tagger and pass the
80% working point of the DNN tagger (c) and the 80% working point of the GNN tagger (d)
respectively.

177

 [GeV]T p1v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.06fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet   [GeV]T p1v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.06fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet   [GeV]T p1v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.06pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet   [GeV]T p1v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.06pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.33: The pT distribution of the third leading vertex of the ˇC complex of jets that pass
the 50% working point of the JSS tagger and fail the 80% working point of the DNN tagger (a)
and the 80% working point of the GNN tagger (b) respectively. The same distributions are
shown for jets that fail the 50% working point of the JSS tagger and pass the 80% working
point of the DNN tagger (c) and the 80% working point of the GNN tagger (d) respectively.

178

 [GeV]T p2v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.060.070.080.09fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet   [GeV]T p2v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.060.070.08fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet   [GeV]T p2v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.060.070.080.09pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet   [GeV]T p2v01002003004005006007008009001000Fraction of events00.020.040.060.080.1pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.34: The three-to-two connected component Aachen splitting scale of jets that pass
the 50% working point of the JSS tagger and fail the 80% working point of the DNN tagger (a)
and the 80% working point of the GNN tagger (b) respectively. The same distributions are
shown for jets that fail the 50% working point of the JSS tagger and pass the 80% working
point of the DNN tagger (c) and the 80% working point of the GNN tagger (d) respectively.

179

23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.06fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.060.07fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.06pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.06pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.35: The two-to-one connected component Aachen splitting scale of jets that pass the
50% working point of the JSS tagger and fail the 80% working point of the DNN tagger (a)
and the 80% working point of the GNN tagger (b) respectively. The same distributions are
shown for jets that fail the 50% working point of the JSS tagger and pass the 80% working
point of the DNN tagger (c) and the 80% working point of the GNN tagger (d) respectively.

180

12d00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.070.08fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  12d00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.070.08fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  12d00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.07pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  12d00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.07pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.36: The energy correlation function e2 evaluated with the CCs of jets that pass the
50% working point of the JSS tagger and fail the 80% working point of the DNN tagger (a)
and the 80% working point of the GNN tagger (b) respectively. The same distributions are
shown for jets that fail the 50% working point of the JSS tagger and pass the 80% working
point of the DNN tagger (c) and the 80% working point of the GNN tagger (d) respectively.

181

2Jet e00.050.10.150.20.250.30.350.4Fraction of events00.020.040.060.080.10.120.140.16fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  2Jet e00.050.10.150.20.250.30.350.4Fraction of events00.020.040.060.080.10.12fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  2Jet e00.050.10.150.20.250.30.350.4Fraction of events00.020.040.060.080.10.120.14pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  2Jet e00.050.10.150.20.250.30.350.4Fraction of events00.020.040.060.080.10.120.140.160.180.2pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.37: The τ21 ratio evaluated with the CCs of jets that pass the 50% working point of
the JSS tagger and fail the 80% working point of the DNN tagger (a) and the 80% working
point of the GNN tagger (b) respectively. The same distributions are shown for jets that
fail the 50% working point of the JSS tagger and pass the 80% working point of the DNN
tagger (c) and the 80% working point of the GNN tagger (d) respectively.

182

21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.06fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.06fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.060.07pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.060.070.08pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.38: The τ32 ratio evaluated with the CCs of jets that pass the 50% working point of
the JSS tagger and fail the 80% working point of the DNN tagger (a) and the 80% working
point of the GNN tagger (b) respectively. The same distributions are shown for jets that
fail the 50% working point of the JSS tagger and pass the 80% working point of the DNN
tagger (c) and the 80% working point of the GNN tagger (d) respectively.

183

32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.16fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.14fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.160.18pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.160.18pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.39: The D2 of the second leading CC of jets that pass the 50% working point of
the JSS tagger and fail the 80% working point of the DNN tagger (a) and the 80% working
point of the GNN tagger (b) respectively. The same distributions are shown for jets that
fail the 50% working point of the JSS tagger and pass the 80% working point of the DNN
tagger (c) and the 80% working point of the GNN tagger (d) respectively.

184

2 D1CC00.511.522.533.54Fraction of events00.020.040.060.080.1fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet  2 D1CC00.511.522.533.54Fraction of events00.010.020.030.040.050.060.070.080.09fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet  2 D1CC00.511.522.533.54Fraction of events00.010.020.030.040.050.060.070.080.09pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet  2 D1CC00.511.522.533.54Fraction of events00.020.040.060.080.1pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  (a)

(b)

(c)

(d)

Figure 5.40: The mass of the leading CC of jets that pass the 50% working point of the JSS
tagger and fail the 80% working point of the DNN tagger (a) and the 80% working point
of the GNN tagger (b) respectively. The same distributions are shown for jets that fail the
50% working point of the JSS tagger and pass the 80% working point of the DNN tagger (c)
and the 80% working point of the GNN tagger (d) respectively.

185

 mass [GeV]0CC020406080100120140160180200Fraction of events00.020.040.060.080.10.120.140.16fail DNN 50% and pass JSS 80% WPsQCD jettop quark jet   mass [GeV]0CC020406080100120140160180200Fraction of events00.020.040.060.080.10.120.140.160.18fail GNN 50% and pass JSS 80% WPsQCD jettop quark jet   mass [GeV]0CC020406080100120140160180200Fraction of events00.010.020.030.040.050.060.070.080.09pass DNN 80% and fail JSS 50% WPsQCD jettop quark jet   mass [GeV]0CC020406080100120140160180200Fraction of events00.020.040.060.080.1pass GNN 80% and fail JSS 50% WPsQCD jettop quark jet  Chapter 6

Searches for Vector-Like Quarks

In this Chapter, the search for Vector-Like Quarks (VLQs) is presented. Two analyses were

performed for the search of Vector-Like top quarks (T ) where the T decays into Ht or Zt.

The ﬁrst analysis is dedicated to the search for a singly produced T in association with an

electron or muon, referred to as the 1-lepton channel 1. The second analysis is dedicated to

the search for pair-produced T s both in the 0-lepton and 1-lepton channels. Both analyses

use 139 fb−1 of data recorded corresponding to the year period 2015-2018. Additionally,

both analyses follow similar background modeling and event selection criteria as discussed

in Chapter 4, as well as a similar search strategy that will be discussed in this Chapter.

The ﬁrst part of this Chapter is devoted to the single production analysis and will dis-

cuss its strategy and results. The results are interpreted using two signal benchmarks: the

SU (2) singlet (T 2/3) gauge representation and the SU (2) doublet (T 2/3 B−1/3) gauge rep-

resentation. The involvement of the author of this thesis in this analysis was mostly limited

to the derivation of correction factors that were designed to improve the modeling of the

background processes. This has an important role in the overall analysis since having a well-

modeled background is essential in the design of the analysis strategy and interpretation of

results. More emphasis will be given to the overall search strategy of the analysis in this

ﬁrst part of the Chapter, which will serve as an introduction and motivation for the pair

1Throughout the remainder of this Chapter, the word lepton will refer speciﬁcally to either an electron

or muon, unless otherwise stated.

186

production analysis. This analysis is now concluded, and its results have been published [79].

The second part of this Chapter is devoted to the currently ongoing pair production

analysis. At the time of writing this dissertation, the 1-lepton channel of the analysis is far

more developed than the 0-lepton channel. Only the 1-lepton channel will be discussed in

the second part of this Chapter. However, the 0-lepton channel will follow a similar analysis

strategy as the one that will be discussed for the 1-lepton channel. Since several aspects

of the search strategy of this analysis are shared with the single production analysis, only

the strategy components that are diﬀerent will be discussed. The results of this analysis

are interpreted using four signal benchmarks: the SU (2) singlet gauge representation, the

SU (2) doublet gauge representation, assuming the branching ratio BR(T → Ht) = 1, and

assuming the branching ratio BR(T → Zt) = 1.

6.1 Single Production of Vector-Like Quarks

6.1.1 Analysis Strategy

The single production analysis is optimized to search for the production of a T that decays

to a top quark and either a Higgs or Z boson in the 1-lepton channel. The lepton is mainly

expected to be produced from the leptonic decay of a top quark. However, other less frequent

sources for the lepton include the dileptonic decay of the Z, in which one of the leptons is

misreconstructed, for example. As discussed in subsection 4.2.1, the single production of

a T is initiated through an electroweak interaction that results in the production of an

associated top or bottom quark, referred to as the t-associated and b-associated production

modes respectively. Thus, the signal processes in this analysis can be categorized based on

187

the T decays and the associated production modes as follows:

1. T (→ Ht)qb for the b-associated production of a T decaying into Ht

2. T (→ Zt)qb for the b-associated production of a T decaying into Zt

3. T (→ Ht)qt for the t-associated production of a T decaying into Ht

4. T (→ Zt)qt for the t-associated production of a T decaying into Zt

Although the analysis strategy is optimized for these production modes and decay channels

of the T , the search is aimed at the SU (2) singlet (T 2/3) and doublet (T 2/3 B−1/3) gauge

representations of VLQs, which are the signal benchmarks of the analysis. It should be noted

that the coupling of the T to the W or Z boson is dependent on the SU (2) gauge represen-

tation. In the case of the doublet representation, the coupling to the W boson vanishes due

to charge conservation considerations, thereby making the t-associated production mode the

only allowed mode for this representation. As previously discussed in subsection 4.2.1, even

though the t-associated production mode is kinematically suppressed due to the mass of the

top quark, studying both production modes is well motivated by the theory and extends the

interpretability of the parameter space in the analysis results.

The distributions of the number of jets and b-tagged jets overlayed between the diﬀerent

signal processes just described and the total SM background are shown in Figure 6.1. The

distributions are shown at the event preselection level that was discussed in subsection 4.2.3.

For the b-associated production mode, the number of jets that originate directly from the

main decay topology, which includes the associated b quark and the decay products of the

T , is on average expected to be 4. However, if the Higgs or Z boson that originate from the

T are highly boosted, then the two-pronged decay of these particles might not be identiﬁed

188

(a)

(b)

Figure 6.1: The distributions of the multiplicities of jets (a) and b-tagged jets (b) at pres-
election level overlayed between the diﬀerent signal processes for a T mass of 1.6 TeV and
the SM background. These ﬁgures are taken from [79].

and instead be reconstructed as a single jet, thereby reducing the number of jets in the

event. Additionally, the associated b quark that originates from the initial gluon split can

potentially decay in the high-pseudorapidity region of the detector due to its low mass,

which will not be reconstructed as a central jet. As observed in Figure 6.1a, the bulk of the

b-associated production mode populates the 3 − 5 jet region, which is denoted as the low-

jet (LJ) multiplicity region. On the other hand, the t-associated production mode mostly

populates the ≥ 6 jets region, which is denoted as the high-jet (HJ) multiplicity region.

Since the analysis is performed in the 1-lepton channel, at least one of the top quarks in the

t-associated production mode is expected to decay hadronically. Thus, the total number of

jets that are expected from the top quark decays alone can range from 2 − 5, depending on

the degree of collimation of the top decay products. In addition to the jets that originate

directly from the main decay topology, the ﬁnal state radiation of the signal processes can

lead to the production of additional jets. Although, these additional jets will not be as

189

Number of jets3456789101112Fraction of events00.10.20.30.40.5Preselection = 13 TeVs Simulation ATLAS=0.5κ=1.6 TeV, TmTotal background Ht)qb→T( Zt)qb→T( Ht)qt→T( Zt)qt→T(      Number of b jets123456Fraction of events00.10.20.30.40.50.60.70.80.9Preselection = 13 TeVs Simulation ATLAS=0.5κ=1.6 TeV, TmTotal background Ht)qb→T( Zt)qb→T( Ht)qt→T( Zt)qt→T(      Baseline selections on jet and b-tag multiplicities

Jet multiplicity b-tag multiplicity Channel name Targeted signal

3–5
3–5
≥6
≥6

1–2
≥3
1–2
≥3

LJ, 1–2b
LJ, ≥3b
HJ, 1–2b
HJ, ≥3b

T (→ Zt)qb
T (→ Ht)qb
T (→ Zt)qt
T (→ Ht)qt

Table 6.1: Deﬁnition of the four baseline analysis search regions based on jet and b-tagged
jet multiplicity and the signal process which they are designed to target.

energetic as the ones that arise directly from the main decay topology of signal processes.

Another feature that distinguishes signal events from background events is the number

of b-tagged jets in the event. From all the T production modes considered, the ones with

the T → Ht decay channel are expected to have the largest number of b-tagged jets. This

is due to the H → b¯b decay channel, which has the largest branching ratio for the Higgs

boson. Thus, for the T → Ht decay channel the number of jets that originate from the

main decay topology and can potentially be b-tagged is 4. However, as previously discussed,

the b-associated production mode could have fewer b-tagged jets due to the possibility of

the associated b quark decaying in the high-pseudorapidity region, which lies outside of the

validty range of the b-tagger used. On the other hand, for the T → Zt decay channel the

expected number of b-tagged jets from the main decay topology ranges between 1 − 2.

Taking these observations into account, four baseline analysis search regions are deﬁned

solely based on the multiplicity of jets and b-tagged jets that individually target each signal

process. These baseline regions are summarized in Table 6.1.

190

6.1.2 Signal Discrimination

From the previous discussion it is clear that signal events can be isolated from background

events by placing selection cuts on the multiplicity of jets and b-tagged jets. However,

these requirements are dependent on the signal decay channel and production modes by

deﬁnition. Instead, a clever observation is to note that due to the large mass of the T , its

decay products are expected to be highly boosted regardless of the signal process considered.

Thus, the production of a large number of jets, of which a signiﬁcant fraction is expected to

be boosted, a potentially boosted lepton, and a signiﬁcant amount of Emiss

T

from the leptonic

decay of a boosted top quark motivates the deﬁnition of the eﬀective mass (meﬀ) variable:

meﬀ =

pj
T +

p(cid:96)
T + Emiss
T

(cid:88)central jets

(cid:88)leptons

(6.1)

which is the scalar sum of the pT of the jets, the lepton, and the Emiss

T

that are produced in

the event. This variable allows us to discriminate between signal and background processes in

a way that is agnostic to the signal decay channels and production modes. The distribution

of meﬀ overlayed between the diﬀerent signal processes and the total SM background is

shown in Figure 6.2 at the event preselection level. All signal processes that are shown are

for a T with a mass mT = 1.6 TeV. As can be observed in the plot, the distribution peaks

close to mT for signal processes, while for the SM background processes the distribution

decays rapidly at higher values of meﬀ due to these processes lacking the suﬃcient energy to

produce highly boosted ﬁnal states. As the mass of the T gets larger, the separation power

between signal and background processes improves since the signal processes will populate

the high meﬀ region. Based on these observations, the meﬀ variable is chosen as the variable

on which the ﬁt is performed in the statistical analysis (see subsection 6.1.7).

191

Figure 6.2: Distribution of meﬀ at preselection level overlayed between the diﬀerent signal
processes for a T mass of 1.6 TeV and the SM background. This ﬁgure is taken from [79].

6.1.3 Boosted Object Tagging and Reconstruction

As discussed in the previous section, due to the large mass of the T , a large number of boosted

jets can be produced from the hadronic decays of the top quark, the Higgs boson, and the Z

boson that are produced in the main decay topology of signal processes. Depending on the

degree of collimation of the decay products of these particles, the jets that are produced can

be reclustered into a single large-R jet. These reclustered jets can be used to identify the

particle that originated them with the use of a tagging algorithm. Thus, this allows us to

potentially reconstruct the direct decay products of the T by correctly tagging the reclustered

large-R jets to their source particle. As discussed in subsection 3.3.4, variable radius RC jets

are used as the inputs to the tagging algorithm due to their ﬂexibility in capturing the decay

products of boosted objects over a wide pT regime. The distributions of the number of RC

192

 [GeV]effm500100015002000250030003500Fraction of events / 250 GeV00.10.20.30.40.50.60.70.80.9Preselection = 13 TeVs Simulation ATLAS=0.5κ=1.6 TeV, TmTotal background Ht)qb→T( Zt)qb→T( Ht)qt→T( Zt)qt→T(      jets in an event and their masses are shown in Figure 6.3. As can be observed, the number of

RC jets is, on average, larger in signal events compared to events from the SM background.

Additionally, the RC jets in signal processes exhibit prominent mass peaks that correspond

to the direct decay products of the T . For the b-associated production modes, the mass peak

near the top quark mass is less prominent when compared to the t-associated production

modes. This is because the top quark in the b-associated production mode is likely to decay

leptonically; thus, jets cannot be used to identify the leptonically decaying top.

(a)

(b)

Figure 6.3: The distributions of the multiplicities of reclustered large-R jets (a) and their
mass (b) at preselection level overlayed between the diﬀerent signal processes for a T mass
of 1.6 TeV and the SM background.

The tagging algorithm that is implemented to identify the RC jets to their source particles

is a simple kinematic variable cut-based tagger. The tagger takes as input the pT, mass,

and the number of subjet constituents (Nconst) of the RC jets. The tagger is designed to

193

identify jets that are produced from hadronically decaying top quarks, Higgs bosons, and

vector bosons inclusively. The tagger does not distinguish between W and Z bosons due

to the similarity of the jets that are produced by these particles in the input variables of

the tagger. However, this ambiguity does not impact the analysis signiﬁcantly since the

production of W bosons is not a central focus of the analysis. The kinematic requirements

to tag a jet to a given particle are summarized in Table 6.2. The requirements on the RC

Kinematic Observable
pT [GeV]
Mass [GeV]
Nconst

t-tagged
> 400
> 140

H-tagged
> 350
[105, 140]
≥ 2 if pT < 700 = 2 if pT < 600 = 2 if pT < 450
≥ 1 if pT > 700 ≤ 2 if pT > 600 ≤ 2 if pT > 450

V -tagged
> 350
[70, 105]

Table 6.2: Kinematic requirements on RC jets to be tagged to a top quark, a Higgs boson,
or a vector boson.

jet pT ensure that the majority of the decay products of the particles are captured within

the jet. The mass requirements for each type of particle are designed to be orthogonal in

order to have well-deﬁned particle classes. Additionally, they allow the tagger to be ﬂexible

on jets that are highly boosted or do not capture all the subsequent decays of the source

particle. Finally, the requirement on Nconst is designed to capture the jet substructure by

introducing a pT dependence that adjusts to the desired jet topology. Jets that are highly

boosted tend to have collimated decay products; thus, the requirements on Nconst at high pT

allow for the merging of subjets. On the other hand, jets that have a lower pT tend to have

a resolved decay topology; thus, the requirements on Nconst are higher or more exclusive

compared to their corresponding high pT requirement. The distributions of the number of

jets that are tagged to a top quark, Higgs boson, and vector boson are shown in Figure 6.4

at the event preselection level. As can be observed from the plots, the signal processes tend

to have a larger fraction of events with at least one jet tagged to a hadronically decaying

194

boosted object when compared to the SM background. The signal t-associated production

modes have a larger fraction of events with at least one top-tagged jet when compared to the

b-associated production modes, which is expected due to the presence of an additional top

quark in the t-associated production mode. Similarly, the signal processes with the T → Ht

decay channel have a larger fraction of events with a Higgs-tagged jet, while the processes

with the T → Zt decay channel have a larger fraction of V -tagged jets.

In order to identify potential leptonically decaying top quarks that are produced in events,

jets cannot be used due to the presence of the lepton and the Emiss

T

that originate from this

decay. Instead, a dedicated algorithm is implemented to reconstruct a candidate leptonic top

system from simple kinematic considerations. A schematic representation of this algorithm

is shown in Figure 6.5. First, a candidate leptonically decaying W boson is reconstructed

under the assumption that all the Emiss

T

from the event and its azimuthal angle are the

same as those of the pT of the neutrino that is produced from the leptonically decaying W

boson. The longitudinal momentum component of the neutrino is determined by performing

algebraic manipulations on the four-momenta of the neutrino and lepton in the event under

the constraint that the invariant mass of the lepton-neutrino system is consistent with the

mass of the W boson. The candidate leptonically decaying W boson is then reconstructed

by adding the four-momenta of the lepton and reconstructed neutrino. Next, the candidate

leptonically decaying W boson is spatially matched with the closest b-tagged jet within a

distance of ∆R < 1.5. Additionally, the b-tagged jet must not be a constituent of any tagged

RC jet in the event. This is done in order to avoid potential double counting and to ensure

that the leptonically decaying W is matched with the appropriate b-tagged jet that originates

from the same top quark decay. If no such b-tagged jet exists, then the leptonic top is not

reconstructed in the event. On the other hand, if the candidate leptonically decaying W

195

(a)

(b)

(c)

Figure 6.4: The distributions of the multiplicities of reclustered large-R jets that are tagged to
a hadronically decaying top quark (a), Higgs boson (b), and vector boson (c) at preselection
level overlayed between the diﬀerent signal processes for a T mass of 1.6 TeV and the SM
background. These ﬁgures are taken from [79].

196

Figure 6.5: Schematic representation of the leptonic top reconstruction algorithm.

boson is matched with a free b-tagged jet and the pT of the reconstructed W boson and

b-tagged jet system satisﬁes pT > 300 GeV, then the resulting system is considered to be

a reconstructed leptonic top. The distribution of the number of reconstructed candidate

leptonic tops at the event preselection level is shown in Figure 6.6.

Figure 6.6: The number of reconstructed candidate leptonic tops at preselection level over-
layed between the diﬀerent signal processes for a T mass of 1.6 TeV and the SM background.
This ﬁgure is taken from [79].

197

6.1.4 Analysis Search Regions

In order to carry out the statistical analysis, dedicated search regions that are pure enough

in the diﬀerent signal processes considered in this search must be deﬁned. Starting from

the baseline search regions that were deﬁned in subsection 6.1.1, the signal purity for the

diﬀerent signal processes can be enhanced by making additional requirements on the number

of boosted objects in the event, which are tailored to a particular signal process. These

requirements are also motivated by the fact that a larger number of boosted objects in events

drives the meﬀ distribution towards higher values in signal processes, thereby increasing the

overall separation power of meﬀ between signal and background.

For example, the baseline regions that require ≥ 3b are designed to be sensitive to the

T → Ht decay channel; therefore, requiring the presence of at least one Higgs-tagged jet

would increase the purity of this decay channel. Similarly, in the 1–2 b-tagged jets regions the

purity of the T → Zt decay channel can be increased by requiring at least one V -tagged jet.

Additionally, the presence of a V -tagged jet can also improve the sensitivity of signal events

where a semi-boosted hadronically decaying top quark is produced. This could happen in

events where no jets are tagged to the top quark but instead to the W boson originating

from the top decay. The presence of top-tagged jets can be used to increase the sensitivity of

t-associated production modes where one of the top quarks must decay hadronically. Regions

with at least one top-tagged jet can also improve the sensitivity of rare processes such as

T → Ht decays where H → W W/τ τ and the lepton is produced from a leptonically decaying

W or τ . Similarly, signal events with T → Zt decays where Z → (cid:96)(cid:96) and one of the leptons

is misreconstructed can also gain sensitivity if a jet is tagged to a top quark. Finally, as

discussed in subsection 4.2.1, at least one forward jet in signal processes is expected to be

198

produced from the initial quark that recoils oﬀ from the oﬀ-shell W/Z boson. Background

events, on the other hand, are usually not energetic enough to produce jets in the forward

region of the detector, as shown in Figure 6.7. Thus, the overall signal purity can be further

improved by requiring the presence of at least one forward jet in the analysis search regions.

Figure 6.7: The distribution of the number of forward jets at preselection level overlayed
between the diﬀerent signal processes for a T mass of 1.6 TeV and the SM background. This
ﬁgure is taken from [79].

To summarize this discussion, the analysis search regions, which will be referred to as

analysis ﬁt regions, are obtained from the baseline search regions by making additional

requirements on the number of forward jets (fj), Higgs-tagged jets (H), top-tagged jets (th),

vector boson-tagged jets (V ), and reconstructed leptonic tops (tl). The combined use of

all these regions in a likelihood ﬁt allows for the analysis search to retain sensitivity to

all of the signal processes that can occur in a given signal benchmark model. In addition

199

Number of forward jets012345Fraction of events00.20.40.60.81Preselection = 13 TeVs Simulation ATLAS=0.5κ=1.6 TeV, TmTotal background Ht)qb→T( Zt)qb→T( Ht)qt→T( Zt)qt→T(      to the ﬁt regions that target the diﬀerent signal processes, two background control regions

are included in the set of ﬁt regions. The purpose of these regions is to calibrate and

constrain the normalization of the t¯t production in association with at least one b-tagged

jet (t¯t+ ≥ 1b) when performing the likelihood ﬁt. These control regions are deﬁned by

requiring the presence of a leptonic top, at least 4 b-tagged jets, and a veto of forward jets

and hadronically decaying boosted objects. In total there are 24 analysis ﬁt region, which

are summarized in Table 6.3.

In order to ensure that the background is well-modeled in the analysis ﬁt regions, a set of

20 validation regions that are kinematically similar to the ﬁt regions and are signal-depleted

are deﬁned. This is achieved by either requiring a veto on forward jets or inverting the most

relevant boosted object multiplicity requirement of a given ﬁt region. The validation regions

are summarized in Table 6.4.

200

Object mult.

Region name

Targeted signal / bkg

Fit regions with 3–5 jets

LJ, 1b, ≥1fj, 0(th+tl), 0H, ≥1V
1b, 0(th+tl), 0H, ≥1V
LJ, 1b, ≥1fj, 0th, ≥1tl, 0H, ≥1V
1b, 0th, ≥1tl, 0H, ≥1V
LJ, 2b, ≥1fj, 0(th+tl), 0H, ≥1V
2b, 0(th+tl), 0H, ≥1V
LJ, 2b, ≥1fj, 0th, ≥1tl, 0H, ≥1V
2b, 0th, ≥1tl, 0H, ≥1V
LJ, 3b, ≥1fj, 0(th+tl), ≥1H, 0V
3b, 0(th+tl), ≥1H, 0V
LJ, 3b, ≥1fj, 0th, ≥1tl, ≥1H, 0V
3b, 0th, ≥1tl, ≥1H, 0V
LJ, 3b, ≥1fj, ≥1th, 0tl, ≥1H, 0V
3b, ≥1th, 0tl, ≥1H, 0V
≥4b, 0(th+tl), ≥1H, 0V LJ, ≥4b, ≥1fj, 0(th+tl), ≥1H, 0V
≥4b, 0th, ≥1tl, ≥1H, 0V LJ, ≥4b, ≥1fj, 0th, ≥1tl, ≥1H, 0V
≥4b, ≥1th, 0tl, ≥1H, 0V LJ, ≥4b, ≥1fj, ≥1th, 0tl, ≥1H, 0V
≥4b, ≥1tl, 0H, 0(V+th)

LJ, ≥4b, 0fj, ≥1tl, 0H, 0(V+th)
Fit regions with ≥6 jets

T (→ Zt)qb
T (→ Zt)qb
T (→ Zt)qb
T (→ Zt)qb
T (→ Ht)qb
T (→ Ht)qb
T (→ Ht)qb
T (→ Ht)qb
T (→ Ht)qb
T (→ Ht)qb
t¯t+ ≥ 1b

Object mult.

Region name

Targeted signal / bkg

HJ, 1b, ≥1fj, 0th, 1tl, 0H, ≥1V
1b, 0th, 1tl, 0H, ≥1V
HJ, 1b, ≥1fj, 1th, 0tl, 0H, ≥1V
1b, 1th, 0tl, 0H, ≥1V
HJ, 1b, ≥1fj, ≥2(th+tl), 0H, ≥1V
1b, ≥2(th+tl), 0H, ≥1V
HJ, 2b, ≥1fj, 0th, 1tl, 0H, ≥1V
2b, 0th, 1tl, 0H, ≥1V
HJ, 2b, ≥1fj, 1th, 0tl, 0H, ≥1V
2b, 1th, 0tl, 0H, ≥1V
HJ, 2b, ≥1fj, ≥2(th+tl), 0H, ≥1V
2b, ≥2(th+tl), 0H, ≥1V
HJ, 3b, ≥1fj, 1tl, ≥1H, 0(V+th)
3b, 1tl, ≥1H, 0(V+th)
HJ, 3b, ≥1fj, 0tl, ≥1H, 1(V+th)
3b, 0tl, ≥1H, 1(V+th)
HJ, 3b, ≥1fj, ≥1H, ≥2(V+tl+th)
3b, ≥1H, ≥2(V+tl+th)
HJ, ≥4b, ≥1fj, 1tl, ≥1H, 0(V+th)
≥4b, 1tl, ≥1H, 0(V+th)
≥4b, 0tl, ≥1H, 1(V+th)
HJ, ≥4b, ≥1fj, 0tl, ≥1H, 1(V+th)
≥4b, ≥1H, ≥2(V+tl+th) HJ, ≥4b, ≥1fj, ≥1H, ≥2(V+tl+th)
≥4b, ≥1tl, 0H, 0(V+th)

HJ, ≥4b, 0fj, ≥1tl, 0H, 0(V+th)

T (→ Zt)qt
T (→ Zt)qt
T (→ Zt)qt
T (→ Zt)qt
T (→ Zt)qt
T (→ Zt)qt
T (→ Ht)qt
T (→ Ht)qt
T (→ Ht)qt
T (→ Ht)qt
T (→ Ht)qt
T (→ Ht)qt
t¯t+ ≥ 1b

Table 6.3: Deﬁnition of the 24 analysis search regions (referred to as “ﬁt regions”). The
events are categorized based on the multiplicity of central jets (j), b-tagged jets (b), for-
ward jets (fj), V-tagged jets (V), Higgs-tagged jets (H), hadronic top tagged jets (th), and
reconstructed leptonic tops (tl).

201

b-tag mult. Fwd-jet mult. Boosted-object mult.

Region name

Validation regions with 3–5 jets

1
1
1
1
2
2
2
2
≥3
≥3

0
0
≥1
≥1
0
0
≥1
≥1
0
≥1

0th, 0tl, 0H, ≥1V
0th, ≥1tl, 0H, ≥1V
≥1(th+tl), 0H, 0V
≥1th, 0tl, 0H, ≥1V
0th, 0tl, 0H, ≥1V
0th, ≥1tl, 0H, ≥1V
≥1(th+tl), 0H, 0V
≥1th, 0tl, 0H, ≥1V
0(th+tl), ≥1H, 0V
0H, ≥1(V+tl+th)

LJ, 1b, 0fj, 0th, 0tl, 0H, ≥1V
LJ, 1b, 0fj, 0th, ≥1tl, 0H, ≥1V
LJ, 1b, ≥1fj, ≥1(th+tl), 0H, 0V
LJ, 1b, ≥1fj, ≥1th, 0tl, 0H, ≥1V
LJ, 2b, 0fj, 0th, 0tl, 0H, ≥1V
LJ, 2b, 0fj, 0th, ≥1tl, 0H, ≥1V
LJ, 2b, ≥1fj, ≥1(th+tl), 0H, 0V
LJ, 2b, ≥1fj, ≥1th, 0tl, 0H, ≥1V
LJ, ≥3b, 0fj, 0(th+tl), ≥1H, 0V
LJ, ≥3b, ≥1fj, 0H, ≥1(V+tl+th)

Validation regions with ≥6 jets

b-tag mult. Fwd-jet mult. Boosted-object mult.

Region name

1
1
1
1
2
2
2
2
≥3
≥3

0
0
≥1
≥1
0
0
≥1
≥1
0
≥1

HJ, 1b, 0fj, 1(th+tl), 0H, ≥1V
1(th+tl), 0H, ≥1V
HJ, 1b, 0fj, ≥2(th+tl), 0H, ≥1V
≥2(th+tl), 0H, ≥1V
0th, 0tl, ≥1H, ≥1V
HJ, 1b, ≥1fj, 0th, 0tl, ≥1H, ≥1V
≥2(th+tl), ≥1H, 0V HJ, 1b, ≥1fj, ≥2(th+tl), ≥1H, 0V
HJ, 2b, 0fj, 1(th+tl), 0H, ≥1V
1(th+tl), 0H, ≥1V
HJ, 2b, 0fj, ≥2(th+tl), 0H, ≥1V
≥2(th+tl), 0H, ≥1V
0th, 0tl, ≥1H, ≥1V
HJ, 2b, ≥1fj, 0th, 0tl, ≥1H, ≥1V
≥2(th+tl), ≥1H, 0V HJ, 2b, ≥1fj, ≥2(th+tl), ≥1H, 0V
HJ, ≥3b, 0fj, ≥1H, ≥1(V+tl+th)
≥1H, ≥1(V+tl+th)
HJ, ≥3b, ≥1fj, 0H, ≥1(V+tl+th)
0H, ≥1(V+tl+th)

Table 6.4: Deﬁnition of the 20 analysis validation regions that are designed to validate the
kinematic modeling of the background processes in the analysis ﬁt regions. The validation
regions are obtained by either vetoing forward jets in the events or inverting the most relevant
boosted object multiplicity requirements in the ﬁt regions that are deﬁned in Table 6.3.

202

6.1.5 Kinematic Reweighting of Background

As discussed in subsection 6.1.2, the meﬀ variable has a good separation power between

signal and background, which stems from its dependence on the pT and multiplicities of

ﬁnal state objects in an event. However, recent measurements have demonstrated that the

MC simulations of the dominant t¯t background process and subdominant V +jets background

processes mismodel the pT and multiplicity of jets that are produced in events from these

processes. In the case of t¯t processes, it is observed that the MC simulation overestimates the

cross section of this process at large values of the jet pT spectrum [80] and underestimates

it at high jet multiplicities [81]. A similar issue is also present in the modeling of V +jets

processes in the high jet multiplicity region and Hhad

T

2 [82]. These mismodelings on the MC

simulation enter as an additional source of mismodeling on meﬀ due to how it is deﬁned. This

strongly impacts the high-tail of the meﬀ distribution where most of the signal is expected

to reside, as shown in Figure 6.8.

In order to ﬁx the mismodeling introduced by these background processes, data-driven

correction factors are derived in kinematic regions that are enriched in the background pro-

cess that is to be corrected and are depleted in signal events. These regions are referred to

as reweighting source regions (RSRs) and are summarized in Table 6.5. The potential con-

tamination from signal processes in the RSRs was quantiﬁed and found to be negligible. The

reweighting of t¯t+jets is done jointly with the single-top W t-channel background, denoted

as t¯t + W t, due to both processes sharing the same ﬁnal state and therefore being subject

to interference. Additionally, both processes are generated using the same MC generator,

thus sharing similar mismodeling. The t¯t + W t RSR is deﬁned in a way that is close to the

preselection level of the analysis, which is dominated by the t¯t+jets process. For the V +jets

2Hhad

T is deﬁned as the scalar sum of the pT of all central jets in the event.

203

Figure 6.8: Comparison of the meﬀ distribution between the data, the mismodeled back-
ground prediction (blue dashed line), and the reweighted background prediction at prese-
lection level before performing the likelihood ﬁt. The “Others” background includes the
t¯tV /H, V H, tZ, t¯tt¯t, diboson, and multijet production background processes. The bottom
panel shows the ratios of data to the total mismodeled background prediction and the total
reweighted background prediction. This ﬁgure is taken from [79].

background, there is no region in the analysis that is pure enough in the W +jets process

so that it can be isolated to derive a correction factor for it. Instead, the correction factor

is derived for Z+jets in a RSR that requires exactly two same-ﬂavored leptons in order to

isolate this process from other backgrounds. The Z+jets RSR requires at least 3 jets with

exactly one b-tagged jet in order to maintain kinematic consistency with the preselection

region. Additionally, in order to increase the purity of Z+jets in its RSR, the invariant

mass of the dilepton system (m(cid:96)(cid:96)) is required to be consistent with the mass of the Z boson.

Finally, to further reduce the contamination from t¯t+jets, a Emiss

T < 100 GeV cut is applied.

The correction factor that is derived for Z+jets is assumed to be valid also for W +jets.

The correction factors are deﬁned in the same way for all background processes that are

to be reweighted. For a background process a and kinematic variable x, the correction factor

204

100015002000250030003500 [GeV]effm00.51Data / Bkg.210310410510610710810910Events / 100 GeVATLAS 1 = 13 TeV, 139 fbs PreselectionPre FitDataUnreweighted Bkg.+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertaintyReweighting source regions

Lepton mult.

Jet mult.

b-tag mult.

Additional cuts

Targeted background

1
2

≥3
≥3

2
1

–
|m(cid:96)(cid:96) − MZ| ≤ 10 GeV,
T < 100 GeV

Emiss

t¯t + W t
Z+jets

Table 6.5: Reweighting source regions from which the reweighting functions for t¯t and W t
production and W/Z+jets production are derived.

is calculated as:

Ra(x) =

Data(x) − MCnon−a(x)
MCa(x)

(6.2)

where Data(x), MCnon−a(x), and MCa(x) denote the distributions of the kinematic variable

x in data, the total background MC simulation excluding the process to be reweighted, and

the MC simulation of the process to be reweighted respectively. Since Equation 6.2 is deﬁned

through binned distributions of a kinematic variable x, the correction factor Ra(x) is also a

binned distribution that is a function of x. The reweighting procedure for the background

processes to be corrected can be summarized in the following steps:

1. Derive a bin-by-bin jet multiplicity correction factor (Ra(Njets)) in the RSR for process

a.

2. For each event of the process a, apply the binned correction factor that corresponds to

the value of Njets in the event as an event weight.

3. Derive a bin-by-bin correction factor for the reduced eﬀective mass variable (mred

eﬀ )3 in

the RSR for process a after applying Ra(Njets).

4. Perform a functional ﬁt on Ra(mred

eﬀ ) in order to mitigate statistical eﬀects and then

3The reduced eﬀective mass variable is deﬁned as mred

eﬀ = meﬀ − (Njets − 3) × 50 GeV.

205

apply it as an event weight in conjuction with Ra(Njets) to the unreweighted process

a.

This procedure is ﬁrst applied to the V +jets background and then subsequently applied

to t¯t + W t using the corrected V +jets background. The motivation of using mred

eﬀ as the

kinematic variable to derive the correction factor that addresses the meﬀ mismodeling is

twofold. First, the constant factor of 50 GeV in its deﬁnition approximately corresponds

to the average pT of each additional jet that is produced at preselection level in t¯t events.

Second, the reweighting of mred

eﬀ for t¯t + W t is performed in exclusive Njets bin regions for

Njets = 3, 4, 5, and 6, and inclusively for Njets ≥ 7. This is implemented in this way because

the additional jets in t¯t processes can arise directly from the main decay topology of this

process in the exclusive Njets bin regions and therefore can strongly inﬂuence the shape of

the correction factor in a Njets-dependent way. Thus, including the Njets − 3 shift in the

deﬁnition of mred

eﬀ reduces the Njets dependence on the correction factors.

In the case of

Njets ≥ 7, the additional jets start to arise from outside the main decay topology of the t¯t

system and thus lack suﬃcient energy to inﬂuence the shape of the correction factor. Thus,

the use of mred

eﬀ allows for an inclusive correction factor in high jet multiplicities instead of

requiring an individual factor for each Njets bin. Unlike the derivation of the mred

eﬀ correction

factor for t¯t + W t, the V +jets correction factor is derived inclusively in jet multiplicity due

to the low event statistics available in the Z+jets RSR. Furthermore, the additional jets in

V +jets processes come mostly from ﬁnal state radiation and thus lack the energy to strongly

inﬂuence the shape of the correction factor.

The ﬁnal step in the reweighting procedure is to perform a functional ﬁt to Ra(mred

eﬀ ) in

order to reduce the eﬀects from statistical limitations in the extreme regions of mred

eﬀ . All the

diﬀerent Ra(mred

eﬀ ) use the same functional form template, which is given by the following

206

sigmoid function:

f (x) = p1 −

p2
1 + exp (p3(x − q))

(6.3)

The values of the pi parameters are determined from the ﬁt, while the parameter q is a ﬁxed

parameter that varies for each diﬀerent ﬁt. An uncertainty is assigned to the reweighting

procedure by applying a ±2σ variation on the ﬁt function to take into account the statistical

uncertainty on the bin-by-bin reweighting and the potential shape diﬀerences on the ﬁt

template. The uncertainties for each ﬁt are propagated as nuisance parameters on the binned

likelihood ﬁt which will be discussed in the following sections. The correction factors and

their ﬁts with the corresponding 1, 2, and 3σ bands are shown in Figures 6.9 and 6.10. As

shown in Figure 6.8 the background reweighting improves signiﬁcantly the MC modeling at

preselection level. Additional plots are included in Appendix D that demonstrate how well

the background reweighting procedure extends to other kinematic observables and selection

regions in the analysis.

207

(a)

(b)

(c)

Figure 6.9: The V +jets reweighting correction factor in the Z+jets RSR (a) and the t¯t+W t
reweighting correction factors in the 3j (b) and 4j (c) RSRs. The black markers correspond
to the bin-by-bin correction factors with their associated statistical uncertainty. The solid
blue line corresponds to the best ﬁt obtained using the sigmoid template. The green, yellow,
and orange bands correspond to the ±1, 2, and 3σ conﬁdence intervals of the ﬁt, respectively.
The bottom panel shows the ratio of the bin-by-bin correction factor to the ﬁt.

208

(a)

(b)

(c)

Figure 6.10: The t¯t+W t reweighting correction factors in the 5j (a), 6j (b), and ≥7j (c)RSRs.
The black markers correspond to the bin-by-bin correction factors with their associated
statistical uncertainty. The solid blue line corresponds to the best ﬁt obtained using the
sigmoid template. The green, yellow, and orange bands correspond to the ±1, 2, and 3σ
conﬁdence intervals of the ﬁt, respectively. The bottom panel shows the ratio of the bin-by-
bin correction factor to the ﬁt.

209

6.1.6 Systematic Uncertainties

Several sources of systematic uncertainty are considered that can aﬀect either the normal-

ization or both the normalization and the shape of the meﬀ distribution. Each systematic

uncertainty is considered to be correlated across processes, analysis regions, and bins of

meﬀ, unless explicitly stated otherwise in the following description. Uncertainties from dif-

ferent sources are considered to be uncorrelated from each other. The sources of systematic

uncertainty can be classiﬁed as either experimental uncertainties or modeling uncertainties.

6.1.6.1 Experimental Uncertainties

The experimental uncertainties are associated with the data taking and object reconstruction

procedures by the ATLAS detector. The leading sources of experimental uncertainties in

this search arise from the jet ﬂavor tagging eﬃciencies and the jet mass resolution.

Luminosity The uncertainty in the combined 2015-2018 integrated luminosity is 1.7%,

which aﬀects the overall normalization of all simulated processes. It is obtained using the

LUCID-2 detector [83] for the primary luminosity measurements.

Lepton Uncertainties These uncertainties are associated with the lepton triggering, se-

lection, reconstruction, and identiﬁcation processes. Additionally, data to MC scale factors

are derived to calibrate the eﬃciencies of these processes in MC to data. The uncertainties

associated with these scale factors are also considered. The overall eﬀect of these uncer-

tainties results in a normalization uncertainty in signal and background of approximately

1%.

210

Jet and Emiss

T Uncertainties The uncertainties associated with jets arise from the jet

energy scale (JES) and resolution (JER), the jet mass scale (JMS) and resolution (JMR),

and the eﬃciency of the jet vertex tagger (JVT) [84] requirements that are imposed to reject

jets from pile-up. The JES and JER uncertainties are estimated from a combination of

collision data, test-beam data and simulation. The JES and JER uncertainties are split into

30 and 8 uncorrelated components, respectively, that correspond to diﬀerent physical sources.

The JMS uncertainty is estimated by comparing each nominal sample to two corresponding

alternative event samples in which the mass of each jet is shifted up and down by 10%,

respectively. A similar procedure is applied to estimate the JMR uncertainty by comparing

each nominal sample to an alternative event sample in which the mass of each jet is smeared

by a Gaussian function whose width is shifted by 20% relative to the nominal JMR.

The Emiss

T

reconstruction is aﬀected by uncertainties associated with the energy scales and

resolutions of leptons and jets that are propagated to Emiss

T . Additional small uncertainties

associated with the impact on the pT scale and resolution of the unclustered energy from

the underlying event are also taken into account as part of the Emiss

T

uncertainties.

Flavor Tagging Uncertainties The uncertainties associated with the eﬃciency of tagging

jets to b-, c-, and light-quarks and the data to MC scale factors used to calibrate the b-

tagging algorithm eﬃciency. These uncertainties are broken down into a set of 9 independent

uncertainty sources for b-jets, 5 independent uncertainty sources for c-jets, and 6 independent

uncertainty sources for light-jets. Additionally, an extrapolation uncertainty component is

considered for high pT jets that are outside the kinematic regime of the data sample that

is used to calibrate the b-tagger. This component is taken to be correlated amongst the

diﬀerent jet ﬂavors.

211

Experimental uncertainty

Type Components

Luminosity
Electron trigger+reco+ID+isolation
Electron energy scale+resolution
Muon trigger+reco+ID+isolation
Muon momentum scale+resolution

Jet vertex tagger
Jet energy scale
Jet energy resolution
Jet mass scale
Jet mass resolution
Emiss
T
Emiss
T

scale and resolution
trigger eﬃciency

b-tagging eﬃciency
c-tagging eﬃciency
Light-jet tagging eﬃciency
b-tagging extrapolation

N
SN
SN
SN
SN

SN
SN
SN
SN
SN
SN
N

SN
SN
SN
SN

1
5
2
12
5

1
30
8
1
1
3
1

9
5
6
2

Table 6.6: List of experimental systematic uncertainties considered. An “N”(“S”) means that
the uncertainty is taken as normalization-only (shape-only) for all processes and channels
aﬀected, whereas “SN” means that the uncertainty is taken on both shape and normalization.
Some of the systematic uncertainties are split into several components for a more accurate
treatment.

Table 6.6 summarizes the sources of experimental uncertainties, including whether they

aﬀect only the normalization (N), or both the normalization and the shape (SN) of the meﬀ

distribution, as well as the number of uncorrelated components.

212

6.1.6.2 Modeling Uncertainties

The modeling uncertainties are associated to the modeling of the MC simulations of SM

background processes. The sources of uncertainties considered are normalization uncertain-

ties related to the cross section of the diﬀerent processes and the uncertainties related to

the modeling parameters used to obtain the simulation samples. For small background pro-

cesses in the analysis, only the cross section uncertainties are considered. For the dominant

background source in the analysis, modeling uncertainties are considered in addition to the

cross section uncertainties. The modeling parameter uncertainties are estimated by compar-

ing the nominal simulation sample with specialized alternative samples that are obtained by

systematically varying these modeling parameters. These alternative samples are described

in Appendix A. Finally, the uncertainties associated with the background reweighting proce-

dure described in subsection 6.1.5 are also included as modeling uncertainties. The leading

sources of modeling uncertainties in this search arise from the modeling of t¯t and single-top

W t-channel backgrounds.

Cross Section Uncertainties An uncertainty of +5.5/−6.1% is assigned to the inclusive

t¯t production cross section [85], which includes contributions from varying the factorization

and renormalization scales, the PDF, the αS parameter, and the value of the top quark mass.

Additionally, a normalization uncertainty of 50% is assigned to the t¯t+≥ 1b and t¯t+≥ 1c

processes individually. These uncertainties are motivated by the level of agreement between

data and MC simulation in dedicated measurements of the cross section of the t¯t+≥ 1b

process [86].

For single-top processes a ±5% uncertainty on the total cross section is estimated as

a weighted average of the theoretical uncertainties on the t-, s-, and W t-channel produc-

213

tions [87, 88, 89].

For the V +jets backgrounds a ±30% normalization uncertainty is applied and kept cor-

related between W +jets and Z+jets processes but uncorrelated between diﬀerent b-tag mul-

tiplicities (1,2,3,≥4). This is based on variations of the factorization and renormalization

scales and the Sherpa [90] MC generator matching parameters [91] that are used to generate

samples of these processes.

For the t¯tW/Z and t¯tH processes, since their contribution in the analysis ﬁt regions

is small, only cross section uncertainties of these processes are considered. The assigned

uncertainty is ±15% for t¯tW/Z, which is decorrelated between the LJ and HJ regions, and

+9/ − 12% for t¯tH, which is kept correlated.

For diboson processes a 5% uncertainty on the inclusive cross section calculated at

NLO [92] is included. An additional uncorrelated 24% uncertainty on the production cross

section is considered for each additional jet in the event that is based on a comparison

amongst diﬀerent algorithms for merging LO matrix elements and parton showers [93]. This

uncertainty is computed based on the average jet multiplicity in each ﬁt region which is ap-

proximately 3 in the LJ regions and 6 in the HJ regions. A ±30% normalization uncertainty

on the production of additional heavy ﬂavor jets is also considered and only applied in the ﬁt

regions with ≥ 3b-jets. All of these uncertainties are added in quadrature and decorrelated

between the LJ and HJ regions, as well as between low b-tag and high b-tag multiplicity

regions. The total magnitude of the normalization uncertainty on diboson processes in the

LJ region is 24% and 38% in the low b-tag and high b-tag multiplicity regions respectively,

whereas the HJ regions it is 48% and 56% in the low b-tag and high b-tag regions respectively.

214

Sample Modeling Uncertainties A number of sources of systematic uncertainties that

aﬀect the modeling of t¯t+jets and single-top production processes are considered. The un-

certainties associated to the choice of the NLO generator, the modeling of the parton show-

ering and hadronization processess, and the modeling of the initial-state radiation (ISR)

and ﬁnal-state radiation (FSR) are estimated by comparing the nominal simulation samples

to alternative samples as outlined in Appendix A. These uncertainties are all treated as

uncorrelated between the t¯t+light-jets, t¯t+≥ 1c, t¯t+≥ 1b, and single-top samples but cor-

related across the the single-top s-, t-, and W t-channels. Furthermore, these uncertainties

are treated as uncorrelated amongst the LJ and HJ analysis regions and regions with 0,

1, or ≥2 tagged boosted objects. An additional systematic uncertainty on the W t-channel

production concerning the separation between t¯t and W t at NLO is assessed by comparing

the nominal sample that utilizes the diagram-subtraction scheme to an alternative sample

using the diagram-removal scheme.

An additional set of modeling uncertainties on the V +jets background is also consid-

ered. These uncertainties are estimated from variations in the internal renormalization and

factorization scale parameters in the Sherpa MC generator.

Background Reweighting Uncertainties As discussed in subsection 6.1.5, an uncer-

tainty is associated to the kinematic reweighting of the t¯t+W t and V +jets processes by

varying the functional ﬁt of the bin-by-bin reweighting by ±2σ in order to consider the

statistical limitations in the derivation procedure and the choice of the functional template.

Additionally, the reweighting procedure is applied to the alternative t¯t and single-top W t-

channel samples. This is required since the alternative samples also share similar mismod-

elings as the ones aﬀecting the nominal samples and also in order to maintain kinematic

215

consistency between the nominal and alternative samples when estimating the modeling un-

certainties. The reweighting correction factors that are applied to the alternative samples

are derived from the nominal correction factor Ra nom(x) sample as:

Ra alt(x) =

MCa nom(x)
MCa alt(x)

Ra nom(x)

(6.4)

where MCa nom(x) and MCa alt(x) are the distributions of the kinematic variable x in the

nominal MC simulation and alternative MC simulation of the background process to be

reweighted, respectively.

Table 6.7 summarizes the sources of modeling uncertainties, including whether they af-

fect only the normalization (N), or both the normalization and the shape (SN) of the meﬀ

distribution, as well as the number of uncorrelated components.

216

Modeling uncertainty
t¯t cross section
t¯t+≥ 1b, t¯t+≥ 1c normalizations
t¯t+light parton shower+hadronization
t¯t+light NLO generator
t¯t+light radiation
t¯t+≥ 1c parton shower+hadronization
t¯t+≥ 1c NLO generator
t¯t+≥ 1c radiation
t¯t+≥ 1b parton shower+hadronization
t¯t+≥ 1b NLO generator
t¯t+≥ 1b radiation

Single-top cross section
Single-top parton shower+hadronization
Single-top NLO generator
Single-top radiation
Single-top DR/DS

V +jets normalization
W +jets modeling
Z+jets modeling
Diboson normalization
t¯tV normalization
t¯tH cross section

V +jets reweighting
t¯t+W t reweighting

Type Components

N
N
SN
SN
SN
SN
SN
SN
SN
SN
SN

N
SN
SN
SN
SN

N
S
S
N
N
N

SN
SN

1
2
5
5
20
5
5
20
5
5
20

1
5
5
20
1

4
1
1
8
2
1

1
5

Table 6.7: List of modeling systematic uncertainties considered. An “N”(“S”) means that
the uncertainty is taken as normalization-only (shape-only) for all processes and channels
aﬀected, whereas “SN” means that the uncertainty is taken on both shape and normalization.
Some of the systematic uncertainties are split into several components for a more accurate
treatment.

217

6.1.7 Statistical Analysis

6.1.7.1 Maximum Likelihood Function

For each signal benchmark scenario and mass hypotheses considered in this analysis, the meﬀ

distributions across all the analysis search regions are jointly analyzed to test for the presence

of the predicted signal. To perform the statistical analysis a likelihood function L(µ, θ) is

constructed as the product of Poisson probability terms over all the meﬀ bins considered in

the analysis. This likelihood function depends on the signal-strength parameter µ, which

enters as a multiplicative factor of the predicted production cross section for signal, and θ,

which is a set of nuisance parameters that encode the eﬀect of systematic uncertainties in

the signal and background expectations. Therefore, the expected total number of events in

a given bin depends on µ and θ. The likelihood function can be expressed mathematically

as:

L(µ, θ) =

bins

regions

(cid:89)i

(cid:89)j

(µSij(θ) + Bij(θ))nij
nij!

−
e

µSij (θ)+Bij (θ)
(cid:17)

(cid:16)

×

θ

(cid:89)θk

√

1
2πσk

2
θk−θ0
k
σk (cid:33)

− 1
2
e

(cid:32)

(6.5)

The terms in the ﬁrst two products are the Poisson terms that quantify the probability

of observing nij data events in the meﬀ bin i of a given analysis ﬁt region j subject to

µSij(θ) + Bij(θ) expected events, where Sij(θ) and Bij(θ) are the predicted number of

signal and background events in that bin, respectively. The signal is normalized to the

signal production cross section times the decay branching ratio of a given signal benchmark

times the integrated luminosity. The terms in the third product are Gaussian prior terms

that parametrize the eﬀect of a given systematic uncertainty with a corresponding nuisance

parameter θk, where k enumerates the systematic uncertainties that are considered in the

218

analysis. The values that θk takes represent variations from the nominal value of the nuisance

parameter θ0

k, which is usually taken as 0, measured in units of one standard deviation σk.

The value of θk will be determined by obtaining the maximum likelihood estimator (MLE)

of the nuisance parameter. As a result of this, the Gaussian prior acts as a penalty function

by reducing the likelihood if the ﬁtted value of θk is moved away, or pulled, from its nominal

value θ0
k.

For a given value of µ, the variations in the nuisance parameters θ allow the expectations

for signal and background to change according to the corresponding systematic uncertain-

ties. The ﬁtted values of θ that are obtained correspond to deviations from the nominal

expectations that globally provide the best ﬁt to the data. This procedure reduces the im-

pact of the systematic uncertainties on the search sensitivity and improves the background

prediction by taking advantage of the highly populated background-dominated regions that

are included in the likelihood ﬁt.

6.1.7.2 Hypothesis Testing

The statistical analysis is designed to test the agreement of the observed data with two

hypotheses: the background-only hypothesis, µ = 0, where only the physics of the Standard

Model is assumed, and the signal-plus-background hypothesis, µ > 0, where the new physics

from beyond the Standard Model is also assumed. This is achieved by deﬁning a test statistic,

which will depend on the likelihood function, that quantiﬁes the agreement of the observed

data with a given hypothesis. In order to deﬁne the test statistic, the likelihood function

will be ﬁt to the data. The ﬁts are done in two ways: a conditional ﬁt where the value of µ

is speciﬁed, and an unconditional ﬁt where µ is not speciﬁed but obtained from its MLE. A

desirable property of the test statistic is that it is a function that maps observed data to a

219

numerical value that monotonically orders the observations based on how extreme they are

under the assumption of a given hypothesis. This will allow us to calculate probabilities of

making an observation that is at least as incompatible with the observed data and set limits

on the signal-strength parameter based on whether the signal-plus-background hypothesis is

rejected for a given value of µ.

6.1.7.3 Proﬁle Likelihood Ratio Test Statistic

The test statistic that is employed for testing the signal-plus-background hypothesis for a

given value of µ > 0 is the proﬁle likelihood ratio qµ:

qµ = −2 ln

L(µ,

(cid:16)

ˆˆθµ)/L(ˆµ, ˆθ)
(cid:17)

(6.6)

where ˆµ and ˆθ are the values of the parameters µ and θ that simultaneously maximize the

likelihood function L(µ, θ) subject to the constraint 0 ≤ ˆµ ≤ µ, which are obtained through
ˆˆθµ are the values of the nuisance parameters that maximize

the unconditional ﬁt. The values

the likelihood function for a given value of µ, which are obtained through the conditional ﬁt.

Eﬀectively, this statistic chooses the value of the signal-strength parameter that best matches

the observed data, ˆµ, and compares it with the value set by the analyzer. If the data agrees
ˆˆθµ)/L(ˆµ, ˆθ) tends to 1, and consequently, qµ

well with the speciﬁed value of µ then L(µ,

tends to 0. On the other hand, if the data strongly disagrees with the speciﬁed value of µ,
ˆˆθµ)/L(ˆµ, ˆθ) tends to 0, and consequently, qµ tends to larger positive values. To

then L(µ,

test for a discovery, which amounts to testing the compatibility of the observed data with

the background-only hypothesis, a similar test statistic is used as in Equation 6.6 by setting

220

µ = 0 and leaving ˆµ unconstrained when performing the unconditional ﬁt:

q0 = −2 ln

L(0,

(cid:16)

ˆˆθ0)/L(ˆµ, ˆθ)
(cid:17)

(6.7)

Both test statistics behave similarly in that they attain values near zero when the observed

data agrees well with the hypothesized value of µ and tend to large positive values when

in disagreement. The p-value of the hypothesis test, which is the probability of making an

observation that is at least as incompatible with the observed data, is given by:

pH =

∞

qµ,obs
(cid:90)

f (qµ | H)dqµ

(6.8)

where qµ,obs is the value of the test statistic that is obtained from the observed data and

f (qµ | H) is the probability density function of the test statistic qµ under the assumption of

a hypothesis H. Speciﬁcally, the p-value of the background-only hypothesis, pb, is obtained

by integrating the probability density function of q0 that one would obtain by assuming the

background-only hypothesis. On the other hand, the p-value of the signal-plus-background

hypothesis for a speciﬁc value of µ > 0, ps+b, is obtained by integrating the probability

density function of qµ that one would obtain assuming the signal-plus-background hypothesis,

where the number of signal events is scaled by µ.

The probability distributions of the test statistics q0 and qµ are often unknown and

thus require estimation. This can be done through MC pseudo-experiments in which toy

datasets are sampled from the Poisson distributions in Equation 6.5. The sampling is done

individually for the background-only hypothesis and the signal-plus-background hypothesis

under a speciﬁed value of µ. These toy datasets are then used to evaluate the test statistic of

221

the assumed hypothesis from which they were sampled. A distribution of values of the test

statistic is obtained from multiple toy datasets, which is then used to calculate the p-values.

However, an asymptotic approximation of the expected distributions of the test statistic

probability density functions [94, 95] can also be used to do these computations since the

sampling of a large number of toy datasets is often needed in order to obtain reliable test

statistic distributions for each value of µ, which can be a time-consuming process. Under this

approximation, the values of the test statistics follow Gaussian distributions with diﬀerent

probability density functions, where qµ tends to lower values when in agreement with the

signal-plus-background hypothesis and q0 tends to higher values when in agreement with the

background-only hypothesis. As a consequence of this, the logic of the p-value pb is ﬂipped,

so that the probability of making an observation that is at least as incompatible as q0,obs is

obtained by integrating the probability distribution below this value.

6.1.7.4 The CLs Method

In traditional statistics, when performing a hypothesis test, a hypothesis is rejected if its

p-value is below a certain threshold. In the ﬁeld of particle physics, it is standard practice to

set the rejection threshold to 0.05 for the signal-plus-background hypothesis in an analysis

that searches for potential new particles. However, instead of applying this threshold to

ps+b, it is applied to the quantity CLs [96, 97], which is deﬁned as

CLs =

ps+b
1 − pb

(6.9)

where pb is the p-value of the background-only hypothesis. In the majority of searches for new

particles, the predicted number of signal events is very low compared to the expected number

222

of background events (Sij << Bij), to the point where the physics eﬀects of the signal-plus-

background hypothesis can be approximated by the background-only hypothesis. For some

values of µ, the expected number of events from the signal-plus-background hypothesis Pois-

son distributions in Equation 6.5 can eﬀectively behave as the background-only hypothesis

distributions since µSij +Bij ≈ Bij. This can lead to probability density functions of the test

statistics that can signiﬁcantly overlap when estimated from toy datasets, as shown in Fig-

ure 6.11. A problematic situation that can arise when excluding the signal-plus-background

hypothesis based on ps+b alone is when there is an observed deﬁcit of data compared to the

expected number of events predicted from the background-only hypothesis. The test statistic

tends towards negative values when in agreement with the signal-plus-background hypoth-

esis and towards positive values when in agreement with the background-only hypothesis.

Thus, an observed deﬁcit will result in a positive test statistic that is distributed towards

the right-hand tail of f (q | s + b), which will result in a very small value of ps+b. However,

even if qobs agrees more with the background-only hypothesis, a deﬁcit of observed events

also has poor compatibility with the background-only hypothesis. If the rejection thresh-

old were to be applied to ps+b, then this could lead to the rejection of the hypothesis that

favors new physics based on a statistical test that has negligible sensitivity to the signal-plus-

background model. The main motivation for using CLs to reject the signal-plus-background

hypothesis is that by dividing by 1 − pb, the rejection threshold is increased when there is an

observed deﬁcit of data compared to the background-only model, thereby providing a more

conservative rejection criteria for the signal-plus-background hypothesis.

223

Figure 6.11: Example of the probability distribution of a hypothetical test statistic q that is
used for testing a signal-plus-background hypothesis against a background-only hypothesis.
The p-value for rejecting the background-only hypothesis, pb, is obtained by integrating the
probability distribution of the test statistic under the background-only hypothesis, f (q | b),
below the observed value of the statistic, qobs, which is depicted as the yellow region. The
p-value for rejecting the signal-plus-background hypothesis is obtained by integrating the
probability distribution under the signal-plus-background hypothesis, f (q | s + b), above the
observed value, which is depicted as the green region. This ﬁgure is taken from [98].

6.1.7.5 Limit Calculation

Upper limits on the signal production cross section are computed using qµ in the CLs method,

where the production cross section is parametrized by µ. Two sets of upper limits are com-

puted: the expected limit, and the observed limit. Since f (q0 | b) is a Gaussian distribution

probability density function, the expected limit is deﬁned when pb = 0.5, which in turn de-

ﬁnes a value for qµ, denoted as qµ,exp. Next, ps+b is determined by setting the lower bound

of the integral to qµ,exp, which will allow us to determine the value of CLs. If CLs is diﬀerent

than 0.05, then µ is varied until CLs is equal to the threshold value of 0.05. The value of µ

224

q-10-8-6-4-20f(q)00.10.20.30.40.5obsqf(q|s+b)f(q|b)s+bpbpthat achieves CLs = 0.05 is the expected upper limit. The observed limit is obtained in the

same way but using the observed value of the test statistic obtained from the data instead of

the expected value. This process is repeated for all mass points for a given signal benchmark

scenario, with the end result being an upper limit band that depends on the mass of the

Vector-Like top quark. The ±1σ and ±2σ contour bands of the expected limit are obtained

by repeating these steps using the values of qµ,exp that correspond to the given standard

deviation of f (q0 | b).

6.1.8 Results

6.1.8.1 Maximum Likelihood Fits to Data

A binned likelihood ﬁt as described in subsection 6.1.7 is performed under the background-

only hypothesis on the meﬀ distributions across all analysis ﬁt regions. A comparison between

the overall observed and expected yields in each ﬁt region before and after the ﬁt to data is

shown in Figure 6.12. As can be observed in the bottom panels of the plots, the combined

impact of the systematic uncertainties considered in the analysis has been constrained as a

result of the ﬁt, using information from the large number of events in the signal-depleted

regions with diﬀerent background contributions. Consequently, an improved background

prediction is obtained with reduced uncertainty across all regions, including those with a

signiﬁcant fraction of expected signal events. This is summarized in Tables 6.8 and 6.9, which

contain the number of observed data events, and the pre-ﬁt and post-ﬁt background yields in

the four most sensitive analysis ﬁt regions respectively. Furthermore, the pre-ﬁt and post-ﬁt

distributions of meﬀ in these four regions are shown in Figures 6.13 and 6.14 to highlight

the overall good post-ﬁt agreement between data and the MC background prediction.

225

The improved background prediction is veriﬁed by checking the agreement between data

and the post-ﬁt background in the analysis validation regions, which are not included in

the ﬁt and are designed to be orthogonal from the analysis search regions that are used

in the ﬁt. The pre-ﬁt and post-ﬁt comparison of the observed and expected yields in all

validation regions is shown in Figure 6.15. Overall the post-ﬁt results in a reduced impact

of the systematic uncertainties and an improved background prediction that agrees with

data within uncertainties. Furthermore, the pre-ﬁt and post-ﬁt meﬀ distributions in the

corresponding validation regions of the four most sensitive analysis ﬁt regions are shown in

Figure 6.16 and Figure 6.17. The general post-ﬁt improvement in the estimated background

in the analysis validation regions gives conﬁdence in the background estimation procedure.

226

LJ, 2b, ≥1fj,
0th, ≥1tl, 0H,
≥1V

HJ, 2b, ≥1fj,
LJ, ≥4b, ≥1fj,
0th, ≥1tl, ≥1H, ≥2(th+tl), 0H,

0V

≥1V

HJ, ≥4b, ≥1fj,
≥1H,
≥2(V+tl+th)

31.8 ± 4.9

7.2 ± 3.5

1.3 ± 0.4

1.0 ± 0.5

21.8 ± 2.4

8.5 ± 5.6

7.3 ± 2.1

7.1 ± 4.5

1170 ± 210
143 ± 80
57 ± 32
250 ± 50
13.2 ± 3.1
1.5 ± 0.2
25.7 ± 9.4
4.4 ± 1.7
3.8 ± 1.4
12.9 ± 7.3
2.0 ± 0.3

1.6 ± 2
1.5 ± 1.3
4.8 ± 3.9
0.66 ± 0.87
0.33 ± 0.19
0.51 ± 0.15
0.70 ± 1.3
<0.001
0.02 ± 0.03
0.025 ± 0.017
0.03 ± 0.04

39.1 ± 9.5
15.3 ± 9.9
6.1 ± 4.3
7.3 ± 7.5
2.5 ± 1.1
0.34 ± 0.14
1.2 ± 1.1
0.25 ± 0.10
0.21 ± 0.15
0.61 ± 0.46
0.25 ± 0.14

0.49 ± 0.29
0.86 ± 0.58
2.6 ± 2
<0.001
0.22 ± 0.82
0.42 ± 0.12
0.24 ± 0.15
0.007 ± 0.007
<0.001
0.16 ± 0.14
0.33 ± 0.06

T singlet
(mT = 1.6 TeV, κ = 0.5)
T doublet
(mT = 1.6 TeV, κ = 0.5)

t¯t+light-jets
t¯t+≥ 1c
t¯t+≥ 1b
Single-top
t¯tW/Z
t¯tH
W +jets
Z+jets
Dibosons
Multijet
Rare backgrounds

Total background

1690 ± 280

10.2 ± 4.8

73 ± 20

5.4 ± 2.5

Data

1519

10

64

7

Table 6.8: Predicted and observed yields in four of the most sensitive search regions (de-
pending on the signal scenario) considered. The “rare backgrounds” category includes the
V H, tZ and t¯tt¯t backgrounds. The background prediction is shown before the ﬁt to data.
Also shown are the signal predictions for diﬀerent benchmark scenarios considered. The
individual systematic uncertainties for the diﬀerent background processes can be correlated,
and do not necessarily add in quadrature to equal the systematic uncertainty in the total
background yield. The quoted uncertainties are the sum in quadrature of statistical and
systematic uncertainties in the yields.

227

LJ, 2b, ≥1fj,
0th, ≥1tl, 0H,
≥1V

LJ, ≥4b, ≥1fj,
HJ, 2b, ≥1fj,
0th, ≥1tl, ≥1H, ≥2(th+tl), 0H,

0V

≥1V

HJ, ≥4b, ≥1fj,
≥1H,
≥2(V+tl+th)

t¯t+light-jets
t¯t+≥ 1c
t¯t+≥ 1b
Single-top
t¯tW/Z
t¯tH
W +jets
Z+jets
Dibosons
Multijet
Rare backgrounds

1033 ± 72
144 ± 54
75 ± 22
223 ± 55
12.1 ± 2.3
1.46 ± 0.21
26.6 ± 7.1
4.5 ± 1.2
3.4 ± 1.2
9.5 ± 5.7
2.0 ± 0.2

0.6 ± 0.8
1.5 ± 1.0
8 ± 3
0.09 ± 0.55
0.36 ± 0.18
0.51 ± 0.11
0.6 ± 1.0
<0.001
0.017 ± 0.029
0.018 ± 0.015
0.03 ± 0.03

Total background

1534 ± 56

12.1 ± 3.5

Data

1519

10

33.6 ± 4.5
15.6 ± 5.5
8.2 ± 2.3
2.3 ± 4.5
2.3 ± 0.8
0.29 ± 0.08
0.8 ± 0.5
0.27 ± 0.08
0.17 ± 0.13
0.45 ± 0.41
0.22 ± 0.08

64 ± 8

64

0.57 ± 0.24
0.82 ± 0.32
3.8 ± 1.1
<0.001
0.62 ± 0.76
0.40 ± 0.09
0.22 ± 0.13
0.005 ± 0.006
<0.001
0.12 ± 0.12
0.31 ± 0.05

6.8 ± 1.5

7

Table 6.9: Predicted and observed yields in four of the most sensitive search regions (de-
pending on the signal scenario) considered. The “rare backgrounds” category includes the
V H, tZ and t¯tt¯t backgrounds. The background prediction is shown after the ﬁt to data
under the background-only hypothesis. The individual systematic uncertainties for the dif-
ferent background processes can be correlated, and do not necessarily add in quadrature to
equal the systematic uncertainty in the total background yield. The quoted uncertainties
are computed after taking into account correlations among nuisance parameters and among
processes. The statistical uncertainty is added in quadrature to the systematic uncertainties.

228

Figure 6.12: Comparison between the data and background prediction for the yields in
each of the ﬁt regions considered (top) pre-ﬁt and (bottom) post-ﬁt, performed under the
background-only hypothesis. The “others” background includes the t¯t V /H, V H, tZ, t¯tt¯t,
diboson, and multijet backgrounds. The expected T singlet signal (solid red) for mT =
1.6 TeV and κ = 0.5 is included in the pre-ﬁt ﬁgure. The bottom panels display the ratios
of data to the total background prediction. These ﬁgures are taken from [79].

229

                        00.511.5 Data / Bkg.110210310410510EventsATLAS -1 = 13 TeV, 139 fbsFit regionsPre-FitDataT singlet (1.6 TeV)+light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty), 0H, 1Vl+th1b, 1fj, 0(t, 0H, 1Vl, 1th1b, 1fj, 0t), 0H, 1Vl+th2b, 1fj, 0(t, 0H, 1Vl, 1th2b, 1fj, 0t), 1H, 0Vl+th3b, 1fj, 0(t, 1H, 0Vl, 1th3b, 1fj, 0t, 1H, 0Vl, 0th3b, 1fj, 1t), 1H, 0Vl+th4b, 1fj, 0(t, 1H, 0Vl, 1th4b, 1fj, 0t, 1H, 0Vl, 0th4b, 1fj, 1t, 0H, 1Vl, 1th1b, 1fj, 0t, 0H, 1Vl, 0th1b, 1fj, 1t), 0H, 1Vl+th1b, 1fj, 2(t, 0H, 1Vl, 1th2b, 1fj, 0t, 0H, 1Vl, 0th2b, 1fj, 1t), 0H, 1Vl+th2b, 1fj, 2(t)h, 1H, 0(V+tl3b, 1fj, 1t)h, 1H, 1(V+tl3b, 1fj, 0t)h+tl3b, 1fj, 1H, 2(V+t)h, 1H, 0(V+tl4b, 1fj, 1t)h, 1H, 1(V+tl4b, 1fj, 0t)h+tl4b, 1fj, 1H, 2(V+t)h, 0H, 0(V+tl4b, 0fj, 1t)h, 0H, 0(V+tl4b, 0fj, 1t00.511.52Data / Bkg.110210310410510EventsATLASWork in Progress-1 = 13 TeV, 139 fbsFit RegionsPre-FitData=0.7 (1.6 TeV)kT singlet +light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty3-5 jets≥6 jetsCRLJHJ                        00.511.5 Data / Pred.110210310410510EventsATLAS -1 = 13 TeV, 139 fbsFit regionsPost-FitData+light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty), 0H, 1Vl+th1b, 1fj, 0(t, 0H, 1Vl, 1th1b, 1fj, 0t), 0H, 1Vl+th2b, 1fj, 0(t, 0H, 1Vl, 1th2b, 1fj, 0t), 1H, 0Vl+th3b, 1fj, 0(t, 1H, 0Vl, 1th3b, 1fj, 0t, 1H, 0Vl, 0th3b, 1fj, 1t), 1H, 0Vl+th4b, 1fj, 0(t, 1H, 0Vl, 1th4b, 1fj, 0t, 1H, 0Vl, 0th4b, 1fj, 1t, 0H, 1Vl, 1th1b, 1fj, 0t, 0H, 1Vl, 0th1b, 1fj, 1t), 0H, 1Vl+th1b, 1fj, 2(t, 0H, 1Vl, 1th2b, 1fj, 0t, 0H, 1Vl, 0th2b, 1fj, 1t), 0H, 1Vl+th2b, 1fj, 2(t)h, 1H, 0(V+tl3b, 1fj, 1t)h, 1H, 1(V+tl3b, 1fj, 0t)h+tl3b, 1fj, 1H, 2(V+t)h, 1H, 0(V+tl4b, 1fj, 1t)h, 1H, 1(V+tl4b, 1fj, 0t)h+tl4b, 1fj, 1H, 2(V+t)h, 0H, 0(V+tl4b, 0fj, 1t)h, 0H, 0(V+tl4b, 0fj, 1t00.511.52Data / Bkg.110210310410510EventsATLASWork in Progress-1 = 13 TeV, 139 fbsFit RegionsPre-FitData=0.7 (1.6 TeV)kT singlet +light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty3-5 jets≥6 jetsCRLJHJ                        00.511.5 Data / Bkg.110210310410510EventsATLAS -1 = 13 TeV, 139 fbsFit regionsPre-FitDataT singlet (1.6 TeV)+light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty(a)

(b)

(c)

(d)

Figure 6.13: Comparison between the data and prediction for the meﬀ distribution under the
background-only hypothesis, in the (LJ, 2b, ≥1fj, 0th, ≥1tl, 0H, ≥1V) region (a) pre-ﬁt and
(b) post-ﬁt, and the (LJ, ≥4b, ≥1fj, 0th, ≥1tl, ≥1H, 0V) region (c) pre-ﬁt and (d) post-ﬁt.
The expected T singlet signal (solid red) for mT = 1.6 TeV and κ = 0.5 is included in the
pre-ﬁt ﬁgures. The “others” background includes the t¯t V /H, V H, tZ, t¯tt¯t, diboson, and
multijet backgrounds. The bottom panels display the ratios of data to the total background
prediction. The hashed area represents the total uncertainty on the background. The last
bin in each distribution contains the overﬂow. These ﬁgures are taken from [79].

230

100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.0100200300400500600700800900EventsATLAS 1 = 13 TeV, 139 fbs 1V≥, 0H, l1t≥,h1fj, 0t≥LJ, 2b, Pre FitDataT singlet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.0100200300400500600700800EventsATLAS 1 = 13 TeV, 139 fbs 1V≥, 0H, l1t≥,h1fj, 0t≥LJ, 2b, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm00.511.5 Data / Bkg.0246810121416EventsATLAS 1 = 13 TeV, 139 fbs 1H, 0V≥, l1t≥,h1fj, 0t≥4b, ≥LJ, Pre FitDataT singlet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm00.511.5 Data / Bkg.0246810121416EventsATLAS 1 = 13 TeV, 139 fbs 1H, 0V≥, l1t≥,h1fj, 0t≥4b, ≥LJ, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty(a)

(b)

(c)

(d)

Figure 6.14: Comparison between the data and prediction for the meﬀ distribution under the
background-only hypothesis, in the (HJ, 2b, ≥1fj, ≥2(th+tl), 0H, ≥1V) region (a) pre-ﬁt
and (b) post-ﬁt, and the (HJ, ≥4b, ≥1fj, ≥1H, ≥2(V+th+tl)) region (c) pre-ﬁt and (d)
post-ﬁt. The expected T doublet signal (solid purple) for mT = 1.6 TeV and κ = 0.5 is
included in the pre-ﬁt ﬁgures. The “others” background includes the t¯t V /H, V H, tZ, t¯tt¯t,
diboson, and multijet backgrounds. The bottom panels display the ratios of data to the total
background prediction. The hashed area represents the total uncertainty on the background.
The last bin in each distribution contains the overﬂow. These ﬁgures are taken from [79].

231

15002000250030003500 [GeV]effm00.511.5 Data / Bkg.05101520253035EventsATLAS 1 = 13 TeV, 139 fbs 1V≥0H, ),l+th2(t≥1fj, ≥HJ, 2b, Pre FitDataT doublet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty15002000250030003500 [GeV]effm00.511.5 Data / Bkg.05101520253035EventsATLAS 1 = 13 TeV, 139 fbs 1V≥0H, ),l+th2(t≥1fj, ≥HJ, 2b, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty15002000250030003500 [GeV]effm00.511.5 Data / Bkg.0246810121416EventsATLAS-1= 13 TeV, 139 fbs)h+tl2(V+t≥1H,≥1fj, ≥4b, ≥HJ, Pre-FitDataT doublet (1.6 TeV)+light-jetstt1c≥+tt1b≥+ttV+jetsOthersUncertainty15002000250030003500 [GeV]effm00.511.5 Data / Bkg.0246810121416EventsATLAS-1= 13 TeV, 139 fbs)h+tl2(V+t≥1H,≥1fj, ≥4b, ≥HJ, Post-FitData+light-jetstt1c≥+tt1b≥+ttV+jetsOthersUncertaintyFigure 6.15: Comparison between the data and background prediction for the yields in each
of the VRs considered (top) pre-ﬁt and (bottom) post-ﬁt, performed under the background-
only hypothesis. The “others” background includes the t¯t V /H, V H, tZ, t¯tt¯t, diboson, and
multijet backgrounds. The expected T singlet signal (solid red) for mT = 1.6 TeV and
κ = 0.5 is included in the pre-ﬁt ﬁgure. The bottom panels display the ratios of data to the
total background prediction. These ﬁgures are taken from [79].

232

                    00.511.5 Data / Bkg.110210310410510610710EventsATLAS -1 = 13 TeV, 139 fbsValidation regionsPre-FitDataT singlet (1.6 TeV)+light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty), 0H, 0Vl+th1b, 1fj, 1(t, 0H, 1Vl, 0th1b, 0fj, 0t, 0H, 1Vl, 1th1b, 0fj, 0t, 0H, 1Vl, 0th1b, 1fj, 1t), 0H, 0Vl+th2b, 1fj, 1(t, 0H, 1Vl, 0th2b, 0fj, 0t, 0H, 1Vl, 1th2b, 0fj, 0t, 0H, 1Vl, 0th2b, 1fj, 1t)h+tl3b, 1fj, 0H, 1(V+t, 1H, 0Vh3b, 0fj, 0t), 0H, 1Vl+th1b, 0fj, 1(t, 1H, 1Vl, 0th1b, 1fj, 0t), 0H, 1Vl+th1b, 0fj, 2(t), 1H, 0Vl+th1b, 1fj, 2(t), 0H, 1Vl+th2b, 0fj, 1(t), 0H, 1Vl+th2b, 0fj, 2(t, 1H, 1Vl, 0th2b, 1fj, 0t), 1H, 0Vl+th2b, 1fj, 2(t)h+tl3b, 1fj, 0H, 1(V+t)h+tl3b, 0fj, 1H, 1(V+t00.511.52Data / Bkg.110210310410510610EventsATLASWork in Progress-1 = 13 TeV, 139 fbsFit RegionsPre-FitData=0.7 (1.6 TeV)kT singlet +light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty3-5 jets≥6 jets                    00.511.5 Data / Pred.110210310410510610710EventsATLAS -1 = 13 TeV, 139 fbsValidation regionsPost-FitData+light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty), 0H, 0Vl+th1b, 1fj, 1(t, 0H, 1Vl, 0th1b, 0fj, 0t, 0H, 1Vl, 1th1b, 0fj, 0t, 0H, 1Vl, 0th1b, 1fj, 1t), 0H, 0Vl+th2b, 1fj, 1(t, 0H, 1Vl, 0th2b, 0fj, 0t, 0H, 1Vl, 1th2b, 0fj, 0t, 0H, 1Vl, 0th2b, 1fj, 1t)h+tl3b, 1fj, 0H, 1(V+t, 1H, 0Vh3b, 0fj, 0t), 0H, 1Vl+th1b, 0fj, 1(t, 1H, 1Vl, 0th1b, 1fj, 0t), 0H, 1Vl+th1b, 0fj, 2(t), 1H, 0Vl+th1b, 1fj, 2(t), 0H, 1Vl+th2b, 0fj, 1(t), 0H, 1Vl+th2b, 0fj, 2(t, 1H, 1Vl, 0th2b, 1fj, 0t), 1H, 0Vl+th2b, 1fj, 2(t)h+tl3b, 1fj, 0H, 1(V+t)h+tl3b, 0fj, 1H, 1(V+t00.511.52Data / Bkg.110210310410510610EventsATLASWork in Progress-1 = 13 TeV, 139 fbsFit RegionsPre-FitData=0.7 (1.6 TeV)kT singlet +light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty), 0h, 1Vl+tH1b, 1fj, 0(t, 0h, 1Vl, 1tH1b, 1fj, 0t), 0h, 1Vl+tH2b, 1fj, 0(t, 0h, 1Vl, 1tH2b, 1fj, 0t), 1h, 0Vl+tH3b, 1fj, 0(t, 1h, 0Vl, 1tH3b, 1fj, 0t, 1h, 0Vl, 0tH3b, 1fj, 1t), 1h, 0Vl+tH4b, 1fj, 0(t, 1h, 0Vl, 1tH4b, 1fj, 0t, 1h, 0Vl, 0tH4b, 1fj, 1t, 0h, 1Vl, 1tH1b, 1fj, 0t, 0h, 1Vl, 0tH1b, 1fj, 1t), 0h, 1Vl+tH1b, 1fj, 2(t, 0h, 1Vl, 1tH2b, 1fj, 0t, 0h, 1Vl, 0tH2b, 1fj, 1t), 0h, 1Vl+tH2b, 1fj, 2(t)H, 1h, 0(v+tl3b, 1fj, 1t)H, 1h, 1(v+tl3b, 1fj, 0t)H+tl3b, 1fj, 1h, 2(v+t)H, 1h, 0(v+tl4b, 1fj, 1t)H, 1h, 1(v+tl4b, 1fj, 0t)H+tl4b, 1fj, 1h, 2(v+t)H, 0h, 0(v+tl4b, 0fj, 1t)H, 0h, 0(v+tl4b, 0fj, 1t00.511.5 Data / Bkg.110210310410510EventsATLASWork in Progress-1 = 13 TeV, 139 fbs6 jets CR³3-5 jets Pre-FitData=0.7 (1.6 TeV)kT singlet +light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty), 0h, 1Vl+tH1b, 1fj, 0(t, 0h, 1Vl, 1tH1b, 1fj, 0t), 0h, 1Vl+tH2b, 1fj, 0(t, 0h, 1Vl, 1tH2b, 1fj, 0t), 1h, 0Vl+tH3b, 1fj, 0(t, 1h, 0Vl, 1tH3b, 1fj, 0t, 1h, 0Vl, 0tH3b, 1fj, 1t), 1h, 0Vl+tH4b, 1fj, 0(t, 1h, 0Vl, 1tH4b, 1fj, 0t, 1h, 0Vl, 0tH4b, 1fj, 1t, 0h, 1Vl, 1tH1b, 1fj, 0t, 0h, 1Vl, 0tH1b, 1fj, 1t), 0h, 1Vl+tH1b, 1fj, 2(t, 0h, 1Vl, 1tH2b, 1fj, 0t, 0h, 1Vl, 0tH2b, 1fj, 1t), 0h, 1Vl+tH2b, 1fj, 2(t)H, 1h, 0(v+tl3b, 1fj, 1t)H, 1h, 1(v+tl3b, 1fj, 0t)H+tl3b, 1fj, 1h, 2(v+t)H, 1h, 0(v+tl4b, 1fj, 1t)H, 1h, 1(v+tl4b, 1fj, 0t)H+tl4b, 1fj, 1h, 2(v+t)H, 0h, 0(v+tl4b, 0fj, 1t)H, 0h, 0(v+tl4b, 0fj, 1t00.511.5 Data / Bkg.110210310410510EventsATLASWork in Progress-1 = 13 TeV, 139 fbs6 jets CR³3-5 jets Pre-FitData=0.7 (1.6 TeV)kT singlet +light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty                        00.511.5 Data / Bkg.110210310410510EventsATLAS -1 = 13 TeV, 139 fbsFit regionsPre-FitDataT singlet (1.6 TeV)+light-jetstt1c³+tt1b³+ttSingle-topV+jetsOthersUncertainty(a)

(b)

(c)

(d)

Figure 6.16: Comparison between the data and prediction for the meﬀ distribution under
the background-only hypothesis, in the (LJ, ≥3b, ≥1fj, 0H, ≥1(V+tl+th)) validation region
(a) pre-ﬁt and (b) post-ﬁt, and the (LJ, ≥3b, 0fj, 0(th+tl), ≥1H, 0V) validation region (c)
pre-ﬁt and (d) post-ﬁt. The expected T singlet signal (solid red) for mT = 1.6 TeV and
κ = 0.5 is included in the pre-ﬁt ﬁgures. The “others” background includes the t¯t V /H, V H,
tZ, t¯tt¯t, diboson, and multijet backgrounds. The bottom panels display the ratios of data
to the total background prediction. The last bin in each distribution contains the overﬂow.
These ﬁgures are taken from [79].

233

100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.050010001500200025003000EventsATLAS 1 = 13 TeV, 139 fbs )h+tl1(V+t≥1fj, 0H,≥3b, ≥LJ, Pre FitDataT singlet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.050010001500200025003000EventsATLAS 1 = 13 TeV, 139 fbs )h+tl1(V+t≥1fj, 0H,≥3b, ≥LJ, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.0100200300400500600700EventsATLAS 1 = 13 TeV, 139 fbs 1H, 0V≥),l+th3b, 0fj, 0(t≥LJ, Pre FitDataT singlet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.0100200300400500600700EventsATLAS 1 = 13 TeV, 139 fbs 1H, 0V≥),l+th3b, 0fj, 0(t≥LJ, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty(a)

(b)

(c)

(d)

Figure 6.17: Comparison between the data and prediction for the meﬀ distribution under
the background-only hypothesis, in the (HJ, ≥3b, ≥1fj, 0H, ≥1(V+tl+th)) validation region
(a) pre-ﬁt and (b) post-ﬁt, and the (HJ, ≥3b, 0fj, ≥1H, ≥1(V+tl+th)) validation region (c)
pre-ﬁt and (d) post-ﬁt. The expected T doublet signal (solid purple) for mT = 1.6 TeV and
κ = 0.5 is included in the pre-ﬁt ﬁgures. The “others” background includes the t¯t V /H, V H,
tZ, t¯tt¯t, diboson, and multijet backgrounds. The bottom panels display the ratios of data
to the total background prediction. The last bin in each distribution contains the overﬂow.
These ﬁgures are taken from [79].

234

100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.0500100015002000250030003500EventsATLAS 1 = 13 TeV, 139 fbs )h+tl1(V+t≥1fj, 0H,≥3b, ≥HJ, Pre FitDataT doublet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.0500100015002000250030003500EventsATLAS 1 = 13 TeV, 139 fbs )h+tl1(V+t≥1fj, 0H,≥3b, ≥HJ, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.050100150200250300350400450EventsATLAS 1 = 13 TeV, 139 fbs )h+tl1(V+t≥1H,≥3b, 0fj, ≥HJ, Pre FitDataT singlet (1.6 TeV)+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty100015002000250030003500 [GeV]effm0.50.7511.25 Data / Bkg.050100150200250300350400450EventsATLAS 1 = 13 TeV, 139 fbs )h+tl1(V+t≥1H,≥3b, 0fj, ≥HJ, Post FitData+light jetstt1c≥+tt1b≥+ttSingle topV+jetsOthersUncertainty6.1.8.2 Limits on Single Vector-Like Quark Production

As discussed in the previous subsection, no signiﬁcant excess above the SM prediction is found

in any of the considered regions in the background only ﬁt. Furthermore, the unconditional

ﬁts with a ﬂoating signal-strength parameter µ were also consistent with the background-

only hypothesis. Upper limits at the 95% CL on the T production cross section are derived

in both the singlet (T ) and doublet (T B) scenarios. The observed cross section limits are

compared to the NLO theoretical prediction to set exclusion limits on model parameters.

The reliability range of the theory cross section calculations with ﬁnite width eﬀects and non-

resonant contributions is up to a relative T decay width (Γ/M ) of approximately 50% [99],

thus results are shown only in this restricted regime.

The obtained limits corresponding to the singlet and doublet scenarios are shown in

Figure 6.18 and Figure 6.19 respectively for a set of three values of the common coupling

parameter κ that are chosen to approximately span the sensitivity range of the search in each

scenario. The upper limits are also interpreted as a function of the T mass and coupling

strength, which are shown in Figure 6.20 and Figure 6.21 for the singlet and doublet scenarios

respectively. All T masses below 2.1 TeV (expected 1.9 TeV) are excluded for the singlet

scenario at couplings κ ≥ 0.6. At a mass of 1.6 TeV the coupling strength values above 0.3

(expected 0.41) are all excluded. For the doublet scenario the limits on the considered mass

range extend down to coupling values of κ = 0.55 corresponding to a T quark mass limit

of 1.0 TeV. At a coupling strength of κ = 0.75, masses up to 1.68 TeV are excluded at the

threshold of 50% relative T decay width.

The expected limits on the production cross section get progressively stronger at larger

T masses in both scenarios, as the decay products of the T become more boosted, and the

fraction of signal in the highest meﬀ bins increases. The limits deteriorate at larger values of

235

κ since this regime corresponds to large resonance width and a larger fraction of the signal

resides in the low mass regime away from the peak of the resonance. The observed limits

exceed the expected limits in both signal benchmarks in a few cases, with deviations reaching

almost 2σ at high masses for the singlet scenario. This can be ascribed to the downward

statistical ﬂuctuations in a few of the most signal sensitive bins such as the last meﬀ bin of

the LJ, ≥4b, ≥1fj, 0th, ≥1tl, ≥1H, 0V region (Figure 6.13d) which has no data events. The

origin of these discrepancies has been investigated and no evidence of any systematic bias

was found. Furthermore, as previously discussed, the pre-ﬁt and post-ﬁt meﬀ distributions in

the corresponding validation regions exhibit good agreement between data and expectations.

236

(a)

(b)

(c)

Figure 6.18: Observed (solid line) and expected (dashed line) 95% CL upper limits on the
single T production cross-section as a function of the T quark mass in the singlet scenario with
the common coupling parameter (a) κ = 0.2, (b) κ = 0.4, and (c) κ = 0.6. The surrounding
shaded bands correspond to ±1 and ±2 standard deviations around the expected limit. The
red line shows the NLO theoretical cross-section prediction, with the surrounding shaded
band representing the corresponding uncertainty. Limits are only presented in the regime
Γ/M ≤ 50%, where the theory calculations are known to be valid, as indicated by the vertical
gray dashed line. These ﬁgures are taken from [79].

237

1000120014001600180020002200 [GeV]Tm110210310410 [fb]B × σTheory (NLO)95% CL observed limit95% CL expected limitσ1±95% CL expected limit σ2±95% CL expected limit ATLAS 1 = 13 TeV, 139 fbs=0.2κT singlet, 1000120014001600180020002200 [GeV]Tm10210310 [fb]B × σTheory (NLO)95% CL observed limit95% CL expected limitσ1±95% CL expected limit σ2±95% CL expected limit ATLAS 1 = 13 TeV, 139 fbs=0.4κT singlet, 1000120014001600180020002200 [GeV]Tm10210310 [fb]B × σTheory (NLO)95% CL observed limit95% CL expected limitσ1±95% CL expected limit σ2±95% CL expected limit ATLAS 1 = 13 TeV, 139 fbs=0.6κT singlet, (a)

(b)

(c)

Figure 6.19: Observed (solid line) and expected (dashed line) 95% CL upper limits on the
single T production cross-section as a function of the T quark mass in the doublet scenario
with the common coupling parameter (a) κ = 0.2, (b) κ = 0.4, and (c) κ = 0.6. The
surrounding shaded bands correspond to ±1 and ±2 standard deviations around the expected
limit. The red line shows the NLO theoretical cross-section prediction, with the surrounding
shaded band representing the corresponding uncertainty. Limits are only presented in the
regime Γ/M ≤ 50%, where the theory calculations are known to be valid, as indicated by
the vertical gray dashed line. These ﬁgures are taken from [79].

238

1000120014001600180020002200 [GeV]Tm1−10110210310410 [fb]B × σTheory (NLO)95% CL observed limit95% CL expected limitσ1±95% CL expected limit σ2±95% CL expected limit ATLAS 1 = 13 TeV, 139 fbs=0.2κT doublet, 1000120014001600180020002200 [GeV]Tm110210310 [fb]B × σTheory (NLO)95% CL observed limit95% CL expected limitσ1±95% CL expected limit σ2±95% CL expected limit ATLAS 1 = 13 TeV, 139 fbs=0.4κT doublet, 1000120014001600180020002200 [GeV]Tm10210310 [fb]B × σTheory (NLO)95% CL observed limit95% CL expected limitσ1±95% CL expected limit σ2±95% CL expected limit ATLAS 1 = 13 TeV, 139 fbs=0.6κT doublet, (a)

(b)

Figure 6.20: (a) Observed and (b) expected 95% CL exclusion limits on the cross section
times branching ratio of single T quark production as a function of the universal coupling
constant κ and the T quark mass in the the SU(2) singlet scenario. The red hashed area
around the observed limit corresponds to the theoretical uncertainty on the NLO theoretical
cross-section prediction. All values of κ above the black contour line are excluded at each
mass point. The purple contour lines denote exclusion limits of equal cross section times
branching ratio in units of fb. Limits are only presented in the regime Γ/M ≤ 50%, where
the theory calculations are known to be valid. These ﬁgures are taken from [79].

239

1000120014001600180020002200 [GeV]Tm0.20.40.60.811.21.41.6κ10203040506070210210×2 limit [fb]σ95% CL 20305080120=50%T/mTΓthσ1±95% CL observed limit ATLAS 1 = 13 TeV, 139 fbsT singlet1000120014001600180020002200 [GeV]Tm0.20.40.60.811.21.41.6κ10203040506070210210×2 limit [fb]σ95% CL 20305080120=50%T/mTΓ95% CL expected limitATLAS 1 = 13 TeV, 139 fbsT singlet(a)

(b)

Figure 6.21: (a) Observed and (b) expected 95% CL exclusion limits on the cross section
times branching ratio of single T quark production as a function of the universal coupling
constant κ and the T quark mass in the the SU(2) doublet scenario. All values of κ above
the black contour line are excluded at each mass point. The purple contour lines denote
exclusion limits of equal cross section times branching ratio in units of fb. Limits are only
presented in the regime Γ/M ≤ 50%, where the theory calculations are known to be valid.
These ﬁgures are taken from [79].

240

1000120014001600180020002200 [GeV]Tm0.20.40.60.811.21.41.6κ10203040506070210210×2 limit [fb]σ95% CL 20305080120=50%T/mTΓthσ1±95% CL observed limit ATLAS 1 = 13 TeV, 139 fbsT doublet1000120014001600180020002200 [GeV]Tm0.20.40.60.811.21.41.6κ10203040506070210210×2 limit [fb]σ95% CL 20305080=50%T/mTΓ95% CL expected limitATLAS 1 = 13 TeV, 139 fbsT doublet6.2 Pair Production of Vector-Like Quarks

6.2.1 Analysis Strategy

The pair production analysis is optimized to search for pairly produced T s where one of the

T s decays to a top quark and either a Higgs or Z boson. The analysis is divided into a

0-lepton and 1-lepton channel, however, at the time of writing this dissertation, the 1-lepton

channel is far more developed than the 0-lepton channel, so only the 1-lepton channel will be

discussed. Since this analysis closely follows the background model and shares the same signal

decay channels as the single production analysis, some elements from the single production

analysis strategy carry over to the pair production analysis. In particular, the boosted object

tagging and reconstruction and the choice of meﬀ as the analysis discriminant variable remain

the same. The systematic uncertainty model that is used in the pair production analysis also

closely follows the model from the single production analysis due to the background model

and event kinematics being almost identical. Furthermore, since the simulation of the most

relevant background processes is identical between the analyses, the background reweighting

procedure designed for the single production analysis is also implemented in this analysis.

However, some minor modiﬁcations are made to this procedure in order to accommodate the

event preselection of the pair production analysis, which will be further elaborated in the

following sections. Finally, the results presented for this analysis were obtained using the

same statistical analysis methodology as the one used in the single production analysis.

Signal processes in this analysis are categorized based on their combinatorial decay

topologies as follows:

1. HtHt for both vector-like tops decaying into Ht

241

2. HtW b for one vector-like top decaying into Ht and the other into W b

3. HtZt for one vector-like top decaying into Ht and the other into Zt

4. ZtZt for both vector-like tops decaying into Zt

The 1-lepton channel strategy is optimized for the pair production of T s with one T → Ht

decay. However, the 1-lepton channel has some sensitivity to ZtZt processes in which a large

amount of Emiss

T

is produced, as will be discussed shortly. The combinatorial nature of the

decays of the T provides interesting kinematic features that are taken advantage of in the

design of the analysis strategy. The signal processes are characterized by the production of a

large number of jets and b-tagged jets, as can be observed in the plots shown in Figure 6.22

at the event preselection level discussed in subsection 4.2.3. The increase in jet multiplicity

in signal events is attributed to the dominant decay modes of the decay products of the T s,

such as H → b¯b decays and hadronically decaying top quarks. Background events that have a

large multiplicity of jets are expected to come from the dominant t¯t+jets processes. However,

these jets mostly originate outside the main t¯t decay topology as ﬁnal state radiation and

are thus not very energetic. The number of b-tagged jets in signal is overall larger compared

to the total SM background, with the distribution being shifted towards higher multiplicities

for signal processes with a T → Ht decay. Another kinematic feature that characterizes

signal processes is the production of a large amount of Emiss

T

, as can be observed in the plot

shown in Figure 6.23. The source of Emiss

T

in signal processes is mostly expected to come

from Z → νν decays or a leptonically decaying top quark that are boosted as a consequence

of the large mass of the T .

Due to both T s being produced with a large mass and low pT, their decay products often

emerge as boosted objects that are back-to-back. As a result of this, the number of jets

242

(a)

(b)

Figure 6.22: The distributions of the multiplicities of jets (a) and b-tagged jets (b) at the
1-lepton channel preselection level overlayed between the diﬀerent signal processes for a T
mass of 1.6 TeV and the SM background.

Figure 6.23: The distribution of Emiss
between the diﬀerent signal processes for a T mass of 1.6 TeV and the SM background.

at the 1-lepton channel preselection level overlayed

T

243

Number of jets02468101214Fraction of events00.050.10.150.20.250.30.352b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      Number of track b-jets012345678Fraction of events00.20.40.60.812b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt       [GeV]missTE02004006008001000120014001600Fraction of events00.10.20.30.40.52b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      that are tagged to hadronically decaying boosted objects increases due to the additional T

that is produced in signal processes, as can be observed in the plot shown in Figure 6.24.

The distributions of the number of top-tagged jets, Higgs-tagged jets, W/Z-tagged jets,

and reconstructed leptonic tops are shown in Figure 6.25. The tagging of RC jets and the

reconstruction of the leptonic top is performed as discussed in subsection 6.1.3. The fraction

of signal events in the diﬀerent bins of the distributions are expected for the signal decay

topologies that are considered. Similar to the single production analysis, the lepton that is

produced in the 1-lepton channel is expected to originate from a leptonically decaying top

quark in signal processes. The combined presence of these boosted objects allows for the

reconstruction of candidate T s, which will be further elaborated in subsection 6.2.4.

Figure 6.24: The distribution of the number of reclustered large-R jets that are tagged to
either a top quark, Higgs boson, or a vector boson at the 1-lepton channel preselection
level overlayed between the diﬀerent signal processes for a T mass of 1.6 TeV and the SM
background.

244

Number of tagged jets012345Fraction of events00.10.20.30.40.50.60.72b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      (a)

(b)

(c)

(d)

Figure 6.25: The distributions of the number of reclustered large-R jets that are tagged to a
hadronically decaying top quark (a), Higgs boson (b) and vector boson (c), and the number
of reconstructed leptonically decaying top quark (d). The distributions are shown at the
1-lepton channel preselection level and overlayed between the diﬀerent signal processes for a
T mass of 1.6 TeV and the SM background.

245

Number of Top-tagged jets012345Fraction of events00.10.20.30.40.50.60.72b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      Number of Higgs-tagged jets012345Fraction of events00.20.40.60.811.22b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      Number of W/Z-tagged jets012345Fraction of events00.20.40.60.811.22b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      Number of leptonic tops0.4-0.2-00.20.40.60.811.21.4Fraction of events00.20.40.60.812b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      6.2.2 Signal Discrimination

As previously discussed, the meﬀ variable is used as the analysis discriminant as it is done

in the single production analysis. The use of meﬀ in the pair production analysis is further

motivated by the additional T that is produced in signal processes. First, the number of

ﬁnal state objects in signal processes increases due to the additional T . Second, both T s are

produced with low pT, but due to their large mass their decay products emerge as boosted

objects, which results in energetic ﬁnal states. This results in an increased separation power

between signal and background processes when compared to the single production analysis.

As can be observed in Figure 6.26, the meﬀ distribution peaks close to twice the mass of the

T .

(a)

(b)

Figure 6.26: The distributions of meﬀ at the 1-lepton channel preselection level overlayed
between the diﬀerent signal processes for a T mass of 1.6 TeV (a) and of 2.0 TeV (b), and the
SM background. An additional cut of meﬀ > 1.0 TeV is included to highlight the separation
between signal and background in a region that is close to the analysis search regions.

246

 [GeV]effm01000200030004000500060007000Fraction of events00.10.20.30.40.50.60.72b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt       [GeV]effm01000200030004000500060007000Fraction of events00.10.20.30.40.50.60.72b‡5j, ‡1l, -1=13 TeV, 139 fbs=2.0 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      6.2.3 Kinematic Reweighting of Background

Both the single production and pair production analyses follow a similar background model-

ing. The t¯t+jets background is the main irreducible background in both analyses, followed

by subdominant contributions from single-top and V +jets production processes. The MC

simulation samples that are used to model these background processes in the pair produc-

tion analysis were generated using the same MC generators as the samples used in the single

production analysis; therefore, the same MC mismodeling that was discussed in subsec-

tion 6.1.5 is present in the pair production anaylsis. It should be noted that the version

of the Sherpa MC generator that is used to model the V +jets background processes was

updated from v2.2.1 to v2.2.11. This results in an overall improvement in the modeling of

V +jets, however, these backgrounds are still mismodeled in the kinematic regime that the

signal is expected to reside in, albeit to a lesser degree when compared to the single produc-

tion analysis. The distributions of meﬀ and Njets overlayed between the total SM background

simulation prior to applying any correction factors and data are shown in Figure 6.27. Both

distributions are shown in the 1l, ≥ 5j, 2b region, which is background-dominated. As can

be observed, the MC prediction overestimates the meﬀ distribution at high values and un-

derestimates the Njets distribution at high jet multiplicities, which is where the signal is

expected to reside.

A background reweighting procedure is applied in the 1-lepton channel of this analysis

in order to improve the MC simulation modeling. The implementation of the reweighting

procedure closely follows the strategy outlined in subsection 6.1.5, with a few modiﬁcations.

In order to accommodate the event preselection of the pair production analysis, the RSRs

for both t¯t + W t and V +jets are deﬁned starting at ≥ 5j instead of ≥ 3j. Additionally,

247

(a)

(b)

Figure 6.27: The distributions of meﬀ (a) and Njets (b) overlayed between data and the
total SM background simulation prior to applying any correction factors. The “Others”
background includes the t¯tV /H, t¯tt¯t, diboson, and multijet backgrounds. The distributions
are shown in the 1l, ≥5j, 2b region. All bins are weighted by the bin width.

for the t¯t + W t reweighting, the 1l, ≥ 7j, 2b RSR that was used in the single production

analysis is now split into the RSRs 1l, 7j, 2b and 1l, ≥ 8j, 2b. This choice is motivated

by the presence of suﬃcient background statistics in the resulting RSRs, thereby allowing

the derivation of correction factors that are more reliable in the ≥ 7j region. As a result

of the improved V +jets modeling from Sherpa v2.2.11, it was determined that only Njets

correction factors were suﬃcient to address the remaining mismodeling of these processes.

The Njets correction factors for V +jets are applied as a bin-by-bin jet multiplicity correction

factor prior to the derivation of the t¯t + W t correction factors. For the t¯t + W t reweighting,

the mred

eﬀ is redeﬁned in order to better capture the behavior of the additional jets from t¯t

processes. Thus, instead of a Njets − 3 shift, the new deﬁnition of mred

eﬀ uses a Njets − 5 shift.

In order to determine the constant pT scale that multiplies the Njets shift, the average jet

pT was calculated as a function of the number of additional jets at the event preselection, as

248

100015002000250030003500400045005000Events5-104-103-102-101-101102103104105106105j, 2b‡1l, -1=13 TeV, 139 fbsData+light-jetstt1c‡+tt1b‡+ttSingle-topOthersV+jetsSM Total         [GeV]effm100015002000250030003500400045005000Data/MC00.20.40.60.811.21.468101214Events50100150200250300350310·5j, 2b‡1l, -1=13 TeV, 139 fbsData+light-jetstt1c‡+tt1b‡+ttSingle-topOthersV+jetsSM Total        Number of jets68101214Data/MC00.20.40.60.811.21.4shown in Figure 6.28. Three linear ﬁts were performed to determine the pT scale from the

slope of the line. The ﬁts diﬀer in the range of Njets that is considered. As can be observed,

the average jet pT is slightly higher in events with at least 10 jets. However, as previously

discussed, these jets originate mostly from outside the main t¯t decay topology. The choice

of the pT scale is obtained from the linear trend observed for Njets ≤ 9. Based from this

discussion, the new deﬁnition of mred

eﬀ is given by:

mred

eﬀ = meﬀ − (Njets − 5) × 40 GeV

(6.10)

The mred

eﬀ background correction factors are ﬁtted using the same sigmoid functional template

that was used in the single production analysis. The distributions of meﬀ and Njets overlayed

between the total SM background simulation after applying the background correction factors

and data are shown in Figure 6.29. As can be observed, the modeling of MC simulation

is signiﬁcantly improved. The background MC simulation is reweighted throughout the

remaining discussion of this analysis.

6.2.4 VLQ Reconstruction

The reconstruction of candidate T s in events from signal processes is possible due to the

combined presence of boosted objects that are identiﬁed as the direct decay products of the

T s. As can be observed in Figure 6.24, a large fraction of signal events have at least two

RC jets that are tagged to a hadronically decaying boosted object, which allows the possible

reconstruction of at least one T . A dedicated algorithm is implemented to reconstruct

candidate T s using the identiﬁed boosted objects in the event. The algorithm works under the

assumption that the decay topology of each T is resolved due to their low pT, and therefore

249

Figure 6.28: Plot of the average pT of additional jets at preselection level (average Hhad
)
as a function of the number of additional jets at preselection level. The colored lines show
the linear ﬁts that were performed to determine constant pT scale in the mred
eﬀ deﬁnition.
The red line shows the ﬁt obtained by including all values of the average Hhad
, the blue line
shows the ﬁt obtained by including the values of the average Hhad
up to Njets = 9, and the
green line shows the ﬁt obtained by including the values of the average Hhad
for Njets ≤ 9.

T

T

T

T

(a)

(b)

Figure 6.29: The distributions of meﬀ (a) and Njets (b) overlayed between data and the total
SM background simulation after applying all background correction factors. The “Others”
background includes the t¯tV /H, t¯tt¯t, diboson, and multijet backgrounds. The distributions
are shown in the 1l, ≥5j, 2b region. All bins are weighted by the bin width.

250

100015002000250030003500400045005000Events5-104-103-102-101-101102103104105106105j, 2b‡1l, -1=13 TeV, 139 fbsData+light-jetstt1c‡+tt1b‡+ttSingle-topOthersV+jetsSM Total         [GeV]effm100015002000250030003500400045005000Data/MC00.20.40.60.811.21.468101214Events50100150200250300310·5j, 2b‡1l, -1=13 TeV, 139 fbsData+light-jetstt1c‡+tt1b‡+ttSingle-topOthersV+jetsSM Total        Number of jets68101214Data/MC00.20.40.60.811.21.4the two boosted objects that emerge from each T are approximately back-to-back. First,

all the identiﬁed boosted objects in the event are grouped into pairs based on the possible

decays that each T might have had in the event. Next, all pairs of boosted objects are sorted

by descending ∆R distance, in accordance with the resolved decay topology assumption that

was made for each T . Finally, the leading and subleading sorted pairs of boosted objects

are used to reconstruct the candidate T s in the event by adding the four-momenta of the

two boosted objects in each pair. The distribution of the invariant mass of the leading

and subleading pairs are shown in Figure 6.30. As can be observed, the invariant mass in

signal processes has a narrow peak close to the corresponding mass value of the T s, which

gives conﬁdence in the reconstruction algorithm. For background processes, the invariant

mass distributions peak at lower values and have a large tail at higher mass values, which is

expected from trying to reconstruct T s from random background processes.

251

(a)

(b)

(c)

(d)

Figure 6.30: The invariant mass distributions of the reconstructed candidate T s at the 1-
lepton channel preselection level overlayed between the diﬀerent signal processes and the SM
background. The distributions are shown for candidate T s with a mass of mT = 1.6 TeV
and mT = 2.0 TeV that are reconstructed from the leading (a)-(b) and subleading (c)-(d)
pairs of boosted objects in ∆R distance.

252

)max RD(0vlqm0500100015002000250030003500400045005000Fraction of events00.020.040.060.080.10.120.140.160.180.20.220.242b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      )max RD(0vlqm0500100015002000250030003500400045005000Fraction of events00.050.10.150.20.252b‡5j, ‡1l, -1=13 TeV, 139 fbs=2.0 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      )max RD(1vlqm0500100015002000250030003500400045005000Fraction of events00.020.040.060.080.10.120.140.160.180.20.222b‡5j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      )max RD(1vlqm0500100015002000250030003500400045005000Fraction of events00.020.040.060.080.10.120.140.160.180.22b‡5j, ‡1l, -1=13 TeV, 139 fbs=2.0 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      6.2.5 Multivariate Analysis

As previously discussed, kinematic variables such as meﬀ, the invariant masses of the re-

constructed candidate T s, and the multiplicity of boosted objects in the event contain dis-

criminatory power between signal and background processes. Additionally, other kinematic

features that take advantage of the combinatorial nature of the decays of the T s have been

deﬁned. Examples of such variables include the pT of boosted objects in the event, the

angular separations of boosted objects in the event, Emiss

T

related variables, and the number

of subjet constituents within tagged jets. The distributions of some of these variable types

are shown in Figure 6.31 in a 1-lepton channel region that requires at least 6 jets, at least 3

b-tagged jets, and at least 3 RC jets, of which at least 2 must be tagged to a boosted hadron-

ically decaying object (1l, ≥ 6j, ≥ 3b, ≥ 2M, ≥ 3J). These requirements are made in order

to select events from background processes that contain multiplicities of jets and boosted

objects that resemble those from events in signal processes at the event preselection level.

Additionally, a meﬀ ≥ 1 TeV cut is also applied to this region in order to select background

events that are in a kinematic regime where most of the signal is expected to reside.

A multivariate analysis (MVA) was performed in the 1-lepton channel in order to fully

exploit the information present in all these variables to classify events as either signal pair

production events or SM background events. The MVA consisted in the training of three

separate DNN models to perform the event classiﬁcation task. The DNNs were trained in

the region 1l, ≥ 6j, ≥ 3b, ≥ 2M, ≥ 3J with the meﬀ ≥ 1 TeV cut applied in order for the

DNNs to learn to separate background processes in events that are kinematically similar to

signal processes. Furthermore, the DNNs were trained to be agnostic on the decay modes

of the T s and independent of their mass by including all relevant signal decay channels with

253

(a)

(b)

(c)

(d)

(e)

(f)

Figure 6.31: The distributions of meﬀ (a), the reconstructed candidate T invariant mass
from the leading ∆R pair of boosted objects (b), the pT of the reconstructed candidate
leptonic top (c), the number of b-tagged subjets in the pT leading Higgs-tagged jet (d), the
minimum absolute value of ∆φ between two tagged jets in the event (e), and the minimum
∆R between two tagged jets in the event (f). The distributions are overlayed between the
diﬀerent signal processes for a T mass of 1.6 TeV and the SM background.

254

 [GeV]effm01000200030004000500060007000Fraction of events00.10.20.30.40.50.63J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      )max RD(0vlqm0500100015002000250030003500400045005000Fraction of events00.050.10.150.20.253J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt       [GeV]TLeptonic top p0200400600800100012001400160018002000Fraction of events00.020.040.060.080.10.120.140.160.183J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      b-taggedsubjetsHiggs-tagged jet0 N012345Fraction of events00.10.20.30.40.50.63J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      (RCMTT)minfD1-0.5-00.511.522.533.5Fraction of events00.050.10.150.20.253J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      (RCMTT)minRD00.511.522.533.544.5Fraction of events00.020.040.060.080.10.120.140.160.180.20.223J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      mT = 1.4 TeV, 1.6 TeV, and 1.8 TeV as part of the training events. Two DNN models were

trained with 30 input variables: one with all background processes as part of the training

(30 vars., all bkgd.), and the other with t¯t events as the only background process in the

training (30 vars., t¯t only). The remaining DNN was trained with 20 input variables and all

background processes as part of the training (20 vars., all bgkd.). The list of input variables

used by each DNN is summarized in Table 6.10.

255

Variable

30 vars., all bkgd.

30 vars., t¯t

20 vars., all bkgd.

T

T,min

meﬀ
mb
mW
T
Emiss
T
Residual Emiss
pT(leptonic top)
pT(RCjet2)
pT(RCMHiggs0)
pT(RCMHiggs1)
pT(RCMV0)
pT(RCMTop0)
Nconst(RCMHiggs0)
Nconst(RCMV0)
Nbconst(RCMHiggs0)
Nbconst(RCMV0)
∆φmin(RCTTM)
∆φmin(RCjets)
∆φavg(RCjets)
∆ηmin(RCTTM)
∆ηmin(RCjets)
Leading ∆η(RCjets)
∆Rmin(RCMTT)
∆Rmin(RCjets)
∆Ravg(RCjets)
Leading ∆R(RCMTT)
Leading ∆R(RCjets)
m0
m1
m1
Nbjets
NRCjets

vlq(RCTTM, ∆Rmax)
vlq(RCTTM, ∆Rmax)
vlq(RCjets, ∆Rmax)

•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•

•

•
•
•

•

•
•
•

•
•
•

•
•
•

•
•
•
•

Table 6.10: List of input variables of the three DNN models that were trained in the 1-lepton
channel MVA. Variables that are ticked indicate that they are used as an input to a given
DNN model.

256

The performance of each DNN is assessed with their receiver operating characteristic

(ROC) curve. The ROC curve is a parametric curve that shows the fraction of background

events that are correctly identiﬁed as background (1 − (cid:15)background) as a function of the

fraction of signal events that are correctly identiﬁed as signal ((cid:15)signal) at a given DNN score

value selection cut. The ROC curve for the three DNN models is shown in Figure 6.32.

As can be observed in the plot, the performance between the three DNN models is similar.

The DNN that was trained with 30 input variables and with all background processes was

chosen as the event classiﬁer due to the robustness in the background model used in its

training. Figure 6.33 shows the distribution of the DNN score of this model, which is

also referred to as the MVA score. As can be observed in the plot, the DNN achieves

good separation power between signal and background processes. Two MVA score working

points were optimized based on the background rejection they achieve, which are shown in

Figure 6.32. The low working point was set to 0.16, which achieves a background rejection

of approximately 75%. The high working point was set to 0.81, which achieves a background

rejection of approximately 95%. The signal eﬃciencies that are achieved were found to be

consistent across the diﬀerent signal processes considered and T mass values. The overall

signal eﬃciency is approximately 95% at the low working point and 77% at the high working

point. These working points are used to deﬁne the analysis baseline search regions, which

will be discussed in the next section.

257

(a)

(b)

(c)

Figure 6.32: The ROC curve of the three DNN models that were trained in the 1-lepton MVA.
The horizontal axis shows the signal eﬃciency evaluated for the signal processes HtHt (a),
HtZt (b), and ZtZt (c) for a T mass of 1.6 TeV. The vertical axis shows the background
rejection that is achieved at a given value of the signal eﬃciency. The upwards (downwards)
triangle marks indicate points in the ROC curve that achieve a 95% (75%) background
rejection. The corresponding signal eﬃciencies and the MVA score that achieves these values
are also displayed. The bottom panel in each plot shows the ratio of the ROC curve value
of each model to the 30 vars., all bkgd. model.

258

00.20.40.60.81background˛1-HtHt 1600 GeV3J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs = 0.75)background˛ =  0.96, 1-signal˛(MVA Score = 0.16,  = 0.95)background˛ =  0.79, 1-signal˛(MVA Score = 0.81, 30 vars., all bkgd. = 0.75)background˛ =  0.96, 1-signal˛(MVA Score = 0.17,  = 0.95)background˛ =  0.79, 1-signal˛(MVA Score = 0.83, 30 vars., ttbar only = 0.75)background˛ =  0.97, 1-signal˛(MVA Score = 0.27,  = 0.95)background˛ =  0.77, 1-signal˛(MVA Score = 0.80, 20 vars., all bkgd.00.20.40.60.81signal˛0.70.750.80.850.90.9511.051.11.15Ratio00.20.40.60.81background˛1-HtZt 1600 GeV3J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs = 0.75)background˛ =  0.96, 1-signal˛(MVA Score = 0.16,  = 0.95)background˛ =  0.77, 1-signal˛(MVA Score = 0.81, 30 vars., all bkgd. = 0.75)background˛ =  0.96, 1-signal˛(MVA Score = 0.17,  = 0.95)background˛ =  0.77, 1-signal˛(MVA Score = 0.83, 30 vars., ttbar only = 0.75)background˛ =  0.96, 1-signal˛(MVA Score = 0.27,  = 0.95)background˛ =  0.75, 1-signal˛(MVA Score = 0.80, 20 vars., all bkgd.00.20.40.60.81signal˛0.70.750.80.850.90.9511.051.11.15Ratio00.20.40.60.81background˛1-ZtZt 1600 GeV3J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs = 0.75)background˛ =  0.94, 1-signal˛(MVA Score = 0.16,  = 0.95)background˛ =  0.72, 1-signal˛(MVA Score = 0.81, 30 vars., all bkgd. = 0.75)background˛ =  0.94, 1-signal˛(MVA Score = 0.17,  = 0.95)background˛ =  0.72, 1-signal˛(MVA Score = 0.83, 30 vars., ttbar only = 0.75)background˛ =  0.96, 1-signal˛(MVA Score = 0.27,  = 0.95)background˛ =  0.69, 1-signal˛(MVA Score = 0.80, 20 vars., all bkgd.00.20.40.60.81signal˛0.70.750.80.850.90.9511.051.11.15RatioFigure 6.33: The distribution of the MVA score obtained from the DNN that was trained
with 30 input variables and all background processes. The distribution is shown for events
in the DNN training region 1l, ≥ 6j, ≥ 3b, ≥ 2M, ≥ 3J with meﬀ ≥ 1 TeV. The plot
is overlayed between the diﬀerent signal processes for a T mass of 1.6 TeV and the SM
background.

259

 MVA Score00.20.40.60.81Fraction of events00.10.20.30.40.50.60.73J‡2M, ‡3b, ‡6j, ‡1l, -1=13 TeV, 139 fbs=1.6 TeVTmTotal backgroundHtHtHtWbHtZtZtZt      1-lepton channel MVA cut

Region name

MVA score ≥ 0.81 (High MVA score)
0.16 ≤ MVA score < 0.81 (Mid MVA score)
MVA score < 0.16 (Low MVA score)

1l, ≥ 5j, ≥ 3b, ≥ 2M, ≥ 3J, HMVA
1l, ≥ 5j, ≥ 3b, ≥ 2M, ≥ 3J, MMVA
1l, ≥ 5j, ≥ 3b, ≥ 2M, ≥ 3J, LMVA

Table 6.11: The cuts on the DNN score that are used to deﬁne the baseline analysis search
regions.

6.2.6 Analysis Search Regions

As discussed in the previous section, the MVA performed in the 1-lepton channel resulted

in the development of a DNN that is designed to classify events as either signal T pair

production events or SM background events. Two working points on the MVA score were

deﬁned based on the background rejection metric. The low working point is deﬁned to achieve

a 75% background rejection, which corresponds to an MVA score of 0.16 and an overall signal

eﬃciency of 95%. The high working point is deﬁned to achieve a 95% background rejection,

which corresponds to an MVA score of 0.81 and an overall signal eﬃciency of 77%. These

two working points allow for the deﬁnition of simpler analysis search regions when compared

to the ones deﬁned in the single production analysis. This is due to the training of the

DNN being agnostic on the signal processes that are considered in the analysis. Therefore,

the purity of signal events is expected to be higher in regions that are deﬁned by requiring

events to have an MVA score higher than the high working point. Conversely, regions that

are deﬁned by requiring events to have an MVA score lower than the low working point are

expected to be background-dominated and can serve as background control regions. These

observations motivate the following deﬁnitions of the baseline analysis search regions that

are listed in Table 6.11. The baseline regions are more inclusive than the DNN training

regions by requiring the presence of at least 5 jets instead of 6. This is done in order to

260

increase the signal sensitivity of the analysis. The potential impact that the loosening of this

requirement might have had on the performance of the DNN was assessed and found to be

negligible. An intermediate region (1l, ≥ 5j, ≥ 3b, ≥ 2M, ≥ 3J, MMVA) is also included as

a baseline region in order to retain some sensitivity to signal processes that are not targeted

by the 1-lepton channel, such as the ZtZt signal process.

Since the 1-lepton channel of the analysis targets the pair production of T s with one

T → Ht decay, the purity of the diﬀerent signal processes in the baseline regions can be

further improved by making additional requirements on the multiplicities of b-tagged jets

and Higgs-tagged jets. For example, the purity of the HtHt signal process can be increased

by requiring at least 4 b-tagged jets and at least one Higgs-tagged jet in the baseline regions.

On the other hand, the purity of the HtZt and HtW b can be increased by requiring exactly

3 b-tagged jets or no Higgs-tagged jets. The list of the analysis search regions, also referred

to as ﬁt regions, are summarized in Table 6.12. The number of expected events in the SU (2)

doublet and singlet signal benchmarks for a T mass of 1.6 TeV and the total background in

each ﬁt region are also listed. In addition to the regions that are deﬁned with the MVA score,

three regions that are deﬁned only through boosted object multiplicities are also included.

These regions serve an identical purpose as the t¯t control regions that were included as part

of the single production analysis ﬁt regions. The background composition in each of the ﬁt

regions is summarized in the pie charts shown in Figure 6.34. The dominant background

in each ﬁt region comes from t¯t+jets processes. The regions that have a low b-tagged jet

multiplicity requirement are dominated by t¯t+light-jets, while regions that have a higher b-

tagged jet multiplicity are dominated by t¯t+ ≥ 1b. The single-top, t¯tV /H, and t¯tt¯t processes

have subdominant contributions in the HMVA regions; however, the total number of expected

background events in these regions is small.

261

Figure 6.34: The breakdown of the background composition in the 1-lepton channel ﬁt re-
gions.The “Others” background includes the t¯tV /H, t¯tt¯t, diboson, and multijet backgrounds.

262

Region

SU (2) doublet SU (2) singlet Total bkg

1l, ≥ 5j, 3b, ≥ 2M, ≥ 3J, 0H, HMVA
1l, ≥ 5j, 3b, ≥ 2M, ≥ 3J, 0H, MMVA
1l, ≥ 5j, 3b, ≥ 2M, ≥ 3J, 0H, LMVA

1l, ≥ 5j, 3b, ≥ 2M, ≥ 3J, ≥ 1H, HMVA
1l, ≥ 5j, 3b, ≥ 2M, ≥ 3J, ≥ 1H, MMVA
1l, ≥ 5j, 3b, ≥ 2M, ≥ 3J, ≥ 1H, LMVA

1l, ≥ 5j, ≥ 4b, ≥ 2M, ≥ 3J, 0H, HMVA
1l, ≥ 5j, ≥ 4b, ≥ 2M, ≥ 3J, 0H, MMVA
1l, ≥ 5j, ≥ 4b, ≥ 2M, ≥ 3J, 0H, LMVA

1l, ≥ 5j, ≥ 4b, ≥ 2M, ≥ 3J, ≥ 1H, HMVA
1l, ≥ 5j, ≥ 4b, ≥ 2M, ≥ 3J, ≥ 1H, MMVA
1l, ≥ 5j, ≥ 4b, ≥ 2M, ≥ 3J, ≥ 1H, LMVA

1l, ≥ 5j, 2b, 0H, 1(V+th), 1tl
1l, ≥ 5j, 3b, 0H, 1(V+th), 0–1tl
1l, ≥ 5j, ≥ 4b, 0H, 1(V+th), 0–1tl

1.46
0.914
0.246

2.67
0.506
0.129

1.7
0.402
0.0656

3.24
0.197
0.0272

0.554
0.879
0.561

1.05
0.633
0.15

1.55
0.284
0.0631

0.846
0.195
0.031

1.34
0.0765
0.0103

0.481
0.828
0.381

16.3
127
620

20.3
46.3
142

13.7
36.1
102

11.1
12.1
18.5

2.37e+04
6.29e+03
816

Table 6.12: Deﬁnition of the 15 analysis search regions (referred to as “ﬁt regions”). The
events are categorized based on the multiplicity of central jets (j), b-tagged jets (b), tagged
RC jets (M), RC jets (J), Higgs-tagged jets (H), W/Z-tagged jets (V), hadronic top-tagged
jets (th), reconstructed leptonic tops (tl), and MVA score. The expected yields of the SU (2)
doublet and singlet benchmarks for mT = 1.6 TeV and total background are shown in each
ﬁt region.

263

6.2.7 Systematic Uncertainties

As previously discussed, the pair production analysis follows closely the background model

and shares the same signal decay channels with the single production analysis. As a result of

this, the sources of systematic uncertainties that are relevant to the pair production analysis

are the same as the single production analysis, which were described in subsection 6.1.6.

Thus, the systematic uncertainty model that is implemented in the pair production analysis

follows closely the model from the single production analysis, with a few modiﬁcations which

will be detailed in this section.

The only experimental systematic uncertainty that has been modiﬁed is the uncertainty

in the combined 2015-2018 integrated luminosity. The uncertainty associated with the mea-

surement of luminosity for the ATLAS detector for Run-2 has decreased from 1.7% to 0.83%,

which is based on the ﬁnal measurement made during the Run-2 data taking period [100].

The only modeling uncertainties that have been modiﬁed in the pair production analysis

are the uncertainties associated with the modeling of the t¯t background process and the

uncertainties associated with the background reweighting procedure. The t¯t parton shower

and hadronization modeling uncertainty, NLO generator modeling uncertainty, and radiation

modeling uncertainty are kept uncorrelated between the t¯t+light-jets, t¯t+ ≥ 1c, and t¯t + 1b

samples, but treated as correlated amongst all analysis ﬁt regions for each sample. As

discussed in subsection 6.2.3, only a bin-by-bin Njets correction factor is applied to the

V +jets background processes. The uncertainty assigned to this correction factor is obtained

by applying the nominal correction factor shifted by the statistical error of the corresponding

Njets bin, both as a positive and negative variation. The t¯t + W t background reweighting

uncertainties are obtained from the ±2σ variations of the mred

eﬀ ﬁts that are performed in each

264

RSRs, similar to how it was done in the single production analysis. An additional nuisance

parameter is included for the t¯t + W t background reweighting procedure that corresponds

to the 1l, ≥8j, 2b RSR resulting from the 1l, ≥7j, 2b RSR split.

6.2.8 Results

The results of the search for pair-produced T s in the 1-lepton channel following the statistical

analysis methodology described in subsection 6.1.7 will be presented in this section. As

previously mentioned, the analysis is in the early stages of going through the rigorous internal

review that all ATLAS analyses must go through in order to establish full conﬁdence in the

results obtained. The 0-lepton channel of the analysis is currently at the stage of ﬁnalizing

validation studies to assess the modeling of the MC prediction to data in order to initialize

the 0-lepton channel ﬁt studies. The 1-lepton channel of the analysis has been reviewed up

to partially blinded data results in order to assess the ﬁt model behavior in the background-

enriched and signal-depleted search regions. This review process has deemed the ﬁt model

behavior to be reliable up to this stage, establishing full conﬁdence in the background and

uncertainty models of the 1-lepton channel. While the fully unblinded data results have

not been internally reviewed, they are not expected to change signiﬁcantly once they reach

that stage. Furthermore, the interpretation of the results given here will be limited to the

1-lepton channel only. These interpretations may change once the results of the 0-lepton

channel become available and are combined with the 1-lepton channel results in order to

give a broader description in the full phase space of the analysis.

265

6.2.8.1 Maximum Likelihood Fits to Data

A likelihood ﬁt under the background-only hypothesis is performed on the meﬀ distributions

across the search regions of the 1-lepton channel. A comparison between the overall observed

and expected yields in each search region before and after the ﬁt to data is shown in Fig-

ure 6.35. As can be observed in the bottom panel of the pre-ﬁt plot, the agreement between

data and the predicted background is reasonable. The combined impact of the systematic un-

certainties is signiﬁcantly constrained as a result of the ﬁt by using the information from the

background-enriched and signal-depleted regions. This results in an overall improved back-

ground prediction with reduced uncertainties in the majority of the search regions. However,

the search regions that require ≥4b, 0H, and a low and mid MVA score show excesses in the

observed data that are not covered by the post-ﬁt uncertainty in these regions. The post-ﬁt

event yield breakdown from these two regions is summarized in Table 6.13. As can be ob-

served, there is a 17% and 29% excess of data over the post-ﬁt background prediction in the

≥4b, 0H, LMVA and ≥4b, 0H, MMVA regions, respectively. For comparison, the post-ﬁt

event yield breakdown in the signal-enriched HMVA regions shows no signiﬁcant excesses, as

can be observed in Table 6.14. Furthermore, as can be observed between the pre-ﬁt and post-

ﬁt yield comparison in Figure 6.35, the t¯t control regions and the 3b, LMVA regions drive

the ﬁt. These observations are indicative that the ﬁt might be missing additional degrees of

freedom that are needed to correct the background in the ≥4b, 0H, LMVA and ≥4b, 0H,

MMVA regions. The pre-ﬁt and post-ﬁt meﬀ distributions in these two regions are shown in

Figure 6.36. As can be observed in the plots, the overall post-ﬁt agreement between data and

the background prediction is sensible and within the post-ﬁt uncertainty in all bins except

the second bin of the MMVA region. Since the t¯t+ ≥1b background dominates in these

regions, an alternative test ﬁt was performed with the t¯t+ ≥1b normalization uncertainty

266

being decorrelated across the high-statistics 2-3b regions and the low-statistics ≥4b regions

in order to test if the initial ﬁt model conﬁguration has missing degrees of freedom related

to this background. The results of this decorrelation test did not deviate signiﬁcantly from

those of the initial ﬁt conﬁguration. These observations will require further investigation

into the ﬁt model; however, as it was argued, these excesses are observed in signal-depleted

regions and the overall agreement between data and the post-ﬁt background on the meﬀ

distributions in these regions is good. Furthermore, the signal-enriched HMVA regions also

show overall good agreement between data and the post-ﬁt background prediction, with

only the last bin of the ≥4b, 0H, HMVA region showing a downward ﬂuctuation, as can

be observed in Figures 6.37 and 6.38. Thus, these observed excesses can be deemed as not

signiﬁcant.

267

Figure 6.35: Comparison between the data and background prediction yields in each of the
ﬁt regions considered (top) pre-ﬁt and (bottom) post-ﬁt, performed under the background-
only hypothesis. The “Others” background includes the t¯tV /H, t¯tt¯t, diboson, and multijet
backgrounds. The expected T T → HtHt signal (solid red) for mT = 1.4 TeV is included
in the pre-ﬁt ﬁgure. The bottom panels display the ratios of data to the total background
prediction. The hashed area represents the total uncertainty on the background.

268

≥4b, 0H, MMVA ≥4b, 0H, LMVA

t¯t+light-jets
t¯t+≥1c
t¯t+≥1b
Single-top
W+jets
Z+jets
t¯tV
t¯tH
t¯tt¯t
Dibosons
QCD

Total

Data

1.58 ± 1.2
3.97 ± 1.36
28.88 ± 3.03
2.42 ± 2.94
0.7 ± 0.26
0.12 ± 0.045
1.75 ± 0.81
2.12 ± 0.3
2.52 ± 0.75
0.14 ± 0.1
0.13 ± 0.13

4.35 ± 2.76
10.97 ± 3.076
83.71 ± 8.07
3.12 ± 2.52
1.7 ± 0.58
0.27 ± 0.092
3.87 ± 1.08
5.91 ± 0.76
4.16 ± 1.25
0.38 ± 0.23
0.22 ± 0.19

44.34 ± 5.15

118.66 ± 8.14

57

139

Table 6.13: Predicted and observed yields in the two search regions that show an observed
excess of data after performing the background-only ﬁt. The individual systematic uncer-
tainties for the diﬀerent background processes can be correlated, and do not necessarily add
in quadrature to equal the systematic uncertainty in the total background yield. The quoted
uncertainties are computed after taking into account correlations among nuisance parameters
and among processes. The statistical uncertainty is added in quadrature to the systematic
uncertainties.

269

3b, 0H, HMVA ≥4b, 0H, HMVA 3b, ≥1H, HMVA ≥4b, ≥1H, HMVA

t¯t+light-jets
t¯t+≥1c
t¯t+≥1b
Single-top
W+jets
Z+jets
t¯tV
t¯tH
t¯tt¯t
Dibosons
QCD

Total

Data

2.33 ± 0.46
2.62 ± 0.89
7.47 ± 1.17
1.1 ± 1.22
0.96 ± 0.37
0.11 ± 0.051
1.26 ± 0.35
0.46 ± 0.092
0.75 ± 0.22
0.16 ± 0.22
0.15 ± 0.073

0.54 ± 0.38
1.34 ± 0.48
11.49 ± 1.6
0.28 ± 0.43
0.4 ± 0.15
0.042 ± 0.016
0.57 ± 0.33
0.84 ± 0.18
1.53 ± 0.47
0.083 ± 0.055
0.37 ± 0.46

3.25 ± 0.57
3.54 ± 0.92
8.22 ± 1.25
2.93 ± 1.06
1.09 ± 0.38
0.11 ± 0.045
1.04 ± 0.33
0.71 ± 0.1
0.51 ± 0.16
0.21 ± 0.13
0.19 ± 0.096

0.3 ± 0.21
1.05 ± 0.52
7.31 ± 1.019
0.59 ± 0.7
0.31 ± 0.14
0.035 ± 0.0188
0.63 ± 0.41
0.91 ± 0.14
0.62 ± 0.19
0.1 ± 0.07
0.16 ± 0.11

17.36 ± 3.4

17.48 ± 2.04

21.8 ± 4.25

12 ± 1.21

16

15

21

15

Table 6.14: Predicted and observed yields in the four of the most sensitive search regions
considered after performing the background-only ﬁt. The individual systematic uncertainties
for the diﬀerent background processes can be correlated, and do not necessarily add in
quadrature to equal the systematic uncertainty in the total background yield. The quoted
uncertainties are computed after taking into account correlations among nuisance parameters
and among processes. The statistical uncertainty is added in quadrature to the systematic
uncertainties.

270

(a)

(b)

(c)

(d)

Figure 6.36: Comparison between the data and prediction for the meﬀ distribution under
the background-only hypothesis, in the (≥5j, ≥4b, ≥2M, ≥3J, 0H, LMVA) region (a) pre-ﬁt
and (b) post-ﬁt, and the (≥5j, ≥4b, ≥2M, ≥3J, 0H, MMVA) region (c) pre-ﬁt and (d) post-
ﬁt. The “Others” background includes the t¯tV /H, t¯tt¯t, diboson, and multijet backgrounds.
The expected T T → HtHt signal (solid red) for mT = 1.4 TeV is included in the pre-ﬁt
ﬁgures. The bottom panels display the ratios of data to the total background predictions.
The hashed area represents the total uncertainty on the background.

271

(a)

(b)

(c)

(d)

Figure 6.37: Comparison between the data and prediction for the meﬀ distribution under
the background-only hypothesis, in the (≥5j, 3b, ≥2M, ≥3J, 0H, HMVA) region (a) pre-ﬁt
and (b) post-ﬁt, and the (≥5j, 3b, ≥2M, ≥3J, ≥1H, HMVA) region (c) pre-ﬁt and (d) post-
ﬁt. The “Others” background includes the t¯tV /H, t¯tt¯t, diboson, and multijet backgrounds.
The expected T T → HtHt signal (solid red) for mT = 1.4 TeV is included in the pre-ﬁt
ﬁgures. The bottom panels display the ratios of data to the total background predictions.
The hashed area represents the total uncertainty on the background.

272

(a)

(b)

(c)

(d)

Figure 6.38: Comparison between the data and prediction for the meﬀ distribution under the
background-only hypothesis, in the (≥5j, ≥4b, ≥2M, ≥3J, 0H, HMVA) region (a) pre-ﬁt and
(b) post-ﬁt, and the (≥5j, ≥4b, ≥2M, ≥3J, ≥1H, HMVA) region (c) pre-ﬁt and (d) post-
ﬁt. The “Others” background includes the t¯tV /H, t¯tt¯t, diboson, and multijet backgrounds.
The expected T T → HtHt signal (solid red) for mT = 1.4 TeV is included in the pre-ﬁt
ﬁgures. The bottom panels display the ratios of data to the total background predictions.
The hashed area represents the total uncertainty on the background.

273

To further investigate the robustness of the ﬁt model, a likelihood ﬁt under the signal-plus-

background hypothesis was performed assuming the diﬀerent signal benchmark scenarios

and mass points that are considered in this analysis. The observed data is ﬁtted with the

signal-strength parameter µ treated as a ﬂoating parameter of the ﬁt. In all scenarios, the

post-ﬁt signal strength is negative, which indicates that the ﬁt model disfavors the signal-

plus-background hypothesis. Furthermore, the robustness of the uncertainty model was

assessed by performing the signal-plus-background ﬁt four times for each individual nuisance

parameter, with the value of the nuisance parameter θk being ﬁxed to one of the following

values per ﬁt: θpre-ﬁt

k

± ∆θpre-ﬁt
k

, and θpost-ﬁt
k

± ∆θpost-ﬁt
k

. Here ∆θk denotes the uncertainty

of the nuisance parameter θk. These ﬁts allow us to determine the impact that each nuisance

parameter has on the signal-strength parameter by calculating the diﬀerence between the

µ obtained in these ﬁts and the one obtained from the nominal signal-plus-background ﬁt.

This information is summarized in Figures 6.39 and 6.40, which show the 20 leading nuisance

parameters ranked based on their post-ﬁt impact on µ assuming the two signal benchmarks

that the 1-lepton channel is most sensitive to, which are the HtHt and doublet signals, with

mT = 1.6 TeV. In addition to the nuisance parameter ranking, these plots also show the

deviation, or pull, of each nuisance parameter post-ﬁt value from its nominal value, as well

as the constraint on the nuisance parameter uncertainty that results from the ﬁt.

As can be observed from these plots, a large fraction of the top-ranked nuisance parame-

ters are associated with the modeling uncertainties of the t¯t background processes and the jet

experimental uncertainties. Overall, the top-ranked nuisance parameters are well-behaved,

with only a few exhibiting mild pulls and constraints, the most noticeable of which comes

from the uncertainty associated with the normalization of the t¯t+ ≥1b background. The

pull and constraint of this uncertainty can be ascribed to the high-statistics t¯t control re-

274

gions with 3b and ≥4b, which the ﬁt uses to correct the normalization of this background

and consequently reduce its uncertainty. The nuisance parameters that exhibit the strongest

post-ﬁt impact on µ are associated with the t¯t + W t background reweighting uncertainty in

Njets ≥ 8, the modeling and normalization uncertainties of the t¯t+ ≥1b background, the

uncertainty on the jet mass resolution (JMR), and the uncertainty on the extrapolation of

the b-jet tagging scale factors for jets that have a pT greater than the validity range of the

data sample used for the calibration of the tagger. The impacts from the b-jet tagging ex-

trapolation and the modeling and normalization of t¯t+ ≥1b background are expected since

the HtHt and doublet signals mostly populate the 3b and ≥4b search regions, which are

characterized by a large presence of b-tagged jets by construction and are dominated by the

t¯t+ ≥1b background. The impact from the JMR uncertainty can be ascribed to the eﬀect

that the jet mass smearing associated with this uncertainty has when it is propagated to

the RC jets. This can potentially cause event migration between search regions, which can

happen as the JMR-varied RC jet is tagged to a diﬀerent particle compared to the nominal

RC jet or as the MVA input variables that depend on RC jets change signiﬁcantly, thereby

resulting in a diﬀerent MVA score than the nominal event. Finally, the impact from the

uncertainty on the t¯t + W t background reweighting in Njets ≥ 8 is also expected as a large

number of jets drives the meﬀ towards higher values where the signal is expected to reside.

275

Figure 6.39: The pre-ﬁt and post-ﬁt impacts of the 20 leading nuisance parameters on the
signal strength parameter µ under the signal-plus-background hypothesis, assuming a 1.6
TeV HtHt signal. Each nuisance parameter is ranked based on their post-ﬁt impact on
µ, which is indicated by the ﬁlled colored rectangles. The vertical axis lists the top ranked
nuisance parameters in descending order. The pre-ﬁt impact on µ is indicated by the unﬁlled
rectangles. The impact on µ, denoted by ∆µ, is read from the top horizontal axis. The black
markers represent the deviation, or pull, of the corresponding post-ﬁt nuisance parameter
from their nominal value, measured in units of the pre-ﬁt standard deviation ∆θ. The black
error bars represent the post-ﬁt uncertainty of the corresponding nuisance parameter. This
information is read from the bottom horizontal axis.

276

Figure 6.40: The pre-ﬁt and post-ﬁt impacts of the 20 leading nuisance parameters on the
signal strength parameter µ under the signal-plus-background hypothesis, assuming a 1.6
TeV doublet signal. Each nuisance parameter is ranked based on their post-ﬁt impact on
µ, which is indicated by the ﬁlled colored rectangles. The vertical axis lists the top ranked
nuisance parameters in descending order. The pre-ﬁt impact on µ is indicated by the unﬁlled
rectangles. The impact on µ, denoted by ∆µ, is read from the top horizontal axis. The black
markers represent the deviation, or pull, of the corresponding post-ﬁt nuisance parameter
from their nominal value, measured in units of the pre-ﬁt standard deviation ∆θ. The black
error bars represent the post-ﬁt uncertainty of the corresponding nuisance parameter. This
information is read from the bottom horizontal axis.

277

6.2.8.2 Limits on Pair Vector-Like Quark Production

As argued in the preceding section, no signiﬁcant excess above the SM prediction is found

in the 1-lepton channel regions. Furthermore, the ﬁts performed under the signal-plus-

background hypothesis with the signal-strength parameter µ free-ﬂoating were consistent

with the background-only hypothesis. Upper limits at the 95% CL on the T ¯T produc-

tion cross section are derived in the four signal benchmarks considered in this analysis and

compared to the leading order theory prediction. The obtained limits are shown in Fig-

ure 6.41. As can be observed from the plots, an excess above the expected limit is observed

for mT < 1 TeV, which ranges between 1-2σ for the doublet and HtHt signal benchmarks.

For mT > 1.2 TeV, a deﬁcit below the expected limit is observed, being close to -1σ for

the singlet and ZtZt signal benchmarks. This is expected as the 1-lepton channel has little

sensitivity to the singlet and ZtZt signals. Since no signiﬁcant excess is observed in any of

the signal benchmarks, a claim for the evidence of T ¯T production cannot be made. Thus,

the T ¯T production is excluded for all T masses where the observed limit is below the theory

prediction. The lower limits set on the T mass for each signal benchmark considered are

shown in Table 6.15. The limits set by the previous iteration of this analysis in the 1-lepton

channel [101], which was based on a dataset of recorded collisions in 2015-2016 corresponding

to an integrated luminosity of 36 fb−1, are also listed for comparison. As can be observed, a

signiﬁcant improvement on the lower limits on the T mass has been achieved when compared

to the limits obtained in the 1-lepton channel from the previous iteration of the analysis.

278

(a)

(b)

(c)

(d)

Figure 6.41: Observed (solid line) and expected (dashed line) 95% CL upper limits on the
T ¯T production cross section as a function of the T quark mass in the doublet (a), singlet
(b), HtHt (c), and ZtZt (d) signal scenarios. The surrounding shaded bands correspond
to ±1 and ±2 standard deviations around the expected limit. The red line shows the LO
theoretical cross section prediction.

1-lepton channel 95% CL lower limits on T quark mass [TeV]

Analysis iteration BR(T → Ht)=1 BR(T → Zt)=1

Doublet

Singlet

Current
Previous

1.64 (1.62)
1.47 (1.30)

1.37 (1.31)
1.12 (0.91)

1.54 (1.50)
1.36 (1.16)

1.40 (1.35)
1.23 (1.02)

Table 6.15: Summary of the observed (expected) 95% CL lower limits on the T quark mass
for the diﬀerent signal benchmarks considered that were obtained in the 1-lepton channel of
the current and previous iteration of the analysis.

279

Chapter 7

Conclusion

This dissertation describes two research topics. The ﬁrst topic describes the tagging of

collimated sprays of particle decays that are initiated by the process of hadronization, known

as jets, to the particle that instigated the process. This topic covers two jet tagging studies.

The ﬁrst one consists of the development and calibration of three jet substructure-based

taggers to tag jets to top quarks and W bosons. The second study consists of performing

a topological data analysis (TDA) of jets, which has not been used in the context of jet

tagging previously, and applying the information obtained in the design of two top tagging

algorithms. The second topic describes two search analyses for a hypothetical vector-like top

quark (T ) that decays to a Higgs boson and a top quark (Ht), or a Z boson and a top quark

(Zt), associated with the presence of a single electron or muon. The ﬁrst analysis focuses on

the single production of a T , which is mediated through the electroweak force. This analysis

allows to probe the universal coupling strength κ, which controls the coupling of the T to

the W , Z and Higgs bosons and the production cross section. The results of this analysis

are interpreted in the SU (2) singlet (T 2/3) and doublet (T 2/3 B−1/3) signal scenarios. The

second analysis focuses on the pair production T ¯T , which is mediated by the strong force.

The results of this analysis are interpreted in the SU (2) singlet and doublet scenarios, as

well as assuming the branching ratios BR(T → Ht) = 1 and BR(T → Zt) = 1. A summary

of both topics as well as potential outlooks is given in this Chapter.

280

7.1 Tagging Top Quark Studies

For the ﬁrst jet tagging study, a three-variable W tagger and two deep neural network

(DNN) top taggers, one designed to tag jets that contain the full decay products of the top

quark and the other designed to tag jets regardless of the full containment, were optimized

to perform their corresponding tagging tasks. The performance of the taggers in Monte

Carlo (MC) simulation was calibrated to the performance of the data. The MC modeling of

the data in the input variables of the taggers was assessed, including the eﬀects of various

sources of systematic uncertainty that are associated with the modeling of physics processes

and the reconstruction and calibration of relevant physics objects. The overall agreement

between MC and the data is good in the input variables. Some moderate MC modeling

discrepancies are observed in the input variables of the W tagger in a region close to the W -

tagged region; however, these diﬀerences are within the total uncertainty considered. Data

to MC scale factors were derived for the signal jet tagging eﬃciency measurement. The

scale factors were found to range between 0.8 and 1, with the lowest values being attained

by the W tagger. The MC overestimate of the W tagging signal eﬃciency is attributed to

the moderate discrepancies observed in the modeling of the tagger input variables. Finally,

the eﬀects of the systematic uncertainties were propagated to the scale factor calculation in

order to provide an uncertainty of these measurements. The uncertainties associated with the

modeling of the t¯t production process are observed to be the primary source of uncertainty.

This is expected as these uncertainties can signiﬁcantly vary the hadronization of the signal

t¯t process, which in turns varies the simulated detector response and the jet reconstruction

process, thereby having ramiﬁcations for the signal eﬃciency measurement.

For the second jet tagging study, two TDA techniques were applied to analyze the ho-

281

mology of top jets and QCD jets using their associated topoclusters to build simplicial

complexes. The ﬁrst technique is a Persistent Homology (PH) analysis, which was used to

study how the homology of jets varied as a function of a distance scale parameter. A hy-

pothesis was formulated that the number of connected components that are formed by the

topoclusters corresponds to the decay topology of a signal top jet. A distance scale param-

eter of ∆R = 1.2 was determined from the average distance scale at which signal top jets

have two connected components, which corresponds to a top quark decay topology where

the decays of the W boson are collimated under the assumed hypothesis. The homology

of both signal top and background QCD jets at this distance scale was characterized with

the presence of a circular void that disappeared after the topoclusters formed a single con-

nected component. This circular void was found to persist longer in signal jets. A kinematic

description of the connected components was achieved by adding the four-momenta of the

topoclusters associated with a connected component and interpreting it as a subjet. The

mass distribution of the connected components in signal top jets indicate that these objects

are reconstructing relevant substructures of top jets, as evidenced by the mass bumps near

the W and top mass. The corresponding mass distributions from background QCD jets are

indicative of reconstructing inconsistent substructures from random patterns of topoclusters.

These observations give conﬁdence in the hypothesis formulated for this study.

The second technique used in the TDA of jets is the Mapper algorithm, which analyzes the

homology of jets at a ﬁxed distance scale. The distance scale used in the Mapper algorithm

studies, which governs the formation of vertices in the ˇCech ( ˇC) simplicial complex of jets was

set to ∆R = 1.2, motivated by the results obtained from the PH analysis. A dedicated study

was performed to optimally select the other parameters needed to use the Mapper algorithm.

The homology of jets obtained by the Mapper algorithm is characterized by the presence of

282

multiple connected components and a lack of circular voids, with no signiﬁcant diﬀerences

observed between signal and background jets. The absence of voids in the ˇC complex of

jets is attributed to the combined use of a granular covering set with the distance scale

parameter ∆R = 1.2, which hinders the ability of the algorithm to resolve circular features

in jets. Similar to the PH studies, a kinematic description of the vertices and connected

components in the ˇC complex of jets was given. Additionally, jet substructure-inspired

observables were deﬁned in order to quantify how the energy of the jet is distributed across

its connected components. The connected components were observed to achieve a similar

degree of jet substructure reconstruction as the one achieved in the PH study. Furthermore,

the jet substructure-inspired observables showed diﬀerences in how the energy of the jet is

distributed in vertices and connected components between signal and background jets.

Two tagging algorithms were designed to use the information obtained from the TDA

of jets to classify jets as either signal top jets or background QCD jets. The ﬁrst algorithm

consists of a DNN tagger that uses the kinematic and substructure information of the vertices

and connected components obtained from the Mapper algorithm in order to classify jets.

The second algorithm consists of a convolutional graph neural network (GNN) that uses a

graph representation of jets that is built from the connected components of the ˇC complex

of a jet. Both taggers achieved a good separation power between signal and background

jets. However, the GNN tagger presented signs of undertraining, evidenced by its moderate

ability to conﬁdently tag signal jets when compared to the DNN tagger. The undertraining

is attributed to the limited computational memory resources that were available when this

tagger was trained. The GNN training required that the graph of the jets used for the

training dataset were readily available, which exceeded the memory resources when a large

number of jets were included.

283

The performance of the DNN and GNN taggers was compared to the contained top

tagger from the ﬁrst jet tagging study. Both the DNN and GNN taggers were found to be

slightly outperformed by the contained top tagger. The variables obtained from the TDA

of jets were compared between signal and background jets in tagging selection regions that

corresponded to ambiguous classiﬁcations between the DNN or GNN and the contained top

tagger. This was done in order to determine if there was any residual information from the

TDA of jets that the taggers were not using to their full extent and could further improve

the separation between signal and background jets. The connected components from signal

top jets in these tagging selection regions were observed to partially retain their ability to

reconstruct relevant substructures of top jets. Furthermore, the jet substructure-inspired

observables of connected components showed diﬀerences in how the energy is distributed in

these structures between signal and background jets. These observations indicate that there

is residual information from the TDA of jets that the taggers are not fully utilizing.

The TDA of jets has untapped potential that can be harnessed in future endeavors. First,

the assumed hypothesis that was made in the PH analysis that the number of connected

components should correspond to the decay topology of a top quark may not be optimal.

Instead, a more descriptive distance scale of the homology of jets could be obtained from

the merging of two connected components that results in a mass close to the W boson mass,

which could happen well before the jet has two or three connected components. This opens

the possibility of analyzing jets with the Mapper algorithm using a distance scale parameter

that varies on a jet-by-jet basis instead of a ﬁxed-value distance parameter that may not

properly characterize the homology of all jets. Another aspect that can be improved in the

TDA methodology is to apply the Mapper algorithm with the use of a covering set that is

composed of ﬁner elements. For the studies presented in this thesis, a set with four granular

284

elements was used to cover the topocluster φ-projection image space. When combined with

the distance scale parameter ∆R = 1.2, the vertices that are formed in each cover element

will tend to have large fractions of topoclusters, which trivializes the ˇC complex of the jet.

Thus, a covering set with ﬁner elements can improve the resolution of the Mapper algorithm

by increasing the number of vertices that better capture the small-scale structure of jets.

This can be further improved by combining the use of the φ-projection ﬁlter function with a

η-projection ﬁlter function, which increases the spatial resolution of the Mapper algorithm.

Finally, the TDA of jets described in this thesis was limited to a geometric point of view. A

prospect of this analysis is to study how the homology of jets is aﬀected with the use of a

distance metric that takes into account the energy of the topoclusters.

7.2 Searches for Vector-Like Quarks

The search analyses for a vector-like T quark presented in this dissertation covered the single

production mechanism, which is mediated by the electroweak force, and the pair production

mechanism, which is mediated by the strong force. Both analyses target the decay topology

T → Ht in ﬁnal states that include the presence of a single electron or muon, referred to as the

1-lepton channel. The pair production analysis will cover the 0-lepton channel; however, the

0-lepton channel analysis is at the stage of ﬁnalizing validation studies that are needed prior

to performing the statistical analysis. Thus, the 0-lepton channel results are not covered in

this dissertation. Both single and pair production analyses were performed using 139 fb−1

of data and shared the same background and systematic uncertainty models. The main

irreducible background in these searches is t¯t production in association with additional jets.

Subdominant background contributions come from the single-top and W/Z+jets production

285

processes.

The design of the analysis strategy for the single production search took advantage of the

simultaneous presence of several unique objects in signal processes, such as forward jets, an

associated top or bottom quark with the T production, and a hadronically decaying boosted

Higgs boson produced from the decay of the T . The presence of these objects allowed the

deﬁnition of search regions that were relatively pure in the diﬀerent T decay topologies and

associated production modes considered. The design of the pair production analysis strat-

egy took advantage of the interesting decay topology combinatorics that became available

with the production of an additional T . This allowed the deﬁnition of many discriminating

variables between signal and background processes. An example of one of these variables is

the invariant mass of reconstructed candidate T s. The distribution of this variable in signal

processes peaked sharply at the mass of the T , while for background processes it peaked at

lower values and exhibited a long tail, which is characteristic of reconstructing a candidate T

from inconsistent kinematics. A multivariate analysis was performed using all the discrim-

inating variables, which resulted in the deﬁnition of a DNN that classiﬁed events as either

signal T ¯T production events or SM background events. The DNN allowed for the deﬁnition

of simpler search regions that were agnostic to the decay topologies of signal processes, which

contrasts with the single production analysis search regions that are tailored to the diﬀerent

T decay topologies and associated production modes.

Both analyses use the eﬀective mass (meﬀ) variable, which is deﬁned as the scalar sum

of the pT of the ﬁnal state jets, leptons, and Emiss

T

in an event, as the ﬁnal discriminant

between signal and background processes. The deﬁnition of this variable is motivated by the

presence of a large number of energetic ﬁnal state objects in signal processes that arise from

the decays of the massive T s. As a result of this, the meﬀ allows for a discrimination between

286

signal and background processes that is agnostic on the decay topologies and associated

production modes of the T s. The MC simulations of the t¯t and W/Z+jets background

processes are known to mismodel the upper tail of the jet pT spectrum and the distribution

of the number of jets at high multiplicities. This enters as a source of mismodeling in meﬀ

in the region where the signal is expected to reside due to how it is deﬁned. To address

this issue, data-driven correction factors were derived to improve the MC modeling of these

backgrounds in this kinematic regime. The correction factors were derived in regions that are

enriched in the background to be reweighted and signal-depleted in order to ensure that the

presence of potential signal events is not removed by the correction factors. The modeling

of the background MC to the data in the regions used to derive these correction factors, as

well as orthogonal validation regions that are background-enriched and signal-depleted, was

compared before and after applying the correction factors. A signiﬁcant improvement in the

modeling of meﬀ, as well as other variables that are not related to meﬀ but showed signs of

being mismodeled, was observed after applying the correction factors.

In both search analyses, a statistical analysis in the form of a maximum likelihood ﬁt

was performed, where the meﬀ distributions in all search regions of a given analysis were

jointly analyzed to test for the presence of potential signal T production events in the data.

In the single production analysis, no signiﬁcant excess above the SM prediction was found

in all search regions considered. Upper limits at the 95% CL on the cross section of the

single production of a T were derived in both the singlet and doublet signal scenarios. The

limits are interpreted as exclusion lower limits of the T mass and universal coupling strength

κ. For the singlet scenario, masses below 2.1 TeV are excluded for κ ≥ 0.6, while values of

κ ≥ 0.3 are excluded for a T mass of 1.6 TeV. For the doublet scenario, values of κ ≥ 0.55

are excluded for a T mass of 1 TeV.

287

Finally, for the pair production analysis, an excess of data that is not covered by the post-

ﬁt uncertainty was observed in two search regions that are not signal-enriched. The post-ﬁt

agreement in the remaining search regions, which includes the signal-enriched search regions,

is overall good. The agreement between the data and the post-ﬁt MC background on the meﬀ

distribution in these two search regions is sensible and within the post-ﬁt uncertainty in the

majority of the meﬀ bins. Furthermore, the likelihood ﬁts performed under the signal-plus-

background hypothesis were consistent in rejecting the signal-plus-back-ground hypothesis

in favor of the background-only hypothesis. These observations indicate that the ﬁt model is

missing degrees of freedom that are required to improve the correction of the background MC

prediction, which will need to be further investigated. However, based from the observations

made, the observed data excesses can be deemed as non-signiﬁcant. Upper limits at the

95% CL on the cross section of T ¯T pair production were derived for the BR(T → Ht) = 1,

BR(T → Zt) = 1, doublet, and singlet signal scenarios in the 1-lepton channel. These limits

were interpreted as exclusion lower limits of the T mass for each signal scenario. The 1-

lepton channel search excludes T masses below 1.64 TeV, 1.37 TeV, 1.54 TeV and 1.40 TeV

for the BR(T → Ht) = 1, BR(T → Zt) = 1, doublet, and singlet scenarios, respectively.

These limits show a signiﬁcant improvement from the ones that were obtained in the previous

iteration of this analysis in the 1-lepton channel. The interpretations of the results obtained

may change once the results of the 0-lepton channel become available and are combined with

the 1-lepton channel results.

288

BIBLIOGRAPHY

[1] G. Kane. Modern Elementary Particle Physics: Explaining and Extending the Standard

Model. Cambridge University Press, 2017.

[2] A. Zee. Group Theory in a Nutshell for Physicists. In a Nutshell. Princeton University

Press, 2016.

[3] Andrea Wulzer. Behind the Standard Model. In 2015 European School of High-Energy

Physics, 1 2019.

[4] Standard Model of Elementary Particles. https://upload.wikimedia.org/wikipedia/

commons/0/00/Standard Model of Elementary Particles.svg, 2019.

[5] Maggiore Michele. A Modern Introduction to Quantum Field Theory. Number Vol. 12

in Oxford Master Series in Physics. OUP Oxford, 2005.

[6] Alistair Savage. Introduction to Lie Groups. https://alistairsavage.ca/mat4144/notes/

MAT4144-5158-LieGroups.pdf, 2015.

[7] C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes, and R. P. Hudson. Experimental

test of parity conservation in beta decay. Phys. Rev., 105:1413–1415, Feb 1957.

[8] John Ellis. An illustration of the Higgs potential. https://cds.cern.ch/record/1638469/

plots.

[9] K. G. Begeman, A. H. Broeils, and R. H. Sanders. Extended rotation curves of spiral
galaxies: dark haloes and modiﬁed dynamics. Monthly Notices of the Royal Astronom-
ical Society, 249(3):523–537, 04 1991.

[10] Edvige Corbelli and Paolo Salucci. The extended rotation curve and the dark matter
halo of M33. Monthly Notices of the Royal Astronomical Society, 311(2):441–447, 01
2000.

[11] M. Markevitch, A. H. Gonzalez, D. Clowe, A. Vikhlinin, W. Forman, C. Jones, S. Mur-
ray, and W. Tucker. Direct Constraints on the Dark Matter Self-Interaction Cross
Section from the Merging Galaxy Cluster 1E 0657-56. The Astrophysical Journal,
606(2):819–824, May 2004.

[12] Mathieu Buchkremer, Giacomo Cacciapaglia, Aldo Deandrea, and Luca Panizzi.
Model-independent framework for searches of top partners. Nuclear Physics B,
876:376–417, 2013.

[13] J. A. Aguilar-Saavedra, R. Benbrik, S. Heinemeyer, and M. P´erez-Victoria. Handbook
of vectorlike quarks: Mixing and single production. Physical Review D, 88(9), nov
2013.

289

[14] The ATLAS Collaboration. Search for production of vector-like quark pairs and of
s = 8 TeV with

four top quarks in the lepton-plus-jets ﬁnal state in pp collisions at
the ATLAS detector. Journal of High Energy Physics, 2015(8), aug 2015.

√

[15] Lyndon Evans and Philip Bryant. LHC machine.

Journal of Instrumentation,

3(08):S08001–S08001, aug 2008.

[16] The ATLAS Collaboration. The ATLAS experiment at the CERN large hadron col-

lider. Journal of Instrumentation, 3(08):S08003–S08003, aug 2008.

[17] The CMS Collaboration. The CMS experiment at the CERN LHC. Journal of Instru-

mentation, 3(08):S08004–S08004, aug 2008.

[18] The LHCb Collaboration. The LHCb detector at the LHC. Journal of Instrumentation,

3(08):S08005–S08005, aug 2008.

[19] The ALICE Collaboration. The ALICE experiment at the CERN LHC. Journal of

Instrumentation, 3(08):S08002–S08002, aug 2008.

[20] The TOTEM Collaboration. The TOTEM Experiment at the CERN Large Hadron

Collider. Journal of Instrumentation, 3(08):S08007, aug 2008.

[21] The LHCf Collaboration. The LHCf detector at the CERN Large Hadron Collider.

Journal of Instrumentation, 3(08):S08006, aug 2008.

[22] James Pinfold. The MoEDAL experiment at the LHC. EPJ Web Conf., 145:12002,

2017.

[23] AC Team. The four main LHC experiments. http://cds.cern.ch/record/40525, 1999.

[24] AC Team. Diagram of an LHC dipole magnet. Schema d’un aimant dipole du LHC.

https://cds.cern.ch/record/40524, 1999.

[25] Christiane Lefevre. The CERN accelerator complex. Complexe des accelerateurs du

CERN. http://cds.cern.ch/record/1260465, 2008.

[26] The ATLAS Collaboration. Luminosity determination in pp collisions at

s = 7 TeV
using the ATLAS detector at the LHC. The European Physical Journal C, 71(4), apr
2011.

√

[27] The ATLAS Collaboration. Public ATLAS Luminosity Results for Run-2 of the LHC.
https://twiki.cern.ch/twiki/bin/view/AtlasPublic/LuminosityPublicResultsRun2.

[28] The ATLAS Collaboration. Measurement of the inelastic proton-proton cross section
s = 13 TeV with the atlas detector at the lhc. Phys. Rev. Lett., 117:182002, Oct

√

at
2016.

290

[29] Joao Pequenao. Computer generated image of the whole ATLAS detector. https:

//cds.cern.ch/record/1095924, 2008.

[30] Joao Pequenao and Paul Schaﬀner. How ATLAS detects particles: diagram of particle

paths in the detector. https://cds.cern.ch/record/1505342, 2013.

[31] The ATLAS Collaboration. ATLAS inner detector: Technical Design Report, 1. Tech-

nical design report. ATLAS. CERN, Geneva, 1997.

[32] Joao Pequenao. Computer generated image of the ATLAS inner detector. https:

//cds.cern.ch/record/1095926, 2008.

[33] M Capeans, G Darbo, K Einsweiller, M Elsing, T Flick, M Garcia-Sciveres, C Gemme,
H Pernegger, O Rohne, and R Vuillermet. ATLAS Insertable B-Layer Technical Design
Report. Technical report, The ATLAS Collaboration, 2010.

[34] Joao Pequenao. Computer Generated image of the ATLAS calorimeter. https://cds.

cern.ch/record/1095927, 2008.

[35] Joao Pequenao. Computer generated image of the ATLAS Muons subsystem. https:

//cds.cern.ch/record/1095929, 2008.

[36] The ATLAS Collaboration. ATLAS magnet system: Technical Design Report, 1. Tech-

nical design report. ATLAS. CERN, Geneva, 1997.

[37] Ana Maria Rodriguez Vera and Joao Antunes Pequenao. ATLAS Detector Magnet

System. https://cds.cern.ch/record/2770604, 2021.

[38] The ATLAS Collaboration. Performance of the ATLAS trigger system in 2015. The

European Physical Journal C, 77(5):317, 2017.

[39] T Cornelissen, M Elsing, S Fleischmann, W Liebig, E Moyse, and A Salzburger. Con-
cepts, Design and Implementation of the ATLAS New Tracking (NEWT). Technical
report, CERN, Geneva, 2007.

[40] W Lampl, S Laplace, D Lelas, P Loch, H Ma, S Menke, S Rajagopalan, D Rousseau,
S Snyder, and G Unal. Calorimeter Clustering Algorithms: Description and Perfor-
mance. Technical report, CERN, Geneva, 2008.

[41] The ATLAS Collaboration. Electron and photon performance measurements with the
ATLAS detector using the 20152017 LHC proton-proton collision data. Journal of
Instrumentation, 14(12):P12006, dec 2019.

[42] The ATLAS Collaboration. Electron reconstruction and identiﬁcation in the ATLAS
experiment using the 2015 and 2016 LHC protonproton collision data at sqrts =
13 TeV. The European Physical Journal C, 79(8):639, 2019.

291

[43] The ATLAS Collaboration. Muon reconstruction performance of the ATLAS detector
s = 13 TeV. The European Physical Journal C,

√

in proton–proton collision data at
76, may 2016.

[44] Yu.L. Dokshitzer, G.D. Leder, S. Moretti, and B.R. Webber. Better jet clustering

algorithms. Journal of High Energy Physics, 1997(08):001, sep 1997.

[45] Stephen D. Ellis and Davison E. Soper. Successive combination jet algorithm for

hadron collisions. Phys. Rev. D, 48:3160–3166, Oct 1993.

[46] Matteo Cacciari, Gavin P. Salam, and Gregory Soyez. The anti-kt jet clustering algo-

rithm. Journal of High Energy Physics, 2008(04):063, apr 2008.

[47] The ATLAS Collaboration. ATLAS b-jet identiﬁcation performance and eﬃciency
s = 13 TeV. The European Physical

√

measurement with t¯t events in pp collisions at
Journal C, 79(11), nov 2019.

[48] The ATLAS Collaboration. ATLAS ﬂavour-tagging algorithms for the LHC Run 2 pp

collision dataset. 11 2022.

[49] David Krohn, Jesse Thaler, and Lian-Tao Wang. Jets with variable R. Journal of High

Energy Physics, 2009(06):059–059, jun 2009.

[50] The ATLAS Collaboration. Expected performance of missing transverse momentum
s = 13 TeV. Technical report, CERN,

√

reconstruction for the ATLAS detector at
Geneva, 2015.

[51] The ATLAS Collaboration. Reconstruction, Energy Calibration, and Identiﬁcation of
Hadronically Decaying Tau Leptons in the ATLAS Experiment for Run-2 of the LHC.
Technical report, CERN, Geneva, 2015.

[52] Georges Aad et al. Jet energy scale and resolution measured in proton-proton collisions
s = 13 TeV with the ATLAS detector. Eur. Phys. J. C, 81(8):689, 2021.

at

√

[53] The ATLAS Collaboration. Jet energy scale measurements and their systematic uncer-
s = 13 TeV with the ATLAS detector. Phys.

√

tainties in proton-proton collisions at
Rev. D, 96:072002, Oct 2017.

[54] The ATLAS Collaboration. Jet reconstruction and performance using particle ﬂow
with the ATLAS Detector. The European Physical Journal C, 77(7):466, 2017.

[55] Duccio Pappadopulo, Andrea Thamm, Riccardo Torre, and Andrea Wulzer. Heavy
vector triplets: bridging theory and data. Journal of High Energy Physics, 2014(9),
sep 2014.

292

[56] The ATLAS Collaboration.

top
and
report, CERN,
quark tagging with ATLAS using Run 2 data.
Geneva,
at
available
including
All ﬁgures
https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
PUB-2020-017.

vector
Technical

auxiliary ﬁgures

hadronic

Boosted

boson

2020.

are

[57] Georges Aad et al.

comparisons with atlas data taken at

Identiﬁcation of boosted, hadronically decaying W bosons and
s = 8 TeV. Eur. Phys. J. C, 76(3):154, 2016.

√

[58] Jesse Thaler and Ken Van Tilburg. Identifying boosted objects with n-subjettiness.

Journal of High Energy Physics, 2011(3), mar 2011.

[59] Jesse Thaler and Ken Van Tilburg. Maximizing boosted top identiﬁcation by mini-

mizing n-subjettiness. Journal of High Energy Physics, 2012(2), feb 2012.

[60] Andrew J. Larkoski, Duﬀ Neill, and Jesse Thaler. Jet shapes with the broadening axis.

Journal of High Energy Physics, 2014(4), apr 2014.

[61] The CMS Collaboration. Displays of candidate events in the search for new heavy
resonances decaying to dibosons in the all-jets ﬁnal state in the CMS detector, 2022.
CMS Collection.

[62] The ATLAS Collaboration. Measurement of kt splitting scales in W → (cid:96)ν events at

√

s = 7 TeV with the atlas detector. Eur. Phys. J. C, 73:2432, 2013.

[63] Andrew J. Larkoski, Gavin P. Salam, and Jesse Thaler. Energy correlation functions

for jet substructure. Journal of High Energy Physics, 2013(6), jun 2013.

[64] Andrew J. Larkoski, Ian Moult, and Duﬀ Neill. Power counting to better jet observ-

ables. Journal of High Energy Physics, 2014(12), dec 2014.

[65] Jesse Thaler and Lian-Tao Wang. Strategies to identify boosted tops. Journal of High

Energy Physics, 2008(07):092–092, jul 2008.

[66] The ATLAS Collaboration. Performance of top-quark and W -boson tagging with atlas

in run 2 of the lhc. Eur. Phys. J. C, 79:375, 2019.

[67] Matteo Cacciari, Gavin P Salam, and Gregory Soyez. The catchment area of jets.

Journal of High Energy Physics, 2008(04):005–005, apr 2008.

[68] Nina Otter, Mason A Porter, Ulrike Tillmann, Peter Grindrod, and Heather A Har-
rington. A roadmap for the computation of persistent homology. EPJ Data Science,
6(1), aug 2017.

293

[69] Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson. Topological Methods for the
Analysis of High Dimensional Data Sets and 3D Object Recognition. In M. Botsch,
R. Pajarola, B. Chen, and M. Zwicker, editors, Eurographics Symposium on Point-
Based Graphics. The Eurographics Association, 2007.

[70] Afra Zomorodian. Fast construction of the Vietoris-Rips complex. Computers and

Graphics, 34(3):263–271, 2010.

[71] Fran¸cois Chollet et al. Keras. https://keras.io, 2015.

[72] G¨unter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. Self-
normalizing neural networks. Advances in neural information processing systems, 30,
2017.

[73] Timothy Dozat. Incorporating Nesterov Momentum into Adam. In Proceedings of the

4th International Conference on Learning Representations, pages 1–4, 2016.

[74] Daniele Grattarola and Cesare Alippi. Graph neural networks in tensorﬂow and keras

with spektral, 2020.

[75] Thomas N. Kipf and Max Welling. Semi-supervised classiﬁcation with graph convolu-

tional networks, 2016.

[76] Djork-Arn´e Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate

deep network learning by exponential linear units (elus), 2015.

[77] Xavier Glorot and Yoshua Bengio. Understanding the diﬃculty of training deep feed-
forward neural networks. In Yee Whye Teh and Mike Titterington, editors, Proceed-
ings of the Thirteenth International Conference on Artiﬁcial Intelligence and Statistics,
volume 9 of Proceedings of Machine Learning Research, pages 249–256, Chia Laguna
Resort, Sardinia, Italy, May 2010. PMLR.

[78] Martin Simonovsky and Nikos Komodakis. Dynamic edge-conditioned ﬁlters in con-

volutional neural networks on graphs, 2017.

[79] The ATLAS Collaboration. Search for single production of vector-like T quarks decay-
s = 13 TeV with the ATLAS detector. Journal

√

ing into Ht or Zt in pp collisions at
of High Energy Physics, 2023(8):153, 2023.

[80] The ATLAS Collaboration. Measurements of top-quark pair diﬀerential and double-
s = 13 TeV

diﬀerential cross-sections in the (cid:96)+jets channel with pp collisions at
using the atlas detector. The European Physical Journal C, 79(12), dec 2019.

√

294

[81] The ATLAS Collaboration. Measurement of the t¯t production cross-section and lepton
s = 13 TeV with

diﬀerential distributions in eµ dilepton events from pp collisions at
the ATLAS detector. The European Physical Journal C, 80(6), jun 2020.

√

[82] The ATLAS Collaboration. ATLAS simulation of boson plus jets processes in Run 2.

Technical report, CERN, Geneva, 2017.

[83] The ATLAS Collaboration. Luminosity determination in pp collisions at

s = 8 TeV
using the ATLAS detector at the LHC. The European Physical Journal C, 76(12), nov
2016.

√

[84] The ATLAS Collaboration. Tagging and suppression of pileup jets with the ATLAS

detector. Technical report, CERN, Geneva, 2014.

[85] Micha(cid:32)l Czakon and Alexander Mitov. Top++: A program for the calculation of
the top-pair cross-section at hadron colliders. Computer Physics Communications,
185(11):2930–2938, nov 2014.

[86] The ATLAS Collaboration. Measurements of inclusive and diﬀerential ﬁducial cross-
sections of t¯t production with additional heavy-ﬂavour jets in proton-proton collisions
s = 13 TeV with the ATLAS detector. Journal of High Energy Physics, 2019(4),
at
apr 2019.

√

[87] Nikolaos Kidonakis. Next-to-next-to-leading-order collinear and soft gluon corrections
for t-channel single top quark production. Physical Review D, 83(9), may 2011.

[88] Nikolaos Kidonakis. Two-loop soft anomalous dimensions for single top quark associ-

ated production with a W − or H−. Physical Review D, 82(5), sep 2010.

[89] Nikolaos Kidonakis. Next-to-next-to-leading logarithm resummation for s-channel sin-

gle top quark production. Physical Review D, 81(5), mar 2010.

[90] Enrico Bothmann et al. Event Generation with Sherpa 2.2. SciPost Phys., 7(3):034,

2019.

[91] The ATLAS Collaboration. Measurement of higgs boson decay into b-quarks in as-
s = 13 TeV with the

sociated production with a top-quark pair in pp collisions at
ATLAS detector. Journal of High Energy Physics, 2022(6), jun 2022.

√

[92] J. M. Campbell and R. K. Ellis. Update on vector boson pair production at hadron

colliders. Physical Review D, 60(11), nov 1999.

[93] J. Alwall, S. Hoche, F. Krauss, N. Lavesson, L. Lonnblad, F. Maltoni, M.L. Mangano,
M. Moretti, C.G. Papadopoulos, F. Piccinini, S. Schumann, M. Treccani, J. Winter,
and M. Worek. Comparative study of various algorithms for the merging of parton

295

showers and matrix elements in hadronic collisions. The European Physical Journal C,
53(3):473–500, dec 2007.

[94] Glen Cowan, Kyle Cranmer, Eilam Gross, and Ofer Vitells. Asymptotic formulae for
likelihood-based tests of new physics. The European Physical Journal C, 71(2), feb
2011.

[95] Glen Cowan, Kyle Cranmer, Eilam Gross, and Ofer Vitells. Erratum to: Asymptotic
formulae for likelihood-based tests of new physics. The European Physical Journal C,
73(7), 2013.

[96] A L Read. Presentation of search results: the CLs technique. Journal of Physics G:

Nuclear and Particle Physics, 28(10):2693, sep 2002.

[97] Thomas Junk. Conﬁdence level computation for combining searches with small statis-

tics. Nucl. Instrum. Meth. A, 434:435–443, 1999.

[98] ATLAS Statistics Forum. The CLs method:

information for conference speakers.

https://www.pp.rhul.ac.uk/∼cowan/stat/cls/CLsInfo.pdf.

[99] Aldo Deandrea, Thomas Flacke, Benjamin Fuks, Luca Panizzi, and Hua-Sheng Shao.
Single production of vector-like quarks: the eﬀects of large width, interference and
NLO corrections. JHEP, 08:107, 2021.

[100] The ATLAS Collaboration. Luminosity determination in pp collisions at

using the atlas detector at the lhc, 2022.

√

s = 13 tev

[101] The ATLAS Collaboration. Search for pair production of up-type vector-like quarks
and for four-top-quark events in ﬁnal states with multiple b-jets with the ATLAS
detector. Journal of High Energy Physics, 2018(7):89, 2018.

[102] Torbj¨orn Sj¨ostrand, Stefan Ask, Jesper R. Christiansen, Richard Corke, Nishita De-
sai, Philip Ilten, Stephen Mrenna, Stefan Prestel, Christine O. Rasmussen, and Pe-
ter Z. Skands. An introduction to PYTHIA 8.2. Computer Physics Communications,
191:159–177, jun 2015.

[103] Richard D. Ball, Valerio Bertone, Stefano Carrazza, Christopher S. Deans, Luigi Del
Debbio, Stefano Forte, Alberto Guﬀanti, Nathan P. Hartland, Jos´e I. Latorre, Juan
Rojo, and Maria Ubiali. Parton distributions with LHC data. Nuclear Physics B,
867(2):244–289, feb 2013.

[104] The ATLAS Collaboration.

Studies

Sherpa

and MG5 aMC@NLO.

elling with
Geneva,

2017.

on top-quark Monte Carlo mod-
report, CERN,
at
available
are

Technical
auxiliary ﬁgures

All ﬁgures

including

296

https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-PHYS-
PUB-2017-007.

[105] Stefano Frixione, Paolo Nason, and Giovanni Ridolﬁ. A Positive-weight next-to-

leading-order Monte Carlo for heavy ﬂavour hadroproduction. JHEP, 09:126, 2007.

[106] Paolo Nason. A new method for combining NLO QCD with shower Monte Carlo

algorithms. JHEP, 11:040, 2004.

[107] Stefano Frixione, Paolo Nason, and Carlo Oleari. Matching NLO QCD computations
with parton shower simulations: the POWHEG method. JHEP, 11:070, 2007.

[108] J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao,
T. Stelzer, P. Torrielli, and M. Zaro. The automated computation of tree-level and
next-to-leading order diﬀerential cross sections, and their matching to parton shower
simulations. JHEP, 07:079, 2014.

[109] M. Bahr et al. Herwig++ physics and manual. Eur. Phys. J. C, 58:639, 2008.

[110] Johannes Bellm et al. Herwig 7.0/Herwig++ 3.0 release note. Eur. Phys. J. C,

76(4):196, 2016.

[111] J. A. Aguilar-Saavedra. Protos - program for top simulations. http://jaguilar.web.

cern.ch/jaguilar/protos/.

[112] Stefano Frixione, Eric Laenen, Patrick Motylinski, Chris White, and Bryan R Webber.
Single-top hadroproduction in association with a w boson. Journal of High Energy
Physics, 2008(07):029–029, jul 2008.

[113] The ATLAS Collaboration. Studies on top-quark Monte Carlo modelling for Top2016.
Technical report, CERN, Geneva, 2016. All ﬁgures including auxiliary ﬁgures are
available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-
PHYS-PUB-2016-020.

297

APPENDIX A. Monte Carlo Simulations

This appendix describes the MC simulation samples that were used to simulate the diﬀerent

signal and background processes of interest in the studies presented in this thesis. The sam-

ples are generated with computational tools known as MC generators that apply the MC

sampling method to simulate the events of a given process of interest in order to produce

distributions of kinematic variables of the process. The MC generators simulate multiple

steps in a given process. First, the collision of two protons is simulated down to the level of

the quarks and gluons inside protons, also known as partons. This is done using parton dis-

tribution functions (PDFs) that represent the probabilities of two given partons interacting

and carrying a given fraction of the total energy of the proton. The second step consists of

simulating the ﬁnal state particles that are produced from the interacting partons for a given

process. The third step consists of simulating the hadronization of quarks that are produced

from the ﬁnal state of the process of interest. Additionally, the emission of quarks and gluons

from partons prior to the collision, known as initial state radiation, and after the collision,

known as ﬁnal state radiation, are modeled using parton shower MC generators. Finally, the

detector response is simulated using the ﬁnal state leptons and hadronized quarks from the

previous step. In the following, the list of MC samples used in the diﬀerent studies, as well

as the MC generators, PDFs, and modeling parameters, is given.

Jet Tagging Study Samples

The samples used in the design and optimization of the jet taggers studied in Chapter 5

are divided into two categories: signal and background. The signal samples are generated

with BSM processes that are described in the Heavy Vector Triplets framework [55], which

298

is an extended gauge symmetry model that predicts the existence of heavy W (cid:48) and Z(cid:48) gauge

bosons. These samples were simulated using the Pythia 8.235 [102] generator with the

NNPDF2.3LO [103] PDF set and the A14 set of tuned parameters [104]. The background

events used for the tagger optimization are QCD multijet events. These are generated using

Pythia 8.230 with the NNPDF2.3LO PDF set and the A14 set of tuned parameters.

The samples used for the signal eﬃciency calibration of the jet substructure taggers

are also divided into signal and background. The t¯t and single top signal samples are

used to model events with jets originating from top quarks and W bosons. These samples

were simulated with Powheg [105, 106, 107] interfaced with the Pythia 8.230 generator.

Alternative t¯t samples were also used for the evaluation of systematic uncertainties of the

signal eﬃciency calibration. The samples used to assess the uncertainty on the matching

of the next to leading order (NLO) matrix-elements and parton shower for t¯t samples were

generated with MadGraph5 aMC@NLO v2.6.0 [108] interfaced with Pythia 8.230. To

assess the uncertainty on the choice of the parton shower and hadronization algorithm,

samples were simulated using Powheg interfaced with Herwig 7.04 [109, 110] to model

the parton shower and hadronization.

The background samples used for the signal eﬃciency calibration consists of simulations of

W/Z+jets (V +jets) and diboson production processes. The V +jets samples were generated

with Sherpa v2.2.1 [90], while the diboson samples were generated with Sherpa v2.1.

299

Single and Pair Production of Vector-Like Quarks Sam-

ples

The single production of T vector-like quarks was simulated with samples produced with the

MadGraph5 aMC@NLO v2.3.3 generator interfaced with Pythia 8.212 for the modeling

of the parton showering and hadronization. The NNPDF3.0LO PDF set and the A14 set

of tuned parameters are used. The VLQs are assumed to couple exclusively to the third

generation SM quarks. Separate samples were generated for the T (→ Ht)qb and T (→ Zt)qb,

T (→ Ht)qt and T (→ Zt)qt processes in the 1.1-2.3 TeV mass range at ﬁxed values of mass

and coupling strength parameter κ.

The pair production of T vector-like quarks was simulated with samples produced with

the Protos [111] generator using the NNPDF2.3LO PDF set and the A14 set of tuned

parameters. These events were interfaced with Pythia 8.212 to model the parton showering

and hadronization. The samples were generated assuming singlet couplings and forced to

decay with equal branching ratios to Ht, Zt, and W b. Additionally, the samples were

generated in the T mass range 600-2000 GeV in steps of 100 GeV.

Both the single and pair production analyses have a similar background model; thus,

the majority of the MC samples that are used to simulate the background processes were

generated with same conﬁgurations for both analyses. The following description of the

background samples applies to both analyses, unless otherwise stated.

The t¯t and single top production background processes were modeled using the Powheg

generator at NLO with the NNPDF3.0LO PDF set. The events were interfaced to Pythia

8.230 to model the parton shower and hadronization. The t¯t samples were generated inclu-

sively, but events are categorized based on the ﬂavor content of additional particle jets that

300

do not originate from the decay of the t¯t system. These events are labeled as t¯t+ ≥ 1b,

t¯t+ ≥ 1c, and t¯t+light-jets.

The associated production of a single top quark with W bosons has signiﬁcant contribu-

tions in regimes of high transverse momentum. Samples to model the single top W t-channel

were generated using the diagram removal scheme [112] in order to remove interference

and overlap with t¯t production. The uncertainty associated with this procedure is esti-

mated by comparing with an alternative W t sample generated using the diagram subtraction

scheme [113] and the same generator setup as the nominal sample. Separate samples were

generated to model the s-channel and t-channel of single top production.

Additional alternative t¯t and single top production samples were used to evaluate sys-

tematic uncertainties on the modeling of these processes. The impact on the choice of the

parton shower and hadronization model is evaluated with samples that were generated with

the Powheg generator using the NNPDF3.0NLO PDF set, but interfaced with Herwig

7.04. The uncertainty on the matching of NLO matrix-element and parton shower for the t¯t

samples is evaluated by comparing the nominal sample that was generated using Powheg

with an alternative sample generated with MadGraph5 aMC@NLO v2.6.0. For single top

production, the nominal sample was compared with an alternative sample generated with

MadGraph5 aMC@NLO v2.6.2.

The V +jets production background process in the single production analysis was simu-

lated with the Sherpa v2.2.1 using the NNPDF3.0NNLO PDF set. In the pair production

analysis this process was simulated with Sherpa v2.2.11, which improves the modeling of

this background. Diboson production in the single production analysis was simulated with

the Sherpa v2.2.1 or Sherpa v2.2.2 generators depending on the process. In the pair pro-

duction analysis this process was simulated with Sherpa v2.2.11. The production of t¯tW

301

and t¯tZ (t¯tV ) were simulated using the MadGraph5 aMC@NLO v2.3.3 generator with

the NNPDF3.0NLO PDF set interfaced to Pythia 8.210. The production of t¯tH was simu-

lated using the Powheg generator with the NNPDF3.0NLO PDF set interfaced to Pythia

8.230. The production of four top quarks was simulated with the MG5 aMC v2.2.2 genera-

tor with the NNPDF2.3LO PDF set interfaced to Pythia 8.186. Finally, the QCD multijet

samples were simulated using Pythia 8.230.

302

APPENDIX B. Mapper Algorithm Optimization

This appendix summarizes the optimization studies that were performed to determine the

optimal set of parameters to be used with the Mapper algorithm. These parameters are:

the ﬁlter function that maps the topoclusters to an image topological space; the covering

set of the image topological space, from which the ˇC complex of the jet is obtained; the

clustering algorithm that is applied in each cover element, which provides the vertices of

the ˇC complex; and the distance resolution scale ∆Rres, which determines the clustering

threshold distance that governs the formation of vertices. The ﬁnal choice of the parameters

consists of projecting the topoclusters to the φ-axis of the η-φ plane, which is covered by

the set of overlapping intervals U = {[−3.2, −1.2], [−2.0, 0.4], [−0.4, 2.0], [1.2, 3.2]}. The

topoclusters in each interval are clustered using a single-linkage clustering algorithm with

∆Rres = 1.2. These parameters were chosen due to their interpretability on how the Mapper

algorithm works and their eﬀectiveness in reconstructing the relevant substructures in signal

top jets.

The optimization was performed with a grid search that varied a single parameter option

at a time. The topological and kinematic distributions of the jets, vertices and connected

components were analyzed in order to make the ﬁnal choice of parameter options. The vari-

ations in the ﬁlter function and clustering algorithm exhibited some of the largest diﬀerences

in the output of the Mapper algorithm. In the following, the distributions of topological and

kinematic variables of diﬀerent objects are compared between diﬀerent options that were

considered for the ﬁlter function and clustering algorithm during the optimization process,

both for signal top jets and background QCD jets.

303

Filter Function Optimization

A comparison between a subset of the ﬁlter functions considered for the Mapper algorithm

is presented. The other parameters of the Mapper algorithm are set to their ﬁnal choice in

the results shown here, with the exception of the covering set, which varies depending on the

deﬁnition of the ﬁlter function. Although the ﬁnal choice of the ﬁlter function consists of a

single function, the Mapper algorithm can be extended to use an arbitrary number of ﬁlter

functions. The use of two ﬁlter functions was considered during the optimization process. In

this case, the covering set consists of square grids that are built from the covering intervals

of the individual ﬁlter functions. The overlap region, from which edges between the vertices

of the ˇC complex are deﬁned, can maximally consist of four overlapping square grids instead

of two overlapping intervals as in the case of a single ﬁlter function. A description of the

subset of ﬁlter functions shown in this comparison study is given below:

η-Projection

This function projects the coordinate pair of topoclusters onto the η-axis of the η-φ plane.

The covering set that is used with this ﬁlter function is the same as the one used with the

φ-projection ﬁlter function.

Log Sigmoid ∆R

This function is deﬁned as the absolute value of the natural logarithm of the product of three

sigmoid functions. Each individual sigmoid function is designed to measure the distance

response of a topocluster t to a reference topocluster t(cid:48) in the jet. The reference topoclusters

chosen are the three leading in pT topoclusters in the jet: t0, t1, and t2. This ﬁlter function

304

can be expressed mathematically as:

f (t) = |ln (s(t, t0)s(t, t1)s(t, t2))|

(B.1)

where the sigmoid function s(t, ti) is deﬁned as

s(t, ti) =

1
1 + e−∆R(t,ti)/∆Rres

(B.2)

The distance between topoclusters t and ti is scaled by the threshold distance that is used

in the clustering process of the Mapper algorithm. The motivation of this ﬁlter func-

tion is to map topoclusters that are spatially close to the three leading topoclusters of

the jet onto the same region of the image space of the ﬁlter function. This is done in

order to construct substructures that are centered around the most energetic topoclus-

ters of the jet. The optimal covering set that is used with this ﬁlter function is U =

{[0, 0.7], [0.66, 1.1], [0.88, 1.4], [1.31, 2.1]}.

Log Sigmoid ET

This function is deﬁned as the absolute value of the natural logarithm of a sigmoid function

that measures the energy response between a topocluster t relative to the average transverse

energy of the topoclusters in the jet. This ﬁlter function can be expressed mathematically

as:

f (t) =

ln

1

(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
where ET,t is the transverse energy of the topocluster t and Eavg
(cid:12)
T

ET,t/E

1 + e

(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)









(cid:113)

−

avg
T

(B.3)

is the average trans-

verse energy of the topoclusters in the jet. The motivation of this ﬁlter function is to

305

map topoclusters that are energetically similar onto the same region of the image space

of the ﬁlter function. The optimal covering set that is used with this ﬁlter function is

U = {[0, 0.1], [0.04, 0.4], [0.21, 0.54], [0.49, 0.7]}.

Momentum Fractions

These two ﬁlter functions are deﬁned as the ratio of the x and y momentum components

of a topocluster t with the scalar sum of the pT of the topoclusters in the jet. These ﬁlter

functions can be expressed mathematically as:

f (t) =

pi,t
t(cid:48)∈J pT,t(cid:48)

(B.4)

(cid:80)
where i is the x or y momentum component and the sum in the denominator runs along

all the topoclusters t(cid:48) in the jet J. These two functions are used together in the Mapper

algorithm. The optimal covering sets for both functions are the same and is given by U =

{[−1.1, −0.25], [−0.45, 0.45], [0.25, 1.1]}. These intervals are combined in order to form a

covering grid.

As can be observed in Figure B.1, both for signal and background jets, the η-projection

and momentum fraction ﬁlter functions produce fewer connected components on average,

the log sigmoid ﬁlter functions tend to produce more connected components on average,

and the φ-projection ﬁlter function results in an average number of connected components.

From the Cambridge-Aachen splitting scales shown in Figures B.2 - B.4, it is observed that

the η-projection and momentum fraction ﬁlter functions tend to produce CCs that are, on

average, spatially closer at each reclustering step. The n-subjettiness ratio distributions

behave similarly for all ﬁlter functions as can be observed in Figures B.5 and B.6. For both

306

signal and background jets, the τ32 distribution peaks sharply at values close to 1, which

indicates that the jets are better modeled with two CCs as subjets instead of three. The

τ21 distribution for signal jets is bimodal, with low values corresponding to jets that are

better modeled with two CCs as subjets and higher values corresponding to jets that are

better modeled with a single CC. The mass distributions of the leading vertex and CC are

shown in Figures B.7 and B.8, respectively. As can be observed in the distributions for

signal jets, both the leading vertex and CC tend to reconstruct larger substructures in the

jet when the η-projection and momentum fraction ﬁlter functions are used. Additionally, it

is observed that the momentum fraction ﬁlter functions reconstruct most of the top jet at the

leading vertex level, while for the η-projection a similar degree of reconstruction is achieved

at the leading CC level. On the other hand, the log sigmoid ∆R tends to reconstruct smaller

substructures with these objects in signal jets, while the φ-projection and log sigmoid ET act

as a compromise between small-scale and large-scale substructure reconstruction. Finally,

as can be observed in Figures B.9 and B.10, the diﬀerent ﬁlter functions result in varying

behaviors of the energy correlation of the topoclusters that are associated with the CCs.

307

(a)

(b)

Figure B.1: The number of connected components in the jet for signal top jets from W (cid:48) → tb
processes (a) and background jets from QCD processes (b) overlayed between the diﬀerent
options of ﬁlter functions.

(a)

(b)

Figure B.2: The Cambridge-Aachen splitting scale to three connected components for signal
top jets from W (cid:48) → tb processes (a) and background jets from QCD processes (b) overlayed
between the diﬀerent options of ﬁlter functions.

308

0b024681012Fraction of events00.050.10.150.20.250.30.350.4top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     0b024681012Fraction of events00.050.10.150.20.250.30.350.40.45QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     34d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.020.040.060.080.1top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     34d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.020.040.060.080.1QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     (a)

(b)

Figure B.3: The Cambridge-Aachen splitting scale to two connected components for signal
top jets from W (cid:48) → tb processes (a) and background jets from QCD processes (b) overlayed
between the diﬀerent options of ﬁlter functions.

(a)

(b)

Figure B.4: The Cambridge-Aachen splitting scale to one connected component for signal
top jets from W (cid:48) → tb processes (a) and background jets from QCD processes (b) overlayed
between the diﬀerent options of ﬁlter functions.

309

23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.060.07top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.060.07QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     12d00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.070.080.09top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     12d00.20.40.60.811.2Fraction of events00.020.040.060.080.1QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     (a)

(b)

Figure B.5: The n-subjettiness ratio τ21 for signal top jets from W (cid:48) → tb processes (a) and
background jets from QCD processes (b) overlayed between the diﬀerent options of ﬁlter
functions. The individual n-subjettiness variables are calculated by interpreting the CCs of
the jet as subjets.

(a)

(b)

Figure B.6: The n-subjettiness ratio τ32 for signal top jets from W (cid:48) → tb processes (a) and
background jets from QCD processes (b) overlayed between the diﬀerent options of ﬁlter
functions. The individual n-subjettiness variables are calculated by interpreting the CCs of
the jet as subjets.

310

21tJet 00.20.40.60.81Fraction of events00.020.040.060.080.1top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.060.070.080.09QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.160.180.2top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.160.18QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     (a)

(b)

Figure B.7: The mass of the leading in pT vertex for signal top jets from W (cid:48) → tb pro-
cesses (a) and background jets from QCD processes (b) overlayed between the diﬀerent
options of ﬁlter functions.

(a)

(b)

Figure B.8: The mass of the leading in pT connected component for signal top jets from
W (cid:48) → tb processes (a) and background jets from QCD processes (b) overlayed between the
diﬀerent options of ﬁlter functions.

311

 mass [GeV]0v020406080100120140160180200Fraction of events00.020.040.060.080.10.120.140.160.180.2top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions      mass [GeV]0v020406080100120140160180200Fraction of events00.050.10.150.20.250.30.350.4QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions      mass [GeV]0CC020406080100120140160180200Fraction of events00.020.040.060.080.10.120.140.160.18top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions      mass [GeV]0CC020406080100120140160180200Fraction of events00.050.10.150.20.250.30.350.4QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     (a)

(b)

Figure B.9: The energy correlation function ratio C2 of the leading in pT connected com-
ponent for signal top jets from W (cid:48) → tb processes (a) and background jets from QCD
processes (b) overlayed between the diﬀerent options of ﬁlter functions. The energy cor-
relation function ratios are evaluated using the topoclusters associated with the connected
component.

(a)

(b)

Figure B.10: The energy correlation function ratio D2 of the leading in pT connected com-
ponent for signal top jets from W (cid:48) → tb processes (a) and background jets from QCD
processes (b) overlayed between the diﬀerent options of ﬁlter functions. The energy cor-
relation function ratios are evaluated using the topoclusters associated with the connected
component.

312

2 C0CC00.20.40.60.811.2Fraction of events00.010.020.030.040.05top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     2 C0CC00.20.40.60.811.2Fraction of events00.010.020.030.040.050.06QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     2 D0CC00.511.522.533.54Fraction of events00.020.040.060.080.10.120.140.160.180.20.22top quark jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     2 D0CC00.511.522.533.54Fraction of events00.020.040.060.080.10.120.14QCD jetsfProjection hProjection  RDLog sigmoid TLog sigmoid EMomentum fractions     Clustering Algorithm Optimization

A comparison between the diﬀerent clustering algorithms that are used by the Mapper

algorithm to form the vertices of the ˇC complex is presented. The other parameters of the

Mapper algorithm are set to their ﬁnal choice in the results shown here. In addition to the

single-linkage clustering algorithm, a centroid-linkage clustering and the anti-kT clustering

algorithms were considered during the optimization process of the Mapper algorithm. These

two options are described below:

Centroid-Linkage Clustering

The centroid-linkage clustering algorithm deﬁnes the distance between two clusters of topoclus-

ters vi and vj as:

D(vi, vj) =

(cid:113)

(ηavg

i − ηavg

j

)2 + (φavg

i − φavg

j

)2

(B.5)

where ηavg

k

and φavg

k

are the average pseudorapidity and azimuthal angles of the topoclusters

in a given cluster vk. The two clusters that achieve the minimum centroid-linkage distance

are merged together.

Anti-kT Clustering

This is the standard anti-kT clustering algorithm applied to clusters of topoclusters. The

distance between clusters is deﬁned as

D(vi, vj) = min {p−2

T i, p−2

T j}∆R(vi, vj)

(B.6)

The two clusters that achieve the minimum anti-kT distance are merged together.

313

As can be observed in Figure B.11, the anti-kT clustering tends to produce fewer CCs

in jets, while the centroid-linkage clustering results in more CCs. The distributions of the

n-subjettiness ratios τ21 and τ32 are shown in Figures B.12 and B.13, respectively. The

τ32 distribution indicates that both signal and background jets that have at least three

CCs are better modeled with two CCs instead of three. On the other hand, when using

the single-linkage and centroid-linkage clustering algorithms, the τ21 distribution is bimodal

for signal jets, while preferring a single CC substructure for background jets. The anti-kT

clustering tends to produce CCs that better model signal jets with two CCs when compared

to the other clustering options. This observation also holds for background jets; however,

the relative modeling between two CCs and a single CC is more ambiguous compared to the

other n-subjettiness ratio distributions. The mass distributions of the leading vertex and

CC are shown in Figures B.14 and B.15, respectively. These two objects tend to reconstruct

smaller substructures in signal jets when the centroid-linkage is used. On the other hand,

the anti-kT clustering reconstructs most of the substructure in signal jets with these two

objects. This can be observed from the bump in the leading vertex mass distribution near

the W boson mass and the prominent peak in the leading CC mass distribution near the top

quark mass. No signiﬁcant diﬀerences are observed in the mass distributions of these objects

for background jets between the single-linkage and centroid-linkage clustering algorithms.

On the other hand, the anti-kT clustering tends to produce leading vertices and CCs in

background jets that have more mass compared to the other clustering options. This could

be explained by the behavior of the anti-kT algorithm, which clusters the most energetic

objects ﬁrst. Finally, the diﬀerent clustering algorithms tend to produce CCs with varying

behaviors in their topocluster substructure, as can be observed from the energy correlation

function ratios in Figures B.16 and B.17.

314

(a)

(b)

Figure B.11: The number of connected components in the jet for signal top jets from W (cid:48) → tb
processes (a) and background jets from QCD processes (b) overlayed between the diﬀerent
options of clustering algorithms.

315

0b024681012Fraction of events00.10.20.30.40.50.60.70.8top quark jetssingle-linkagecentroid-linkageTanti-k   0b024681012Fraction of events00.10.20.30.40.50.6QCD jetssingle-linkagecentroid-linkageTanti-k   (a)

(b)

Figure B.12: The n-subjettiness ratio τ21 for signal top jets from W (cid:48) → tb processes (a) and
background jets from QCD processes (b) overlayed between the diﬀerent options of clustering
algorithms. The individual n-subjettiness variables are calculated by interpreting the CCs
of the jet as subjets.

(a)

(b)

Figure B.13: The n-subjettiness ratio τ32 for signal top jets from W (cid:48) → tb processes (a) and
background jets from QCD processes (b) overlayed between the diﬀerent options of clustering
algorithms. The individual n-subjettiness variables are calculated by interpreting the CCs
of the jet as subjets.

316

21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.06top quark jetssingle-linkagecentroid-linkageTanti-k   21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.06QCD jetssingle-linkagecentroid-linkageTanti-k   32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.16top quark jetssingle-linkagecentroid-linkageTanti-k   32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.12QCD jetssingle-linkagecentroid-linkageTanti-k   (a)

(b)

Figure B.14: The mass of the leading in pT vertex for signal top jets from W (cid:48) → tb pro-
cesses (a) and background jets from QCD processes (b) overlayed between the diﬀerent
options of clustering algorithms.

(a)

(b)

Figure B.15: The mass of the leading in pT connected component for signal top jets from
W (cid:48) → tb processes (a) and background jets from QCD processes (b) overlayed between the
diﬀerent options of clustering algorithms.

317

 mass [GeV]0v020406080100120140160180200Fraction of events00.020.040.060.080.10.120.140.160.18top quark jetssingle-linkagecentroid-linkageTanti-k    mass [GeV]0v020406080100120140160180200Fraction of events00.050.10.150.20.250.30.350.4QCD jetssingle-linkagecentroid-linkageTanti-k    mass [GeV]0CC020406080100120140160180200Fraction of events00.020.040.060.080.10.120.14top quark jetssingle-linkagecentroid-linkageTanti-k    mass [GeV]0CC020406080100120140160180200Fraction of events00.050.10.150.20.250.30.350.4QCD jetssingle-linkagecentroid-linkageTanti-k   (a)

(b)

Figure B.16: The energy correlation function ratio C2 of the leading in pT connected com-
ponent for signal top jets from W (cid:48) → tb processes (a) and background jets from QCD
processes (b) overlayed between the diﬀerent options of clustering algorithms. The energy
correlation function ratios are evaluated using the topoclusters associated with the connected
component.

(a)

(b)

Figure B.17: The energy correlation function ratio D2 of the leading in pT connected com-
ponent for signal top jets from W (cid:48) → tb processes (a) and background jets from QCD
processes (b) overlayed between the diﬀerent options of clustering algorithms. The energy
correlation function ratios are evaluated using the topoclusters associated with the connected
component.

318

2 C0CC00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.07top quark jetssingle-linkagecentroid-linkageTanti-k   2 C0CC00.20.40.60.811.2Fraction of events00.010.020.030.040.050.060.070.08QCD jetssingle-linkagecentroid-linkageTanti-k   2 D0CC00.511.522.533.54Fraction of events00.050.10.150.20.25top quark jetssingle-linkagecentroid-linkageTanti-k   2 D0CC00.511.522.533.54Fraction of events00.020.040.060.080.10.120.140.160.18QCD jetssingle-linkagecentroid-linkageTanti-k   APPENDIX C. Mapper Algorithm Comparison Plots

As discussed in subsection 5.2.3, variables that are inspired by the jet substructure observ-

ables were deﬁned using the information obtained from the topological data analysis of jets.

The vertices and connected components (CCs) of the ˇC complex of jets obtained from the

Mapper algorithm are interpreted as subjets. This is achieved by adding the four-momenta

of the topoclusters that are associated with these objects. This allows us to use vertices

and CCs as inputs to the jet substructure observables to quantify how the energy of a jet is

distributed across structures formed by these objects. Additionally, some of these substruc-

ture observables were also deﬁned for the CCs in jets by using the topoclusters associated

with a given CC as inputs when evaluating these variables. This allows us to quantify how

the energy is distributed in the substructures that are reconstructed by CCs. As discussed

in subsection 5.2.4, some of these variables are used as inputs to the DNN and GNN taggers

that were trained to classify jets as either signal top jets or background QCD jets. This

appendix contains plots comparing the distributions of these variables between signal and

background jets.

319

(a)

(b)

Figure C.1: Cambridge-Aachen splitting scales of jets to three connected components (a),
two connected components (b), and one connected component (c).

(c)

320

34d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.020.040.060.080.1QCD jettop quark jet  23d00.050.10.150.20.250.30.350.40.450.5Fraction of events00.010.020.030.040.050.060.07QCD jettop quark jet  12d00.20.40.60.811.2Fraction of events00.020.040.060.080.1QCD jettop quark jet  (a)

(b)

Figure C.2: n-subjettiness distributions τ1 (a), τ2 (b), and τ3 (c) using the connected com-
ponents as the subjets of the jet.

(c)

321

1tJet 00.10.20.30.40.50.60.7Fraction of events00.020.040.060.080.10.120.140.16QCD jettop quark jet  2tJet 00.050.10.150.20.250.30.350.40.450.5Fraction of events00.020.040.060.080.10.120.140.160.180.2QCD jettop quark jet  3tJet 00.050.10.150.20.250.3Fraction of events00.020.040.060.080.10.120.140.16QCD jettop quark jet  (a)

(b)

Figure C.3: n-subjettiness ratios τ2 1 = τ2/τ1 (a) and τ3 2 = τ3/τ2 (b) using the connected
components as the subjets of the jet.

322

21tJet 00.20.40.60.81Fraction of events00.010.020.030.040.050.06QCD jettop quark jet  32tJet 00.20.40.60.81Fraction of events00.020.040.060.080.10.120.140.16QCD jettop quark jet  (a)

(b)

(c)

Figure C.4: n-point energy correlation function e2 (a) and the ratios C2 = e3/e2
D2 = e3/e3

2 (c) using the connected components as the constituents of the jet.

2 (b) and

323

2Jet e00.050.10.150.20.250.30.350.4Fraction of events00.020.040.060.080.10.120.140.16QCD jettop quark jet  2Jet C00.050.10.150.20.250.30.350.4Fraction of events00.020.040.060.080.10.120.14QCD jettop quark jet  2Jet D0123456Fraction of events00.020.040.060.080.10.120.14QCD jettop quark jet  (a)

(b)

Figure C.5: Mass distributions of the leading (a), second leading (b), and third leading (c)
in pT vertices obtained from the Mapper algorithm by clustering topoclusters in the cover
elements of the ﬁlter function image space.

(c)

324

 mass [GeV]0v020406080100120140160180200Fraction of events00.050.10.150.20.250.30.35QCD jettop quark jet   mass [GeV]1v020406080100120140160180200Fraction of events00.050.10.150.20.250.30.35QCD jettop quark jet   mass [GeV]2v020406080100120140160180200Fraction of events00.050.10.150.20.250.30.35QCD jettop quark jet  (a)

(b)

Figure C.6: pT distributions of the leading (a), second leading (b), and third leading (c)
in pT vertices obtained from the Mapper algorithm by clustering topoclusters in the cover
elements of the ﬁlter function image space.

(c)

325

 [GeV]T p0v01002003004005006007008009001000Fraction of events00.010.020.030.040.05QCD jettop quark jet   [GeV]T p1v01002003004005006007008009001000Fraction of events00.010.020.030.040.050.060.07QCD jettop quark jet   [GeV]T p2v01002003004005006007008009001000Fraction of events00.020.040.060.080.10.12QCD jettop quark jet  (a)

(b)

Figure C.7: Mass distributions of the leading (a), second leading (b), and third leading (c) in
pT connected components obtained from the Mapper algorithm by adding the four-momenta
of the topoclusters associated with the connected component.

(c)

326

 mass [GeV]0CC020406080100120140160180200Fraction of events00.050.10.150.20.250.30.35QCD jettop quark jet   mass [GeV]1CC020406080100120140160Fraction of events00.050.10.150.20.25QCD jettop quark jet   mass [GeV]2CC0102030405060708090100Fraction of events00.10.20.30.40.50.6QCD jettop quark jet  (a)

(b)

Figure C.8: pT distributions of the leading (a), second leading (b), and third leading (c) in
pT connected components obtained from the Mapper algorithm by adding the four-momenta
of the topoclusters associated with the connected component.

(c)

327

 [GeV]T p0CC0200400600800100012001400160018002000Fraction of events00.010.020.030.040.050.060.070.08QCD jettop quark jet   [GeV]T p1CC01002003004005006007008009001000Fraction of events00.010.020.030.040.050.060.070.080.09QCD jettop quark jet   [GeV]T p2CC0100200300400500Fraction of events00.020.040.060.080.10.120.140.160.180.2QCD jettop quark jet  (a)

(b)

Figure C.9: Energy correlation function ratio C2 = e3/e2
2 distributions of the leading (a),
second leading (b), and third leading (c) in pT connected components obtained from the
Mapper algorithm.

(c)

328

2 C0CC00.20.40.60.811.2Fraction of events00.0050.010.0150.020.0250.030.0350.040.045QCD jettop quark jet  2 C1CC00.20.40.60.811.2Fraction of events00.010.020.030.040.05QCD jettop quark jet  2 C2CC00.20.40.60.811.2Fraction of events00.010.020.030.040.050.06QCD jettop quark jet  (a)

(b)

Figure C.10: Energy correlation function ratio D2 = e3/e3
2 distributions of the leading (a),
second leading (b), and third leading (c) in pT connected components obtained from the
Mapper algorithm.

(c)

329

2 D0CC00.511.522.533.54Fraction of events00.020.040.060.080.10.120.140.16QCD jettop quark jet  2 D1CC00.511.522.533.54Fraction of events00.020.040.060.080.10.12QCD jettop quark jet  2 D2CC00.511.522.533.54Fraction of events00.020.040.060.080.10.12QCD jettop quark jet  APPENDIX D. Single VLQ Background Reweighting

This appendix contains plots that compare kinematic distributions before and after applying

the background correction factors that were derived in the single production of a vector-like

T analysis, as discussed in subsection 6.1.5. The selection of kinematic distributions that

are shown is varied and highlights the applicability of the reweighting procedure to other

kinematic variables that are not related to the variables that are used in the derivation

of the correction factors. The distributions are shown at the t¯t + W t reweighting source

region before applying the correction factors in Figure D.1 and after applying the correction

factors in Figure D.2. Additionally, these kinematic distributions are also shown in a region

that requires exactly one b-tagged jet. This region is orthogonal to the t¯t + W t region

by deﬁnition and is used to validate the full reweighting procedure. Figures D.3 and D.4

show the distributions before and after applying the correction factors, respectively. As can

be observed, the modeling of the MC simulation improves signiﬁcantly in this validation

region after applying all correction factors, which gives conﬁdence in the overall background

reweighting procedure.

330

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure D.1: Comparison between data and unreweighted MC prediction in the preselection
region with 1 lepton, at least 3 jets, and 2 b-tagged jets. From top to bottom and left to
right, the variables displayed are: number of jets, number of forward jets, leading lepton pT,
Emiss
T , Hhad
T , number of V -tagged jets, number of H-tagged jets, number of top-tagged
jets.

T , mW

331

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure D.2: Comparison between data and fully reweighted MC prediction in the preselection
region with 1 lepton, at least 3 jets, and 2 b-tagged jets. From top to bottom and left to
right, the variables displayed are: number of jets, number of forward jets, leading lepton pT,
Emiss
T , Hhad
T , number of V -tagged jets, number of H-tagged jets, number of top-tagged
jets.

T , mW

332

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure D.3: Comparison between data and unreweighted MC prediction in the preselection
region with 1 lepton, at least 3 jets, and 1 b-tagged jets. From top to bottom and left to
right, the variables displayed are: number of jets, number of forward jets, leading lepton pT,
Emiss
T , Hhad
T , number of V -tagged jets, number of H-tagged jets, number of top-tagged
jets.

T , mW

333

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure D.4: Comparison between data and fully reweighted MC prediction in the preselection
region with 1 lepton, at least 3 jets, and 1 b-tagged jets. From top to bottom and left to
right, the variables displayed are: number of jets, number of forward jets, leading lepton pT,
Emiss
T , Hhad
T , number of V -tagged jets, number of H-tagged jets, number of top-tagged
jets.

T , mW

334