CONSTRAINING NUCLEAR WEAK INTERACTIONS IN
ASTROPHYSICS AND NEW MANY-CORE ALGORITHMS FOR
NEUROEVOLUTION
By
Christopher James Sullivan
A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
Physics — Doctor of Philosophy
Computational Mathematics, Science and Engineering — Dual Major
2018
ABSTRACT
CONSTRAINING NUCLEAR WEAK INTERACTIONS IN ASTROPHYSICS AND
NEW MANY-CORE ALGORITHMS FOR NEUROEVOLUTION
By
Christopher James Sullivan
Weak interactions involving atomic nuclei are critical components in a broad range of astrophysical phenomenon. As allowed Gamow-Teller transitions are the primary path through
which weak interactions in nuclei operate in astrophysical contexts, the constraint of these
nuclear transitions is an important goal of nuclear astrophysics.
In this work, the charged current nuclear weak interaction known as electron capture is
studied in the context of stellar core-collapse supernovae (CCSNe). Specifically, the sensitivity of the core-collapse and early post-bounce phases of CCSNe to nuclear electron capture
rates are examined. Electron capture rates are adjusted by factors consistent with uncertainties indicated by comparing theoretical rates to those deduced from charge-exchange and
-decay measurements. With the aide of such sensitivity studies, the diverse role of electron
capture on thousands of nuclear species is constrained to a few tens of nuclei near N ⇠ 50
and A ⇠ 80 which dictate the primary response of CCSNe to nuclear electron capture. As
electron capture is shown to be a leading order uncertainty during the core-collapse phase
of CCSNe, future experimental and theoretical e↵orts should seek to constrain the rates of
nuclei in this region.
Furthermore, neutral current neutrino-nuclear interactions in the tens-of-MeV energy
range are important in a variety of astrophysical environments including core-collapse supernovae as well as in the synthesis of some of the solar systems rarest elements. Estimates for
inelastic neutrino scattering on nuclei are also important for neutrino detector construction
aimed at the detection of astrophysical neutrinos. Due to the small cross sections involved,
direct measurements are rare and have only been performed on a few nuclei. For this reason, indirect measurements provide a unique opportunity to constrain the nuclear transition
strength needed to infer inelastic neutrino-nucleus cross sections. Herein the (6 Li, 6 Li0 ) inelastic scattering reaction at 100 MeV/u is shown to indirectly select the relevant transitions for
inelastic neutrino-nucleus scattering. Specifically, the probes unique selectivity of isovectorspin transfer excitations ( S = 1,
T = 1,
T z = 0) is demonstrated, thereby allowing the
extraction of Gamow-Teller transition strength in the inelastic channel.
Finally, the development and performance of a newly established technique for the subfield of artificial intelligence known as neuroevolution is described. While separate from the
physics that is discussed, these algorithmic advancements seek to improve the adoption of
machine learning in the scientific domain by enabling neuroevolution to take advantage of
modern heterogeneous compute architectures. Because the evolution of neural network populations o✏oads the choice of specific details about the neural networks to an evolutionary
search algorithm, neuroevolution can increase the accessibility of machine learning. However,
the evolution of neural networks through parameter and structural space presents a novel divergence problem when mapping the evaluation of these networks to many-core architectures.
The principal focus of the algorithm optimizations described herein are on improving the
feed-forward evaluation time when tens-to-hundreds of thousands of heterogeneous neural
networks are evaluated concurrently.
Copyright by
CHRISTOPHER JAMES SULLIVAN
2018
ACKNOWLEDGMENTS
There are many students, sta↵, faculty, friends, family, and others whom I regret having not
listed in these acknowledgments by name. To each and everyone of you, thank you for the
role you have played in this season of my life.
I must begin by thanking my advisor, Remco Zegers. Remco’s leadership has been
unparalleled, knowing not only when to direct and when to withdraw, but also recognizing
potential and encouraging growth in ways that were genuinely for my benefit alone. His
scientific expertise and vision for the field of nuclear astrophysics is as accessible as it is
energizing. Under his guidance I have been able to cultivate my identity as a physicist and
as a computational scientist and have been supported uniformly along both paths. Remco,
your passionate commitment to science, service, and stewardship is remarkable and I am
truly honored to have been exposed to your level of professional investment on a daily basis.
I also owe the other members of the charge-exchange group a debt of gratitude. Every
member of that group has aided in the e↵orts described in this work. Thank you all for your
insight over the years, your willingness to travel with me to Japan for the 6 Li measurement,
and for the intentionality each of you brought to making the group a friendly and welcoming
place to grow.
I must also thank each of the members of my thesis committee, Drs. Dirk Colbry, Sean
Couch, Morten Hjorth-Jensen, Sean Liddick, Hendrik Schatz, (and Ed Brown in my first few
years). They, as much as I, have seen my interests grow and evolve over the past several
years. Thank you for your continued guidance and for providing me with the space to mature
in those areas which I am most energized. Thank you Morten for recognizing my passion
for computational science and suggesting I pursue it as part of my degree. Your relentless
v
optimism and encouragement is contagious.
Many thanks are also in order for Evan O’Connor, whose willingness to partner with an
experimental group to achieve the sensitivity study outlined in this work was exceptional.
His mentorship during our time at CITA and thereafter were some of my first experiences
in collaborative scientific software development. Thank you for your helpful explanations of
the physics and numerics and for being an excellent collaborator.
I am also particularly thankful for the sense of community among the MSU and NSCL
graduate students, and for a number of close relationships I’ve developed over the years. I
had the great fortune of joining MSU in 2012 with Charles Loelius, Eric Lunderberg, and
Sam Lipschutz, who have been constant companions throughout the past six years. It has
been a pleasure to learn, laugh, and complain with each of you. To my officemates, it
was an honor to see you grow, to work beside you, and to be consistently inspired by your
stewardship to our communities. In response to the thesis acknowledgments of Chris Morse:
in fact, suggesting that you pursue a Physics degree at Westmont was likely one of the most
lucrative actions of my professional career. You have been an always willing and capable
source of advice and mentorship over the past decade, and I am thankful for your close
friendship.
The unwavering love and support of my mother, Theresa Sullivan, and my grandparents,
Ronald and Martha Sullivan, cannot be overstated. You have invested in me and my education in ways that only very few people experience. While your involvement in my Ph.D.
has looked di↵erent than when I was young, it was equally foundational. I also must thank
my mother and father in-law, Chris and Jim Hansen, who have been extraordinary in their
selfless support and encouragement. To both the Hansens and the Sullivans, your radical
generosity and love have brought genuine vitality to Rachel and I during many long months
vi
of my degree, and for that I am eternally grateful.
Finally, to Rachel Sullivan, my spouse and closest friend, I owe nearly all of my accomplishments over these past years in Michigan to you. In a period of life that is statistically
plagued with doubt, anxiety, and depression, you have filled my life with continuous joy and
purpose. I am beyond privileged to have had your unceasing love and support at every stage
of my academic career, from secondary school until now. Thank you for your love, your
willingness to uproot with me, and for your encouragement to reach past what was easy and
well beyond what I would have ever been capable of otherwise.
vii
PREFACE
The fence we walked between the years
Did balance us serene
It was a place half in the sky where
In the green of leaf and promising of peach
We’d reach our hands to touch and almost touch the sky
If we could reach and touch, we said,
’Twould teach us, not to, never to, be dead
We ached and almost touched that stu↵;
Our reach was never quite enough.
If only we had taller been
And touched God’s cu↵, His hem,
We would not have to go with them
Who’ve gone before,
Who, short as us, stood as they could stand
And hoped by stretching tall that they might keep their land
Their home, their hearth, their flesh and soul.
But they, like us, were standing in a hole
O, Thomas, will a Race one day stand really tall
Across the Void, across the Universe and all?
And, measured out with rocket fire,
At last put Adam’s finger forth
As on the Sistine Ceiling,
And God’s hand come down the other way
To measure man and find him Good
And Gift him with Forever’s Day?
I work for that
Short man, Large dream
I send my rockets forth between my ears
Hoping an inch of Good is worth a pound of years
Aching to hear a voice cry back along the universal mall:
We’ve reached Alpha Centauri!
We’re tall, O God, we’re tall!
Ray Bradbury
NASA Mariner 9 Symposium
Nov. 12, 1971, Caltech
viii
TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xii
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Nuclear weak interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Accessible machine learning via neuroevolution . . . . . . . . . . . . . . . . .
1
3
7
Chapter 2
2.1
2.2
2.3
2.4
2.5
2.6
The sensitivity of core-collapse supernovae to nuclear electron
capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Astrophysical weak interaction rates . . . . . . . . . . . . . . . . . . . . . . .
Core collapse and the role of electron capture . . . . . . . . . . . . . . . . .
Codes & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 NuLib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 GR1D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3 Neutrino emission via electron capture . . . . . . . . . . . . . . . . .
Sensitivity study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Reference simulations . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Species dependent sensitivity . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Systematic variations . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3.1 Progenitor model sensitivity . . . . . . . . . . . . . . . . . .
2.4.4 Monte-Carlo variations . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Most important nuclei . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Impact of uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.3 Goals for future studies . . . . . . . . . . . . . . . . . . . . . . . . . .
Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.1 Implementation procedure for future updates . . . . . . . . . . . . . .
Chapter 3 The (6 Li, 6 Li0 + ) reaction . . . . . . . . . . . .
3.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 The CAGRA+Grand Raiden Coincidence Setup .
3.1.1.1 Grand Raiden Spectrometer . . . . . . .
3.1.1.2 CAGRA . . . . . . . . . . . . . . . . . .
3.2 Methods & Analysis . . . . . . . . . . . . . . . . . . . .
3.2.1 Data acquisition and analysis framework . . . . .
3.2.2 Grand Raiden calibrations . . . . . . . . . . . . .
3.2.2.1 Sieve-slit scattering angle reconstruction
3.2.2.2 Energy calibration . . . . . . . . . . . .
3.2.3 CAGRA calibrations . . . . . . . . . . . . . . . .
3.2.3.1 Energy calibration . . . . . . . . . . . .
ix
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
16
23
25
25
26
27
30
30
33
38
45
48
51
52
53
54
54
57
59
63
65
65
66
70
70
73
73
75
77
77
3.2.4
3.2.5
3.2.6
3.2.7
3.3
3.4
Invariant missing mass reconstruction . . . . . . . . . . . .
Doppler reconstruction . . . . . . . . . . . . . . . . . . . .
Excitation energy resolution . . . . . . . . . . . . . . . . .
Cross-section calculations . . . . . . . . . . . . . . . . . .
3.2.7.1 Normalization of the number of incident particles
3.2.7.2 Multipole Decomposition Analysis . . . . . . . .
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 The 12 C(6 Li, 6 Li0 + ) measurement . . . . . . . . . . . . .
3.3.2 The (6 Li,6 Li’) unit cross-section . . . . . . . . . . . . . . .
3.3.3 Competition with isoscalar resonances . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 4
4.1
4.2
4.3
4.4
4.5
4.6
Many-core algorithms for topologically divergent
works . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Applications in physics . . . . . . . . . . . . . . . . . . . . . .
Network Structures & Evaluation . . . . . . . . . . . . . . . .
Directed acyclic graph concurrency . . . . . . . . . . . . . . .
4.3.1 Topological connection sort . . . . . . . . . . . . . . .
4.3.2 Neural network operations . . . . . . . . . . . . . . . .
4.3.2.1 Neural network node lifecycle . . . . . . . . .
4.3.3 Generalized concurrent neural network evaluation . . .
Composite population evaluation . . . . . . . . . . . . . . . .
CUDA implementation . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Performance results . . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 79
. 82
. 83
. 86
. 88
. 89
. 95
. 95
. 103
. 108
. 112
neural net. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
114
118
121
124
125
130
131
132
135
138
140
144
Summary & Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . 146
BIBLIOGRAPHY
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
x
LIST OF TABLES
Table 2.1: Density, temperature and mass ranges for the compiled weak rate set . . .
18
Table 4.1: Neural network connection dependency comparator . . . . . . . . . . . . . 129
xi
LIST OF FIGURES
Figure 1.1: Diagrams of the weak interactions studied in this work, and the corresponding strong-force reactions that were utilized to indirectly extract the
relevant nuclear transitions. As can be seen in the Charged-Current (CC)
column of the diagram, even though nuclear charge-exchange is mediated
by a ⇡ + meson instead of the weak W + boson for electron capture, the operator responsible for the nuclear transition, Ô(GT+ ), is the same in both
reactions. This implies that charge-exchange can be used as a surrogate or
indirect measurement of the nuclear matrix element relevant for electron
capture. Similarly, in the Neutral-Current (NC) channel, nuclear probes
can be used to estimate the response of nuclei to inelastic excitation via
neutrinos, if the GT0 , component of the M1 operator can be extracted.
This is the case for certain proton inelastic scattering measurements, and
in chapter 3, 6 Li inelastic scattering is shown to be directly sensitive to
these weak NC transitions. . . . . . . . . . . . . . . . . . . . . . . . . .
6
Figure 2.1: Chart of the nuclear species depicting the nuclei included in each rate
tabulation as well as the full reach of the weak rate library presented in
this work. The table to which a species belongs is given by the color and
legend in the figure. The Oda set contains rates for lower mass sd-shell
nuclei (light blue), the LMP set contains rates for the intermediate mass
pf -shell nuclei (green), and the LMSH set contains rates for the heavier
mass pf g/sdg-shell nuclei near stability (red). The FFN tabulation provides rates across the sd and pf -shells (dark blue). Squares individually
bordered in black are stable nuclei. The tables are mutually exclusive
except for FFN which spans many nuclear shells. To distinguish between
nuclei with rates from FFN and another table, the border of the FFN set
has been outlined with a black and white line. . . . . . . . . . . . . . .
19
Figure 2.2: Panel (a): Q-value dependence of electron-capture rates at two points
along a core-collapse trajectory. The scattered points are tabulated (shellmodel and SMMC) rates for each electron-capture reaction, while the
black points are the approximate rates given by Eq. 2.1. Panel (b): The
residual di↵erences between log10 of the shell-model rates and the approximate rates for each nucleus in the weak-rate library. An example residual
is indicated on panel (a). When the density and temperature of a simulation evolve outside the range of the rate tables (see Table 2.1), rates are
calculated via the approximate routines in order to avoid an artificial cut
o↵ imposed by the table boundaries. Rates are estimated between density
and temperature grid points via monotonic cubic-spline interpolation as
described by Ste↵en [63]. . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
xii
Figure 2.3: The average nuclear mass (divided by two), charge, and electron-capture
rate versus central density for the three EOS utilized in this study. The
colors indicate di↵erent EOS, while the line style indicate which quantity
is plotted. All three EOS have nearly identical abundance distributions
up to densities of 2·1012 g cm 3 . Beyond this point the TMA EOS has a
heavier and slightly more neutron rich mass distribution compared to both
SFHo and DD2, but maintains a comparable average electron capture rate
overall. These simulations each utilize the s15WW95 progenitor. . . . . .
Figure 2.4: The contribution of nuclear electron capture to the change of the matter
electron-fraction with time. The contours are the binned sums of |Ẏe | for
each species in several mass regions. For reference, the central density at
t tb = -20, -10, -5, -2, and -1 ms is 1.41 · 1011 , 4.06 · 1011 , 1.42 · 1012 ,
8.49 · 1012 , 3.30 · 1013 g cm 3 respectively. . . . . . . . . . . . . . . . .
32
35
Figure 2.5: Top 500 electron capturing nuclei with the largest absolute change to the
electron fraction up to neutrino trapping. The color scale indicates |Ẏe |
integrated up to the trapping time, occurring when ⇢c ⇠ 2 · 1012 g cm 3 ,
such that the total electron-fraction at this point is equal to its initial
value less the sum of Ye , the plotted quantity, over all nuclides. Calculations are based on the s15WW95+SFHo reference simulation. The black
contour that runs parallel to the valley of stability on the neutron-rich side
is the boundary between measured [77] and theoretical masses used in the
approximate rate estimates of Eqs. 2.1 and 2.2. The rectangular outline
indicates the size of the sampling region used in the statistical resampling
study, and also the set of nuclei which exhibited the largest changes to
the simulations when excluded from the electron-capture calculations. . .
36
Figure 2.6: Projection of the (trapped) lepton fraction at ⇢c = 4 · 1012 g cm 3 as
a function of the electron-capture rate scaling factor for progenitor+EOS
reference simulation. In all the cases the lepton fraction begins to increase
if the capture rate becomes too high because of a dramatic increase in
the electron neutrino absorption cross section. The asymmetry seen here
indicates that those quantities which depend on Yl/e are likely to be more
sensitive to a reduction of the electron-capture rates due to a systematic
overestimate in the base rates, than they are to an increase due to an
underestimate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
Figure 2.7: Comparison of the central electron, electron-neutrino (left) and lepton
(right) fractions in which the nuclear electron-capture rate for every species
has been scaled by factors shown in the legend. Warmer colors indicate a
higher overall electron capture rate, and cooler colors indicate a lower rate.
The dashed black line indicates the reference s15WW95+SFHo simulation. 41
xiii
Figure 2.8: The electron fraction, entropy, density, and velocity as a function of enclosed mass at four times during a core-collapse simulation, spanning 6
ms around bounce, including the collapse phase just after the onset of
neutrino trapping (a), core bounce (b), and 1 ms (c) and 5 ms (d) after
bounce during which the shock has begun its outward trajectory. The
reference simulation (s15WW95+SFHo) is shown in dashed black. Simulations shown in color have electron-capture rates scaled systematically
for all species by factors of 10, 4, 2, 0.5, 0.25, and 0.1. Core bounce
(t tb = 0) shown in panel (b) is defined as when the entropy at the
shock front exceeds 3.0 kb /baryon. . . . . . . . . . . . . . . . . . . . . .
42
Figure 2.9: The neutrino luminosity as measured at a radius of 500 km as a function
of time after bounce in the s15WW95+SFHo simulation set. Electroncapture rate scaling factors are shown in the legend, where contours with
warmer colors have higher rates, and cooler colors have lower rates. While
the peak electron-neutrino luminosity is considered particularly stable
across core-collapse simulations, it varies significantly with variations of
the electron-capture rates on medium-heavy nuclei. When the rates are
at their lowest (⇥0.1 case), the shock reaches the neutrinosphere more
quickly than in the other simulations. This results in a larger luminosity
in the peak electron-neutrino burst because more ⌫e s are able stream out
of the core at early times. The opposite is true when the rates are higher,
the neutrinosphere and shock converge much more slowly, and so the neutrinos spend more time di↵using out of the inner core, reducing the peak
luminosity but distributing it out to later times. . . . . . . . . . . . . . .
44
Figure 2.10: The full range of sensitivity of the PNS inner-core mass, central entropy,
and central temperature at bounce as well as the peak ⌫e -luminosity, the
peak average ⌫e energy, and the average ⌫e energy prior to neutrino trapping, owing to variations of the progenitor model and electron-capture
rates. Thirty two progenitors were utilized from the WH07 model set of
Woosley and Heger [79] for producing the progenitor bars (red) in the
figure. Each bar of the electron-capture rate variations derives from simulations where the rates have been systematically scaled by factors of 10,
4, 2, 0.5, 0.25, and 0.1. The horizontal tick represents the value of the
reference simulation for the tested Progenitor + EOS combination. The
window ranges are chosen so that the progenitor sensitivity bars are of
equal size across each of the plotted parameters. . . . . . . . . . . . . .
46
xiv
Figure 2.11: Lepton fraction band as a function of central density for the Monte Carlo
(MC) described in the text. The band represents data points from seven
MC simulations based on the s15WW95+SFHo reference, where the band
width represents the min to max range of Yl . The rates are varied by
drawing from the Q-value dependent residual function shown in Figure 2.2
after binning it into 2.5 MeV energy bins. This binning was chosen as it
best tracks the Q-value dependence of the residual distribution. . . . . .
50
Figure 2.12: Updated version of Figure 2.1 showing the addition of the recently added
rate tabulations since the libraries first implementation in [2]. . . . . . .
55
Figure 3.1: A simplified level diagram of 6 Li based on Ref. [103]. The ↵ decay of the
J ⇡ = 0+ , T = 1 state at Ex = 3.56 MeV is parity-forbidden, and thus this
state mainly decays to the ground state via emission. . . . . . . . . .
63
Figure 3.2: The primary facility beam line layout for the Research Center for Nuclear
Physics (RCNP) located at Osaka University, Japan. In this experiment,
the 6 Li beam was transported from the ring cyclotron (shown in green)
into the WS hall. It was then transported achromatically to the Grand
Raiden target position. See text for details. . . . . . . . . . . . . . . . .
64
Figure 3.3: Top panel: Layout of the Grand Raiden Spectrometer which was used to
momentum analyze the inelastically scattered 6 Li nuclei. The spectrometer was configured with seven magnets in the QSQDMDD configuration
(where Q, D, S, and M are Quadrupole, Dipole, Sextupole, and Multipole
magnets, respectively). Bottom panel: configuration of the spectrometer
end station and focal-plane detectors. Figure adapted from [106, 95]. See
text for details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
Figure 3.4: The CAGRA array of HPGe clover detectors. CAGRA consisted of two
rings of clover detectors with 8 detector slots at 90 degrees and 4 at 135
degrees. There were also 4 forward slots for LaBr3 detectors at 45 degrees,
but they were not used in this work due to their large angular coverage
which made Doppler reconstruction challenging. . . . . . . . . . . . . .
68
Figure 3.5: The range of energies the 3.56 MeV -ray from 6 Li in the laboratory frame.
The -ray emitted from the de-excitation of the 6 Li[0+; T = 1; 3.56M eV ]
state is 3.56 MeV in the rest frame of 6 Li. In the laboratory frame this
gamma is Doppler boosted due to the momentum of the 6 Li ( ⇠0.43),
according to its emission angle in the lab-frame. Shown in red are the
clover crystal positions in the CAGRA array. Each clover has crystals at
two distinct azimuthal angles. Thus, the two rings of clover detectors span
for separate angular ranges. For the purpose of Doppler reconstruction,
the centroid of the clover crystal is used as the gamma interaction point.
69
xv
Figure 3.6: The analysis software stack that was used in the RCNP CAGRA+Grand
Raiden campaign. Dashed blue lines show application and library boundaries. Solid arrows that cross these boundaries represent communication
by local memory, shared memory or disk I/O. See text for details. . . .
71
Figure 3.7: The image of the sieve slit in the focal plane for 3 separate magnetic
rigidity settings (left) and the reconstructed scattering angles at the target
using a least squares fit to Eqs. 3.6 & 3.7 (right). . . . . . . . . . . . . .
74
Figure 3.8: The time-dependence of the Grand Raiden energy calibration for the centroid of the 12 C[2-;T=1;19.4 MeV] state in the 12 C(6 Li,6 Li’) excitation
energy spectrum. Due to the coincident excitation of the 6 Li ejectile to
its 3.56 MeV state, this isovector state in 12 C appears at ⇠23 MeV (=19.4
MeV + 3.56 MeV). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
Figure 3.9: Rate dependence of the CAGRA clover crystals. Panel (a) shows an example count rate spectrum as a function of time for the CAGRA array.
Panel (b) details the rate dependence of the energy calibration o↵set for
a specific HPGe crystal in the array. This trend was established for each
crystal and then used to correct the energy calibration on a second-bysecond basis. Panels (c) and (d) show the rate dependent gamma-ray
energy and (corrected) rate independent gamma-ray energy for the 511
keV line, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
Figure 3.10: The kinematics of the 6 Li reaction.
. . . . . . . . . . . . . . . . . . . .
80
Figure 3.11: (a) A comparison of the contribution of the intrinsic energy resolution,
opening angle uncertainty, and velocity spread of the ejectile, to the overall uncertainty in the Doppler reconstructed gamma-ray energy for the
detection of the 3.56 MeV de-excitation gamma from 6 Li in CAGRA. The
uncertainty in the emission angle of the -ray is the dominant component
of the overall resolution. (b) The resulting estimated resolution of the
reconstructed invariant 6 Li ejectile given the resolution of the measured
de-excitation gamma. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
xvi
Figure 3.12: Comparison of the simulated reconstruction of the 12 C[1+;T=1;15.1 MeV]
excited state with the measured result. In the left panel, the purple line
represents the (simulated) best-case scenario for the missing mass reconstruction of the 15.1 MeV peak, if the invariant 6 Li[0+;T=1;3.56 MeV] is
ignored. By comparing this with the data in the right panel, it is apparent
that the data follows a much broader distribution. In order to reproduce
the data, the simulated missing mass reconstruction of this state in 12 C
(purple) had to be smeared with a gaussian of a variety of widths. The
shape of the smeared state was interpolated and fit to the data using a
normalization factor. The best fit was found with a gaussian smearing
of FWHM=1.75 MeV (shown by the black scatter crosses and red line in
the left and right panels, respectively). The smearing that was required
suggests that the intrinsic beam energy resolution was 1.75 MeV. . . . .
85
Figure 3.13: Calculated angular distributions for isovector spin-transfer excitations of
di↵erent orbital angular momentum transfer for the 12 C(6 Li,6 Li0 ) inelastic scattering measurement. Each panel is the angular distribution for a
particular angular momentum multipole, and the di↵erent color lines are
the excitation energy (Q-value) dependence of the angular distribution.
93
Figure 3.14: The gamma-ray energy and target excitation energy matrix for the 12 C(6 Li, 6 Li0 +
) reaction. In the top panel, the gamma energies shown are those measured in the detectors, without any transformation applied and in the
bottom panel, the gamma energies have been Doppler reconstructed according to the clover crystal angular positions. Both panels are drawn as
a function of the excitation energy in 12 C which is calculated via the missing mass formalism. The red arrow in the top panel indicates the width
of the 12 C(15.1 MeV) state owing to the momentum kick that is imparted
to the 6 Li and which is not accounted for in the missing-mass–only reconstruction. This width is analogous to the FWHM of the excitation
energy spectrum shown in Figure 3.12. By comparing the two panels, the
e↵ect of the doppler reconstruction can be seen, where the photopeak and
Compton edge owing to the 3.56 MeV gamma from 6 Li is more prominent in the bottom panel after reconstruction. In addition, gamma decay
from the strong isoscalar resonance (occurring above 20 MeV in excitation
energy) is clearly evident. . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Figure 3.15: The Doppler-corrected -ray energy gated on the 12 C[15.1 MeV; T=1]
state for emphasis. The 3.56 MeV -line from the 6 Li[0+; T=1] excited
state is clearly evident. The resolvability of this -ray is crucial for tagging
isovector spin-transfer reactions in the target nucleus. The inset plot
compares the signal to noise achieved with the clover detectors to what is
expected were a gamma-ray tracking detector was utilized. See text for
details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvii
97
Figure 3.16: The reconstructed excitation energy spectrum of 12 C, for di↵erent angular
bins, when applying the two gates shown in the Doppler-reconstructed
gamma energy spectrum (Fig. 3.15). The red spectrum corresponds to
the 3.56 MeV photopeak gate, and the blue spectrum corresponds to the
sideband, representative of the gamma background in the photopeak gate.
The 12 C[1+;T=1;15.1 MeV] state, which is populated via an isovector
spin-transfer reaction, is clearly seen in the photopeak gated spectra, but
not in the sideband data. . . . . . . . . . . . . . . . . . . . . . . . . . .
98
Figure 3.17: Comparison of the 12 C inelastic scattering singles and coincidence excitation energy dependent double di↵erential cross section for the 12 C(6 Li, 6 Li0 )
and the 12 C(6 Li, 6 Li0 + ) reactions, respectively. The coincidence spectrum shown is the result of the subtraction of the di↵erential cross section
resulting from a gate on the photopeak of the 3.56 MeV -ray and a sideband in the region directly above this gamma (shown as the red and blue
cross-hatched regions in Fig 3.15, respectively). The coincidence spectrum
corresponds to the isovector spin-transfer excitations in 12 C, whereas the
singles spectrum contains all types of transitions but is primarily dominated by isoscalar excitations. See text for details. . . . . . . . . . . . . 100
Figure 3.18: Double di↵erential cross section for the 12 C(6 Li, 6 Li0 + ) reaction as a
function of excitation energy and center of mass scattering angle, in which
the 3.56 MeV state was measured in coincidence. Due to conservation of
spin and isospin, this coincidence measurement extracts the isovector spintransfer response of 12 C. In addition, a multipole decomposition analysis
(MDA) has been performed utilizing DWBA angular distributions to extract the components of orbital angular momentum transfer. Shown in
the right panel are the angular distributions for states in 12 C dominated
by Gamow-Teller transitions (top panel) and spin-dipole transitions (bottom panel). The absolute scale has a systematic uncertainty of ±20% due
to uncertainty in the beam normalization which is not depicted in the
statistical error bars. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
xviii
Figure 3.19: The mass dependence of the Gamow-Teller unit cross-section for 6 Li calculated from theoretical estimates of the inelastic GT strength and DWBA
estimates for the cross section of several nuclei (black). Filled circles indicate that the underlying GT strengths were estimated via shell-model
or normal-modes calculations. A postulated unit cross section massdependence for heavy-ion reactions (Eq. 3.36) is fit to these theoretical unit
cross-sections and is shown in the plot (dashed black line). In addition,
the experimentally derived Gamow-Teller UCS for the inelastic (6 Li,6 Li’)
reaction established in this work is shown alongside an estimated unit
cross section from data available for the (6 Li,6 He) charge-exchange reaction [135]. The error bar shown for the inelastic UCS includes systematic
and statistical uncertainties, while no error on the charge-exchange data
was available. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Figure 3.20: Comparison of the achievable signal-to-noise for the 3.56 MeV -ray emitted from inelastic scattering of 6 Li on 12 C (top panel), 24 Mg (middle
panel), and 93 Nb (bottom panel). The Doppler reconstructed gamma is
resolvable in the 12 C measurement but not in the 24 Mg other measurements. The purple and red line-shapes are the simulated response of the
clover and tracking detectors, respectively. In the 12 Ccase, the purple line
shape is fit to the data, and used to estimate the number of counts in the
3.56 MeV peak. The excitation energy was gated on the 15.1 MeV state
in 12 C which has a B(GT)⇠1. Assuming a comparable B(GT) in 24 Mg
and 93 Nb, over an equal-width range in excitation energy, and accounting
for the change in the unit cross-section with mass, the purple and red lineshapes have been scaled to reproduce the expected number of isovector
counts in the other two measurements. This clearly illustrates that resolving the 6 Li 3.56 MeV gamma becomes increasingly difficult in heavier
systems where the isoscalar giant resonances remain strong. Decay from
these resonances contribute a significant background in the region of the
3.56 MeV -ray from 6 Li, and this background increases in magnitude
with increasing nuclear mass. . . . . . . . . . . . . . . . . . . . . . . . . 109
Figure 3.21: The di↵erential cross sections for the 24 Mg(6 Li,6 Li0 ) and the 93 Nb(6 Li,6 Li0 )
singles measurements. The cross sections in both cases are dominated by
isoscalar giant resonances which become larger with heavier mass systems. 110
Figure 3.22: Mass dependence of the calculated zero-degree di↵erential cross section for
the isoscalar giant monopole and dipole resonances (IVGMR and IVGDR,
respectively) corresponding to 100% exhaustion of the energy weighted
sum-rule. These calculations illustrate the increase in the isoscalar crosssection as a function of nuclear mass number and were provided at the
courtesy of Umesh Garg [139]. . . . . . . . . . . . . . . . . . . . . . . . 111
xix
Figure 4.1: The basic components an artificial neural network. Each node (shown as
the circle in the figure) accumulates inputs modulated by the weights of
its inbound connections. After all of the inbound connections have been
accumulated into the node, a non-linear activation function, such as a
logistic sigmoid, transforms the nodes value. The transformed node value
is then used as an output along outbound connections to other nodes in
the network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Figure 4.2: (a) A common feedforward neural-network structure with a single hidden
layer. x0i , x1i , and x2i correspond to the input, hidden and output layers
0 , 1 refer to the connection weight matrices between
respectively, while wij
two layers. (b) An unstructured neural network in which the connections
are not constrained to specific layers. Such network structure is common
in neuroevolution populations. Each connection in (b) is labeled with
an integer indicating the lock-free set in which it can be evaluated. See
section 4.3.1 for details. . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure 4.3: An illustration of the structural divergence than can exist in populations
of neural networks evolved via neuroevolution. In this case, each network
has a fixed number of input and output nodes (3 and 1, respectively). . 123
Figure 4.4: An example neural network and two directed acyclic dependency graphs
for the evaluation of the neural network’s connections. The left dependency graph includes only logical dependencies whereas the graph on the
right includes both logical & memory dependencies. Each circle in the
dependency graphs represent connections in the top neural network figure. The arrows in the two dependency graphs indicate the dependency
relationship between the connections . For example, the arrow between
A1 and B1 indicates that connection A1 must be evaluated prior to B1.
By inspecting the neural network, this relationship is evident as connection B1 relies on node 1, which in turn relies on connection A1 (as well as
other connections). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Figure 4.5: Prototype of sorting connections into concurrent lock-free sets. For n
connections, scales as O(n2 ). . . . . . . . . . . . . . . . . . . . . . . . . 128
xx
Figure 4.6: The full directed acyclic dependency graphs for two structurally di↵erent
neural networks. The di↵erent symbols in the graph, squares, circles,
and triangles, represent the three neural network operations: zero-node,
apply-connection, and activate-node, respectively. Even though the two
networks have very di↵erent structure, the algorithm proposed in this
work organizes the network evaluation graphs into evaluations steps that
are component-wise parallel. On the right of the figure, each evaluation
step is shown with the number of SIMD operations that can be performed
simultaneously. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Figure 4.7: Top panel: A representative example population of neural networks consisting of three members. In classic neuroevolution simulations, each of
these neural networks are separately evaluated, and the correctness of
their outputs is used to judge their fitness. Bottom panel: The same population as in the top panel, but constructed as nodes and connections of
a single composite neural network. On the left, the resulting composite
network in which each of the constituting sub-networks requires heterogeneous inputs for evaluation. On the right, a composite network in which
each of the sub-networks will process the same inputs. . . . . . . . . . . 136
Figure 4.8: NVIDIA CUDA kernels for the proposed concurrent neural network evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Figure 4.9: GPU speed up factor when evaluating a population of N evolved neural
networks concurrently. The ratio of time taken for the sequential evaluation of each network on a single CPU thread to the time for transferring
inputs to the device, evaluation on the GPU, and transferring the outputs
back to the host. With a population of approximately 65k networks, a
performance boost of 50x relative to the serial implementation is observed. 140
xxi
Chapter 1
Introduction
Advances in computational capabilities, astrophysical observations, nuclear theory and experimental methods over the past two decades have enabled a new era of multidisciplinary
research in the fields of nuclear science and computational astrophysics. For example, we
now have the capability of probing the impact of specific nuclear processes to large-scale astrophysical events via sensitivity studies, which can direct new experimental and theoretical
e↵orts [1, 2, 3].
Ultimately, the interdisciplinary nature of these e↵orts is fundamentally self-sustaining.
Experimental measurements for all nuclei relevant to a specific astrophysical context are
often unfeasible due to the large number of nuclei that are involved. Therefore, nuclear
theory must be relied upon for the majority of the relevant nuclear information, where key
measurements can help to constrain theoretical calculations. Then, astrophysical sensitivity
studies–which probe the impact of specific nuclear reactions to astrophysical simulations–can
reveal the key nuclei that should be studied experimentally and theoretically. These new
measurements and calculations then feed back into the astrophysical simulations, constraining the uncertainty of the nuclear physics inputs to the astrophysical phenomenon. Finally,
with high fidelity astrophysical simulations, predictions about multi-messenger astronomical observations from such events can be made and verified at earth based electromagnetic,
neutrino, and gravitational wave observatories.
1
In addition, the dramatic increase in computational capabilities and the increasingly
large amounts of data that have enabled the above cycle are also fueling growth in the
fields of artificial intelligence and machine learning. Specifically, the development of novel
algorithmic techniques in machine learning, which take advantage of these computational
advancements, are beginning to provide new and less expensive solutions to challenging
scientific problems [4, 5].
The research described in this work embodies these principle advancements. In the
following sections, the motivation of the projects that compose this body of work are outlined.
Each subsequent chapter will focus on the specific implementation of what is presented here,
as well as the the scientific results and conclusions that can be drawn. The three research
directions that will be discussed are,
Chapter 2. An investigation into the sensitivity of core-collapse supernovae (CCSNe) to
the nuclear reaction known as electron capture,
Chapter 3. The establishment of a new experimental method, the (6 Li,6 Li’) reaction,
which can provide access to astrophysically relevant neutrino-nucleus reactions,
Chapter 4. The development of many-core algorithms for topologically divergent neural
networks which better enable neuroevolution to compete in the automated machinelearning ecosystem.
While these are quantitatively distinct focuses, each of these projects principally explore
new ways to accelerate the process of scientific discovery: sensitivity analyses can direct
experimental and theoretical e↵orts; new experimental techniques can replace or supplement
previously difficult-to-study avenues of research; and novel high-performance automated
2
machine-learning algorithms can reduce the implementation barrier of artificial intelligence
in the scientific domain.
1.1
Nuclear weak interactions
The physics motivation of the research presented in this work is largely rooted in the topic
of weak interactions with atomic nuclei. The weak interaction is one of the four fundamental
physical forces in nature. Weak interactions that involve nuclei include ± -decay (positron/electron emission), forward and inverse neutrino interactions (nuclear capture of neutrinos
and heavy leptons, respectively), and neutrino-nucleus scattering.
In a terrestrial context, weak-processes play an important role in modern technology.
For example, beta emitting isotopes are commonly used in medical diagnostics, nuclear
medicine, imaging (positron emission tomography; PET-scans), portable energy generation,
natural gas production, and radioactive dating [6].
In astrophysics, however, the influence of the weak interaction is pervasive. This is historically evident in that nearly every new insight that has arisen about the weak interaction
over the course of the past century has been coupled with significant changes in our understanding of astrophysical processes [7]. For example, after the existence of the neutrino was
proposed by Pauli and the first weak-interaction theory was developed by Fermi [8], it was
not long before it was suggested and established that the production of weakly-interacting
neutrinos is the primary source of energy loss in stars [9]. Following the theory of neutralcurrent neutrino interactions with nucleons [10], it was recognized that neutrino scattering in
core-collapse supernovae would create the phenomenon known as neutrino-trapping, which
prevents neutrinos from escaping the collapsing core of a massive star [11, 12] and has a
3
dramatic impact on the evolution of the supernovae.
The class of weak interactions focused on in this work (those involving interactions
with atomic nuclei) are known as semi-leptonic weak interactions, and they fall into two
categories: charged- and neutral-current interactions. Subatomic particles interacting via
charged-current (CC) and neutral-current (NC) weak interactions are mediated by the W ±
and Z 0 bosons of the standard model, respectively. Both types of interactions are studied
in this work, where electron-capture (a CC reaction) is discussed in chapter 2, and inelastic
neutrino-nucleus scattering (a NC reaction) is discussed in chapter 3.
Whereas the unified model for the electro-weak theory can accurately predict weakinteraction cross-sections involving elementary particles, for semi-leptonic weak interactions
involving nuclei, the problem is more complex. By treating the weak-interaction with nuclei
perturbatively, the calculation of weak-interaction cross-sections can be reduced to the manybody nuclear-structure problem well within the domain of nuclear physics. Unfortunately, the
difficulties that arise when solving the many-body nuclear-structure problem often introduce
large uncertainties into the weak reaction rates that are important for astrophysics [7, 13].
This suggests that experimental input is necessary. While measurements cannot be made
for all of the nuclear weak interactions that are important, they can provide invaluable
constraints on theoretical models. This will be discussed in detail in chapter 2.
The complication in the nuclear weak-interaction (the semi-leptonic case) arises principally from the need for detailed information about the nuclear transitions and states that
are involved. However, this relationship also implies that if one is able to attain information about the nuclear many-body configurations and transitions involved in the weak
reaction, the weak-interaction cross-section and reaction rate can be indirectly constrained.
Experimentally, this means that if the direct weak-interaction measurement is difficult or
4
impossible, as is the case for many astrophysically relevant weak reactions, an indirect measurement which probes the same nuclear transitions can be utilized. The ability to indirectly
extract the necessary information to constrain weak reaction rates and cross-sections form
the basis for the experimental programs discussed in the following two chapters.
Nuclear charge-exchange and inelastic scattering, as shown in Figure 1.1, are two independent accelerator-based experimental techniques which are capable of inducing the same
nuclear transitions as CC and NC weak interactions, but are induced via reactions mediated
by the strong force. Also shown in the figure are the two weak reactions studied in this work,
electron capture and inelastic neutrino-nucleus scattering.
The similarity between reactions mediated by the weak and the strong force is evident
when considering the interaction diagrams, also shown in Figure 1.1. In the case of the
electron-capture process, the mediating particle is the W+ boson, and in charge-exchange
reactions the mediating particles are mesons (such as the ⇡ + meson, and others). In both
cases, the interacting nucleus undergoes the same transition between initial and final states,
and so both processes probe the same nuclear matrix elements, albeit with separate couplings
(weak vs strong). An analogous argument can be made for the inelastic neutrino scattering
and the hadronic inelastic scattering shown in the figure; that is, instead of a mediating
Z0 boson, a ⇡ 0 meson (for example) is exchanged and the same nuclear transition occurs.
However, a few subtleties arise in the NC channel that are reserved for discussion in chapter 3.
These indirect techniques are of immense value primarily because the direct measurements are often extremely challenging. For example, very few neutrino-inelastic scattering
measurements on nuclei have been performed to date because of the very small cross-sections
involved (⇠ 10 42 mb/sr). This specific example is the primary motivation for the development of the (6 Li,6 Li0 ) reaction probe, and is discussed at length in chapter 3. Even so,
5
Charged-Current (CC)
Weak interaction
(A,Z-1)*
Neutral-Current (NC)
(A,Z)*
νe
W
O(GT+)
Z0
+
O(GT0)
e-
(A,Z)
(A,Z)
Electron capture
Neutrino inelastic scattering
A *
ZX
A
ZX
A *
ZX
A
ZX
e-
Strong interaction
(A,Z-1)*
(A,Z)*
p
p
+
0
O(M1)
O(GT+)
n
(A,Z)
Proton inelastic scattering
Charge-exchange
A
ZX
n
p
(A,Z)
A *
ZX
A
ZX
p
p
A *
ZX
p
Figure 1.1: Diagrams of the weak interactions studied in this work, and the corresponding
strong-force reactions that were utilized to indirectly extract the relevant nuclear transitions. As can be seen in the Charged-Current (CC) column of the diagram, even though
nuclear charge-exchange is mediated by a ⇡ + meson instead of the weak W + boson for
electron capture, the operator responsible for the nuclear transition, Ô(GT+ ), is the same
in both reactions. This implies that charge-exchange can be used as a surrogate or indirect measurement of the nuclear matrix element relevant for electron capture. Similarly,
in the Neutral-Current (NC) channel, nuclear probes can be used to estimate the response
of nuclei to inelastic excitation via neutrinos, if the GT0 , component of the M1 operator
can be extracted. This is the case for certain proton inelastic scattering measurements, and
in chapter 3, 6 Li inelastic scattering is shown to be directly sensitive to these weak NC
transitions.
6
measurements of all the astrophysically relevant nuclei for which weak reactions are important is an impossible task. Thus, those nuclei which are the most important in astrophysical
simulations should be measured, providing the best possible constraints for the theoretical
methods which are needed to estimate the majority of the weak reactions. It is for this reason
that in chapter 2, a sensitivity analysis of core-collapse supernovae to nuclear electron-capture
rates is described, and the important weakly interacting nuclei are identified.
While chapters 2 & 3 have fundamentally di↵erent focuses, together they represent a
coordinated e↵ort to constrain the nuclear weak-response of nuclei through both the charged
and neutral current channels. As weak interactions with nuclei are critical components of
many astrophysical processes, these constraints strongly impact the field of nuclear astrophysics.
The final chapter of this work departs from discussion of weak interactions in nuclear
astrophysics, and instead focuses on algorithmic advancements in artificial intelligence research. This work aims to lower the accessibility barrier of machine learning, and in chapter 4 evidence will be provided that suggest the application of these methods will improve
the applicability of modern neural-network classification techniques in experimental nuclear
astrophysics.
1.2
Accessible machine learning via neuroevolution
Many e↵orts in recent years have been directed toward increasing the accessibility of machine
learning, through new university programs, internet-based educational opportunities, as well
7
as the development of high-level programming interfaces that abstract away the mathematical details of building machine-learning models [14, 15, 16, 17]. These e↵orts have been
driven by an increased demand for machine-learning tools and models that can be applied
by users with limited machine-learning knowledge, as a result of its numerous successes in a
variety of industrial, scientific, and engineering domains.
Unfortunately, while machine learning has the capability to provide novel insights for
an extremely diverse set of problem domains, most applications require in-depth knowledge
of machine learning techniques in order to arrive at suitable solutions. The reason for this
is that the application of machine learning to any problem domain requires at least three
principle choices:
1. The choice of the optimal machine learning model for the application;
2. How to pre-process the data for consumption by the chosen model;
3. How the model training parameters (known as the hyperparameters) should be set for
a particular dataset.
For the general domain scientist and engineer that wishes to focus on their field of expertise, (1-3) can present a significant obstacle [18]. Even if the scope of (1) is limited specifically
to artificial neural networks (ANNs; hereafter simply referred to as neural networks), picking
the correct network hyperparameters such as the neural-network topology, learning rate, and
others, is a task for which there is no scientifically rigorous method [19]. Instead, hyperparameter searches are often performed manually via rules-of-thumb and by testing large
sets of models with a predefined grid of parameter values [20]. While grid-based searches
are e↵ective, they are very computationally expensive, and manual searches by hand often
require considerable expertise and can lead to poor reproducibility [21]. As methodological
8
reproducibility is a core tenant of the scientific method, such implementations of machine
learning are not viable in scientific applications.
The essence of the hyperparameter problem can be captured in a rudimentary description
of supervised learning with neural networks. Supervised learning is a class of machinelearning algorithms which make use of a set of training data (x, y), where the vector of
inputs x are transformed into output estimates ŷ that approximate the true outputs y. The
process is supervised by the provision of data where the expected output response is known,
and thus the model itself can be adjusted so that the output response is closely representative
of the true desired output response. Mathematically, the problem can simply stated as,
F(x, w; C) = ŷ,
(1.1)
where F is the neural network (or other machine learning model), w are learned parameters
of the neural network which are adjusted so that ŷ approaches the true outputs y, and C
are the model hyperparameters (e.g. the fixed structure of the network, the learning rate,
et cetera).
Typically, supervised learning with neural networks follows three steps, a learning phase
with training data, a validation phase with testing data, and a prediction phase. In the
learning phase, a set of training data is used to solve an optimization problem, where the
set of learned parameters are adjusted so that a loss or error function–which represents the
di↵erence between the true output response (y) and the inferred output response (ŷ)–is
minimized. After this process is complete, two conclusions and consequent actions can be
reached: (1) if the neural network performs well on the training data (the loss function is
sufficiently minimized), it is then validated with the testing data to see if the model has been
9
over fit; or (2) the neural-network was unable to sufficiently reproduce the desired output
response of the training data (could not minimize the loss function sufficiently). Ascertaining
the reason for training failure, as in (2), is a particularly difficult problem to asses, as it may
have occurred because of bias in the training data, or the machine-learning model of choice
may not be well suited to the application, or the model hyperparameters were ill-chosen,
such as the choice of a non-optimal neural-network structure.
In this context, the role of a data scientist [22] is clear: identifying the machine-learning
models which will generalize well to a given dataset, and knowing how to increase the model
complexity in cases of training failure. However, for the domain scientist who wishes to
employ machine-learning as a tool, these choices are not immediately obvious. Furthermore,
the process of trial and error is arduous because of the long feedback cycles: in most applications, the training phase takes a non-trivial amount of time. The potentially long delay in
feedback regarding the model adjustments makes manually searching the model complexity
space excessively burdensome for new users.
The concept of automated machine-learning seeks to remove these complexities, by automating the process of model choice, data processing, and hyperparameter tuning [23]. By
doing so, automated methods seek to lower the machine-learning implementation barrier for
novice users.
One method that is beginning to see more use in the automated machine-learning space
is Neuroevolution [24, 25]. Neuroevolution is a form of artificial intelligence that utilizes evolutionary genetic algorithms to evolve neural networks for specific applications [26]. Instead
of requiring human input to refine a single neural-network model, neuroevolution automatically performs the search process by allowing a population of many neural networks to
evolve, where only those networks which improve in performance survive. In this way, man10
ually tuned hyperparameters of a neural network are transformed into learned parameters.
The application of neuroevolution as a method for automated machine learning has a few
important consequences. Because the neural network model complexity search is done across
a population of many networks, the speed to convergence for an optimal solution scales with
the number of neural networks in the population. The more networks that are employed,
the faster the space can be searched. This thereby increases the computational complexity
significantly. If population sizes are in the tens to hundreds of thousands, the evaluation complexity is large enough to motivate the transition to modern multi- and many-core compute
architectures. As will be described in chapter 4, one of the hallmarks of neuroevolution–its
ability to evolve neural networks of diverse structure–has so far prevented the development
of generalized many-core algorithms which scale well with the network population size. This
is primarily because the diversity in the evolved neural network structure leads directly to
divergent evaluation graphs for these networks. On the one hand, many-core architectures–
which employ single-instruction-multiple-data (SIMD) processors–perform optimally when
execution branching is minimal. On the other hand, evolving neural networks with di↵ering
topologies has the opposite e↵ect, execution branching is enhanced.
Presented in chapter 4 is a novel many-core algorithm for the concurrent evaluation of
entire populations of topologically-divergent neural networks. This work represents the first
general-purpose mapping of large numbers of heterogeneous neural networks to many-core
architectures. The achieved evaluation speed up is a step toward enabling neuroevolution to
play a more competitive role in the automated machine-learning space. Given the abundant
computational resources available in most scientific laboratories, this work has the potential
to lower the barrier of entry for domain scientists interested in achieving meaningful results
from machine learning with neural networks.
11
Chapter 2
The sensitivity of core-collapse
supernovae to nuclear electron
capture
The study of interactions mediated via the weak nuclear force is of importance to a large
number of fields in physics. However, it is of particular importance to the field of astrophysics
because of the longer timescale on which weak interactions operate as compared to the
strong and electromagnetic interactions. This is evidenced by the impact that new insights
into weak reaction physics have on astrophysical models [7]. Specifically, electron-capture
reactions play a prominent role in high-density environments such as those found in the late
stages of massive star evolution [27, 28], thermonuclear [29, 30] and core-collapse supernovae
(CCSNe) [31, 32], neutron stars [33, 34], and compact object merger events [35]. Realistic
simulations of these environments rely on accurate nuclear physics inputs including electroncapture rates.
Electron-capture rates depend sensitively on allowed Gamow-Teller (GT) transitionstrength distributions in the + direction. These transition strengths characterize nuclear
excitations in which a single unit of spin and isospin are transferred ( S =
12
T = 1), with
no transfer of orbital angular momentum ( L = 0).1 While the main component of electron
capture occurs on the ground state configuration of a nucleus, in high temperature stellar
environments electron captures on thermally-populated excited states of the parent nucleus
can also contribute significantly to the overall rate [36]. Unfortunately, it is difficult to obtain information about transitions from excited states in the laboratory. Compounding the
problem is the fact that in order to accurately include electron capture in simulations, one
must include electron captures on a wide range of nuclei. Hence, in general one must rely on
theoretical models for a complete description of stellar electron-capture rates. On the other
hand, measurements of Gamow-Teller strength distributions in a representative set of nuclei
are important for the development and benchmarking of robust theories. At the same time,
it is critical that theoretical and computational e↵orts provide guidance to experimenters on
which measurements to perform.
Presently, configuration-interaction (shell-model) calculations are the primary method
for producing reliable GT strength distributions near stability in the sd- and pf - shells
(8 < [N, Z] < 20 and 20 < [N, Z] < 40, respectively) for electron capture on both ground
and excited states [37, 38]. Quasi-particle random-phase approximation (QRPA) calculations have also been utilized to estimate GT strengths for large sets of nuclei, but only
where transitions from the ground state are considered [39, 40, 41, 42, 43]. Furthermore,
comprehensive sets of electron-capture rates (as a function of density and temperature) for
a large number of nuclei based on QRPA calculations have not been published.
Direct and indirect experiments, such as
-decay and charge-exchange (CE) measure-
ments respectively, provide robust benchmarks for theoretical GT strengths and therefore
1 Transitions which follow these selection rules are referred to as “allowed” transitions, and those which
do not–for example, transitions with L > 0–are known as “forbidden” transitions which typically proceed
at a much reduced rate as compared to allowed transitions.
13
are crucial to understanding astrophysical electron-capture rates. Unfortunately, electroncapture and -decay experiments can only access states in a limited Q-value window. Furthermore, for neutron-rich nuclei
-decay only provides information in the
direction,
which is of limited use for electron-capture studies. Intermediate energy (& 100 MeV/u) CE
reactions in the + direction, however, connect the same initial and final states as electron
capture, providing information about transitions up to high excitation energies, and are thus
well suited to study the full Gamow-Teller strength distribution of interest. At these energies, CE measurements have been emperically established to be accurate at the ⇠10% level
and are therefore able to provide rigorous tests of theoretical Gamow-Teller strengths and
derived electron-capture rates [13].
Recently, the results from (n,p), (d,2 He), and (t,3 He) CE reactions on nuclei in the
pf -shell were systematically compared [13, 44, 45] with shell-model calculations using the
KB3G [46] and GXPF1a [47] e↵ective interactions in the pf -model space, and with calculations based on the QRPA formalism of Möller and Randrup [40]. The authors compared shell-model and QRPA derived electron-capture rates against those derived from CE
measurements. It was found that the QRPA calculations systematically overestimate the
electron-capture rates (⇠100-3000%, depending on density and temperature), whereas the
shell-model estimates produce rates similar to those measured experimentally (⇠1-50%) [13].
Unfortunately, shell-model calculations are computationally challenging for nuclei beyond the
pf -shell, and therefore weak rates used in high-density astrophysical calculations most commonly rely on less accurate methods. In each of these cases, systematic and random error
exist, and it is therefore important to understand the sensitivity of astrophysical simulations
to uncertainties in these rates.
Sensitivity studies are useful tools for guiding theoretical and experimental e↵orts because
14
they highlight nuclei that should be given particular focus, and they indicate the accuracy
with which the parameters of interest need to be known. They also illustrate how strongly
the current parameter uncertainties a↵ect the outcome of the astrophysical simulations. In
this chapter, results from a recent sensitivity study are examined in which ⇠ 150 collapse
simulations were performed with systematic and statistical variations of the electron-capture
rates [2]. The impact these rate variations have on the collapse, bounce and pre-explosion
phases of core-collapse supernovae simulations for a range of presupernova progenitors and
equations of state (EOS) will be described.
As part of this work, a modular and open-source weak reaction rate library1 was developed for use in astrophysical simulations. For the simulations described in this work it was
implemented in the stellar core-collapse code GR1D [48], but has also since been used in the
multi-dimensional CCSNe codes FLASH [49, 50], and COCONUT [51, 52]. 2
In the following sections, the inner core of the protoneutron star (PNS) and the observable
peak neutrino-luminosity from core bounce are shown to depend sensitively on the electroncapture rates of neutron-rich nuclei (+16/-4 % and ±20%, respectively). As variations on
this level are not easily reproduced from uncertainties in other inputs to the simulations,
they motivate the development of new theoretical models for electron-capture rates as well
as relevant measurements, which together will constrain these and other key parameters
discussed in this work.
2 http://www.jinaweb.org/weakrates
15
2.1
Astrophysical weak interaction rates
The weak nuclear force is important for a number of processes that influence the evolution
of massive stars [7]. For example, interactions mediated by the weak force are important
ingredients for nucleosynthesis and also for the internal structure of evolving stars, as they
sensitively determine the electron-to-baryon ratio Ye and the iron-core mass just prior to corecollapse [27]. Unlike the conditions present during quasi-static stellar evolution, however,
in the core of a collapsing star the density and temperature are high enough that nuclear
and electromagnetic reactions equilibrate [53]. This is not the case for weak reactions which
operate much more slowly and thus continue to a↵ect the nuclear composition, the neutrino
emission, and ultimately the dynamics of the entire event.
As compared to other semi-leptonic weak interactions (weak interactions with nuclei),
electron capture has a particularly remarkable impact on the core-collapse environment [54].
In the final stages of a star’s life, the nuclear-energy generation rate of the core that normally
sustains a star against gravitational collapse is absent because the core is composed of highly
stable iron-group nuclei. Instead, at these late times the electron-degeneracy pressure provides the primary stability against collapse. It is therefore apparent that electron captures
that remove electrons from the system will have dramatic consequences for this environment.
Furthermore, the electron chemical potential µe is sufficiently large to overcome Q-value3
restrictions, and so the electron-capture rates are significant.
Just prior to and during the early moments of collapse, other weak interactions can also
play a role. Martı́nez-Pinedo et al. [55] have shown that
decay can temporarily compete
3 The Q-value refers to the rest mass energy di↵erence between the final and initial nucleus, Q = M
f Mi ,
and describes either the energy required for the reaction to occur (Q < 0), or the energy deposited by the
reaction (Q > 0). For stellar electron-capture on neutron rich nuclei, Q < 0.
16
with electron capture when Ye =0.42-0.46, which can occur during Si shell burning and the
early stages of collapse. However, as collapse ensues, µe quickly becomes large enough that
-decay electrons are energetically blocked due to degeneracy. Similarly,
+
decay can
also compete with electron capture for nuclei with Qe+ > 2me c2 , but in the core-collapse
environment neutron-rich conditions are favored, and Qe+ is below this threshold.
The importance of reactions mediated by the weak nuclear force, and specifically electron
capture, as it pertains to core-collapse was first demonstrated by Bethe et al. [56]. Not long
after, the theory of stellar electron capture was formalized by Fuller, Fowler, and Newman
(FFN) [57]. In their pioneering work they published the first tabulation of weak interaction
rates ( ± -decay and e± -capture) considering presupernova conditions where allowed Fermi
and Gamow-Teller (GT) transitions dominate. Since then, advancements in computational
resources have allowed for detailed nuclear shell-model calculations that have increased the
accuracy of the weak-interaction theory first outlined by FFN. Major weak-interaction rate
tabulations that derive from a combination of experimental data and shell-model e↵ective
interactions are the Oda et al. [37] and Langanke and Martı́nez-Pinedo [36] tabulations for sd(A=17-39) and pf -shell (A=45-65) nuclei respectively. For heavier nuclei, where full shellmodel calculations are computationally unfeasible, the Shell Model Monte Carlo (SMMC)
approach has been employed which preserves nuclear properties in very large model spaces.
Langanke et al. [38] have combined this method with an RPA technique to estimate electroncapture rates at densities and temperatures relevant during core-collapse for nuclei in the
pf g/sdg-shell (A=65-112), which have come to be known as the LMSH rates. Juodagalvis
et al. [58] have also produced a set of more than 2200 additional rates based on the same RPA
technique but utilizing a Fermi–Dirac parameterization instead of the more computationally
expensive SMMC calculations. The individual rates were not released, but instead these
17
rates were averaged over NSE abundances and reported along a characteristic core-collapse
(⇢, T, Ye ) trajectory. However, such a prescription is not suitable for sensitivity studies in
which the detailed response of simulations to individual nuclei is desired.
The rate tabulations listed in Table 2.1 have been implemented into the weak rate library
used in this work. Together these tabulations contain 445 rates for 304 unique nuclei over a
large density and temperature grid. This library has been built as a standalone module and
has also been implemented into the neutrino-interaction library NuLib [59] for use in neutrinotransport routines employed by the spherically-symmetric, general-relativistic stellar collapse
code GR1D [59]–see Section 2.3 for more information. Details on the density and temperature
range for each of the included rate tabulations are shown in Table 2.1. The mass coverage
of each rate table is shown in Figure 2.1.
Table 2.1: Density, temperature and mass ranges for the compiled weak rate set
Model space
Table
s
FFN
ODA
LMP
LMSH
Approx.
x
x
x
x
p
sd
pf
x
x
x
pf g/sdg
T (GK)
Log10 (⇢Ye g cm 3 )
Ref.
x
x
0.01 - 100
0.01 - 30
0.01 - 100
8.12 - 39.1
-
1.0 - 11
1.0 - 11
1.0 - 11
9.22 - 12.4
-
Fuller et al. [57]
Oda et al. [37]
Langanke et al. [38]
Hix et al. [31], Langanke et al. [60]
Langanke et al. [38]
x
x
x
x
The LMP+LMSH rates were first implemented into a spherically-symmetric core-collapse
simulation by Hix et al. [31]. They compared simulations with this set of shell-model based
electron-capture rates against simulations that utilized the Bruenn [61] prescription for electron capture. The evolution of the core-collapse phase and the structural di↵erences in the
core at bounce seen in that work were significant. In light of the di↵erences that exist
between theoretical estimates for electron-capture rates and those inferred from CE experiments, these results motivate the need for a detailed sensitivity study.
18
70
60
50
50
=
Z
28
40
N
Proton Number
50
20
30
28
82
8
20
Stable
FFN
ODA
LMP
LMSH
Approx.
No rate
20
2
10
8
2
0
0
20
40
60
Neutron Number
80
100
120
Figure 2.1: Chart of the nuclear species depicting the nuclei included in each rate tabulation
as well as the full reach of the weak rate library presented in this work. The table to which a
species belongs is given by the color and legend in the figure. The Oda set contains rates for
lower mass sd-shell nuclei (light blue), the LMP set contains rates for the intermediate mass
pf -shell nuclei (green), and the LMSH set contains rates for the heavier mass pf g/sdg-shell
nuclei near stability (red). The FFN tabulation provides rates across the sd and pf -shells
(dark blue). Squares individually bordered in black are stable nuclei. The tables are mutually
exclusive except for FFN which spans many nuclear shells. To distinguish between nuclei
with rates from FFN and another table, the border of the FFN set has been outlined with
a black and white line.
19
To handle the large number of nuclei not included in the tables, Hix et al. [31] utilized an
average electron-capture neutrino emissivity for all nuclei which lacked a shell-model based
rate. In this work, instead of performing averaging, the approximate routine of Langanke
et al. [38] is employed, which is based on the parameterization of the electron-capture rate
as a function of the ground state to ground state Q-value. This approximation was first
described by Fuller et al. [62] and was later parameterized and fit to shell-model calculations
in the pf -shell by Langanke et al. [38]. In this approximation, the electron-capture rate is,
ln2 · B
EC =
K
✓
◆5
T
[F4 (⌘)
m e c2
2 F3 (⌘) + 2 F2 (⌘)]
(2.1)
2 F4 (⌘) + 2 F3 (⌘)],
(2.2)
and the neutrino-energy loss rate is,
ln2 · B
⌫e =
K
✓
◆6
T
[F5 (⌘)
me c2
where me is the electron mass, K = 6146 s, Fk are Fermi integrals of rank k and degeneracy
⌘,
= (Q
E)/T , ⌘ =
+ µe /T , and T and µe are the temperature and electron chemical
potential. B (= 4.6) and
E (= 2.5 MeV) are fit parameters taken from Langanke et al. [38]
and respectively represent e↵ective values for the transition strength and energy di↵erence
between final and initial excited states.
Figure 2.2 compares the rate estimates from this approximation and from the shell-model
rate tabulations in Table 2.1. As is easily seen from the figure, the variance of the shell-model
rates depends sensitively on the density of the environment. At lower densities, where the
electron chemical-potential and electron capture Q-value are comparable (µe ⇡ QEC ), the
location of excited states in the daughter nucleus, and the associated Gamow-Teller transition
20
10 6
(a)
10 4
EC
(s-1)
10 2
10 0
2
10
4
Log Residuals
10
4
2
0
Residual
log10
e) = 11.0, T9 = 11.8 GK
log10
e) = 9.61, T9 = 8.19 GK
(b)
-2
-4
-20 -15 -10
-5
0
5
10
15
20
Qg.s. (MeV)
Figure 2.2: Panel (a): Q-value dependence of electron-capture rates at two points along a
core-collapse trajectory. The scattered points are tabulated (shell-model and SMMC) rates
for each electron-capture reaction, while the black points are the approximate rates given by
Eq. 2.1. Panel (b): The residual di↵erences between log10 of the shell-model rates and the
approximate rates for each nucleus in the weak-rate library. An example residual is indicated
on panel (a). When the density and temperature of a simulation evolve outside the range
of the rate tables (see Table 2.1), rates are calculated via the approximate routines in order
to avoid an artificial cut o↵ imposed by the table boundaries. Rates are estimated between
density and temperature grid points via monotonic cubic-spline interpolation as described
by Ste↵en [63].
21
strength, sensitively determine the total electron-capture rate for a nucleus. New final states
in the daughter nucleus become accessible as the electron Fermi-energy, which scales with
µe , increases beyond the energy required to populate them via allowed electron capture. The
large scatter of the electron-capture rates at lower densities (Fig. 2.2) is because the Fermi
energy is comparable to the excited state energies of the daughter nuclei and the internal
structure of each nucleus varies significantly. During collapse, as the density increases and the
material becomes more neutron rich due to successive electron captures, both the magnitude
of the average electron capture Q-value and µe increase. However, µe increases more quickly
with density than the reaction Q-values do, and eventually µe
QEC implying that the
majority of the electron-capture channels are open. In this regime the rate is less sensitive to
the excitation energy spectrum of the daughter nucleus, and instead depends more strongly
on the total GT strength across all possible final states. The decrease in the variance of the
shell-model electron-capture rates in the higher density case of Figure 2.2 is a result of this.
While the parameters of Eqs. 2.1 and 2.2 were originally fit from the LMP nuclei, there is
reasonable agreement of the approximation with the other tabulated rates. Outside of these
tables, significant deviations from the estimates of this approximation may exist, specifically for heavier neutron rich nuclei [64]. But for the purpose of a sensitivity study, this
approximation—from which the majority of the rates are calculated—is used as a base estimate o↵ which the electron-capture rates may be varied. Given this set of rates, it is shown
in section 2.4.2 that the simulations are most sensitive to intermediate mass neutron-rich
nuclei. However, electron-capture rates developed from sophisticated theoretical models do
not exist for individual nuclei in this region, and thus cannot be benchmarked against experimental measurements. Therefore, the estimates provided by the approximation of Eqs. 2.1
and 2.2 may be systematically o↵ by a considerable amount. As will be shown, changes in the
22
predicted rates for these nuclei have significant consequences for the simulations, motivating
the need for experimental and theoretical e↵orts to constrain the rates of these species.
2.2
Core collapse and the role of electron capture
Just prior to collapse, the temperature of the stellar core becomes high enough (T & 0.5 MeV)
that the photon gas has sufficient energy to photodissociate nuclei into alpha particles and
free nucleons. However, the density is also high (⇢ & 109 g cm 3 ) resulting in large nuclear
reaction rates that rapidly form nuclei from these light particles. The balance reached
between these competing processes is known as Nuclear Statistical Equilibrium (NSE). If
the entropy is sufficiently low and the mass fraction of free nucleons is small compared to
that of nuclei, the most abundant nucleus in NSE is the species with the highest binding
energy for a given electron-fraction, Ye (= Z/A) [55]. A broad distribution of abundant nuclei
forms due to finite temperatures which distribute the abundances around these peak nuclei.
As collapse ensues and the central density increases through the first few decades, electron
captures are the primary engine of deleptonization. Electrons are removed from the system
and the produced electron-neutrinos (⌫e ) are able to freely stream out of the core, decreasing
both Ye and total lepton fraction Yl . As Ye decreases, peak abundances move toward neutronrich nuclei, and the core begins to cool as ⌫e ’s carry away energy and entropy. Electron
captures continue to dominate the neutrino transport during collapse until the last few
milliseconds before core bounce. In these final moments, the central density reaches a few
times 1012 g cm 3 , which is large enough that the neutrino mean free path begins to shorten
due to coherent scattering on nucleons and heavy nuclei. This increase in the ⌫e -scattering
cross section results in a neutrino di↵usion time that exceeds the collapse time, thereby
23
trapping the electron neutrinos in the inward flow of matter. After this occurs, the conversion
of electrons into electron-neutrinos via electron captures no longer removes leptons from the
core. Instead, further electron captures increase the electron-neutrino fraction Y⌫e in order
to conserve the now constant lepton fraction and bring the system of electrons and electron
neutrinos into equilibrium.
Prior to the work of Langanke et al. [60] it was believed that electron captures on free
protons were of greater importance than captures on nuclei during collapse. The main
considerations involved were that electron capture on free protons has a higher rate owing
to a smaller in magnitude Q-value. Additionally, nuclei with neutron number N
40 have
full pf -shell single particle states and so the addition of another neutron via an allowed
electron-capture transition would be Fermi-blocked. Langanke et al. [60] recognized that the
many-body nuclear states have mixed configurations and do not follow a simple Hartree-Fock
filling of single particle orbitals. They also suggested that thermal excitation of nucleons to
the g9/2 orbital creates vacancies in the pf -shell, and together with configuration mixing,
electron capture on bound protons is unblocked. Furthermore, because of the low entropy
in the core, and the neutron-rich conditions, the abundance of heavy nuclei is several orders
of magnitude higher than that of free protons, resulting in a higher overall electron-capture
rate. Thus, because electron captures on nuclei dominate, it is of great value to future
experimental and theoretical e↵orts to investigate the contribution each species has, and to
identify which nuclei are most important, as these e↵orts would provide direct constraints
to CCSNe simulations.
24
2.3
2.3.1
Codes & Methods
NuLib
In addition to electron-capture rates, other rates are needed to perform core-collapse simulations. The collection of rates used is contained in NuLib [59], an open-source, neutrinointeraction library.1 NuLib contains routines for calculating electron-type neutrino/antineutrino charged-current absorption opacities on nucleons with corrections for weak magnetism
and nucleon recoil based on the formalism of Burrows et al. [65] and Horowitz [66]. Neutrino emissivities for these processes are determined via Kirchho↵’s law which equates the
absorption rate of a equilibrium neutrino distribution to the emission rate of the underlying
matter. Elastic scattering of neutrinos on nucleons, and coherent scattering of neutrons on
alpha particles and heavy nuclei is also included in NuLib. For the former, corrections for
weak magnetism and nucleon recoil are included, and for the latter, corrections from ion-ion
correlations [67], electron polarization, and the nuclear form factor are employed. Inelastic
scattering of neutrinos on electrons is included based on the expressions of Bruenn [61].
Emissivities of heavy-lepton neutrino/antineutrino pairs via electron-positron annihilation
and nucleon-nucleon Bremsstrahlung are computed ignoring final state neutrino blocking.
For neutrino-antineutrino annihilation, instead of computing the non-linear absorption opacity during the simulation, an e↵ective absorption opacity is used which has been shown to
be an excellent approximation for core-collapse supernovae [59]. 4
4 http://www.NuLib.org
25
2.3.2
GR1D
The electron-capture rate implementation described in the previous sections is used to study
the sensitivities of the core-collapse phase to these rates using the code GR1D [48, 59]. GR1D
is an open-source spherically-symmetric general-relativistic neutrino-transport and hydrodynamics code used for studying stellar collapse and the early stages of a core-collapse supernova. For details of the hydrodynamics module of GR1D see reference [48]. The neutrino
transport is handled though a general-relativistic, energy-dependent two-moment formalism
for which extensive details can be found in O’Connor [59]. The employed scheme numerically
solves for the time evolution of the first two moments of the neutrino distribution function:
the neutrino energy density and the neutrino momentum density. The simulations utilize 18
energy groups logarithmically spaced between 0 and 250 MeV. Only electron type neutrinos
are evolved until the central density reaches 1012 g cm 3 , after which electron anti-neutrinos
and a characteristic heavy lepton neutrino are included. However, these latter two neutrinos
do not become important until core bounce has occurred. Spatial fluxes of the neutrino
moments are treated explicitly. Inelastic neutrino-electron scattering is handled explicitly
until the central density reaches 1012 g cm 3 at which point an implicit treatment is used.
Simulations in spherical symmetry (one dimension) such as those facilitated by GR1D,
a↵ord the computational resources needed for the inclusion of a highly detailed and complete set of microphysics. The same is not true for multidimensional simulations, as the
computational requirements are more extensive, forcing the inclusion of only a small subset
of the physical interactions. Fortunately, the core-collapse and early post-bounce phases of
core-collapse supernovae are well represented in one dimension due to the lack of turbulent
convection in the core. Thus, together NuLib and GR1D are able to provide a robust and
26
extendable code base for CCSNe simulations with state of the art microphysics.
2.3.3
Neutrino emission via electron capture
Electron capture is associated with the emission of electron neutrinos and so the electroncapture rate is proportional to the integrated spectrum of ⌫e emitted per second. The rate
for a particular nuclide, as tabulated in the implemented rate tables, is defined as the sum
of the rates for each of the individual nuclear transitions
=
X
ij ,
(2.3)
ij
where indices i and j correspond to levels in the parent and daughter nucleus respectively.
The spectra of emitted neutrinos from the electrons capturing on nuclei, described by the
matter temperature T and electron chemical potential µe , will vary based on the initial and
final states involved owing to a di↵erent reaction Q-value,
QEC
ij = Qg.s. + Ei
Ej
(2.4)
where Qg.s. is the atomic mass di↵erence of the initial and final nuclei, and Ei and Ej are the
excitation energies of the populated states in the parent and daughter nucleus respectively.
The most comprehensive solution to constructing neutrino spectra would be to coherently
sum the spectra of neutrinos emitted from each nuclear transition. However this would rely
upon rate tabulations for individual transitions which are not presently available. Thus, an
e↵ective neutrino spectra is implemented in terms of a single reaction Q-value, q, that is
chosen to constrain the average energy of the spectrum to match that from the tabulated
27
rates [68],
n(E⌫ , q) = E⌫2 (E⌫
hE⌫ i =
R1
0
q)2
N
1 + exp{(E⌫
E⌫ n(E⌫ , q)dE⌫
R1
=
n(E⌫ , q)dE⌫
q
µe )/kT }
⌫
EC +
+
,
(2.5)
(2.6)
0
where n(E⌫ , q) is the neutrino distribution function and is normalized to the total electroncapture rate for a particular nuclear species.
⌫,
+,
and EC are the neutrino energy
loss, positron emission, and electron-capture rates respectively. Eq. 2.6 is solved numerically
for the e↵ective Q-value, q = qe↵ , which then defines the e↵ective neutrino spectrum for
the electron-capture reaction of interest at a given ⇢, T , and Ye . The approximate neutrino
spectra generated in this way are unable to reproduce complex structure such as double
peaking in the true neutrino distribution, which may occur when there is a resonant allowed
transition (QEC
ij ⇠ 0) between an excited parent state and the daughter-nucleus ground
state. However, it approximates singly-peaked neutrino distributions quite well [68]. The
spectrum is normalized to the total electron-capture rate via Gaussian-Legendre quadrature
with an adaptive algorithm, developed in this work, that adjusts the range of integration to
the full width of the spectrum.
Utilizing these spectra, the electron-capture neutrino emissivity for a given nuclear species
is calculated as
⌘i (Ek ) =
1
E n n(Ek , qe↵ ),
4⇡ k i
28
(2.7)
where the ⌫e ’s are assumed to be emitted isotropically, ni is the number density for the
i-th nucleus, n(Ek , qe↵ ) is the neutrino spectra evaluated at the e↵ective Q-value that solves
Eq. 2.6, and Ek indicates the energy of energy group k. Evaluation of the emissivity is done
point wise at the centroid of each energy bin, and has units erg/(cm3 ·sr·s·MeV).
For estimates of the NSE number densities used above, several EOS from Hempel and
Scha↵ner-Bielich [69] were used. In particular, the SFHo EOS and internally consistent NSE
distribution developed by Steiner et al. [70] was the primary EOS employed. Results were also
compared against the DD2 [71] and TMA [72] EOS, each with self-consistent, but di↵erent,
NSEs. The SFHo and DD2 EOS were chosen because they currently best satisfy both
nuclear and astrophysical constraints [73]. Instead of meson self-interactions, the DD2 EOS
implements density-dependent meson-nucleon couplings which have been used successfully
to describe nuclear structure in a wide region of the nuclear chart and have also been tested
in heavy-ion collisions [71]. For nuclear masses, the SFHo and DD2 NSE distributions rely
on the Finite Range Drop Model (FRDM) from [74] and [75], whereas the TMA EOS utilizes
a mass table calculated by Geng et al. [76]. Both mass tables incorporate experimentally
determined masses from Audi et al. [77] and only resort to theoretical estimates where no
experimental measurements are available. For consistency, in addition to NSE abundances,
these mass distributions are also utilized in the calculation of reaction Q-values for use in
Eq. 2.1.
29
2.4
2.4.1
Sensitivity study
Reference simulations
In order to establish reference simulations o↵ which variations are performed, the widely
studied 15 solar mass, solar metallicity progenitor star s15WW95 [78] was used, as well
as s12, s20, and s40 from WH07 [79], which span the range of stellar compactness ⇠2.5 [80]
in this model set. The core compactness, ⇠M , is the ratio of the mass enclosed M to its
enclosing radius, and for this model set, s12 (s40) has the least (most) compact core. This
variety in compactness is an important factor for assessing the global sensitivity of corecollapse supernovae to detailed microphysics, as progenitors with similar compactness will
likely have similar sensitivities to microphysical variations. More details on the progenitor
model set utilized in this work can be found in the progenitor sensitivity subsection 2.4.3.1.
For each simulation the SFHo EOS is utilized, and in addition, for simulations with the
s15WW95 progenitor star the DD2 and TMA EOS, and NSE distributions, are employed.
A full complement of neutrino-interaction microphysics is incorporated via NuLib in each
reference simulation, which includes the newly implemented weak rates library described
here. The weak-rate tables were included using the following priority hierarchy: LM P >
LM SH > Oda > Approx., ensuring that rates from sources with higher priority are utilized
where rate estimates from multiple sources exist. Approx. indicates the parameterized rate
approximation of Eqs. 2.1 and 2.2, which is used for nuclei not included in the tables and
for regions of density and temperature which are beyond the limits found in Table 2.1. For
consistency, only tables that derive from shell-model calculations are utilized.
For each progenitor and EOS collapse simulations are performed in GR1D and follow the
evolution until at least ⇠ 100 ms after bounce. The collapse proceeds as described in Section
30
2.2. Di↵erences in the collapse evolution for di↵erent progenitors stem from the hydrostatic
conditions in the cores of these massive stars at the onset of collapse. For stars with large ⇠2.5 ,
larger central5 temperatures are needed to balance gravity. This gives lower central densities,
and therefore less electron capture during the final stages of stellar evolution. The range
of initial central Ye goes from ⇠0.422 for the s15WW95 model to ⇠0.447 for the s40WH07
model, or a range of ⇠6%. After neutrino trapping sets in, a range of trapped leptonfraction of ⇠0.288 – 0.297 is observed, where s40WH07 and s12WH07 have the minimum
and maximum trapped Yl , respectively. The overall higher deleptonization rate for the more
compact progenitors is due to both longer collapse times and larger matter temperatures,
which enhance the electron-capture rates.
Simulations utilizing di↵erent EOS, while holding all else constant, demonstrate only
small variations in the density, temperature, and Ye central-zone trajectories up to bounce.
Figure 2.3 details the abundance distributions for each EOS, as well as the resulting average
electron capture rate along a collapse trajectory. The NSE distributions of all three EOS
are largely similar early on, but di↵erences in the mass table of the TMA EOS cause it to
diverge from the others starting around 1
2 ⇥ 1012 g cm 3 . However, di↵erences are seen in
the electron-capture rate only after central densities of 2 ⇥ 1012 g cm 3 , where any e↵ect on
the evolution is suppressed because of neutrino trapping. Near nuclear saturation density,
however, the di↵erences in EOS begin to play a more important role. The density-dependent
couplings of the DD2 EOS, for instance, result in higher central temperatures at bounce.
However, since the average rate in simulations utilizing each of the EOS are nearly identical,
they result in a di↵erence of trapped lepton-fraction of only a fraction of a percent. For
more information on the sensitivity of the electron and lepton fractions to the EOS during
5 Central used in this context refers to the inner most zone of the core-collapse simulation.
31
160
107
SFHo
DD2
TMA
106
140
105
EC
120
104
103
EC
60
102
40
101
20
0
1010
> (s-1)
80
EC
EC
,
100
100
1011
1012
c
1013
1014
(g cm-3)
Figure 2.3: The average nuclear mass (divided by two), charge, and electron-capture rate
versus central density for the three EOS utilized in this study. The colors indicate di↵erent
EOS, while the line style indicate which quantity is plotted. All three EOS have nearly
identical abundance distributions up to densities of 2·1012 g cm 3 . Beyond this point the
TMA EOS has a heavier and slightly more neutron rich mass distribution compared to both
SFHo and DD2, but maintains a comparable average electron capture rate overall. These
simulations each utilize the s15WW95 progenitor.
32
collapse see Fischer et al. [73].
Together, these reference calculations span a wide range of progenitor and EOS dependences that ensure a configuration-independent assessment of the core-collapse sensitivity to
electron capture on nuclei, and furthermore demonstrate the universality of collapse. In what
follows, results are discussed for variations on the s15WW95+SFHo reference simulation but
any significant di↵erences in relation to variations on the other progenitor+EOS reference
simulations are pointed out.
2.4.2
Species dependent sensitivity
To understand the sensitivity of core-collapse to di↵erent regions of electron capturing nuclei,
the central zone collapse profile from the reference simulation is used to decompose the
change of the electron fraction with time, Ẏe , into the electron captures of each nuclear
species. While using only the central zone is an approximation, it is justified by noting the
observation by Liebendorfer [81] that the electron fraction profiles typically correlate quite
well with density during the collapse phase. Therefore, matter will generally have the same
electron capture history. The rate of change of the electron fraction with time, Ẏe , that is
estimated accounts for ⌫e re-absorption in an energy-dependent way,
Ẏei
4⇡↵ X
=
⇢Na
k
✓
✏k · ⌘i (✏k )
· 1
✏k
Ek
Bk
◆
(2.8)
where Ẏei is the time derivative of the electron fraction due to electron captures on the
ith nuclear species, ↵ accounts for the general relativistic time dilation, Na is Avogadro’s
constant, ⇢ is the density, ✏k is the energy of the kth energy bin,
bin width, ⌘i is the emissivity of species i and 1
33
Ek
Bk
✏k is the kth energy
is the neutrino blocking factor that
accounts for re-absorption as collapse approaches weak equilibrium. Along with a hydrodynamical correction due to advection of electrons into the central zone, the time integral
of Eq. 2.8 added for all nuclei reproduces the full time dependent Ye profile of the central
zone during collapse, indicating that electron captures on heavy nuclei singularly drive the
deleptonization of the central zone.
With this method, the deleptonization history due to each nucleus can be individually
investigated. At these densities and temperatures, NSE diversifies the abundant nuclei,
ensuring that no single nucleus dominates the deleptonization. There are, however, subsets
of nuclei that contribute more than others to the reduction of Ye . A nuclear-mass dependence
can be studied by binning the the contribution to |Ẏe | from each nuclide into nuclear mass
bins and tracking the evolution of each region up to neutrino trapping. Figure 2.4 plots the
deleptonization rate in the core for di↵erent nuclear mass bins, as the central Ye progresses
from its progenitor value to its value when weak equilibrium is achieved, just prior to bounce.
Early on, before the collapse becomes strongly dynamical, nuclei in both the mass range
2565 dominate
the evolution as seen by the red and light blue curves in Figure 2.4. Unfortunately, the most
precise electron-capture rate estimates fall below this region and instead, the rates are set
primarily by the approximation of Eq. 2.1.
It is also useful to understand the specific nuclei that have the largest integrated contribution to core deleptonization up to neutrino trapping. Shown in Figure 2.5 are the 500 nuclei
with the largest integrated |Ẏe | from t = 0 to the trapping time—when densities are in excess
of 2 · 1012 g cm 3 . This reveals the channel through which the bulk of electron captures
34
<-100 -50 -40 -30
10 1
-20
t-tb (ms)
-10
(a)
Ye (sec-1)
10 0
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
10
9
0.42
Total
5 < A 25
25< A 65
65< A 105
185< A 225
105< A 145
145< A 185
0.40
0.38
0.36
Ye
0.34
0.32
0.30
0.28
Figure 2.4: The contribution of nuclear electron capture to the change of the matter electronfraction with time. The contours are the binned sums of |Ẏe | for each species in several mass
regions. For reference, the central density at t tb = -20, -10, -5, -2, and -1 ms is 1.41 · 1011 ,
4.06 · 1011 , 1.42 · 1012 , 8.49 · 1012 , 3.30 · 1013 g cm 3 respectively.
35
(b)
50
28
=
Z
40
N
Proton Number
50
30
log10
20
20
20
82
28
e
|
50
30
40
50
60
70
80
90
Neutron Number
Figure 2.5: Top 500 electron capturing nuclei with the largest absolute change to the electron
fraction up to neutrino trapping. The color scale indicates |Ẏe | integrated up to the trapping
time, occurring when ⇢c ⇠ 2 · 1012 g cm 3 , such that the total electron-fraction at this
point is equal to its initial value less the sum of Ye , the plotted quantity, over all nuclides.
Calculations are based on the s15WW95+SFHo reference simulation. The black contour
that runs parallel to the valley of stability on the neutron-rich side is the boundary between
measured [77] and theoretical masses used in the approximate rate estimates of Eqs. 2.1
and 2.2. The rectangular outline indicates the size of the sampling region used in the
statistical resampling study, and also the set of nuclei which exhibited the largest changes
to the simulations when excluded from the electron-capture calculations.
36
operate. The central electron-fraction at the trapping density is reproduced by subtracting
the sum of this quantity over all nuclear species from the initial electron-fraction,
Ye (t = ttrapping ) ' Ye (t = 0)
where
X
Yei ,
(2.9)
i
Yei is shown in Figure 2.5, and the component of Ye due to advection of electrons
into the central zone (otherwise making this relation exact) is left out for simplicity. Within
the pf g and sdg-shells the primary contributors to the deleptonization phase of collapse are
neutron rich nuclei near the N=50 and N=82 closed neutron shells (see the dashed vertical
lines in the figure).
To confirm these results, they are gauged against the sensitivity of the collapse phase to
localized groups of nuclei by employing a statistical resampling technique where sets of nuclei
are removed from the simulation. This method is based on well known statistical resampling
methods such as bootstrap and jackknife resampling [82]. Specifically, a rectangular region
centered on a nucleus and spanning all nuclei within ±3 isobars and ±5 isobaric chains is
removed from the calculation of the electron-capture neutrino emissivity. An example of
such a removed region is drawn on Figure 2.5. This technique is employed in 48 simulations
with resampling performed uniformly across the nuclear chart.6 Using this technique the
simulations are found to be most sensitive to nuclei in the mass range 74-84 with Z/A (= Ye )
between 0.36-0.44, corresponding to nuclei near 78 Ni, 79 Cu, and 79 Zn. These results agree
with the Ẏe calculations performed above, and indicate that species near the N=50 magic
number have the largest contribution in magnitude to the change in the electron fraction
overall. The impact of removing these species from the simulation corresponded to a change
6 The resampling regions were allowed to overlap, but were chosen such that uniform coverage accross
the nuclear chart was achieved in the resampling study.
37
of inner-core mass at bounce of & 10%, whereas resampling in other regions resulted in
variations of only a few percent.
The electron-capture rates for these nuclides rely entirely on the approximation of Eqs. 2.1
and 2.2, which were fit originally to rates of lower-mass mid-shell nuclei near stability. Therefore, in the region indicated by the above two studies, the approximation is largely uncertain
and may be systematically o↵ by a significant amount. For instance, these estimates do not
account for nuclear structure e↵ects that may occur near the N=50 closed neutron shell.
Depending on the nuclear configurations, thermal excitations, and increasing dependence on
forbidden transitions, Pauli blocking7 may considerably reduce the electron-capture rates in
this area. Given that the change of inner-core mass at bounce was largest when the rates
of these nuclides were decreased to zero as compared to any other set, and that without
any evaluative measurements the uncertainties in these rates remain large, experimental and
theoretical work should focus here. Any substantial changes to the electron-capture rate estimates for these nuclei will likely have a relatively large impact on simulation predictions for
the PNS formation, and will therefore help to constrain important collapse and pre-explosion
phase quantities.
2.4.3
Systematic variations
To study the strongest impact of variations in the electron-capture rates, simulations in
which the rate for each A>4 nuclide is systematically scaled by factors of 10, 4, 2, 0.5,
0.25, and 0.1. In this way, the structure of the rates as seen in Figure 2.2 is preserved (the
7 Pauli blocking arises as a result of the Pauli exclusion principle in which identical fermions are unable
to occupy the same quantum mechanical state. In the case of neutron-rich nuclei, N=50 represents a g9/2
orbital fully occupied by neutrons. Unless the electron capturing nuclei are highly excited such that there
are holes in this or other lower energy orbitals, allowed Gamow-Teller transitions cannot occur.
38
s15WW95+SFHo
s15WW95+DD2
s15WW95+TMA
s12WH07+SFHo
s20WH07+SFHo
s40WH07+SFHo
Reference sim.
Yltrapped (ρc = 4 .1012 g cm-3)
0.32
0.31
0.30
0.29
0.28
0.1
0.25
0.5
1.0
2.0
systematic scaling factor
4.0
10.
Figure 2.6: Projection of the (trapped) lepton fraction at ⇢c = 4 · 1012 g cm 3 as a function
of the electron-capture rate scaling factor for progenitor+EOS reference simulation. In all
the cases the lepton fraction begins to increase if the capture rate becomes too high because
of a dramatic increase in the electron neutrino absorption cross section. The asymmetry seen
here indicates that those quantities which depend on Yl/e are likely to be more sensitive to
a reduction of the electron-capture rates due to a systematic overestimate in the base rates,
than they are to an increase due to an underestimate.
39
lower panel of residuals is una↵ected), but the distribution of rates is shifted to larger or
smaller values depending on the scaling factor. Systematic shifts of the rates emphasize
the role of electron capture as a regulator for entropy and temperature in the simulations.
By increasing the rates, more neutrinos are emitted and escape during the initial stages
of collapse, thereby increasing the evaporative neutrino-cooling. Furthermore, because the
dominant source of matter pressure is electron degeneracy, increased electron-capture rates
accelerate the collapse. This impacts the matter profiles outside the shock in the early postbounce phase. Decreasing the rates has the opposite e↵ect, the entropy, temperature, and
electron fraction of the core are significantly higher because less cooling takes place.
The evolution prior to and right at ⇢c = 2 · 1012 g cm 3 (which is the density that defines
neutrino trapping) is what sets the final value of the trapped lepton and electron fractions,
which are important due to their direct impact on the formation of the PNS. For all the
reference simulations, a minimum in the trapped lepton fraction was found to occur with a
systematic scaling factor of approximately four. The minimum that forms can be seen in
Figure 2.6. Scaling by ten slightly reverses the downward trend, and increases the trapped
lepton-fraction from its minimum value. This behavior is the result of electron-neutrino
capture on heavy nuclei becoming the primary source of opacity, exceeding what is typical
as a result of coherent ⌫e -scattering. When the rates have been enhanced by a factor of
ten, the ratio of the absorption and scattering opacities, a /s , surpasses unity already by
central densities of 3 · 1011 g cm 3 and a ⇠ 4s by the time ⇢c = 1012 g cm 3 . Absorption
cross sections are then large enough to trigger an early onset of neutrino trapping at densities
lower than what is found for the reference rates. The consequence is that electron capture
has a smaller window of deleptonization, leading ultimately to a higher Yl overall.
The range of electron fractions near core bounce is commensurate with the range of
40
0.40
x2.0
x4.0
x10.
Base
x0.50
x0.25
x0.10
1012
1013
Ye
0.35
Yl
0.30
0.25
Yνe
0.06
0.04
0.02
0.00
1010
1011
ρc (g/cm3)
1014
0.42
0.41
0.40
0.39
0.38
0.37
0.36
0.35
0.34
0.33
0.32
0.31
0.30
0.29
1010
x2.0
x4.0
x10.
1011
Base
x0.50
x0.25
x0.10
1012
1013
ρc (g/cm3)
1014
Figure 2.7: Comparison of the central electron, electron-neutrino (left) and lepton (right)
fractions in which the nuclear electron-capture rate for every species has been scaled by
factors shown in the legend. Warmer colors indicate a higher overall electron capture
rate, and cooler colors indicate a lower rate. The dashed black line indicates the reference
s15WW95+SFHo simulation.
trapped lepton-fractions so far described, see Figure 2.7. As mentioned above, variations of
Ye (and Yl ) on this level are of importance due to its direct impact on the formation of the
PNS and the supernova shock. Electron fraction, entropy, density and velocity profiles are
shown in Figure 2.8 for s15WW95+SFHo at -1, 0, 1, and 5 milliseconds relative to bounce.
Of particular interest, the mass of the forming PNS inner-core at bounce, seen as the mass
behind the steep velocity gradient in panel (b), was found to vary on the order of ⇠0.1 M ,
and up to ⇠0.2 M five milliseconds after bounce. The asymmetry observed in the trappedlepton fraction, where scaling the rates by 0.1 had a more dramatic e↵ect than scaling by
10, translates directly to the variation of the inner-core mass at bounce (+16/-4 % from the
reference). The result is that the forming PNS has a lower bound on the inner-core mass at
bounce over the range of electron-capture rates explored. Because the rates are already high,
and therefore the absorption opacity is already almost comparable to the scattering opacity,
41
(b)
(c)
(d)
Electron Fraction
0.50 (a)
0.45
0.40
0.35
0.30
0.25
0.20
0.15
Entropy (kb / baryon)
6.0
5.0
4.0
3.0
2.0
1.0
0.0
14
Density (g / cm3)
10
13
10
12
10
11
10
10
10
10
Velocity (104 km/s)
10
9
8
1
0
1
2
3
4
5
6
Core bounce
7
t-tb = -1.0 ms
t-tb = 0.0 ms
t-tb = +1.0 ms
8
0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0
x10.
x4.0
x2.0
x1.0
x0.50
x0.25
x0.10
t-tb = +5.0 ms
0.2 0.4 0.6 0.8 1.0 1.2
Enclosed Mass (M )
Figure 2.8: The electron fraction, entropy, density, and velocity as a function of enclosed
mass at four times during a core-collapse simulation, spanning 6 ms around bounce, including
the collapse phase just after the onset of neutrino trapping (a), core bounce (b), and 1 ms
(c) and 5 ms (d) after bounce during which the shock has begun its outward trajectory.
The reference simulation (s15WW95+SFHo) is shown in dashed black. Simulations shown
in color have electron-capture rates scaled systematically for all species by factors of 10, 4,
2, 0.5, 0.25, and 0.1. Core bounce (t tb = 0) shown in panel (b) is defined as when the
entropy at the shock front exceeds 3.0 kb /baryon.
42
the range of inner-core mass at bounce comes mainly from simulations with decreased rates
relative to the base simulation.
Such variations in the homologous inner-core mass will translate directly to the kinetic
energy of the emergent shock that eventually detonates the star. Furthermore, the one to
two orders of magnitude range of density outside the shock (see Fig. 2.8) will likely play
an important role during the shock propagation and explosion phases of multidimensional
CCSNe simulations. Therefore, further investigations into the impact of these variations in
the late stages of two and three dimensional CCSNe simulations are warranted.
In addition to the direct impact on core dynamics and structure, the neutrino emission
at bounce is found to be very sensitive to these variations. Figure 2.9 shows the neutrino
luminosity 500 km from the center for the di↵erent neutrino species as a function of time.
Prior to bounce the ⌫e -luminosity begins to rise from electron captures on bound protons in
nuclei, but is quickly regulated by neutrino trapping, causing a down turn in the luminosity.
During this time the core is very sensitive to the nuclear electron-capture rates as the entropy
is low enough that heavy nuclei dominate the available mass. Scaling the rates for each
nucleus using the same systematic factors results in a 40% variation of the ⌫e -luminosity
before bounce. During bounce, the electron-neutrino burst—seen as the peak luminosity
in the left panel of Figure 2.9—is powered primarily by electron capture on free protons.
The core-bounce and shock liberates nucleons from their bound states and the entropy rises
causing a significant increase in the nucleon and light particle abundances. That said, while
the electron-capture rate on free protons, EC
p , is not adjusted in these simulations, a range
of ±20% relative to the reference peak ⌫e -luminosity is observed.
These dramatic variations of the peak electron neutrino luminosity are a result of alterations to the neutrinosphere and shock convergence-timescale. Specifically, when electron
43
600
Base
x2.0
x0.50
x4.0
x0.25
x0.10
x10.
550
500
100
80
Luminosity (erg s-1)
450
70
400
60
νe
350
50
300
250
40
νe
200
150
30
νx
νe
20
100
50
0
90
νx
10
νe
10 5 0 5 10 15 20 25 30 40
t-tb (ms)
60
80 100 120
0
Figure 2.9: The neutrino luminosity as measured at a radius of 500 km as a function of
time after bounce in the s15WW95+SFHo simulation set. Electron-capture rate scaling
factors are shown in the legend, where contours with warmer colors have higher rates, and
cooler colors have lower rates. While the peak electron-neutrino luminosity is considered
particularly stable across core-collapse simulations, it varies significantly with variations of
the electron-capture rates on medium-heavy nuclei. When the rates are at their lowest (⇥0.1
case), the shock reaches the neutrinosphere more quickly than in the other simulations. This
results in a larger luminosity in the peak electron-neutrino burst because more ⌫e s are able
stream out of the core at early times. The opposite is true when the rates are higher, the
neutrinosphere and shock converge much more slowly, and so the neutrinos spend more time
di↵using out of the inner core, reducing the peak luminosity but distributing it out to later
times.
44
captures on nuclei are weaker (scaling by 0.1), the inner-core mass that forms at bounce is
significantly larger. This results in more kinetic energy transferred to the shock, allowing
it to sweep up mass more quickly. In Figure 2.8 this can be seen by the broadening of the
distribution of shock locations in mass between the di↵erent simulations in the velocity plot
5 ms after bounce (bottom-right) as compared to t
tb = 0. Also, with a weaker overall rate
the opacity will be lower, allowing the neutrinosphere to move in to lower radii more quickly.
The combination of these e↵ects result in the shock and neutrinosphere radii converging
earlier for the simulations with lower electron-capture rates, and later for simulations with
higher rates, up to a di↵erence on the order of 3.5 ms. Thus, electron capture on protons
liberated by the shock produce neutrinos that are able to reach the neutrinosphere earlier
and freely stream away, contributing to a larger ⌫e peak luminosity when the nuclear electron
capture rate is systematically lower. On the other hand, when the nuclear electron capture
rate is high, the emitted neutrinos di↵use more slowly through the core, and reach the neutrinosphere at later times, thus strongly quenching the peak luminosity but spreading out
the emission to later times. Due to the high luminosity of the electron-neutrino burst near
the time of bounce, it is a candidate for detection from a galactic core-collapse supernovae
in Earth-based detectors sensitive to electron neutrinos, e.g. those with a detector volume
composed of liquid Argon. And while such measurements are not presently of high enough
precision to resolve each variation seen here, they may indicate the total amount of electron
capture occurring at core bounce.
2.4.3.1
Progenitor model sensitivity
In order to evaluate the significance of the electron-capture systematic sensitivity studies,
they were tested against a study of the progenitor dependence of the core-collapse phase.
45
Sensitivity to
Inner-core
Mass (M )
2.0
1.5
1.0
0.5
0.0
30
Tcc (MeV)
0.55
sc (kb / baryon)
Progenitor variations
Electron capture rate variations
0.50
0.45
20
10
0
700
600
500
400
16
15
14
13
10
8
6
A
M
+T
95
W
2
D
5W
+D
s1
95
o
W
FH
5W
+S
s1
95
o
W
FH
5W
s1
+S
07
o
H
FH
0W
s4
+S
07
o
H
FH
0W
s2
+S
07
H
2W
s1
2)
(3 7
rs 0
ito WH
en 20
og s1
Pr 12s
Figure 2.10: The full range of sensitivity of the PNS inner-core mass, central entropy, and
central temperature at bounce as well as the peak ⌫e -luminosity, the peak average ⌫e energy,
and the average ⌫e energy prior to neutrino trapping, owing to variations of the progenitor
model and electron-capture rates. Thirty two progenitors were utilized from the WH07 model
set of Woosley and Heger [79] for producing the progenitor bars (red) in the figure. Each bar
of the electron-capture rate variations derives from simulations where the rates have been
systematically scaled by factors of 10, 4, 2, 0.5, 0.25, and 0.1. The horizontal tick represents
the value of the reference simulation for the tested Progenitor + EOS combination. The
window ranges are chosen so that the progenitor sensitivity bars are of equal size across each
of the plotted parameters.
46
Drawing from the larger set of progenitors from which the reference progenitors of the
electron-capture study belong, the 2007 non-rotating solar-metallicity single-star model set
from the stellar evolution code KEPLER [79] was utilized. This model set contains the presupernova configuration of 32 stars ranging in zero-age-main-sequence (ZAMS) mass from
12 M
to 120 M – s12WH07 and s120WH07 respectively. Simulations of these progenitors
exhibit a ⇠3.5% range of trapped lepton-fraction (0.288 - 0.298), a ⇠4% range of inner-core
mass at bounce (0.473 – 0.491M ), and a ⇠9% range of electron-neutrino peak luminosity
(5.19
5.65 · 1053 erg s 1 ) during the neutrino flash occurring just after core bounce.
Figure 2.10 compares the stellar progenitor model and electron-capture rate dependence
of several structural and neutrino quantities during collapse. The range of inner-core mass
and peak ⌫e luminosity seen from employing the WH07 progenitor model set are each approximately a factor of 5 smaller than the ranges seen from varying the electron-capture rates
across all progenitor+EOS references. On the other hand, the range of central entropies and
temperatures at bounce are comparable between the two sensitivity studies. The ⌫e average energies just prior to neutrino trapping and during the deleptonization burst are also
compared in Fig. 2.10. The neutrinos emitted during the luminous burst just following core
bounce are of higher energy than those emitted earlier because they arise primarily from
electron capture on free protons. They also decouple from the core at a much hotter and
denser neutrinosphere than prior to bounce, yielding higher energy neutrinos. In both of the
sensitivity studies, variations of the electron-capture rates and of the initial stellar models,
the range of average neutrino energy during peak emission is comparable (⇡ ±0.5 MeV).
While captures on free protons contribute only marginally to deleptonization in the central zone, further out in the iron core, where the densities are lower (and Ye ’s are higher),
electron captures on protons contribute to the deleptonization, especially in cases where
47
electron captures on nuclei are suppressed. The capture of electrons on these free protons
produces neutrinos of a higher average energy, commensurate with the large spread seen in
the bottom panel of Fig. 2.10 (which is taken when the central density is 3 ⇥ 1011 g cm 3 ,
but present from the onset of collapse). Another contribution to the energy spread is the
systematic shift of electron captures to more neutron rich nuclei as the electron-capture rates
are increased and the matter becomes more neutron rich. These neutron-rich nuclei have
more negative Q-values, yielding lower energy neutrino emission. Both of these e↵ects result
in a dispersion of average neutrino energies early on that is several factors larger than what
is seen in the progenitor simulations.
Finally, note that while the peak luminosity is only weakly dependent on the progenitor model, the post-bounce pre-explosion luminosity of all six neutrino species have strong
progenitor dependences [83]. On the other hand, the pre-explosion luminosities investigated
here are much less sensitive to the nuclear electron-capture rates comparatively–see panel
(b) of Figure 2.9. The diverging of the luminosities seen at t
tb = 120 ms is due only to
the di↵erence in collapse times between the simulations which carries over to the evolution
of the mass accretion rate after bounce.
2.4.4
Monte-Carlo variations
In addition to the possibility of systematic errors in the electron-capture rates, the e↵ect
of statistically distributed variations are also explored. Such an investigation is of great
importance if the e↵ect of an approximation such as Eq. 2.1 is to be understood. The main
flaw in a continuous function for rate estimation across many nuclear species is the loss
of structure, which would otherwise serve to statistically distribute the rates on a reaction
by reaction basis (see Fig 2.2). To study this e↵ect, Monte-Carlo (MC) variations of the
48
electron-capture rates were performed. Using an analytic description of the electron-capture
rate distributions, such as a Gaussian or Poisson distribution, is likely to be inaccurate.
Instead, the approximate rate is adjusted via MC for each species by adding to its log10 ( EC )
a value randomly chosen from a distribution created from the residuals of the tabulated rates
and the approximate rates, i.e.
i,table
log10 ( EC
i,Eq.1
)
log10 ( EC
)
(2.10)
where i is an index running over all the tabulated reactions.
In constructing this distribution, it is important also to preserve the Q-value dependence
of the residuals that can be seen in Figure 2.2b. This is done by separating the residual
distribution into subsets so that the reaction-rate residuals in each subset have similar Qvalues. To do so a Q-value binning of 2.5 MeV was chosen, but this method was also tested
with binnings of 5.0 MeV and 10.0 MeV which resolve the Q-dependence less, but have more
counts per bin from which to sample. From these samples, pseudo electron-capture rates
were MC generated such that they retain the Q-value dependence of Eq. 2.1, but statistically
distribute the approximate rate according to the variance of the rates calculated in the shellmodel. Seven simulations for each binning were performed.
As mentioned before, at low densities the electron-capture rate depends strongly on
the energy levels of the initial and final nuclides because the electron chemical potential
is comparable to the excitation energies of the allowed Gamow-Teller transitions. As the
electron chemical potential increases, it encompasses a larger range of excitation energies
which results in the electron-capture rate becoming sensitive primarily to the total strength.
In the low-density case of Figure 2.2a the approximation of Eq. 2.1 while appearing to
49
0.42
Monte-carlo range
Reference sim.
0.41
0.40
0.39
0.38
0.37
Yl
0.36
0.35
0.34
0.33
0.32
0.31
0.30
0.29
10
10
11
10
12
10
13
10
14
10
ρc (g cm-3)
Figure 2.11: Lepton fraction band as a function of central density for the Monte Carlo (MC)
described in the text. The band represents data points from seven MC simulations based
on the s15WW95+SFHo reference, where the band width represents the min to max range
of Yl . The rates are varied by drawing from the Q-value dependent residual function shown
in Figure 2.2 after binning it into 2.5 MeV energy bins. This binning was chosen as it best
tracks the Q-value dependence of the residual distribution.
50
decently reproduce the mean of the shell-model rates, actually has a mean approximately a
factor of two lower than the tabulated rates. As the density increases, this di↵erence between
the mean electron-capture rate estimated by the approximation and the shell-model rates
decrease. Thus, the approximation better reproduces the mean rate in the high density
case of Figure 2.2a. Because the MC simulations are based on residual distributions of the
tabulated rates and the approximation, the average rate produced in each MC trial also has
this bias.
Plotted in Figure 2.11 is the min-to-max band representing the range of lepton fraction
observed from all of the MC simulations. The band drawn corresponds to a 2.5 MeV binning
of the residual distributions from which the MC sampling was performed. For the reasons
just described, the band has lower electron and lepton fractions than the reference at low
densities, but becomes more statistically distributed around the reference at higher densities,
near 5·1011 g cm 3 . The lepton fraction band width varies from about a half percent initially,
to its largest value of ⇠2.5% just before neutrino trapping, and then decreases back to ⇠1.5%
before bounce. Altogether, no significant impact on the core dynamics or the neutrino
transport were observed and therefore it can be conclude that any statistically distributed
scatter in the estimations of the electron-capture rates, such as those seen in Figure 2.2, will
likely not impact the models.
2.5
Conclusion
Nuclear electron capture has long been understood to play an important role in the dynamics
of core-collapse supernovae and large e↵orts have been undertaken to produce reliable estimates of electron-capture rates for astrophysical contexts. Although significant progress has
51
been made in benchmarking theoretical electron-capture rates by comparison with chargeexchange experiments (especially using shell-model calculations) [13], large uncertainties
remain for neutron-rich nuclei and nuclei beyond A=65. Furthermore, sophisticated shellmodel estimates for electron-capture rates exist only for a small subset of the large number
of nuclei that contribute strongly. The implications of uncertainties in the electron-capture
rate estimates for the core-collapse and early post-bounce phases of fully self consistent, general relativistic, core-collapse supernova simulations with comprehensive neutrino transport
are explored in this work.
2.5.1
Most important nuclei
For the reference simulation, the contribution of each nucleus to core deleptonization is
calculated, and a statistical resampling study is also performed. Both of these studies identify
the nuclear species whose rate should be known most precisely due to their significance in
the simulations. With the given set of electron capture rates—from shell-model estimates
to the approximate estimates of Eqs. 2.1 and 2.2—the simulations are found to be most
sensitive to neutron rich nuclei in the upper pf and pf g/sdg-shells.
Specifically, in these simulations nuclei near the A⇠80, N⇠50 closed neutron shell contribute the bulk of core deleptonization, and when removed from the simulations result in
noticeable changes to the protoneutron star formation, with a significantly larger impact
than when any other group of nuclei are removed. However, because sophisticated estimates
from nuclear theory are not available for individual nuclei in this region, the electron-capture
rates for these species have been accounted for in the past via simple averaging techniques
and in this work via an approximation that has been fit to the LMP rate set. While this
approximation reasonably reproduces the average electron capture rate for sd and pf shell
52
nuclei near stability, rates for heavier neutron-rich nuclides will likely diverge from what is
predicted by this parameterization.
2.5.2
Impact of uncertainties
The impact such uncertainties may have are evaluated by varying the electron-capture rates
for more than 6000 nuclei statistically, about the approximate prediction, and also systematically. On one hand, it is found that statistical variations of electron-capture rates e↵ect
the overall dynamics and neutrino emission only weakly, producing marginal changes to the
simulations. These findings indicate that the lack of structural variation that distributes the
rate estimates from one species to the next is not crucial to the simulations.
On the other hand, the average electron capture rate across a region of nuclei strongly
determines the overall impact of those constituent nuclei. By systematically varying the
electron-capture rates by factors between 10 and 0.1, dramatic variations in the inner-core
mass (+16/-4 %) and the electron-neutrino luminosity (± 20%) at and near bounce, respectively, are observed. Comparing with 32 simulations utilizing di↵erent progenitor models,
this range of inner-core mass and peak neutrino-luminosity is found to be 5 times as large
as that seen when varying the progenitor models.
In addition, the nuclear electron-capture rates are found to be already large enough in
the reference simulations that increasing them beyond their base values has a considerably
smaller e↵ect than decreasing them. This has compelling implications. Rates for A⇠80
nuclei near the N=50 shell gap, which have been shown to be the primary contributors
to the overall impact of electron captures during core collapse, may be overestimated by
Eq. 2.1 due to Pauli blocking at the closed neutron shell. Combined with a greater overall
sensitivity to the systematic decrease in electron-capture rates, changes to the collapse and
53
early post-bounce phases of the simulations may be as significant as those seen in this study
if the current rates of these nuclei are found to be overestimated.
2.5.3
Goals for future studies
For these reasons, it is important that experimental and theoretical e↵orts be aimed at
nuclei which span the region on the chart of isotopes between stability and the neutron
drip line in both the pf and pf g/sdg model spaces, and further expand on the work that
has been carried out for (near-)stable nuclei in the pf -shell. Since data from (n,p)-type
charge-exchange experiments for nuclei in the pf g/sdg-shell and for neutron-rich nuclei in
the pf and pf g/sdg-shell are scarce, new experiments are required to obtain a sufficient
set of data to benchmark current and future theoretical estimates. To this end, presently
feasible experiments on neutron-rich nuclei at and near stability with 60 b :
7
a . num_dependencies++;
8
break ;
9
10
case a < b :
11
b . num_dependencies++;
12
break ;
13
14
// ( a , b ) not a d j a c e n t
15
case Unknown :
16
break ;
17
}
18
}
19 }
20
21 topological_sort ( connections ) ;
Figure 4.5: Prototype of sorting connections into concurrent lock-free sets. For n connections,
scales as O(n2 ).
of the neural network is then straightforward from these dependency graphs, and concurrency
is immediately evident. Consider the left graph in Fig. 4.4. Any connection (circle in the
DAGs) which has no inbound dependencies (shown as the grey arrows), can be evaluated
immediately. Thus, one can see that connections A1 and A2 can be evaluated in parallel at
the first step. The next step is to remove A1 and A2 from the dependency graph, and to
again search for connections which now have no inbound dependencies. After A1 and A2 are
evaluated, we see from the DAG that connections B1 and B2 have no dependencies, and so
are able to be evaluated next and in parallel. Finally connection C1 can be evaluated. This
sorting technique is known as a topological sort of a directed acyclic graph and is often used
in determining the order of tasks [156].
Algorithmically, the dependencies each connection has is determined using the comparator scheme shown in Table 4.1. With this comparison, if two connections are adjacent and
128
Table 4.1: Neural network connection dependency comparator
Type
Condition
Dependency
Connection pair (a,b) with di↵ering output nodes
a recurrent
b recurrent
a normal
b normal
a.origin = b.destination
b.origin = a.destination
a.destination = b.origin
b.destination = a.origin
a
a
a
a
<
>
<
>
b
b
b
b
Connection pair (a,b) with the same output node
a self-recurrent
b self-recurrent
arbitrary choice
arbitrary choice
a.origin = a.destination
b.origin = b.destination
a.origin < b.origin
b.origin < a.origin
a & b not adjacent
-
a
a
a
a
<
>
<
>
b
b
b
b
unknown
The neural network connection dependency comparator. There are three subsections in
the above, when connections a and b (1) are adjacent and have di↵erent output nodes, (2)
are adjacent and have the same output nodes, and (3) are not adjacent. If the algorithm
need not be lock-free (does not respect memory dependencies) then the second subsection
(connections with the same output node) can be dropped and the rules of the first subsection
are used exclusively for adjacent connections. Each connection has an origin and destination
node, which are what is referred to in the Condition column. The less-than and greater-than
symbols in the Dependency column imply the ordering of the two connections; if b > a, then
the evaluation of connection a must come prior to b. In the case of an unknown relationship
between non-adjacent connections, transitive relationships will enforce and ordering if one
exists. This comparator handles both normal and recurrent connections.
129
one connection depends on the other, the dependent connection’s number of dependencies is
incremented. This comparator can then be applied to any neural network regardless of its
structure and a topological sort, as described previously, can determine the sets of concurrently evaluatable connections. An example implementation of this is shown in Figure 4.5.
4.3.2
Neural network operations
The ordering of connections into concurrently evaluatable sets, as described in the previous
section, provides the sca↵olding o↵ which the rest of the neural network operations are built.
Fundamentally, three operations are required in the feedforward evaluation of an arbitrary
neural network,
1. Apply connection: The application of a connection A1 from node 1 to node 2:
x2 = x2 + wA1 · x1 , where xi is the floating point value of node i, and wA1 is the weight
of the connection A1.
2. Zero/Reset node: Setting the memory address of a node’s floating point value to
zero, xi = 0.
3. Activate node: Applying the activation function to a node’s value, xi = (xi ).
In the previous section, connections within a neural network were grouped into ordered sets,
such that the application of the connections within each set could be performed simultaneously. In this way, parallelism over the directed acyclic dependency graph of a neural
network boils down to the bulk synchronous evaluation of operations 1–3 above on di↵erent connections and nodes within the network. Using the ordered sets of connections, the
zeroing and activation of nodes can be deduced from the following observation: given the
130
above neural network operations, a strict order-of-operations can be defined such that the
neural network’s full directed acyclic dependency graph can be established. Specifically this
operation order is,
Activate origin nodes < Zero destination nodes < Apply connections.
Therefore, given a set of connections which are to be applied, any origin node that has not
been previously fed forward must be activated, any destination node which has not yet been
fed into must be zeroed/reset, and then all connections in the set can be applied.
By building the dependency graph from the connections DAG previously discussed, and
by employing the above order of operations, this algorithm guarantees that at every step,
the performed operations will be homogeneous. That is, many nodes will be activated
concurrently, then many nodes will be zeroed concurrently, and finally many connections
will be applied concurrently. In this way the evaluation of a neural network is factored into
steps that can be easily mapped to SIMD processors. This will be discussed in detail in
section 4.3.3.
4.3.2.1
Neural network node lifecycle
While the order of operations proposed captures the general method requirements of the feed
forward evaluation of arbitrary neural networks, it misses a few subtleties that are apparent
when considering the lifecycle of a node in a neural network.
Thus, let us consider the lifecycle of a single node in a network. At the start of neural
network evaluation, a node holds its activated value as determined in the previous evaluation
of that network. The first step that must be taken is that all recurrent connections must
be applied which use this node’s value as an origin. This is because a recurrent connection
131
is meant to act as a type of memory, feeding information from previous network evaluation
into the current evaluation.
Next, the node must be zeroed out. Following this, there is a phase when the node is
used as a destination, with its value being the weighted cumulative sum of all incoming
connections that have been applied. Once all connections inbound to this node have been
applied, the activation function transforms the node’s value. Then, the node can be used as
an origin for normal connections, and finally, the output value is left untouched, so that it
can be used for any recurrent connections in the next evaluation.
These steps are fundamentally what impose the ordering conditions described in the
previous section,
• The zero/reset must occur after the last use of the node as an origin of a recurrent
connection, and before the first use of the node as a destination.
• The activation function must be applied after the last use of the node as a destination,
and before the first use of the node as an origin of a normal connection.
Thus, these actions are performed between the evaluation of each set of connections.
Together with the connection ordering described in section 4.3.1, we arrive at a general
distribution of tasks which follow the basic life cycle of a node and which are constrained by
the parallelism of the evaluatable connection sets.
4.3.3
Generalized concurrent neural network evaluation
By applying the topological sort of network connections and enforcing the order of operations
described above, the full neural network evaluation graph can be built. Figure 4.6 shows
the directed acyclic dependency graphs for two neural networks with di↵erent structures.
132
Network #2
Network #1
Directed Acyclic Dependency Graph
Neural Networks
A2
A1
1
B1
A2
A3
A1
3
2
1
2
B1
B2
B2
1
1
2
3
A1
A2
A3
1
A1
1
2
B1
B2
Evaluation SIMD
Step Concurrency
#1
4
#2
5
#3
3
3
#4
2
B1
B2
#5
4
2
#6
2
C1
#7
1
3
#8
1
Symbol
Name
Zero node 1
0
Activate node 1
1
1
Apply connection
A1
3
2
2
1
A2
C1
2
A1
Figure 4.6: The full directed acyclic dependency graphs for two structurally di↵erent neural networks. The di↵erent symbols in the graph, squares, circles, and triangles, represent
the three neural network operations: zero-node, apply-connection, and activate-node, respectively. Even though the two networks have very di↵erent structure, the algorithm proposed in
this work organizes the network evaluation graphs into evaluations steps that are componentwise parallel. On the right of the figure, each evaluation step is shown with the number of
SIMD operations that can be performed simultaneously.
133
The neural network label network 2 in the figure is the same network shown in Fig. 4.4.
However, instead of just showing the DAG for the application of connections, it includes
the other operations as well: squares represent the zero-node operation, circles represent the
apply-connection operation, and triangles represent the activate-node operation; see the key
in Fig. 4.6 for more details.
Factoring the neural network evaluation into these component operations, the individual
structure of di↵erent neural networks is no longer a consideration. Each network has some
number of nodes which must be zeroed and activated, as well as some number of connections
that must be applied. Starting at the top of the DAGs shown in Figure 4.6, the first step in
evaluating both networks 1 and 2 is to zero out four nodes (see the axis to the right of both
networks). Therefore, in this first step, an operation concurrency of four is achieved. In the
second step of evaluation, five connections must be applied across the two separate networks.
This process continues until all evaluation steps in the shown DAGs have been performed,
at which point the inputs have been fully fed through both networks, and evaluation is
complete.
Thus, the algorithms presented here are completely agnostic to the macroscopic structure
of the constituent neural networks. At the same time, these algorithms order the tasks to
be performed into homogeneous sets of operations which map very well to SIMD processors.
This e↵ectively resolves the instruction branching problem described in section 4.3. Furthermore, as can be seen in Fig. 4.6, this algorithm will scale with the neural network population
size. Since instruction branching is removed, the more networks that are utilized, the greater
the concurrency at each evaluation step. This indicates that the performance gain should
134
scale at least linearly with the number of neural networks.7
4.4
Composite population evaluation
While the DAG based algorithms described in the previous section solves the problem of
evaluating structurally diverse neural networks on SIMD-based processors, it doesn’t alleviate the lack of optimization in memory access and Host-Device communication described in
section 4.3. In this section, both of these latter issues are resolved via a simple reordering
of the memory layout for the neural network population.
Instead of storing each network individually, a refactoring of a population of N networks
into a single composite network, consisting internally of all the nodes and connections of the
sub-networks, is employed. In this way, the resulting neural network will have at minimum N
connections per evaluation step that can be processed concurrently using the evaluation algorithm described in section 4.3.3. An example of two such refactorizations are demonstrated
in Figure 4.7.
The construction of the composite net is straightforward. First, all network input nodes
are added, then all output nodes, and finally all hidden nodes. Following this, each connection of each sub-network is reindexed according to the position of its nodes in the larger
set of nodes of all networks in the population. After all connections are added, the connection sort described in section 4.3.1 and detailed in Figure 4.5 is applied to each subset of
connections in the composite neural network. As these populations can grow quite large,
it is computationally most efficient to sort based on these unconnected sub-structures (the
7 The performance scaling with the neural-network population size can be less than linear if the population
contains a subset of outlier networks which have much larger network topologies than the majority of the
population. In this case, the majority of networks will finish their final step of evaluation while the evaluation
of the subset will only be partially completed. Thus, in the remaining steps of evaluation for the outlier
networks, the concurrency will be dramatically reduced.
135
NETWORK 1
NETWORK 2
NETWORK 3
SINGLE COMPOSITE NEURAL NETWORK
POPULATION OF INDIVIDUAL NEURAL NETWORKS
HETEROGENEOUS INPUTS
HOMOGENEOUS INPUTS
Figure 4.7: Top panel: A representative example population of neural networks consisting
of three members. In classic neuroevolution simulations, each of these neural networks are
separately evaluated, and the correctness of their outputs is used to judge their fitness.
Bottom panel: The same population as in the top panel, but constructed as nodes and
connections of a single composite neural network. On the left, the resulting composite
network in which each of the constituting sub-networks requires heterogeneous inputs for
evaluation. On the right, a composite network in which each of the sub-networks will process
the same inputs.
136
individual networks) instead of the composite network, as the sorting algorithm shown above
scales as O(N 2 ) with respect to the number of connections. Finally, the connections of the
composite network are globally sorted according to their evaluation set index. Following this
procedure then guarantees that connections which can be evaluated together are colocated
in contiguous memory space. This then allows the origin and destination node indices and
connection weights to be perfectly coalesced from DRAM to the individual threads. The
resulting population-level composite network is then ready to be evaluated on a GPU or
other SIMD coprocessor.
In constructing a composite network from a population of neural networks, an additional
optimization that can be made is to refactor all of the input nodes into a single set of inputs.
This can occur if each network in the population will be evaluated given homogeneous sets of
inputs. If on the other hand, each network will receive heterogeneous inputs, this refactoring
cannot be performed. In the bottom panel of Figure 4.7 both possible composite networks
are shown, one in which all inputs are explicitly specified (heterogeneous inputs), and one
in which a single set of input nodes are fed to the rest of the sub-networks (homogeneous
inputs).
In both cases, to evaluate the resulting neural network, a single memory transfer of
the input node array (consisting of nodes for all the networks) is copied to the device, the
evaluation proceeds concurrently, and finally a single copy of the outputs (again in contiguous
memory) finishes the calculation. This is possible because, the input and output nodes of
all the networks are positioned in continuous memory space.
Since this network is quite large and has many unconnected regions, the DAG-based
algorithm described previously takes maximal advantage of the available component-wise
concurrency in the evaluation. Furthermore, the connection weights and node indices are
137
memory access coalesced for each connection, and the interaction between the host and
device occurs only twice (on the boundaries of the evaluation), minimizing any possible
communication contention. In summary, by combining the DAG-based concurrent evaluation
described in section 4.3.3 with the composite organization of a population of neural networks
into a single composite network, instruction execution, memory access, and communication,
are fully optimized for execution on many-core processors.
4.5
CUDA implementation
The DAG-based connection sort and construction of the composite neural network described
in the previous sections are performed on the host CPU, after which the network connections
and nodes are copied to the many-core device for evaluation. The network evaluation performance is benchmarked using the compute unified device architecture (CUDA) [157]. The
implemented CUDA kernels can be found in Figure 4.8. There are a total of three kernels
for the three necessary actions to be performed: resetting a node to zero, activating a node,
and applying a network connection.
As a result of the presorting of operations on the host, the CUDA kernels are extremely
simple, bringing the possibility of (thread) branch divergence to an absolute minimum. The
only possible divergence that exists is seen on line 43 of Figure 4.8 in which a self-recurrent
connection8 is handled as a special case. In the case of neural networks with no self-recurrent
connections, this algorithm is GPU thread divergence free.
8 A self-recurrent connection is a connection whose origin node is also its destination node. These connections function as memory in the neural network, returning the output the node from the previous network
evaluation.
138
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
__global__
void device_clear_nodes ( uint32_t ⇤ list , float ⇤ nodes , uint32_t n )
{
int i = threadIdx . x + blockIdx . x ⇤ blockDim . x ;
if ( i