SEISMIC STUDIES OF WESTERN PACIFIC SUBDUCTION ZONES WITH
HIGH-PERFORMANCE COMPUTING AND DEEP LEARNING

By

Ziyi Xi

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Computational Mathematics, Science and Engineering—Doctor of Philosophy

2024

ABSTRACT

In this dissertation, I explore seismic phenomena across the Western Pacific subduction zones, em-

ploying computational techniques to analyze and interpret complex geophysical data. The study is

structured into focused chapters, each contributing to a comprehensive understanding of the seismic

velocity structures and the mechanisms driving deep earthquakes in this region.

Chapter 1 provides an overview of subduction zones, crucial regions where one tectonic plate is

forced below another, leading to significant geological phenomena such as earthquakes and volcanic

activity. Focusing on the Western Pacific, the chapter examines the complex interactions at the

subduction zones formed by the convergence of the Pacific and Philippine Sea Plates, especially

in the northern regions, and the unique geological setting of the Tonga subduction zone and the

Lau back-arc basin in the south. These zones are vital for understanding the mechanics of plate

tectonics and the resulting seismic activity. Through a detailed examination of the Earth model and

seismicity, the chapter sets the groundwork for the dissertation’s main scientific inquiries, aiming

to unravel the processes governing seismic phenomena in these tectonically active regions using

seismic imaging and machine learning techniques.

Chapter 2 introduces EARA2023, a detailed seismic velocity model of the crust and upper

mantle beneath East Asia and the northwestern Pacific. This model, constructed through adjoint

full-waveform inversion tomography, is based on waveform data from 142 earthquakes and broad-

band records from around 2000 stations. EARA2023 offers enhanced images of the subducted

oceanic plate, detailing complex slab deformations and providing new perspectives on the origins

of notable intraplate volcanoes in East Asia, such as Changbaishan, Datong-Fengzhen, Tengchong,

and Hainan.

Chapter 3 shifts the focus to the application of machine learning for detecting deep earthquakes,

specifically within the Tonga subduction zone. This part of the research outlines the development of

a deep-learning-based workflow centered around PhaseNet-TF, a new phase picker designed for the

time-frequency domain analysis. This method proves particularly effective for analyzing data from

ocean-bottom seismographs (OBSs), improving the detection and picking of P- and S-wave arrivals.

The chapter discusses the integration of this technology with an improved Gaussian Mixture Model

Associator and the application of a semi-supervised learning approach to refine earthquake catalogs,

thereby enhancing our understanding of the seismic activity in the Tonga region.

Chapter 4 further advances the discussion by examining the use of converted seismic phases,

with an emphasis on PS waves from deep earthquakes in the Tonga subduction zone. The chapter

explores the application of PhaseNet-TF to identify PS-converted wave arrivals, alongside direct

P and S phases, and details the use of beam-forming and a Markov chain Monte Carlo method

for associating detected phases. This approach significantly increases the detection of PS arrivals,

contributing to a higher resolution imaging of the subducted slab and offering insights into the

structural seismology of the Tonga subduction zone.

Collectively, these chapters present a narrative that not only highlights the utility of computa-

tional methods in seismic research but also provides a deeper understanding of the mechanisms

underlying seismic activity in the Western Pacific. This dissertation contributes to the field of seis-

mology by demonstrating the potential of combining seismic modeling, machine learning, and the

analysis of converted seismic phases to investigate subduction zone dynamics and deep earthquake

mechanisms.

Copyright by
ZIYI XI
2024

ACKNOWLEDGEMENTS

Embarking on the journey to pursue my PhD in the United States, leaving the familiarity of my

homeland behind, has been one of the most rewarding decisions I’ve made in my early years. This

journey has brought countless invaluable individuals into my life, providing immense support both

academically and personally. This chapter of my life, enriched with learning and growth, will

forever hold a special place in my heart.

I extend my deepest gratitude to my mentors, Dr. Songqiao ”Shawn” Wei and Dr. Min Chen.

Their guidance transcends the boundaries of academia; they have been pillars of wisdom, teaching

me not only the intricacies of research but also the virtues of being a better individual. Their

mentorship is something I hold in high regard.

The memory of my first interaction with Min during a PhD program interview remains vivid.

Despite my limited research experience and language barriers at the time, Min’s patience and

warmth made me feel welcomed, opening the door to this incredible learning opportunity. Min’s

philosophy of keeping an open mind and the importance of communication have been instrumental

in my growth. I cherish the moments when she proudly shared my work with others. Although we

were constrained to virtual interactions due to the pandemic and Min’s unfortunate departure, her

legacy continues to inspire me to forge ahead with resilience.

Shawn has been instrumental in my academic journey, providing me with opportunities to delve

into exciting research in the Tonga subduction zone. I would greatly appreciate it that Shawn took

the responsibility to be my advisor after Min passed away and financially supported the rest of

my PhD study. Our weekly discussions have been a source of significant personal and academic

growth. Shawn’s ability to grasp concepts quickly and offer insightful advice, coupled with our

enjoyable casual conversations, has made our meetings highlights of my week. Shawn’s support

extended beyond academia, encouraging my industry aspirations and facilitating my transition with

flexibility. I am immensely grateful for his guidance and meticulous attention to detail in my thesis

work.

My committee members, Dr. Allen McNamara, Dr. Jeff Freymueller, and Dr. Rongrong Wang,

v

deserve a special thank you for their invaluable advice in Earth science and computational tech-

niques. Their courses have greatly enriched my knowledge base. I also extend my gratitude to Dr.

Jeroen Tromp for his willingness to offer guidance as an unofficial committee member.

The camaraderie and support from my lab colleagues have been pivotal in my PhD journey. I

am thankful for the wisdom shared by Dongdong Tian, Guoliang Li, Jiaqi Li, Ross Maguire, and

Tong Zhou, and the enriching collaborations with Fan Wang, Yaqi Jie, Yurong Zhang, and Zhuoran

Zhang. Their friendship has made this journey all the more enjoyable.

I am grateful for the companionship and support of my peers at MSU, including Binbin Huang,

Hao Wang, He Lyu, Jiaxing Yang, and Kang Yu, Qi Lyu, Shuyang Qin, as well as friends from the

EES department like Guanyuan Shuai, Jiaxing Zhang, Keyi Cheng, Mingda Lyu, and Zechao Zhuo.

My heartfelt thanks go out to my collaborators, Dr. Baoshan Wang, Dr. Gregory C. Beroza,

Dr. Thomas P. Ferrand, Dr. Weiqiang Zhu, and Dr. YoungHee Kim, for their pivotal contributions

and insights. I also wish to acknowledge my undergraduate advisor, Dr. Daoyuan Sun, for his

foundational influence and continued support.

I am deeply appreciative of the financial support from Mr. Raymond Ginther and Mrs. Marie

Ginther through the Ginther graduate fellowship, the National Science Foundation, and MSU for

their fellowships, which have been crucial in my research endeavors.

Above all, my heartfelt gratitude goes to my parents, whose unwavering support and sacrifices

have paved the way for my academic and personal growth. Their strength and love have been my

constant source of motivation, and I hope to make them proud.

This acknowledgment is a small token of my gratitude to everyone who has been a part of this

journey. Each one of you has left an indelible mark on my life, for which I am eternally grateful.

vi

TABLE OF CONTENTS

CHAPTER 1

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

CHAPTER 2

EARA2023: A NEW RADIALLY ANISOTROPIC SEISMIC VELOCITY
MODEL FOR THE CRUST AND UPPER MANTLE BENEATH EAST
ASIA AND NORTHWESTERN PACIFIC SUBDUCTION ZONES . . .

3

CHAPTER 3

DEEP LEARNING FOR DEEP EARTHQUAKES: INSIGHTS FROM
OBS OBSERVATIONS OF THE TONGA SUBDUCTION ZONE . . . . 53

CHAPTER 4

DETECTING CONVERTED SEISMIC PHASES OF TONGA DEEP
EARTHQUAKES: IN-DEPTH INSIGHTS FROM DEEP-LEARNING
METHODS . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

.

BIBLIOGRAPHY . .

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

vii

CHAPTER 1

INTRODUCTION

The intricate tectonic landscape of the Western Pacific, highlighted by the convergence of the Pacific

and Philippine Sea Plates along notable trenches such as the Kuril, Japan, Izu-Bonin, Mariana, Yap,

and Tonga, presents a dynamic setting for seismic studies. This region’s active subduction zones,

characterized by frequent earthquakes and arc volcanoes, form a pivotal area for understanding the

interactions between tectonic plates and the resultant geological processes [Zhang, 1997, Allégre

et al., 1984, Yin and Harrison, 2000]. To the north, the collision between the Indian and Eurasian

Plates, forming the Tibetan Plateau and Himalayan Mountains, further contributes to the complexity

of this geologically diverse area [Bruce Watson and Brenan, 1987, Tapponnier et al., 1986]. To the

south, the Tonga subduction zone, in particular, stands out due to its unique geological setting,

hosting the coldest and fastest subducting slab on Earth. This area is crucial for investigating deep-

focus earthquakes and understanding slab dehydration processes and the interaction between the

slab and surrounding mantle [Green and Houston, 1995, Zhan, 2020]. The Tonga trench provides

a natural laboratory for studying deep earthquakes, offering insights into slab geometry and the

mechanisms behind intermediate-depth and deep-focus seismic events.

Advances in seismic imaging, notably through the development of spectral-element methods

(SEM) and full-waveform inversion (FWI), have significantly enhanced our ability to model the

Earth’s interior with high fidelity [Komatitsch and Tromp, 2002a, Fichtner et al., 2009]. These

methods, leveraging high-quality seismic data and considering all usable segments of seismic wave-

forms, have led to the development of detailed seismic velocity models such as EARA2023, pro-

viding valuable insights into the crustal and upper mantle structures beneath East Asia and the

northwestern Pacific.

Parallel to seismic imaging, machine learning, especially deep learning, has revolutionized the

analysis of seismic data. Techniques like PhaseNet with the adaptation of the U-Net and EQTrans-

former with the transformer architectures for seismic phase detection and association have markedly

improved the efficiency and accuracy of identifying seismic events [Ross et al., 2018, Zhu and

1

Beroza, 2019, Mousavi et al., 2020]. These developments have been particularly impactful in de-

tecting and analyzing earthquakes in subduction zones, including the challenging offshore regions

covered by ocean-bottom seismographs (OBS) and relatively weak seismic converted phases.

The development of seismic imaging and machine learning has been driven by a confluence of

scientific inquiries, algorithmic advancements, and significant leaps in computational capabilities.

In recent years, computational power has been greatly enhanced by the advent of high-performance

computing (HPC) and the availability of large-scale computing resources, such as those provided

by the Texas Advanced Computing Center (TACC) and the Extreme Science and Engineering Dis-

covery Environment (XSEDE), SUMMIT at Oak Ridge National Laboratory, and Frontera at the

University of Texas at Austin. These resources have enabled the development of sophisticated seis-

mic models and machine learning algorithms, facilitating the analysis of large seismic datasets and

the generation of high-resolution seismic images. The algorithm design and implementation have

also been influenced by the computing architectures, such as the Graphics Processing Unit (GPU)

and the Tensor Processing Unit (TPU), which have accelerated the training of deep learning mod-

els and the computation of seismic wave propagation simulations, such as the GPU-accelerated

SPECFEM3D_GLOBE [Komatitsch and Tromp, 2002a].

This dissertation employs advanced computational techniques and machine learning to explore

the seismic phenomena characterizing the Western Pacific subduction zones. By integrating so-

phisticated seismic modeling with innovative data analysis methods, this work aims to deepen our

understanding of the complex processes driving seismic activity in this region, with a special focus

on the Tonga subduction zone. This research will contribute to the broader scientific community’s

knowledge of subduction zone dynamics and the mechanisms behind deep earthquakes, ultimately

enhancing our ability to predict and mitigate seismic hazards in these tectonically active areas. In

the meantime, the development of advanced computational techniques and machine learning algo-

rithms will continue to play a crucial role in advancing our understanding of the Earth’s interior

and the seismic processes shaping our planet, This dissertation will also serve as a testament to the

power of computational science in the geosciences.

2

CHAPTER 2

EARA2023: A NEW RADIALLY ANISOTROPIC SEISMIC VELOCITY MODEL FOR
THE CRUST AND UPPER MANTLE BENEATH EAST ASIA AND NORTHWESTERN
PACIFIC SUBDUCTION ZONES

An edited version of this chapter is under review at Geophysical Journal International. Ziyi Xi,

Min Chen, Songqiao Shawn Wei, Jiaqi Li, Tong Zhou, Baoshan Wang, and YoungHee Kim.

EARA2023: A New Radially Anisotropic Seismic Velocity Model for the Crust and Upper

Mantle beneath East Asia and Northwestern Pacific Subduction Zones

2.1 Abstract

We present a new 3-D radially anisotropic seismic velocity model EARA2023 of the crust and

mantle beneath East Asia and the northwestern Pacific using adjoint full-waveform inversion to-

mography. We construct the EARA2023 model by iteratively minimizing the waveform similarity

misfit between the synthetic and observed waveforms from 142 earthquakes recorded by about 2,000

broadband stations in East Asia. Compared to previous studies, this new model renders significantly

improved images of the subducted oceanic plate in the upper mantle, mantle transition zone, and

uppermost lower mantle along the Kuril, Japan, Izu-Bonin, and Ryukyu Trenches. Complex slab

deformation and break-offs are observed at different depths. Moreover, our model provides new

insights into the origins of intraplate volcanoes in East Asia, including the Changbaishan, Datong-

Fengzhen, Tengchong, and Hainan volcanic fields.

2.2

Introduction

East Asia and the northwestern Pacific subduction zones consist of various tectonic structures

and host diverse and active geological processes in the crust and upper mantle (Figure 2.1). In

the east, the Pacific Plate subducts westward along the Kuril, Japan, Izu-Bonin, Mariana, and Yap

Trenches, and the Philippine Sea Plate subducts along the Nankai, Ryukyu, Malina, and Philippine

Trenches. These subduction zones, with numerous earthquakes and arc volcanoes, are one of the

most geologically active regions in the world. In the middle, the Xing’an-East Mongolia Block,

North China Block, and South China Block form the east-central end of the Eurasian continent

3

[Zhang, 1997]. In the west, the collision between the Indian and Eurasian Plates forms the Tibetan

Plateau and Himalayan Mountains [Allégre et al., 1984, Yin and Harrison, 2000]. The complex

geological structure and active tectonics in this region provide a natural laboratory for studying

inter-plate and intra-plate tectonics. For instance, the collision of the Indian Plate and Eurasian Plate

results in the Karakorum, Kunlun, Altyn-Tagh Faults and the Tarim, Junggar, and Qaidam Basins

[Bruce Watson and Brenan, 1987]. The resistance of the Yangtze Block in southern China to the

subducted Indian Plate causes numerous intra-plate earthquakes in southwestern China [Tapponnier

et al., 1986]. Furthermore, the subducted Pacific Plate in the mantle transition zone (MTZ) also has

significant influences on the evolution of the East Asian lithosphere [Zhao et al., 2007, 2009, Lei

et al., 2013, He et al., 2014, Tang et al., 2014, Guo et al., 2016, 2018].

High-resolution seismic images provide valuable information about the crustal and upper man-

tle structures and geological processes in this region. Since decades ago, body-wave traveltime

tomography, based on either the high-frequency approximation or 3-D finite-frequency kernels of

seismic wave propagation, has been widely used to image East Asia [Aki and Lee, 1976, Dziewon-

ski et al., 1977, Friederich, 2003, Lei and Zhao, 2005, 2016, Huang and Zhao, 2006, Zhao et al.,

2007, 2009, Kustowski et al., 2008b, Lei et al., 2009, Tian et al., 2009, Li and Van Der Hilst, 2010,

Obrebski et al., 2012]. In addition, ambient noise, surface wave dispersion, and Rayleigh wave el-

lipticity also have been used to image the subsurface structure beneath East Asia [Yao et al., 2006,

2008, Zheng et al., 2006, Witek et al., 2014, Bao et al., 2015, Li et al., 2016, Yang et al., 2019].

Finite-difference methods of simulating wave propagation provide another tool for studying the East

Asian crust and upper mantle [Wang et al., 2013].

The development of the spectral-element method (SEM) provides another way to simulate wave

propagation and naturally considers the finite-frequency effects [Komatitsch and Tromp, 2002a,b].

Its combination with the adjoint state method provides an effective way to compute the gradients of

3-D model parameters and thus can be used for tomographic imaging. The full-waveform inversion

(FWI) method based on SEM and adjoint state methods images the Earth’s interior by iteratively

applying SEM to simulate seismic waveforms in 3-D models and then updating the model with

4

finite-frequency kernels. By considering all usable segments of seismic waveforms, this method

uses much more information than other seismic tomography methods, thus potentially enhancing

the model resolution. Furthermore, there is no need for a crustal correction, a common practice in

conventional tomography methods, because the crust is explicitly included in the simulations. This

approach also makes it easy to consider the effects of the Earth’s ellipticity, bathymetry, topography,

and internal discontinuities in the full waveform simulation. Several global or regional models have

been developed using this method [Tape et al., 2007, Fichtner et al., 2009, 2010, Tape et al., 2009,

2010, Zhu et al., 2012a,b, Lei et al., 2013, French and Romanowicz, 2014, Chen et al., 2015, Bozdağ

et al., 2016, Tao et al., 2018, Lei et al., 2020, Zhou et al., 2022, Ma et al., 2022].

In this work, we use SPECFEM3D_GLOBE [Komatitsch and Tromp, 2002a,b, Peter et al.,

2011] to invert 10 years of high-quality seismic data collected from East Asia and the northwestern

Pacific to image the crustal and mantle structure. This paper presents our latest FWI tomography

model of this region: EARA2023. First, we will introduce our dataset and methods. Second, we

will assess our model with resolution tests and waveform comparisons. Third, we will describe

the model and compare our model with previous models. Last, we will discuss slab morphology

in northwestern Pacific subduction zones and the origin of four intra-plate volcanoes based on our

new model.

2.3 Data and Method

2.3.1 Data selection and processing

We defined our study region in East Asia and the northwestern Pacific as a nearly square region

centered at 36°N, 123°E, with a length of 60° on each side, and anti-clockwise rotated by 10° (Fig-

ure 2.1). We selected 142 earthquakes with Mw 5.2–7.0 from the global centroid moment tensor

(CMT) catalog [Ekström et al., 2012] between 2008 and 2018 in our study region (Figure 2.2).

These earthquakes were carefully chosen to provide balanced data coverage. We excluded events

less than 2 degrees away from the study region boundary to further reduce the numerical noise from

the boundary effect, even though Clayton–Engquist–Stacey absorbing conditions [Clayton and En-

gquist, 1977, Stacey, 1988] had been used in our simulations. 58 of these 142 earthquakes were

5

deeper than 150 km for better constraining deep mantle structures, such as the subducted slabs

and MTZ. We also excluded events with Mw larger than 7.0 or source half duration time longer

than 7 s to avoid complex rupture processes that might complicate the structural inversion. Con-

sequently, our earthquake catalog consisted of 142 events in the Kuril, Japan, Izu-Bonin, Mariana,

Nankai, Ryukyu, Manila, and Philippine subduction zones down to about 650 km depth, as well as

continental events shallower than 30 km in the Eurasian lithosphere.

We used three-component seismic waveforms recorded by about 2,000 stations from the China

Earthquake Administration Array (CEArray) [Zheng et al., 2009], F-net [Okada et al., 2004], NEC-

ESSAray [Tang et al., 2014], INDEPTH IV project [Wei et al., 2010, Yue et al., 2012], the central

Mongolia seismic experiment [Meltzer et al., 2019], the Korean seismic network and other regional

or global seismic networks. More than 90% of the stations had broadband (BH and HH) seismome-

ters, whereas other stations had short-period high-gain (SH) seismometers. After detrending and

tapering, we first removed the instrumental response to get displacement waveforms in three com-

ponents: vertical (Z), east (E), and north (N). We then rotated the two horizontal components (E

and N) to the radial (R) and tangential (T) components. In the next step, the R and T components of

stations with misorientation were corrected according to the methods described in previous studies

for CEArray [Niu and Li, 2011] and the Korean seismic network [Lim et al., 2018]. In the first in-

version stage, we applied a second-order zero-phase Butterworth bandpass filter between 10 and 40

s for body waves and 40 to 120 s for surface waves. In the second stage, we tried to achieve higher

resolution with higher-frequency signals and filtered body waves at 8–40 s and surface waves at

30–120 s. In the third stage, we kept the same bandpass filter range for body waves as the second

stage, and filtered surface waves at 30–120 s. Finally, we down-sampled the filtered data to 10

samples per second.

2.3.2 Configuration of the Spectral-Element-Method Simulation

We use the spectral-element method [Komatitsch and Tromp, 2002a,b] to simulate wave prop-

agation and compute sensitivity kernels based on the adjoint state method. In addition, we use

the conjugate gradient method to maximize the waveform similarity and phase matching between

6

data and synthetics. The 3-D model space extends from the Earth’s surface, as delineated by the

Crust1.0 model [Laske et al., 2013], down to the center of the Earth. We construct the initial model

by combining two regional models, FWEA18 [Tao et al., 2018] and EARA2014 [Chen et al., 2015],

embedded in two global models, S362ANI [Kustowski et al., 2008b] for the mantle and CRUST1.0

for the crust. FWEA18, a radially anisotropic model, contributes to the majority of this initial

model. In regions not covered by FWEA18, the initial model relies on EARA2014 instead. This

approach leverages the consistency and compatibility of the methodologies behind these models

and is driven by the progression in model development, where each new model builds upon the

previous one, supplemented by additional data. It is highly likely that the most recent model pro-

vides a better estimation of the real Earth. By starting with a model that is as close to reality as

possible, we aim to enhance the efficiency and accuracy of our inversion process. Our model con-

struction process involved merging the velocity perturbations of FWEA18 and EARA2014 relative

to S362ANI in the mantle and to Crust1.0 in the crust. To facilitate a smooth integration of these

models and mitigate numerical instability, we apply Gaussian smoothing to the combined pertur-

bations. The extent of these models are shown in Figure 2.25. We fix the topography of the 410-

and 660-km discontinuities to that from S362ANI and the attenuation structure to that from QL6

[Durek and Ekström, 1996] during the inversions.

In the crust and upper mantle, we use five parameters to parameterize our model: density 𝜌,

isotropic bulk sound speed 𝑉𝐶, horizontally propagating S wave speeds 𝑉𝑆𝑉 (polarized in the ver-

tical direction) and 𝑉𝑆𝐻 (polarized in the horizontal direction), and a non-dimensional parameter 𝜂

measuring the anellipticity of P wave phase slowness. Radial anisotropy is introduced to the crustal

part of our model because previous studies indicate the presence of radial anisotropy in this region

[Witek et al., 2021, Xie et al., 2013, Tao et al., 2018]. Since previous studies [e.g., Panning and

Romanowicz, 2006] suggest that the upper part of the lower mantle is nearly isotropic, we parame-

terize it with isotropic P wave speed 𝑉𝑃, isotropic S wave speed 𝑉𝑆, and density 𝜌. Because density

𝜌 cannot be well constrained by our data, we scale 𝜌 with isotropic shear wave speed perturba-

tion 𝛿𝑙𝑛𝑉𝑆 as 𝛿𝜌 = 0.33𝛿𝑙𝑛𝑉𝑆 [Panning and Romanowicz, 2006]. Here the perturbation reference

7

model STW105 [Kustowski et al., 2008a] is crucial in the inversion process, as we update model

perturbations rather than the absolute values of model parameters.

The horizontal dimension of the spectral element is set as ∼20 km in the crust and doubled

in the mantle, and the vertical dimension is ∼10 km in the crust and ∼35 km in the mantle. The

shortest period for waveform forward modeling is 8 s. Each spectral element contains 125 Gauss-

Lobatto-Legendre interpolation points. In total, the simulation involves about 2.3 million elements.

To simulate a 30-minute wave propagation for one event, the SPECFEM_GLOBE takes ∼30 min-

utes wall time for the forward simulation and ∼70 minutes for sensitivity kernel calculation using

441 cores on the SKX nodes of the Stampede2 supercomputer at the Texas Advanced Computing

Center (TACC). The total volume of seismic data and synthetics in each iteration is ∼800 GB. To

improve I/O efficiency, we develop a Python-based data analysis and inversion workflow based on

the Adaptable Seismic Data Format (ASDF) [Krischer et al., 2016] and ObsPy [Beyreuther et al.,

2010] to store and process the seismic data.

2.3.3 Waveform Misfit Measurement

We group our measurements into 6 categories: P-SV body waves on the vertical and radial

components, SH body waves on the transverse component, Rayleigh waves on the vertical and radial

components, and Love waves on the transverse component. We use a conjugate gradient method to

iteratively minimize the misfit between recorded waveforms and synthetics. More specifically, we

define the misfit function as a normalized zero-lag cross-correlation coefficient (NZCC) because

this coefficient reflects the model based on waveform similarity and thus works more effectively

compared to other misfit functions that only use traveltime information [Tao et al., 2017, Zhou

et al., 2021]. The NZCC for each measurement window is defined as:

NZCC𝑐,𝑒,𝑠,𝑤 =

∫

∫

𝑠𝑐,𝑒,𝑠,𝑤 (𝑡) · 𝑢𝑐,𝑒,𝑠,𝑤 (𝑡)𝑑𝑡
∫

𝑠𝑐,𝑒,𝑠,𝑤 (𝑡)𝑑𝑡 ·

𝑢𝑐,𝑒,𝑠,𝑤 (𝑡)𝑑𝑡

(2.1)

where 𝑠(𝑡) is the synthetic and 𝑢(𝑡) is the observed waveform in category 𝑐 and window 𝑤 for event

𝑒 recorded by station 𝑠. We measure all major body-wave and surface-wave phases, including P,

pP, sP, PP, S, sS, SS, ScS, Rayleigh wave, and Love wave.

8

Compared with other misfit measurements that utilize dispersion or frequency-dependent phase

arrival time, the NZCC is a fixed frequency measurement method. Previous studies introduce sev-

eral waveform inversion strategies to address dispersion, including generalized seismological data

functionals [Gee and Jordan, 1992], multi-taper delay time measurements [Tape et al., 2010], in-

stantaneous phase and envelope misfits [Bozdağ et al., 2011, Rickers et al., 2012], exponentiated

phase [Yuan et al., 2020], and time-frequency phase misfit for waveform misfit objectives [Fichtner

et al., 2009, 2010, 2013, Krischer et al., 2018]. However, these methods are predominantly focused

on the first arrival time. The arrival time misfit, dominated by the first arrival’s large amplitude,

means that the corresponding adjoint sources do not account for waveform distortions associated

with later arrivals. In contrast, NZCC considers the entire waveform window, and thus offers bet-

ter performance in complex scenarios, as evidenced by both synthetic tests [Tao et al., 2017] and

real-data studies [Tao et al., 2018, Zhou et al., 2021, 2022]. To take frequency-dependent arrival

times into consideration, we utilize different frequency bands at various stages of our analysis.

Following Tao et al. [2018], we select a measurement window based on a predicted traveltime

because this strategy is more computationally efficient compared to the conventional FLEXWIN

method [Maggi et al., 2009]. We define the body-wave measurement windows from 20 s before

to 50 s after the predicted phase arrival times based on the AK135 model [Kennett et al., 1995].

If multiple body-wave windows overlap with each other, we merge them into a single one. For

surface waves, the measurement window starts 50 s before a predicted phase arrival time, assuming

a wave speed of 4.6 km/s, and ends 50 s after a predicted phase arrival time, assuming a wave

speed of 3.3 km/s. We only measure surface waves from events shallower than 150 km. In total,

we measure ∼650,000 body-wave windows and ∼150,000 surface-wave windows that satisfy our

criteria of signal-to-noise ratio (SNR), cross-correlation coefficient (CC), and time shift. The three

criteria are described in Appendix 2.7.1 in detail. It is worth noticing that the absolute amplitude

information is not taken into account because the NZCC is normalized. Therefore, our inversion

is insensitive to seismic attenuation even though it may slightly distort waveforms. Therefore, we

only focus on the Earth’s elastic properties in this study.

9

We apply categorical and geographical weighting (Table 2.1) to balance the misfits from differ-

ent categories and regions. The weighted misfit function is

𝑁𝑐∑

𝑁𝑒∑

𝑁𝑒,𝑠∑

𝑁𝑐,𝑒,𝑠,𝑤∑

𝜒 =

𝑊𝑐

𝑊𝑒,𝑠

𝑐=1

𝑒=1

𝑠=1

𝑤=1

𝑊𝑤 (1 − NZCC𝑐,𝑒,𝑠,𝑤)

(2.2)

where 𝑁𝑐 represents the number of categories (i.e., 6) used in our inversion, 𝑁𝑒 indicates the total

number of events, 𝑁𝑒,𝑠 is the number of stations for event 𝑒, and 𝑁𝑐,𝑒,𝑠,𝑤 is the number of mea-

surement windows in the corresponding category 𝑐 for event 𝑒 and station 𝑠. 𝑊𝑐 is a categorical

weighting term, balancing the misfit contribution from different categories. 𝑊𝑒,𝑠 is a geographical

weighting term, balancing the misfit contribution from different regions and mitigating the bias

due to nonuniform station distribution [Ruan et al., 2019]. 𝑊𝑤 is a weighting term to control the

waveform quality in each measurement window.

The corresponding adjoint source of NZCC is defined as

where:

𝑁𝑐∑

𝑁𝑒∑

𝑁𝑒,𝑆∑

𝑁𝑐,𝑒,𝑠,𝑤∑

𝛿 𝜒 =

𝑊𝑐

𝑊𝑒,𝑠

𝑐=1

𝑒=1

𝑠=1

𝑤=1

𝑊𝑤

𝜕 𝜒𝑐,𝑒,𝑠,𝑤
𝜕𝑡

𝛿𝑡

𝜕 𝜒𝑐,𝑒,𝑠,𝑤
𝜕𝑡

𝑠𝑐,𝑒,𝑠,𝑤 (𝑡) − 𝐴𝑐,𝑒,𝑠,𝑤𝑢𝑐,𝑒,𝑠,𝑤 (𝑡)
𝑊 𝑐,𝑒,𝑠,𝑤

=

𝐴𝑐,𝑒,𝑠,𝑤 is defined as

∫

𝐴𝑐,𝑒,𝑠,𝑤 =

∫

𝑢𝑐,𝑒,𝑠,𝑤 (𝑡)𝑑𝑡

𝑠𝑐,𝑒,𝑠,𝑤 (𝑡)𝑑𝑡

∫

|𝑢𝑐,𝑒,𝑠,𝑤 (𝑡)|2 𝑑𝑡

And the normalization factor 𝑊 𝑐,𝑒,𝑠,𝑤 is defined as

√∫

∫

𝑊 𝑐,𝑒,𝑠,𝑤 =

|𝑠𝑐,𝑒,𝑠,𝑤 (𝑡)|2𝑑𝑡

|𝑢𝑐,𝑒,𝑠,𝑤 (𝑡)|2𝑑𝑡

The structural inversion details will be described below in Section 2.3.5.

2.3.4 Source Inversion

(2.3)

(2.4)

(2.5)

(2.6)

Earthquake source parameters also influence seismic waveforms, and thus we have to invert for

source and structure iteratively. Here we use the steepest descent method to optimize the source

model and the adjoint state method to compute the gradients following Kim et al. [2011]. The misfit

10

function and the adjoint source of event 𝑒 for the source inversion are defined in similar ways:

𝑁𝑐∑

𝑁𝑒,𝑠∑

𝑁𝑐,𝑒,𝑠,𝑤∑

𝜒𝑒 =

𝑊𝑐

𝑊𝑒,𝑠

𝑐=1

𝑠=1

𝑤=1

𝑊𝑤 (1 − 𝑁 𝑍𝐶𝐶𝑐,𝑒,𝑠,𝑤)

𝜕 𝜒𝑒
𝜕𝑡 =

𝑁𝑐∑

𝑁𝑒,𝑠∑

𝑁𝑐,𝑒,𝑠,𝑤∑

𝑊𝑐

𝑊𝑒,𝑠

𝑐=1

𝑠=1

𝑤=1

𝑊𝑤

𝜕 𝜒𝑐,𝑒,𝑠,𝑤
𝜕𝑡

(2.7)

(2.8)

The subscripts and superscripts here have the same definitions as those in Section 2.3.3. Categorical

weighting is defined only for a single event:

𝑊𝑐 =

∑𝑁𝑒,𝑠

1
𝑠=1 N𝑐,𝑒,𝑠,𝑤

(2.9)

The invertible source parameters include centroid moment tensor 𝑀, 3-D centroid location 𝑋, and

source time function 𝑆(𝑡). Readers are referred to Appendix 2.7.2 for technical details.

2.3.5

Inversion Process

The inversion process is divided into three stages, and each stage includes source inversions

followed by structure inversions using a preconditioned conjugate gradient method [Polak and Ri-

biere, 1969]. In the first stage, after 3 iterations of source inversion based on the initial model 𝑚00,

the source parameters remain unchanged, and we perform 5 iterations of structural inversion, using

body waves filtered at 10-40 s and surface waves filtered at 40-120 s. Then we change the dimension

of the Gaussian smoother to sensitivity kernels from 50 km horizontally and 25 km vertically to 25

km horizontally and 10 km vertically for another 5 iterations. In the second stage, we start from

the output model from the first stage and use higher-frequency body waves filtered at 8-40 s and

surface waves filtered at 30-120 s. In this stage, after 3 iterations of source inversion, we perform

another 10 iterations of structure inversion. In the third stage, we have also performed the same

number of inversions as the second stage, but the surface waves are further filtered at 20-120 s. In

sum, we run 9 iterations of source inversion and 30 iterations of structure inversion. Readers are

referred to Appendix 2.7.3 for technical details about the structure inversion. We conclude each

stage after 10 iterations for efficiency. Although the inversion does not necessarily converge at

the end of each stage, the misfit reduction is small with additional iterations (Figure 2.21). More

importantly, the subsequent stage can further reduce the data misfit with the frequency band used

11

in the previous stage. Figure 2.3 shows that the inversions converge after 30 iterations where misfit

reduction becomes minimal. The whole inversion process consumes about 4.5 million CPU hours

on Stampede2.

2.4 Results and Model Assessment

2.4.1 Average 𝑉𝑃 and 𝑉𝑆 models

The final structure model is named EARA2023 (East Asia Radial Anisotropic Model 2023).

Average 𝑉𝑃 and 𝑉𝑆 are calculated following Panning and Romanowicz [2006]:

𝑉 2

2𝑉 2

𝑆𝑉 + 𝑉 2
𝑆𝐻
3
𝑃𝑉 + 4𝑉 2
𝑉 2
𝑃𝐻
5
The model shows 𝛿𝑙𝑛𝑉 perturbations as large as 6%. The 𝛿𝑙𝑛𝑉𝑆 perturbations are typically

(2.10)

(2.11)

𝑆 =

𝑃 =

𝑉 2

larger than the 𝛿𝑙𝑛𝑉𝑃 perturbations, but both show similar large-scale features (Figures 2.4 and S1).

S-wave radial anisotropy (𝑉𝑆𝐻 − 𝑉𝑆𝑉 )/𝑉𝑆 is about 3% in the lithosphere and varies from -3% to

3% in the asthenosphere (Figure 2.5). For the rest of this paper, we focus on the S-wave model

because it is better resolved than the P-wave model due to the constraints from surface waves.

As discussed by Tao et al. [2018], thin anomalies ∼60 km above and below the 410- and 660-

km discontinuities are likely artifacts due to the discontinuity topography model from S362ANI

[Kustowski et al., 2008a] that is fixed during inversions. For the same reason, some high-velocity

anomalies, interpreted as subducted slabs, seem to be split into two parts separated by a thin low-

velocity layer in between, with the upper part in the MTZ and the lower part in the uppermost

lower mantle. These are common artifacts shown in previous models using the same approach of

handling discontinuities [Tao et al., 2018, Lei et al., 2020], and should not be over-interpreted. To

avoid over-interpretation, we place semi-transparent bands ±60 km around the 410- and 660-km

discontinuities to mask potential artifacts due to the discontinuity topography.

The EARA2023 model reveals several important tectonic structures similar to previous models

in this region [Chen et al., 2015, Tao et al., 2018]. Figure 2.4 shows high-velocity anomalies beneath

the Kuril, Japan, Izu-Bonin, and Ryukyu subduction zones, as well as the Sichuan Basin, eastern

12

Tibet Plateau, and Ordos Block in China at lithospheric depths. Compared to previous models,

one obvious improvement is the high-resolution structure of the Japan slab beneath the Chang-

baishan volcano in the MTZ. Figure 2.6 compares cross-sections of multiple tomography models

EARA2023, FWEA18 [Tao et al., 2018], EARA2014 [Chen et al., 2015], GLAD_M25 [Lei et al.,

2020], SPiRaL [Simmons et al., 2021], and GAP_P4 [Obayashi et al., 2013]. In global models

(GLAD_M25, SPiRaL, and GAP_P4), the flat slab appears to be continuous in the MTZ. However,

Tang et al. [2014] used regional traveltime tomography to first image a gap in the flat slab beneath

the Changbaishan volcanoes. Over the course of developing FWI models EARA2014, FWEA18,

and EARA2023, this gap structure gradually emerges and becomes more complex than previously

thought. The EARA2023 model shows that there are at least 2 holes in the flat slab at 125°E and

129°E. Their implications will be discussed in Section 2.5.2.

Here we only highlight a few complex crustal and upper mantle structures beneath the Tibetan

Plateau, Sichuan Basin, and North China Craton (Figure 2.7) while keeping thorough interpreta-

tions for future studies. The Tibetan Plateau is the world’s highest and largest plateau, resulting

from the ongoing collision of the Indian and Eurasian Plates. The Sichuan Basin, located to the

east of the plateau, is a sedimentary and tectonic basin. We observe high-velocity anomalies in the

upper mantle beneath the Tibetan Plateau and Sichuan Basin (Figures 2.7a, 2.7b, and 2.7c). The

upper crust of the Sichuan Basin is characterized by high-velocity anomalies (Figure 2.20). West of

100°E, we image a large high-velocity anomaly zone with a maximum 𝛿𝑙𝑛𝑉𝑆 perturbation of 4%,

potentially indicating a subducted plate reaching the MTZ and uppermost mantle (see discussions

in Section 2.5.1). Moving eastwards, another high-velocity anomaly is observed below the Sichuan

Basin at about 100 km depth, with a maximum 𝛿𝑙𝑛𝑉𝑆 perturbation of 6%, representing the western

edge of the South China Block and extending to a depth of around 300 km. We find significant

low-velocity anomalies in the crust and uppermost mantle beneath the North China Craton (NCC)

(Figures 2.7d and 2.7e), a stable continental region that has experienced significant tectonic defor-

mation and lithosphere thinning [Zhu et al., 2011]. Beneath the Huabei Plain, the eastern part of the

NCC, the strong low-velocity anomaly at 80–150 km depths reveals a thinned lithosphere above it.

13

In contrast, the lithosphere beneath the Ordos Block, indicated by a high-velocity anomaly, extends

to at least 200 km depth.

In summary, our study provides a detailed model of the crustal and upper mantle structures

beneath different tectonic features in East Asia and the northwestern Pacific. These observations

offer valuable insights into the underlying geological processes in this region.

2.4.2 Model Assessment

To assess the model, we first compute the phase time residual (Δ𝑇) between synthetic (Itera-

tion #30) and observed waveforms. Figure 2.8 shows that the EARA2023 model has significantly

reduced phase time residuals, particularly for surface waves, compared to the initial model. Addi-

tionally, the final model shows improved waveform similarity, as measured by normalized zero-lag

cross-correlation coefficient (NZCC) and cross-correlation coefficient (CC). NZCC measures both

phase time shift and similarity, while CC only measures phase similarity. These results demon-

strate that the final model provides a more accurate representation of the seismic structure in this

region.

To further assess the model resolution, we conduct a point spread function (PSF) test [Fichtner

and Trampert, 2011b], a widely used method for resolution tests in FWI studies [Chen et al., 2015,

Tao et al., 2018, Zhou et al., 2022]. Compared to traditional checkerboard methods, which are

computationally expensive for FWI, the PSF test provides a more efficient way to evaluate smooth-

ing, distortion, and parameter trade-offs in the inversion process. The PSF test results are shown in

Figure 2.9. For a checkerboard-shaped model perturbation to 𝑉𝑆𝑉 (Figure 2.9a), we expect the PSF

response for 𝑉𝑆𝑉 to resemble the input in high-resolution regions. The 𝑉𝑆𝑉 response (Figure 2.9b)

shows that the EARA2023 model has good resolutions in central-eastern China, Korea, the Japan

Sea, the Yellow Sea, and the East China Sea at 100 km depth. However, due to the event distribution,

the 𝑉𝑆𝑉 PSF response is distorted in Mongolia, the South China Sea, and the Izu-Bonin-Mariana

Arc. At greater depths, the 𝑉𝑆𝑉 PSF response remains similar to that at 100 km but with smaller

amplitudes. At depths greater than 900 km, the resolution is low, and the model should not be

over-interpreted. Ideally, perturbations to 𝑉𝑆𝑉 should not affect 𝑉𝑆𝐻, and thus the PSF response

14

for 𝑉𝑆𝐻 should be 0. However, the results show non-zeros 𝑉𝑆𝐻 response (Figure 2.9c), suggest-

ing weak trade-offs between 𝑉𝑆𝑉 and 𝑉𝑆𝐻, as they both rely on the same elastic moduli. In other

words, a small portion of 𝑉𝑆𝑉 heterogeneity is artificially mapped into 𝑉𝑆𝐻, slightly biasing radial

anisotropy. Similarly, Figure 2.9d shows the bulk velocity 𝑉𝐶 response being nearly 0 everywhere

(Figure 2.9d), suggesting that the inversion can perfectly distinguish 𝑉𝑆 and 𝑉𝐶. More technical

details of the PSF tests are presented in Appendix 2.7.4.

Waveform comparisons between observed waveforms, synthetic waveforms based on the initial

structure and source models, and synthetic waveforms based on the final EARA2023 and source

models also demonstrate the improvements. Figure 2.10 shows an example of such comparisons

for a 21-km deep event on May 7, 2008 (GCMT ID: 200805071602A) in the Japan subduction zone

recorded by a CEArray station (JX.JIJ) in southeastern China. The results show that the final models

provide better waveform fitting than the initial models, particularly for surface waves. Figure 2.19

offers another example of fitting triplicated waveforms with the final models. Additionally, Figures

2.22 and 2.23 provide independent validation of our models using recent earthquakes (GCMT ID:

202307011438A and 202306220124A) which are not used for inversions. The waveforms of these

earthquakes at several GSN stations are better fitted with our final models.

Since the final models integrate both source and structural inversions, it is crucial to distinguish

the improvements attributable to source inversions and structural adjustments. Figure 2.8 compares

histograms of Δ𝑇, NZCC, and CC for three different cases: (a) using the initial structural and source

models; (b) using the initial structural model with inverted source models after 3 iterations of the

source inversion updates in the first stage of inversions; and (c) using the final structural and source

models. This approach highlights that while the source inversions mainly adjust phase time shifts

Δ𝑇, structural inversion plays a significant role in improving NZCC and CC. Additionally, Figure

2.24 offers a comparative analysis between observed waveforms and synthetics produced for the

three different cases mentioned above. This comparison shows the collaborative improvements for

the waveform fitting through both the source and structure inversions.

Moreover, we use the phase time residual Δ𝑇 as a key indicator of source parameter variations, as

15

Δ𝑇 is mostly influenced by the change of event origin time in the source inversions. For body waves,

the source inversions greatly reduce the mean and standard deviation of Δ𝑇. For surface waves,

which are more sensitive to the crustal and mantle structures, our analysis leads to insignificant

improvement of the mean and standard deviation of Δ𝑇 in the source inversions. Therefore, the

structure inversions should have contributed most of the improvements for the crustal and mantle

structures. Therefore, we are confident with the EARA2023 model in capturing seismic structures

beneath East Asia and northwestern Pacific subduction zones.

2.5 Discussion

2.5.1 Slab morphology of Northwestern Pacific Subduction Zones

The northwestern Pacific region hosts a number of subduction zones. The Pacific Plate subducts

westwards beneath the Eurasian Plate along the Kuril, Japan Trenches and beneath the Philip-

pine Sea Plate along the Izu-Bonin, Mariana, and Yap Trenches, whereas the Philippine Sea Plate

subducts beneath the Eurasian Plate northwestwards along the Nankai, Ryukyu, Malina, and Philip-

pine Trenches. Numerous earthquakes and arc volcanic activities occur in these subduction zones.

Figure 2.11 presents vertical cross-sections of the 𝛿𝑙𝑛𝑉𝑆 perturbation for the Kuril, Japan, Izu-

Bonin, Nankai, and Ryukyu subduction zones. We also create a 3-D slab model in this region by

identifying all fast-𝑉𝑆 anomalies of > 1.5% (Figure 2.12).

The Kuril, Japan, and Izu-Bonin slabs are the subducted Pacific Plate along the Kuril, Japan,

and Izu-Bonin Trenches, respectively. The Kuril slab is characterized by a nearly constant dipping

angle of ∼36° and a significant maximum 8% 𝛿𝑙𝑛𝑉𝑆 perturbation reaching the 660-km discontinuity

(Figure 2.11a). A substantial low-velocity anomaly zone above 200 km depth represents the vast

mantle wedge, with a maximum 𝛿𝑙𝑛𝑉𝑆 perturbation of 4%. The Japan slab, characterized by high-

velocity anomalies, subducts with a shallower dipping angle of ∼25° in the upper mantle and is

deflected and extends horizontally in the mantle transition zone (MTZ) for at least 800 km. There are

two holes within the slab at ∼125°E and 129°E (Figures 2.11c). Their implications and relationship

to Changbaishan volcanoes will be discussed in Section 2.5.2. The Japan slab may extend further

to ∼115°E, with a big slab hole between ∼120°E and ∼123°E. Figure 2.11b shows the transition

16

between the Kuril slab to the Japan slab, where the subducted Pacific plate starts to flatten in the

MTZ. It is worth noting that the Japan slab beneath the East China Sea (Figure 2.11g) is much thicker

than that beneath Northeast China (Figure 2.11c), extending into the uppermost lower mantle.

The mantle wedge above the Japan slab shows a low 𝛿𝑙𝑛𝑉𝑆 anomaly of as much as -6%, reaching

depths between 200 and 300 km. Another low-velocity anomaly is detected beneath the Japan slab

in the MTZ, with a peak 𝛿𝑙𝑛𝑉𝑆 anomaly of -2%. Because the major part of this anomaly is well

above the 660-km discontinuity, we believe it is a real feature and may indicate sub-slab partial

melting that is entrained by subduction from the shallow depths [Kawakatsu et al., 2009, Wang

et al., 2020].

The Izu-Bonin slab, the Pacific Plate subducted beneath the Philippine Sea Plate along the Izu-

Bonin Trench, is characterized by a maximum 𝛿𝑙𝑛𝑉𝑆 anomaly greater than 6%, and has a steeper

dipping angle of ∼40° in the upper mantle. The slab appears flat but heavily deformed in the mid-

mantle, with pieces of high-velocity anomalies at varying depths in the MTZ and uppermost lower

mantle (Figures 2.11d and 2.11e). The slab deformation is particularly clear in the southern part of

the Izu-Bonin slab, although we lose the resolution of the Mariana slab to the further south. Our

observations of high-velocity anomalies are consistent with a previous study [Zhang et al., 2019a]

that interpreted the anomalies as evidence of a single contiguous slab undergoing tearing.

The Philippine Sea Plate is much younger and thus warmer than the Pacific Plate in this re-

gion. As the Philippine Sea Plate subducts beneath the Eurasian Plate along the Nankai Trough and

Ryukyu Trench, the Nankai and Ryukyu slabs show distinct characteristics. The Nankai slab, one

of the warmest slabs globally, subducts with a dipping angle of ∼20° to 200 km depth, with a maxi-

mum 𝛿𝑙𝑛𝑉𝑆 perturbation of only 1% (Figure 2.11f). Consequently, the Nankai slab does not appear

in our 3-D slab morphology model (Figure 2.13), which shows the 1.5% isosurface of high-velocity

anomalies. The Ryukyu slab subducts with a nearly constant dipping angle of ∼50°and extends to

∼500 km depth (Figure 2.11g), consistent with previous studies [Li et al., 2008, Tao et al., 2018].

Wu et al. [2016] interpret the thick high-velocity anomaly in the MTZ and uppermost lower mantle

beneath the Yellow Sea as the subducted Philippine Sea Plate overlying the Pacific Plate. However,

17

our model shows a ∼300-km-wide gap between the Ryukyu slab beneath the East China Sea and

the Pacific Plate beneath the Yellow Sea. Therefore, the thick high-velocity anomaly beneath the

Yellow Sea is more likely the Japan slab ponding around the 660-km discontinuity, as suggested by

Tao et al. [2018].

Similar to Tao et al. [2018], we also image a NE-SW striking high-velocity anomaly in the MTZ

beneath the South China Block. It appears to connect to the Japan slab beneath the Yellow Sea.

However, this connection may be an artifact due to low resolution, as there is no evidence suggesting

that the subducted Pacific Plate reaches this far to the southwest. Tao et al. [2018] speculate that this

anomaly is related to the subduction of a proto-South China Sea, although a plate reconstruction

model places this structure further to the east [Wu et al., 2016].

2.5.2

Intra-plate Volcanoes

Intra-plate volcanoes are intriguing because they cannot be easily explained by the classic the-

ory of plate tectonics. Some of them, e.g., the Hawaiian volcanoes, are related to mantle plumes

originating from the lower mantle [e.g., Wolfe et al., 2009, French and Romanowicz, 2015]. Some

of them, e.g., Bermuda Island, are rooted in the MTZ [e.g., Mazza et al., 2019]. However, the ori-

gins of other intra-plate volcanoes, especially those on continents, e.g., the Yellowstone volcano,

are under debate [Nelson and Grand, 2018, Zhou, 2018, Zhou et al., 2018]. Here we focus on four

intra-plate volcanic fields, Datong-Fengzhen, Tengchong, Changbaisan, and Hainan in East Asia,

and try to explore their origins. These volcanoes are far away from any active subduction zones and

thus are unlikely to be related to arc or back-arc volcanism. They are also distant from any known

divergent plate boundaries.

The Datong-Fengzhen volcanoes in northern China predominantly consist of Cenozoic basaltic

lava flows and pyroclastic deposits [Xu et al., 2005, Smithsonian-Institution, 2023]. Lei [2012] im-

age the upper mantle beneath the North China Craton with teleseismic wave traveltime tomography

and suggest that Datong-Fengzhen volcanoes are generated by hot materials upwelling from the

lower mantle. In contrast, Kim et al. [2021] image the S-wave velocity structure of East Asia with

teleseismic traveltime tomography and suggest that these volcanoes are generated by slab-induced

18

dehydration of the hydrous mantle transition zone. Moreover, by imaging mantle discontinuities

beneath eastern North China with receiver functions, Zuo et al. [2020] find a normal MTZ thickness

in this region, ruling out the possibility of an abnormally hotter mantle plume. Instead, they sug-

gest that the mantle return flow induced by lithospheric delamination beneath the southern Great

Xing’an Range generates decompression melting to feed these volcanoes. Our results show a low-

velocity zone immediately below the Datong-Fengzhen volcanoes, extending to ∼400 km depth

(Figure 2.13a and Figure 2.18a). The origin of the Datong-Fengzhen volcanoes thus appears to be

not a mantle plume from greater depths. In addition, the subducted Pacific Plate, characterized by

high-velocity anomalies in the MTZ and uppermost lower mantle beneath the Bohai Sea, is hun-

dreds of kilometers away from the low-velocity anomalies beneath Datong-Fengzhen volcanoes.

Instead, we observe high-velocity anomalies in the upper mantle and MTZ beneath the southern

Great Xing’an Range, possibly indicating lithospheric delamination. Therefore, we propose that

the mantle return flow induced by the delaminated lithosphere is the most plausible cause for the

Datong-Fengzhen volcanoes, as suggested by Zuo et al. [2020].

The Tengchong volcanic field in southwestern China is characterized by Cenozoic basalts and

andesites [Wang et al., 2007, Smithsonian-Institution, 2023]. Previous traveltime tomography im-

ages [Wei et al., 2012, Huang et al., 2015, Lei and Zhao, 2016, Lei et al., 2019] show a high-

velocity anomaly, interpreted as the subducted Burma microplate beneath Tengchong, suggesting

that the volcanism is related to slab dehydration and partial melting in the so-called big mantle

wedge (BMW). However, our results show no strong high-velocity anomalies in the MTZ beneath

Tengchong, and the Burma slab subducts with a high dip angle to only the 410-km discontinuity

northwest of Tengchong (Figure 2.13b). Our observation is consistent with the global tomogra-

phy model UU-P07 [Amaru, 2007]. Therefore, we conclude that the Tengchong volcanic field is

not caused by slab dehydration of the Burma slab. Instead, we suspect the volcanism is related to

small-scale mantle convection and upwelling between the subducting Burma slab and the 200-300

km thick lithosphere of the Sichuan Basin.

The Changbaishan volcanoes, located in northeastern China along the border with North Korea,

19

are primarily composed of trachytic and basaltic lava flows, as well as pyroclastic deposits [Zhang

et al., 2018, Smithsonian-Institution, 2023]. These stratovolcanoes have been active since the late

Pleistocene epoch and are renowned for their large-scale eruptions. Early seismic traveltime to-

mography studies of Northeast Asia, such as Huang and Zhao [2006] and Lei and Zhao [2005],

image a flattened slab in the MTZ, suggesting the dehydration of the stagnant Pacific slab releases

water to the so-called big mantle wedge (BMW) and triggers partial melting in the upper mantle

beneath these volcanoes. However, this BMW hypothesis cannot explain the specific location of

these volcanoes. In contrast, recent tomography studies empowered by regional permanent and

temporal seismic stations [e.g., Tang et al., 2014, Tao et al., 2018] have uncovered a slab gap in

the MTZ, suggesting that the sub-slab asthenosphere entrained by subduction escape through the

slab gap and upwells to trigger decompression melting to feed the Changbaishan volcanoes. A new

receiver-function study Wang et al. [2020] observes hydrous melting atop the MTZ and hydrous slab

crust segments in the MTZ, supporting the BMW hypothesis. However, Li et al. [2023] compares

seismic waveform modeling with an electrical resistivity model in this region, arguing that seismic

anomalies atop the MTZ are likely caused by moduli reduction during phase transformation rather

than hydrous melting. Our results show low-velocity anomalies immediately beneath the Chang-

baishan volcanoes, extending into two slab holes in the MTZ (Figure 2.13c). This observation is

consistent with that from Tang et al. [2014] and Tao et al. [2018], suggesting that the Changbaishan

volcanoes are fed by decompression melting from upwelling materials through the slab holes. The

origin of the upwelling materials remains uncertain. Since there is no surface geological evidence

of a deep-rooted mantle plume, Tang et al. [2014] hypothesize that the sub-slab asthenosphere is

entrained by the subduction of the Pacific Plate and escapes through the slab holes. Our results

show a sub-slab low-velocity anomaly in the MTZ at ∼135°E, possibly indicating the entrained

asthenosphere. On the other hand, we cannot rule out another possibility that the upwelling mate-

rials come from the lower mantle, as evidenced by the low-velocity anomalies immediately below

the slab holes extending down to at least 1000 km depth. To verify the source of the upwelling

materials, more tomographic images of the lower mantle in this region are needed.

20

The Hainan volcanic field in the Leizhou Peninsula and Hainan Island in southern China is

covered by late Cenozoic basalts [Zhou and Armstrong, 1982, Smithsonian-Institution, 2023]. An

early global finite-frequency tomography model by Montelli et al. [2004] proposed a mantle plume

in this region. Later regional traveltime tomography studies with higher resolution [Huang and

Zhao, 2006, Lei et al., 2009] showed more details of this Hainan mantle plume. Furthermore, a

receiver function study imaged a thinner-than-normal MTZ beneath this region, suggesting that the

Hainan mantle plume is about 170-200°C hotter than the surrounding mantle [Wei and Chen, 2016].

In our study, Figure 2.13d shows strong low-velocity anomalies (more than a 2% reduction for 𝑉𝑠)

extending from the surface to the lower mantle, suggesting a deep-rooted Hainan mantle plume.

2.6 Conclusions

We present a new 3-D radially anisotropic seismic velocity model EARA2023 of the crust and

mantle down to 1,000 km depth beneath East Asia and northwestern Pacific subduction zones us-

ing full waveform inversion (FWI) tomography. Adjoint tomography based on a spectral element

method is applied to a large dataset of about 640,000 body-wave and 148,000 surface-wave mea-

surement windows. The new model provides high-resolution images of the subducted plates, con-

tinental lithosphere, asthenosphere, and mantle transition zone (MTZ) in this region. Compared

to the previous FWI models of East Asia (FWEA18 and EARA2014), EARA2023 shows more

small-scale velocity anomalies, better delineating the continental lithosphere thickness variations,

slab morphology at different depths, and mantle upwelling. In particular, the shear wave velocity

structures of EARA2023 provide the following new insights into slab morphology and origins of

intraplate volcanoes in this region.

• The Japan slab is flattened in the MTZ west of the Japan Sea. At least two slab holes, ∼200 km

wide each, are imaged in the subducted Pacific Plate beneath the Changbaishan volcanoes.

These holes provide the pathway for mantle upwelling that feeds the intra-plate volcanoes

(Figure 2.6, 2.11c, and 2.13c).

• The Izu-Bonin slab is heavily deformed and fragmented, with pieces of subducted Pacific

21

Plate residing in the MTZ or uppermost lower mantle (Figure 2.11d and 2.11e).

• The Philippine Sea Plate subducts along the Ryukyu Trench and reaches 500 km depth.

Although both are in the MTZ, the subducted Philippine Sea Plate beneath the East China

Sea is not attached to the subducted Pacific Plate beneath the Yellow Sea (Figure 2.11f).

• Beneath the Datong volcanoes, low-velocity anomalies are imaged from the lithosphere down

to ∼400 km depth. The mantle upwelling is more likely induced by lithospheric delamination

beneath the southern Great Xing’an Range (Figure 2.13a).

• The low-velocity anomalies beneath the Tengchong volcanic field are confined within the top

200 km. There is no obvious connection between the Tengchong volcano and the subducted

Burma microplate. The intra-plate volcanism is more likely to be triggered by small-scale

asthenospheric convection and upwelling between the subducting Burma slab and the 200-

300 km thick lithosphere of the Sichuan Basin (Figure 2.13b).

• We clearly image the Hainan mantle plume, characterized by low-velocity anomalies extend-

ing from the surface to the lower mantle (Figure 2.13d).

2.7 Appendix

2.7.1 Data quality control weighting

As shown in Equation (2.2) of the main texts, 𝑊𝑤 is the weighting term to control the data

quality in each measurement window. Following Tao et al. [2018], 𝑊𝑤 is defined as

𝑊𝑤 = 𝑓1(CC) 𝑓2(SNR) 𝑓3(𝛿𝑡CC)

(2.12)

where

𝑓1(CC) =






1
2

+ 1

2 cos

0,
(

𝜋 0.7−CC
0.7−0.5

)

1,

CC < 0.5

, 0.5 ≤ CC < 0.7

(2.13)

CC ≥ 0.7

22

𝑓2(SNR) =

𝑓3(𝛿𝑡CC) =










1
2

+ 1

2 cos

1
2

− 1

2 cos

0,
(

𝜋 15−SNR
15−10

)

1,

0,
(
𝜋 10−|𝛿𝑡CC|
10−8

)

1,

SNR < 10

, 10 ≤ SNR < 15

(2.14)

SNR ≥ 15

|𝛿𝑡CC| ≥ 10

, 8 ≤ |𝛿𝑡CC| < 10

(2.15)

|𝛿𝑡CC| < 8

SNR is the signal-to-noise ratio for each measurement window:

SNR = 10𝑙𝑜𝑔

∫

∫

1
𝑡signal
1
𝑡noise

𝑢2

signal

(𝑡)𝑑𝑡

𝑢2

noise

(𝑡)𝑑𝑡

(2.16)

where 𝑢signal(𝑡) represents displacement within the measurement window, 𝑡signal is the correspond-

ing window length, 𝑢noise(𝑡) is displacement in the noise window defined as from the event origin

time to 10 s before the predicted first P wave arrival time based on the AK135 model [Kennett

et al., 1995], and 𝑡noise is the corresponding window length. CC and 𝛿𝑡CC are the cross-correlation

coefficient and time shift between the data and synthetics in the measurement window.

2.7.2 Source inversion details

The invertible source parameters include centroid moment tensor 𝑀, 3-D centroid location 𝑋,

and source time function 𝑆(𝑡). Assuming a Gaussian source time function, we have

𝑆(𝑡) =

𝑒−( 𝑡 −𝑡

𝜏 )2
0

1
√
𝜋𝜏

(2.17)

where 𝜏 means the half duration time and 𝑡0 represents the event’s origin time. Because the fre-

quency band of our inversion cannot well constrain 𝜏, we only invert for 𝑡0 and use 𝜏 determined

by the magnitude in the Global CMT solution [Ekström et al., 2012].

Following Kim et al. [2011], we have ten source parameters to invert, including 6 elements of

the moment tensor 𝑀11,𝑀22,𝑀33,𝑀12,𝑀13,𝑀23, 3 elements of the 3-D centroid location 𝑥1,𝑥2,𝑥3,

and the centroid origin time 𝑡0. The gradient to the data misfit 𝜒𝑒 is

𝑔 = {

𝜕 𝜒𝑒
𝜕 𝑀11

,

𝜕 𝜒𝑒
𝜕𝑀22

,

𝜕 𝜒𝑒
𝜕 𝑀33

,

𝜕 𝜒𝑒
𝜕 𝑀12

,

𝜕 𝜒𝑒
𝜕 𝑀13

,

𝜕 𝜒𝑒
𝜕 𝑀23

,

𝜕 𝜒𝑒
𝜕𝑥1

,

𝜕 𝜒𝑒
𝜕𝑥2

,

𝜕 𝜒𝑒
𝜕𝑥3

,

𝜕 𝜒𝑒
𝜕𝑡0

}

(2.18)

23

By using 𝜎𝑀 =

)2 + ( 𝜕 𝜒𝑒
𝜕𝑥3

(( 𝜕 𝜒𝑒
𝜕𝑥1
ˆ𝑀 = 𝑀

)2 + ( 𝜕 𝜒𝑒
𝜕𝑥2
𝜎𝑀 , ˆ𝑋 = 𝑋

)2 + ( 𝜕 𝜒𝑒
𝜕𝑀33

)2 + ( 𝜕 𝜒𝑒
𝜕𝑀22

√
2(( 𝜕 𝜒𝑒
𝜕 𝑀11
)2)− 1
2 to non-dimensionalize 𝑀 and 𝑋, we have non-dimensional parameters
𝜕 𝜒𝑒
𝜕 𝑋 . The non-dimensionalization will

)2 + ( 𝜕 𝜒𝑒
𝜕 𝑀13

)2 + ( 𝜕 𝜒𝑒
𝜕 𝑀23

)2 + ( 𝜕 𝜒𝑒
𝜕𝑀12

2 and 𝜎𝑋 =

𝜕 𝜒𝑒
𝜕 ˆ𝑀 = 𝜎𝑀

𝜕 𝜒𝑒
𝜕 ˆ𝑋 = 𝜎𝑋

𝜕 𝜒𝑒
𝜕𝑀 ,

)2)− 1

𝜎𝑋 and gradients

guarantee different types of parameters contribute equally to the gradient’s magnitude.

Each source inversion step contains 3 iterations of the steepest descent update, after which the

source parameters will almost not change. To do a line search of the steepest descent update in each
iteration, we first calculate the waveform synthetics of the perturbed source model {𝑀 − 𝜖 𝜕 𝜒𝑒
𝜕 ˆ𝑀
𝜖 𝜕 𝜒𝑒
𝜕 ˆ𝑋

, 𝑡0}, where 𝜖 = 0.01. By assuming a linear relationship between the source model and the

, 𝑋 −

waveform perturbations, we approximate the waveform synthetics 𝑠𝛼 (𝑥, 𝑡) for the source model
{𝑀 − 𝛼 𝜕 𝜒𝑒
𝜕 ˆ𝑀

, 𝑡′} with arbitrary step length 𝛼 and origin time 𝑡′ as

, 𝑋 − 𝛼 𝜕 𝜒𝑒
𝜕 ˆ𝑋

𝑠𝛼 (𝑥, 𝑡) = 𝑠0(𝑥, 𝑡 − (𝑡′ − 𝑡0)) +

𝛼
𝜖

(𝑠𝜖 (𝑥, 𝑡 − (𝑡′ − 𝑡0)) − 𝑠0(𝑥, 𝑡 − (𝑡′ − 𝑡0)))

(2.19)

where 𝑠0(𝑥, 𝑡) is the synthetics for the starting source model {𝑀, 𝑋, 𝑡0} of this iteration, 𝑠𝜖 (𝑥, 𝑡)
is the perturbed synthetics for the source model {𝑀 − 𝜖 𝜕 𝜒𝑒
, 𝑡0}. 𝑠0(𝑥, 𝑡 − (𝑡′ − 𝑡0))
𝜕 ˆ𝑀
and 𝑠𝜖 (𝑥, 𝑡 − (𝑡′ − 𝑡0)) shift the synthetics by 𝑡′ − 𝑡0 to align the waveforms. We use a Bayesian
optimization method to efficiently find an optimized step length 𝛼𝑜 𝑝𝑡 and the origin time 𝑡′

, 𝑋 − 𝜖 𝜕 𝜒𝑒
𝜕 ˆ𝑋

𝑜 𝑝𝑡 to

minimize the data misfit.

2.7.3 Optimization of the structural model parameters

We use the conjugate gradient descent to update our structure model. In each iteration, the

search direction is determined by the gradient of the current iteration 𝑔𝑖 and the search direction of

the previous iteration 𝑑𝑖−1:

where 𝑃 is the preconditioner and

𝑑𝑖 = −𝑃𝑔𝑖 + 𝛽𝑑𝑖−1

(2.20)

(𝑑𝑖 − 𝑑𝑖−1) · 𝑃𝑔𝑖
𝑑𝑖−1 · 𝑃𝑔𝑖−1
In the first iteration, 𝛽 is set to 0. In the later iterations, we calculate the Powell ratio [Powell

(2.21)

𝛽 =

and Powell, 1981] defined as:

𝑅 =

𝑃𝑔𝑖 · 𝑃𝑔𝑖−1
𝑃𝑔𝑖−1 · 𝑃𝑔𝑖−1

24

(2.22)

If 𝑅 > 0.2 for all the invertible structure parameters, we reset 𝛽 to 0 to restart the optimization

when a critical point is found, thus speeding up our inversion.

The preconditioner 𝑃 is chosen to be close to the inverse Hessian matrix [Fichtner and Trampert,

2011a]. In practice, it is difficult to directly calculate the Hessian matrix. Instead, we use the so-

called pseudo-Hessian matrix [Bozdağ et al., 2016]:

∫

𝐻 (𝑥) =

𝑁𝑒∑

𝑒=1

𝜕2𝑠(𝑥, 𝑡)
𝜕𝑡2

·

𝜕2𝑠†(𝑥, 𝑇 − 𝑡)
𝜕𝑡2

𝑑𝑡

(2.23)

where 𝑠(𝑥, 𝑡) is the wave field of the forward simulation and 𝑠†(𝑥, 𝑇 − 𝑡) is the backward wave field

of the adjoint source. 𝑁𝑒 denotes the total number of events.

By setting an appropriate threshold 𝛿 of the inverse hessian, we have

𝑃(𝑥) =





𝐻 (𝑥) < 𝛿

1
𝛿
𝐻 (𝑥) 𝐻 (𝑥) ≥ 𝛿
1

(2.24)

𝛿 is necessary as 𝑃 acts as weighting to the gradient. We need 𝛿 to suppress a large 𝑃. It should

be noticed that the choice of 𝛿 has a significant impact on the inversion. As 𝑃 can be regarded as

the weighting of the gradient in different regions, a smaller 𝛿 will tolerate a larger value of 𝑃(𝑥).

A region with a larger 𝑃(𝑥) usually means a smaller Hessian matrix, i.e., less data coverage. To

balance model updating in different regions, we should use appropriate weighting to enhance the

gradient for the area with less data coverage [Fichtner and Trampert, 2011a].

For the calculated kernels from the adjoint simulation of each event, we first use Gaussian

spheres centered at the sources and receivers to damp out regions with potential singularity [Zhu

et al., 2015]. The diameters of the Gaussian spheres are 50 km for the source and 10 km for the

receivers. Then we apply the preconditioning after summing up the kernels for all events. In the

next step, we apply Gaussian smoothing on the kernel to help stabilize the inversion. In the first 5

iterations of the inversion, the Gaussian smoothing spheres are configured as 50 km horizontally

and 25 km vertically. In the subsequent 25 iterations, we choose Gaussian smoothing spheres with

25 km horizontally and 10 km vertically to capture small-scale structures.

We use line search in each iteration to find the optimal step length. Based on the line search

direction 𝑑𝑖, which is the negation of the normalized gradient, we forwardly simulate synthetics

25

for the perturbed model 𝑚𝑖 + 𝜖 𝑑𝑖, where 𝜖 = 0.01. Using the linear dependency approximation

of waveform to structure [Warner et al., 2013], we estimate the perturbed synthetics for arbitrary

model 𝑚𝑖 + 𝛼𝑑𝑖 as

𝑠(𝑥, 𝑡; 𝑚𝑖 + 𝛼𝑑𝑖) = 𝑠(𝑥, 𝑡; 𝑚𝑖) +

𝛼
𝜖

(𝑠(𝑥, 𝑡; 𝑚𝑖 + 𝜖 𝑑𝑖) − 𝑠(𝑥, 𝑡; 𝑚𝑖))

(2.25)

where 𝑠(𝑥, 𝑡; 𝑚𝑖) represents the synthetics for model 𝑚𝑖, and 𝑠(𝑥, 𝑡; 𝑚𝑖 + 𝜖 𝑑𝑖) represents the syn-

thetics for model 𝑚𝑖 + 𝜖 𝑑𝑖. It is computationally efficient to calculate 𝑠(𝑥, 𝑡; 𝑚𝑖 + 𝛼𝑑𝑖) without

performing the SEM simulation. We can simply test different values of 𝛼 and find the optimized

one.

2.7.4 Point spread function tests

Following Fichtner and Trampert [2011b], if we apply a single point perturbation to one of

the invertible model parameters, the gradient from the adjoint state method can be regarded as the

negation of the Hessian matrix, also called the point spread function (PSF), of the final model. This

function can quantify the model’s blurring and distortion by showing how it tends to update when

only one point is perturbed. For example, when we apply a single point perturbation to 𝑉𝑆𝑉 , the

PSF response for 𝑉𝑆𝑉 and 𝑉𝑆𝐻 are 𝐻𝑉𝑆𝑉 ,𝑉𝑆𝑉 , 𝐻𝑉𝑆𝑉 ,𝑉𝑆𝐻 , respectively, where 𝐻 represents the Hessian

matrix.

In practice, we apply Gaussian sphere perturbations with a radius of 50 km and a separation of

2 degrees horizontally and 200 km vertically, located at depths of 100 km, 300 km, 500 km, 700

km, and 900 km, to the final model. By using the adjoint state method, the corresponding PSFs are

calculated and shown in the main text.

26

2.8 Figures

Figure 2.1 Topography/bathymetry map of East Asia and the northwestern Pacific. The simulation
region is outlined by navy blue lines. Magenta curves represent slab depth contours with a
100-km interval based on the Slab2 model [Hayes et al., 2018]. The tectonic units are outlined by
black curves, and intraplate volcanoes are marked with red triangles. Red thick lines indicate plate
boundaries from the NUVEL-1 model [DeMets et al., 1990]. Acronyms of certain tectonic units:
CDB - Chuandian Block, JGB - Junggar Basin, QDB - Qaidam Basin, SCB - Sichuan Basin,
TLFB - Tulufan Basin, SGFB - Songpan Ganzi Fold Belt.

27

75°E90°E105°E120°E135°E150°E165°E0°10°N20°N30°N40°N50°N60°NCDBJGBQDBSCBTLFBSGFBArabian SeaAndaman T.Bay of BengalIndian PlateBurma T.Main Boundary ThrustHimalaya BlockLhasa BlockQiangtang BlockPamirsTarim BasinTianshanSayanAltayNanshanBasinQilian FoldTengchong V.BurmaSouth China SeaHainan V.South China BlockNorth ChinaBlockOrdosBlockBohai BayXing’an East MongoliaSongliao BasinJapan SeaRyukyu T.Philippine Sea PlateSea of OkhotskPacific PlateMalina T.Philippine T.Yap T.Mariana T.Izu Bonin T.Japan T.Nankai T.Kuril T.Lake BaikaiKamchatka PeninsulaDatong V.ChangbaishanV.East ChinaSeaYellow Sea−6−4−20246Topography/Bathymetry (km)Figure 2.2 Distribution of earthquakes (a) and seismic stations (b) used in this study. Earthquakes
in (a) are color-coded by depth. The focal mechanisms are from the Global Centroid Moment
Tensor (GCMT) project [Dziewonski and Anderson, 1981, Ekström et al., 2012]. Stations in (b)
are color-coded by networks. Other features in (a) and (b) are the same as Fig. 1. (c-f) Histograms
of earthquake origin time (c), moment magnitude (d), depth (e), and half duration time of the
source time function (f).

28

0102030Number of events2010201320162019Year(c)05101520255.05.56.06.57.0Mw(d)0204060800200400600Depth(km)(e)0510152025246Half Duration(s)(f)75°E90°E105°E120°E135°E150°E0°15°N30°N45°N60°N(a)75°E90°E105°E120°E135°E150°E(b)0200400600Event Depth (km)CEArrayF netKMA NetworkCentral MongoliaSeismic ExperimentNCISP7NECESSArrayINDEPTH IVData obstained fromEarthScope/IRIS DMC:1U, 2F, JP, XR, II, G,RM, CB, PS, MM, AD, 8B,KS, IU, ZV, IM, YC, MY,D0, KG, KS, XF, SS, TM,Z6, BO, IN, TW, XI, YM,MI, HK, YP, ICFigure 2.3 Misfit reduction curves in three stages (stage 1 in red stars, stage 2 in blue diamonds,
and stage 3 in magenta squares). The open symbols represent the misfit before the source
inversion. (a) Overall misfits. (b) Misfits for body waves on the vertical component. (c) Misfits for
body waves on the radial component. (d) Misfits for body waves on the transverse component. (e)
Misfits for Rayleigh waves on the vertical component. (f) Misfits for Rayleigh waves on the radial
component. (g) Misfits for Love waves on the transverse component.

29

MeasurementsBody WavesSurface Waves623120130639639173148334645833158884Stage 1Stage 2Stage 3Measured after the last iteration of each stagePeriod RangeBody WavesSurface Waves10   40 s40   120 s8   40 s30   120 s8   40 s20   120 sStage 1Stage 2Stage 3Starting misfit at each stage:0.150.200.25(a)before 1stsource inversionkernel smoother:h: 50km, v: 25kmkernel smoother:h: 25km, v: 10kmbefore 2ndsource inversionbefore 3rdsource inversion0.150.200.250.300.35Ver. Body Wavep, s, P, S, sP, sS, PP, SS(b)0.150.200.250.300.35Rad. Body Wavep, s, P, S, sP, sS, PP, SS(c)0.200.250.300.350.400.45Tan. Body Waves, S, sS, SS, ScS(d)0.080.100.120.140.160102030Iteration No.Ver. Surface WaveRayleigh Wave(e)0.100.120.140.160.180102030Iteration No.Rad. Surface WaveRayleigh Wave(f)0.100.150102030Iteration No.Tan. Surface WaveLove Wave(g)0.150.200.250.300.35Misfit0.080.100.120.140.16MisfitFigure 2.4 Mapviews of 𝛿𝑙𝑛𝑉𝑆 perturbation at various depths. The reference model is the 1-D
average of the 3-D model. Regions with low resolution based on the point spread function
resolution test are masked out. Magenta curves represent the slab depth contour at each depth
based on the Slab2 model [Hayes et al., 2018]. Other features are the same as Fig. 2.1.

30

15°N30°N45°N(a)100 km(b)200 km(c)300 km15°N30°N45°N(d)400 km(e)500 km(f)600 km90°E120°E150°E15°N30°N45°N(g)700 km90°E120°E150°E(h)800 km90°E120°E150°E(i)900 km−6−5−4−3−2−10123456δlnVS(%)Figure 2.5 Mapviews of radial anisotropy at various depths. Other features are the same as Figure
2.4.

31

15°N30°N45°N(a)100 km(b)200 km(c)300 km90°E120°E150°E15°N30°N45°N(d)400 km90°E120°E150°E(e)500 km90°E120°E150°E(f)600 km−3−2−10123(VSH−VSV)/p(2V2SV+V2SH)/3)(%)Figure 2.6 (a-f) Comparison of different models along the same cross-section across the
Changbaishan volcanoes. The EARA2023, FWEA18 [Tao et al., 2018], EARA2014 [Chen et al.,
2015], GLAD_M25 [Lei et al., 2020], and SPiRal [Simmons et al., 2021] models present the
𝛿𝑙𝑛𝑉𝑆 perturbation with respect to EARA2023_AVG (averaged EARA2023), whereas the
GAP_P4 model [Obayashi et al., 2013] displays the 𝑉𝑃 perturbation with respect to its own
reference model. The 410- and 660-km depths are shown as dashed lines. Earthquakes from 1964
to 2019 from the ISC-EHB catalog [Weston et al., 2018] within 100 km of the cross-section are
shown as magenta circles. Regions with low resolution based on the point spread function
resolution test are masked out. Semi-transparent bands ±60 km around the 410- and 660-km
discontinuities mask potential artifacts due to the discontinuity topography in (a). (g) The location
of the cross-section is shown as the blue arrow. Other features are the same as Fig. 2.1. (h)
Comparison of 1-D velocity models AK135 [Kennett et al., 1995], STW105 [Kustowski et al.,
2008a], and EARA2023_AVG.

32

02004006008001000Depth (km)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%EARA2023Vs(a)−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%FWEA18Vs(b)02004006008001000Depth (km)−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%EARA2014Vs(c)−1.5%−1.5%−1.5%GLAD_M25Vs(d)02004006008001000Depth (km)−1.5%1.5%1.5%GAP_P4Vp(e)02004006008001000Depth (km)120125130135140145Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%SPiRaLVs(f)−202δlnV(%)90°E120°E150°E15°N30°N45°N60°N(g)02004006008001000Depth (km)05101-D Wave Speed (km/s)AK135STW105EARA2023_AVG(reference model)(h)VSVPFigure 2.7 Cross-Sections of 𝛿𝑙𝑛𝑉𝑆 perturbation beneath Eastern Tibet (a-c) and North China
Craton (d-e). The locations of the cross-sections are marked as thick blue arrows on the map. The
annotations on the map are similar to the previous figures. Each cross-section includes a
topography panel on the top with a 60-times vertical exaggeration, an absolute velocity panel from
the surface to 100 km depth, and a panel of 𝛿𝑙𝑛𝑉𝑆 perturbation with respect to EARA2023_AVG
(the 1-D average of the 3-D model) down to 1000 km depth. Earthquakes from the ISC-EHB
catalog [Weston et al., 2018] within 100 km of the cross-section are shown as magenta circles.
White and black curves mark the 1.5% contours. Regions with low resolution based on the point
spread function resolution test are masked out. Semi-transparent bands ±60 km around the 410-
and 660-km discontinuities mask potential artifacts due to the discontinuity topography.

33

02004006008001000Depth (km)95100105110Longitude (degree)−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%Sichuan Basin LithosphereUnknown Subducted Plate?0100(a)Chuandian BlockSouth China Block95100105110Longitude (degree)−1.5%1.5%1.5%1.5%1.5%Burma MicroplateSichuan Basin LithosphereUnknown Subducted Plate?(b)BurmaChuandian BlockSichuan Basin95100105110Longitude (degree)−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%Sichuan Basin LithosphereUnknown Subducted Plate?(c)Qiangtang BlockChuandian BlockSichuan Basin02004006008001000Depth (km)105110115120Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%Japan SlabJapan SlabOrdos BlockLithosphere0100(d)Ordos BlockTaihang MountainsHuabei Plain105110115120Longitude (degree)−1.5%−1.5%−1.5%1.5%1.5%1.5%Japan SlabJapan SlabOrdos BlockLithosphere(e)Ordos BlockTaihang MountainsHuabei Plain90°E120°E150°E15°N30°N45°N60°N(a)(b)(c)(d)(e)−202δlnVs(%)3.63.84.04.24.44.6Vs(km/s) in top 100kmFigure 2.8 Histograms of time shifts, normalized zero-lag cross-correlation coefficients (NZCC),
and cross-correlation coefficients (CC) for all 6 measurement categories. Blue histograms
represent the initial structural and source models, black histograms represent the initial structural
model with inverted source models after 3 iterations in the first stage of source inversions,
whereas red ones represent the final structural and source models. For time shifts, also shown the
mean and standard deviation.

34

0250005000075000Ver. Body Wave∆T =  0.26 ± 1.16∆T =  0.31 ± 1.66∆T = 0.41 ± 2.24(a)0200004000060000(b)01000020000300004000050000(c)0250005000075000Rad. Body Wave∆T =  0.12 ± 1.12∆T =  0.06 ± 1.64∆T = 0.62 ± 2.22(d)0200004000060000(e)01000020000300004000050000(f)02000040000Tan. Body Wave∆T = 0.00 ± 1.68∆T = 0.52 ± 2.29∆T = 1.24 ± 2.67(g)0100002000030000(h)05000100001500020000(i)0500010000Ver. Surface Wave∆T =  0.27 ± 1.52∆T = 1.27 ± 2.63∆T = 1.41 ± 2.82(j)0100002000030000(k)050001000015000200002500030000(l)025005000750010000Rad. Surface Wave∆T =  0.30 ± 1.51∆T = 1.24 ± 2.60∆T = 1.36 ± 2.77(m)0500010000150002000025000(n)050001000015000(o)0500010000Tan. Surface Wave−10−50510∆T(s)∆T = 0.01 ± 1.33∆T = 1.98 ± 2.41∆T = 2.10 ± 2.60(p)01000020000300000.00.20.40.60.81.0NZCC(q)05000100001500020000250000.50.60.70.80.91.0CC(r)Figure 2.9 Point Spread Function (PSF) test for 𝑉𝑆𝑉 perturbation at various depths. (a) The input
model perturbation applied to 𝑉𝑆𝑉 for the final model. (b) The PSF response for 𝑉𝑆𝑉 , which should
mimic the input model in an ideal case. (c) The PSF response for 𝑉𝑆𝐻, which should be null in an
ideal case. (d) The PSF response for bulk velocity 𝑉𝐶, which should be null in an ideal case.

35

15°N30°N45°N100 kmVSV(a)15°N30°N45°N300 kmVSV15°N30°N45°N500 kmVSV15°N30°N45°N700 kmVSV90°E120°E150°E15°N30°N45°N900 kmVSV100 kmVSV(b)300 kmVSV500 kmVSV700 kmVSV90°E120°E150°E900 kmVSV100 kmVSH(c)300 kmVSH500 kmVSH700 kmVSH90°E120°E150°E900 kmVSH100 kmVC(d)300 kmVC500 kmVC700 kmVC90°E120°E150°E900 kmVCFigure 2.10 Waveform comparisons for a selected earthquake (GCMT ID: 200805071602A)
recorded by a station indicated by the red triangle in (a). All other stations used in the inversion
are shown as black triangles for reference. Other features are the same as Fig. 2.2b. (b-d)
Waveform comparisons between observations (black), synthetics based on the starting model
(blue), and synthetics based on the final model (red) at different frequency bands for various
seismic phases. Black boxes indicate the measurement windows based on the theoretical phase
arrival times predicted by the AK135 model [Kennett et al., 1995].

36

2004006008001000Time(s)T = 8   40 s(b)2004006008001000Time(s)T = 20   120 s(c)2004006008001000Time(s)T = 40   120 s(d)120°E150°E15°N30°N45°N200805071602A (Mw 6.2, 21km) [Station:JX.JIJ](a)PSsPsSpPScSFigure 2.11 Cross-sections of 𝛿𝑙𝑛𝑉𝑆 perturbation for northwestern Pacific subduction zones,
plotted similar to Fig. 2.7. The reference model is EARA2023_AVG. Acronyms: PP: Pacific
Plate, PSP: Philippine Sea Plate.

37

02004006008001000Depth (km)5505540453Latitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%Kuril Slab (PP)0100(a)Pacific OceanKurilSea of OkhotskSakhalin125130135140145150Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%Japan Slab (PP)Low-V(b)SiberiaJapan SeaHokkaidoPacific Ocean115120125130135140145Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%Japan Slab (PP)Japan Slab (PP)Low-V(c)changbai mountainsJapan SeaHonshuPacific Ocean02004006008001000Depth (km)125130135140145150Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%Ryukyu Slab (PSP)Izu-BoninSlab (PP)0100(d)Ryukyu TrenchPhilippine SeaIzu-Bonin TrenchPacific Ocean125130135140145150Longitude (degree)1.5%1.5%1.5%1.5%Izu-Bonin Slab (PP)(e)Philippine SeaIzu-Bonin TrenchPacific Ocean0453035202Latitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%NanKaiSlab (PSP)Japan Slab (PP)Izu-Bonin Slab (PP)(f)Japan SeaNankai Trench02004006008001000Depth (km)115120125130Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%Japan Slab (PP)Ryukyu Slab (PSP)Izu-Bonin Slab (PP)0100(g)Taihang MountainsNorth China BlockEast China SeaPhilippine Sea90°E120°E150°E15°N30°N45°N60°NRyukyu T.Malina T.Philippine T..T anairaMIzu-Bonin T.Japan T.Nankai T.Kuril T.(a)(b)(c)(d)(e)(f)(g)−202δlnVs(%)3.54.04.5Vs(km/s) in top 100kmFigure 2.12 3-D images of slab morphology in the northwestern Pacific subduction zones. The
slabs are represented by the 1.5% isosurface of high-velocity anomalies and color-coded by depth.
Red thick curves indicate the plate boundaries from the NUVEL-1 model [DeMets et al., 1990],
white curves show the coastline, and red dots represent intra-plate volcanoes. (a) Mapview. (b)
Viewing from the northwest. (c) Viewing from the southeast.

38

Figure 2.13 Cross-sections of 𝛿𝑙𝑛𝑉𝑆 perturbation for four intra-plate volcanic fields in East Asia,
plotted similar to Fig. 2.7. The reference model is EARA2023_AVG. (a) Datong-Fengzhen
volcanoes. (b) Tengchong volcanic field. (c) Changbaishan volcanoes. (d) Hainan volcanic field.

39

02004006008001000Depth (km)110115120125130135Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%Low-VPacific Slab (PP)DelaminatedLithosphere0100(a)DatongOrdosXing’an-East MongoliaSongliao BasinSanjiang Basin95100105110115Longitude (degree)−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%BurmaMicroplateSichuan Basin LithosphereUnknown Subducted Plate?(b)TengchongBurmaTibetSichuan BasinHuabei Plain02004006008001000Depth (km)120125130135140145150Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%Low-VLow-VSlab HolesPacific Slab (PP)Low-V0100(c)ChangbaishanChangbai MountainsJapan SeaJapanPacific Ocean100105110115120Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%Hainan Mantle Plume(d)HainanHainan−202δlnVs(%)3.54.04.5Vs(km/s) in top 100 km90°E120°E150°E15°N30°N45°N60°N(a)(b)(c)(d)Figure 2.14 Mapviews of 𝛿𝑙𝑛𝑉𝑃 perturbation at various depths, similar to Figure 2.4.

40

15°N30°N45°N(a)100 km(b)200 km(c)300 km15°N30°N45°N(d)400 km(e)500 km(f)600 km90°E120°E150°E15°N30°N45°N(g)700 km90°E120°E150°E(h)800 km90°E120°E150°E(i)900 km−6−5−4−3−2−10123456δlnVP(%)Figure 2.15 Comparison of different models along the same cross-section across the
Changbaishan volcanoes. The EARA2023, FWEA18 [Tao et al., 2018], EARA2014 [Chen et al.,
2015], GLAD_M25 [Lei et al., 2020], and SPiRaL [Simmons et al., 2021] models present the
𝛿𝑙𝑛𝑉𝑃 perturbation with respect to EARA2023_AVG (averaged EARA2023), whereas the
GAP_P4 model [Obayashi et al., 2013] displays the 𝛿𝑙𝑛𝑉𝑃 perturbation with respect to its own
reference model. Annotations are similar to Figure 2.6.

41

02004006008001000Depth (km)−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%EARA2023Vp(a)−1.5%1.5%1.5%1.5%1.5%FWEA18Vp(b)02004006008001000Depth (km)−1.5%−1.5%−1.5%−1.5%1.5%1.5%EARA2014Vp(c)−1.5%−1.5%−1.5%GLAD_M25Vs(d)02004006008001000Depth (km)−1.5%1.5%1.5%GAP_P4Vp(e)02004006008001000Depth (km)120125130135140145Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%SPiRaLVp(f)−202δlnV(%)90°E120°E150°E15°N30°N45°N60°N(g)02004006008001000Depth (km)05101-D Wave Speed (km/s)AK135STW105EARA2023_AVG(reference model)(h)VSVPFigure 2.16 Cross-Sections of 𝛿𝑙𝑛𝑉𝑃 perturbation beneath Eastern Tibet (a-c) and North China
Craton (d-e). Annotations are similar to Figure 2.7.

42

02004006008001000Depth (km)95100105110Longitude (degree)−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%Sichuan Basin LithosphereUnknown Subducted Plate?0100(a)Chuandian BlockSouth China Block95100105110Longitude (degree)1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%Burma MicroplateSichuan Basin LithosphereUnknown Subducted Plate?(b)BurmaChuandian BlockSichuan Basin95100105110Longitude (degree)−1.5%1.5%1.5%1.5%1.5%1.5%Sichuan Basin LithosphereUnknown Subducted Plate?(c)Qiangtang BlockChuandian BlockSichuan Basin02004006008001000Depth (km)105110115120Longitude (degree)−1.5%−1.5%−1.5%1.5%Japan SlabJapan SlabOrdos BlockLithosphere0100(d)Ordos BlockTaihang MountainsHuabei Plain105110115120Longitude (degree)−1.5%−1.5%1.5%1.5%Japan SlabJapan SlabOrdos BlockLithosphere(e)Ordos BlockTaihang MountainsHuabei Plain90°E120°E150°E15°N30°N45°N60°N(a)(b)(c)(d)(e)−202δlnVp(%)6.57.07.58.08.5Vp(km/s) in top 100kmFigure 2.17 Cross-Sections of 𝛿𝑙𝑛𝑉𝑃 perturbation for northwestern Pacific subduction zones.
Annotations are similar to Figure 2.11.

43

02004006008001000Depth (km)5505540453Latitude (degree)−1.5%1.5%1.5%1.5%1.5%1.5%Kuril Slab (PP)0100(a)Pacific OceanKurilSea of OkhotskSakhalin125130135140145150Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%Japan Slab (PP)Low-V(b)SiberiaJapan SeaHokkaidoPacific Ocean115120125130135140145Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%Japan Slab (PP)Japan Slab (PP)Low-V(c)changbai mountainsJapan SeaHonshuPacific Ocean02004006008001000Depth (km)125130135140145150Longitude (degree)−1.5%−1.5%−1.5%−1.5%1.5%1.5%Ryukyu Slab (PSP)Izu-BoninSlab (PP)0100(d)Ryukyu TrenchPhilippine SeaIzu-Bonin TrenchPacific Ocean125130135140145150Longitude (degree)1.5%Izu-Bonin Slab (PP)(e)Philippine SeaIzu-Bonin TrenchPacific Ocean0453035202Latitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%NanKaiSlab (PSP)Japan Slab (PP)Izu-Bonin Slab (PP)(f)Japan SeaNankai Trench02004006008001000Depth (km)115120125130Longitude (degree)−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%Japan Slab (PP)Ryukyu Slab (PSP)Izu-Bonin Slab (PP)0100(g)Taihang MountainsNorth China BlockEast China SeaPhilippine Sea90°E120°E150°E15°N30°N45°N60°NRyukyu T.Malina T.Philippine T..T anairaMIzu-Bonin T.Japan T.Nankai T.Kuril T.(a)(b)(c)(d)(e)(f)(g)−202δlnVp(%)6.57.07.58.08.5Vp(km/s) in top 100kmFigure 2.18 Cross-Sections of 𝛿𝑙𝑛𝑉𝑃 perturbation for 4 intra-plate volcanic fields in East Asia.
Annotations are similar to Figure 2.13.

44

02004006008001000Depth (km)110115120125130135Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%Low-VPacific Slab (PP)DelaminatedLithosphere0100(a)DatongOrdosXing’an-East MongoliaSongliao BasinSanjiang Basin95100105110115Longitude (degree)−1.5%−1.5%−1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%1.5%BurmaMicroplateSichuan Basin LithosphereUnknown Subducted Plate?(b)TengchongBurmaTibetSichuan BasinHuabei Plain02004006008001000Depth (km)120125130135140145150Longitude (degree)−1.5%−1.5%−1.5%−1.5%1.5%1.5%1.5%Low-VLow-VSlab HolesPacific Slab (PP)Low-V0100(c)ChangbaishanChangbai MountainsJapan SeaJapanPacific Ocean100105110115120Longitude (degree)−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%−1.5%1.5%1.5%Hainan Mantle Plume(d)HainanHainan−202δlnVp(%)6.57.07.58.08.5Vp(km/s) in top 100 km90°E120°E150°E15°N30°N45°N60°N(a)(b)(c)(d)Figure 2.19 Waveform comparisons and S-wave triplication for Earthquake GCMT ID:
201104160219A. (a-c) Waveform comparisons across stations (black triangles in panel (d)) for the
vertical (BHZ), radial (BHR), and tangential (BHT) components, featuring observed data (black),
starting model synthetics (blue), and final model synthetics (red). Each station’s waveforms are
normalized and aligned with theoretical S wave arrival times based on the AK135 model [Kennett
et al., 1995], indicated by black dashed lines. Station names and great circle distances to the event
are displayed adjacent to (c). Waveforms are ranked by the station’s longitude.

45

051015202530ID0100200Time(s)(a) BHZ0100200Time(s)(b) BHR0100200Time(s)BO.ONS,0.9°BO.SGN,1.2°BO.KZK,1.5°BO.TTO,1.6°BO.JIZ,1.7°BO.FUJ,1.7°BO.KNY,2.2°BO.KNM,2.4°BO.NAA,2.4°BO.TGA,3.2°BO.WTR,3.4°BO.ABU,3.9°BO.YAS,4.0°BO.KIS,4.2°BO.KMT,4.6°BO.YZK,4.7°BO.ISI,5.1°BO.NRW,5.5°BO.UMJ,5.6°BO.OKW,5.9°BO.TGW,6.3°BO.TSA,6.7°BO.NSK,6.8°BO.YTY,7.6°BO.INN,7.7°BO.TKD,7.9°BO.TMC,8.4°BO.SBR,8.5°BO.TKO,8.5°BO.IZH,9.1°BO.STM,9.1°(c) BHT120°E150°E15°N30°N45°N201104160219A (Mw 5.8, 81km)(d)Figure 2.20 Maps of 𝛿𝑙𝑛𝑉𝑆 perturbations above 100 km depth. The reference model is the 1-D
average of the 3-D model. Other features are the same as Figure 2.4.

46

15°N30°N45°N(a)10 km(b)20 km(c)30 km15°N30°N45°N(d)40 km(e)50 km(f)60 km90°E120°E150°E15°N30°N45°N(g)70 km90°E120°E150°E(h)80 km90°E120°E150°E(i)90 km−6−5−4−3−2−10123456δlnVS(%)Figure 2.21 Misfit reduction curves with respect to frequency bands in three stages: Stage 1 (red
stars), Stage 2 (blue diamonds), and Stage 3 (magenta squares), similar to Figure 2.3. Open
symbols represent the misfit prior to source inversion. Unlike Figure 2.3, this figure additionally
displays the misfit for each stage, encompassing more than just their respective iterations.

47

MeasurementsBody WavesSurface Waves623120130639639173148334645833158884Band 1Band 2Band 3Measured after the last iteration of each stagePeriod RangeBody WavesSurface Waves10   40 s40   120 s8   40 s30   120 s8   40 s20   120 sBand 1Band 2Band 3Starting misfit at each stage:0.150.200.250.30(a)0.20.30.4Ver. Body Wavep, s, P, S, sP, sS, PP, SS(b)0.20.30.4Rad. Body Wavep, s, P, S, sP, sS, PP, SS(c)0.20.30.4Tan. Body Waves, S, sS, SS, ScS(d)0.10.20.30102030Iteration No.Ver. Surface WaveRayleigh Wave(e)0.10.20.30102030Iteration No.Rad. Surface WaveRayleigh Wave(f)0.10.20.30102030Iteration No.Tan. Surface WaveLove Wave(g)0.20.30.4Misfit0.10.20.3MisfitFigure 2.22 Waveform comparisons for an earthquake (GCMT ID: 202307011438A) recorded by
selected stations marked by the red triangle in (e). The waveforms are filtered between 20 to 120
s. Other features follow the layout of Figure 2.10. (a-d) Waveforms at different stations,
comparing observed data (black) with synthetics from the starting model and the GCMT source
model (blue), and synthetics from the final model and the GCMT source (as we have not inverted
the source for this event) (red).

48

2004006008001000Time(s)IC.ENH(a)2004006008001000Time(s)IU.ULN(b)2004006008001000Time(s)IC.KMI(c)2004006008001000Time(s)IC.BJT(d)120°E150°E15°N30°N45°NIC.ENHIU.ULNIC.KMIIC.BJT202307011438A (Mw 4.9, 39km)(e)Figure 2.23 Similar to Figure 2.22 but for a different earthquake (GCMT ID: 202306220124A).

49

2004006008001000Time(s)IC.ENH(a)2004006008001000Time(s)IU.ULN(b)2004006008001000Time(s)IC.KMI(c)2004006008001000Time(s)IC.BJT(d)120°E150°E15°N30°N45°NIC.ENHIU.ULNIC.KMIIC.BJT202306220124A (Mw 5.0, 33km)(e)Figure 2.24 Similar to Figure 2.22, this figure relates to a different earthquake (GCMT ID:
200805071602A). As this event is included in the inversion process, measurement windows are
marked with black boxes for clarity. Additionally, synthetics from the initial model and the
inverted source model in stage 1 are displayed in green, closely aligning with those from the
starting model and the GCMT source model, illustrated in blue.

50

2004006008001000Time(s)IC.ENH(a)2004006008001000Time(s)IU.ULN(b)2004006008001000Time(s)IC.KMI(c)2004006008001000Time(s)IC.BJT(d)120°E150°E15°N30°N45°NIC.ENHIU.ULNIC.KMIIC.BJT200910300703A (Mw 6.8, 37km)(e)Figure 2.25 Study region comparison for EARA2023 (navy blue), FWEA18 (green), and
EARA2014 (blue), with plate boundaries from the NUVEL-1 model [DeMets et al., 1990]
delineated by red dashed lines. Tectonic units within China are outlined in black.

51

60°E90°E120°E150°E180°0°30°N60°N−6−4−20246Topography/Bathymetry (km)Table 2.1 Summary of weighting terms in the misfit function

Formula

Applied on

Comment

∑𝑁𝑒
𝑒=1

1
∑𝑁𝑒,𝑠
𝑠=1

𝑁𝑐,𝑒,𝑠,𝑤

Each category

Weighting
term
categorical
weight-
ing
𝑊𝑐

1
− (

𝑒

Δ𝑠 𝑗
Δ0

)2

geographical
weight-
ing
𝑊𝑒,𝑠

∑𝑁𝑒,𝑠
𝑗=1

Each station

𝑓1(CC) 𝑓2(SNR) 𝑓3(𝛿𝑡CC)Each measurement win-
dow

quality
weight-
ing
𝑊𝑤

𝑒: event; 𝑠: station; 𝑐: category; 𝑤:
window; 𝑁𝑒: number of events; 𝑁𝑒,𝑠:
number of stations for event 𝑒; 𝑁𝑐,𝑒,𝑠,𝑤:
number of measurement windows in
category 𝑐 for event 𝑒 and station 𝑠
𝑠: current station; 𝑗: other stations but
include 𝑠 itself; 𝑒: event; 𝑁𝑒,𝑠: number
of stations for event 𝑒; Δ𝑠 𝑗 : Epicenter
distance between station s and station
j; Δ0: reference distance, with detail in
note𝑎
CC:
cross-correlation coefficient;
SNR: signal to noise ratio; 𝛿𝑡CC: time
shift obtained from cross-correlation;
𝑓1, 𝑓2, and 𝑓3 are defined in the same
manner as equation (3) to equation (6)
in Tao et al. [2018]

𝑎For the geographical weighting, we select Δ0 as

nition is consistent with Ruan et al. [2019].

1
− (

𝑒

Δ𝑠 𝑗
Δ0

)2

= 1

3max
𝑥

1
𝑒− (

Δ𝑠 𝑗
𝑥 )2

∑𝑁𝑒,𝑠
𝑗=1

Σ

𝑁𝑒,𝑠
𝑗=1

. This defi-

52

CHAPTER 3

DEEP LEARNING FOR DEEP EARTHQUAKES: INSIGHTS FROM OBS
OBSERVATIONS OF THE TONGA SUBDUCTION ZONE

An edited version of this chapter is under review at Geophysical Journal International. Ziyi Xi,

Songqiao Shawn Wei, Weiqiang Zhu, Gregory C. Beroza, Yaqi Jie, and Nooshin Saloor. Deep

Learning for Deep Earthquakes: Insights from OBS Observations of the Tonga Subduction Zone

3.1 Abstract

Applications of machine learning in seismology have greatly improved our capability of detect-

ing earthquakes in large seismic data archives. Most of these efforts have been focused on conti-

nental shallow earthquakes, but here we introduce an integrated deep-learning-based workflow to

detect deep earthquakes recorded by a temporary array of ocean-bottom seismographs (OBSs) and

land-based stations in the Tonga subduction zone. We develop a new phase picker, PhaseNet-TF,

to detect and pick P- and S-wave arrivals in the time-frequency domain. The frequency-domain

information is critical for analyzing OBS data, particularly the horizontal components because they

are contaminated by signals of ocean-bottom currents and other noise sources in certain frequency

bands. PhaseNet-TF shows a much better performance in picking S waves compared to its predeces-

sor PhaseNet. The predicted phases are associated using an improved Gaussian Mixture Model As-

sociator GaMMA-1D and then relocated with a double-difference package teletomoDD. We further

enhance the model performance with a semi-supervised learning approach by iteratively refining

labeled data and retraining PhaseNet-TF. This approach effectively suppresses false picks and sig-

nificantly improves the detection of small earthquakes. The new catalog of Tonga deep earthquakes

contains more than 10 times more events compared to the reference catalog that was analyzed man-

ually. This deep-learning-enhanced catalog reveals Tonga seismicity in unprecedented detail, and

better defines the lateral extent of the double-seismic zone at intermediate depths and the location

of 4 large deep-focus earthquakes relative to background seismicity. It also offers new potential for

deciphering deep earthquake mechanisms, refining tomographic models, and understanding sub-

duction processes.

53

3.2

Introduction

Detecting and locating earthquakes in subduction zones plays a pivotal role in advancing the

understanding of subduction processes and earthquake physics. In particular, earthquakes deeper

than 50 km provide critical information on slab geometry, slab mineral dehydration and transfor-

mation, and the interaction between the slab and surrounding mantle [Green and Houston, 1995,

Zhan, 2020]. Initial studies, such as Ruff and Kanamori [1980], established meaningful connec-

tions between seismicity and physical attributes of subduction zones, such as the lateral extent

and penetration depth of the Wadati-Benioff zone, the age of the subducting lithosphere, conver-

gence rates, and back-arc spreading. In recent years, comprehensive global slab models like Slab1.0

[Hayes et al., 2012] and Slab2 [Hayes et al., 2018] have leveraged high-accuracy regional seismic-

ity catalogs to refine slab geometry. Precise earthquake distributions also help reveal the underly-

ing mechanisms of intermediate-depth (∼70–300 km) and deep-focus (300–700 km) earthquakes

[Wiens et al., 1993, Brudzinski et al., 2007, Kita et al., 2010, Wei et al., 2017, Chen et al., 2019b,

Florez and Prieto, 2019]. However, most global studies suffer from limited local station coverage,

especially offshore, and most regional studies with temporary deployments lack sufficient duration,

which limits high-precision earthquake locations in large numbers.

Recent advances in the applications of deep learning methods in seismology greatly increase the

information content that can be extracted from seismic datasets by detecting many more earthquakes

[Mousavi and Beroza, 2022, 2023]. Initial efforts, such as Gentili and Michelini [2006], Ross et al.

[2018], used simple neural networks for detecting seismic phase arrivals, a foundational step in

earthquake localization. Subsequent developments incorporated biomedical image segmentation

algorithms, notably the U-Net architecture [Ronneberger et al., 2015], to create highly effective

deep-neural network (DNN) phase pickers like PhaseNet [Zhu and Beroza, 2019]. The Transformer

architecture [Vaswani et al., 2017] has further inspired new models, such as the EQTransformer

[Mousavi et al., 2020], which leverages attention mechanisms to enhance phase detectability. For

seismic phase association that links seismic arrivals to preliminary event origins, significant im-

provements have been achieved through Gaussian mixture models [Zhu et al., 2022] and graph

54

neural networks (GNN) [McBrearty and Beroza, 2023]. These machine-learning-based techniques

outperform traditional methods in both phase-picking [Baer and Kradolfer, 1987, Sleeman and van

Eck, 1999] and phase association [Zhang et al., 2019b].

Most machine-learning studies have been focused on continental, shallow earthquakes. Limited

attention has been given to deep earthquakes in subduction zones. For instance, PhaseNet and

GaMMA were developed using seismic data from Northern California, where most earthquakes

occur at depths shallower than 20 km. EQTransformer was trained with the STanford EArthquake

Dataset (STEAD) [Mousavi et al., 2019], which, despite its global scope, contains earthquakes

predominantly shallower than 100 km. Generalized Seismic Phase Detection [Ross et al., 2018]

is developed with vast hand‐labelled data archives of the Southern California Seismic Network,

which is also dominated by continental earthquakes. Studies utilizing these methods, [e.g., Chai

et al., 2020, Liu et al., 2020, Park et al., 2020, Ross et al., 2020, Tan et al., 2021, Wilding et al.,

2023, Liu et al., 2023, Gong et al., 2023] similarly concentrate on continental earthquakes.

Since many subduction zones are covered by oceans, offshore seismic data is critical for inves-

tigating subduction zone earthquakes. However, data from ocean-bottom seismographs (OBSs) is

generally noisier than that from land-based stations because of ocean-bottom currents, seismometer

tilting, instrument coupling, etc. Recent efforts, such as PickBlue, utilize existing packages such

as PhaseNet and EQTransformer to process OBS data but show a lower performance compared to

continental data [Bornstein et al., 2023]. Therefore, a new phase picker is needed for processing

OBS data.

The Tonga subduction zone hosts abundant intermediate-depth earthquakes and produces the

majority of deep-focus earthquakes, and thus serves as a unique natural laboratory for studying

deep earthquakes. However, studying Tonga’s earthquakes has been challenging. Global catalogs,

such as the ISC EHB catalog, mainly rely on a handful of land-based stations on the islands of

Tonga and Fiji. These catalogs provide foundational information on Tonga earthquakes, but the

hypocentre precisions, particularly in the vertical direction, are limited due to the lack of local

stations. Since 1993, a few temporary seismic deployments, including broadband OBSs, greatly

55

improved the data coverage and earthquake hypocentre precisions in this region, leading to ground-

breaking discoveries [Wiens et al., 1993, 1994, Wei et al., 2017]. However, the short duration of

these deployments has limited the number of analyzed earthquakes and motivates a systematic effort

to mine these datasets for more small earthquakes.

Innovative approaches are needed to tackle earthquake detection and location in the Tonga sub-

duction zone. First, the phase picker must be capable of handling noise in OBS data that is some-

times enriched in specific frequency ranges. To solve this problem, we introduce a new phase picker

PhaseNet-TF to detect seismic arrivals in the time-frequency domain. Second, the phase associator

should be more efficient than the conventional back-projection-based methods and account for the

change in seismic velocity with respect to depth. We develop a new associator GaMMA-1D to

associate arrivals output from PhaseNet-TF. Finally, given the limited data, optimizing the use of

existing data is crucial. Here we build a new semi-supervised learning-based workflow to analyze

seismic data from a 1-year temporary deployment.

3.3 Data

In this study, we analyzed seismic data recorded by a temporary seismic array deployed in the

Tonga subduction zone from November 2009 to December 2010 (Figure 3.1). This array included

49 OBSs (network ID YL) with either Guralp CMG3T_120sec (100 Hz sample rate) or LDEO

OBS Sensor Mk2 seismometers (40 Hz sample rate), 17 island-based stations (network ID Z1) with

Guralp CMG40T, Streckeisen STS-2, or Nanometrics Trillium 120 seismometers (40 Hz sample

rate), and one GSN station MSVF (network ID II) with a Geotech KS-54000 Borehole seismometer

(20 Hz sample rate).

We compiled a reference catalog of local earthquakes, consisting of 1,163 events, 42,256 P-

wave arrivals, and 14,852 S-wave arrivals (Table 3.1) that were manually picked with the Antelope

software [Wei et al., 2017]. This catalog is hereinafter called the manually picked reference catalog

or the reference catalog. We created a reference dataset by windowing three-component waveforms

5 minutes before and 5 minutes after each P-wave arrival. This window is sufficiently long to

include the corresponding S arrival for the same event. We subsequently removed the instrumental

56

response to obtain displacement waveforms in three components: vertical (Z), east (E or 1), and

north (N or 2). We further resample the data to 40 Hz. No additional preprocessing, such as filtering

and component rotation, was applied.

We also created a continuous dataset by partitioning the three-component continuous waveforms

recorded by this array into 10-day segments and subsequently removing the instrumental response.

Since not all stations had complete data from November 2009 to December 2010, we replaced the

missing data with zeros. Incomplete components were also accepted, with missing channels filled

with zeros. Given the large volume of the data, exceeding 2TB in the miniSEED format, we used

the mseedindex package and an ObsPy wrapper [Beyreuther et al., 2010] to construct a miniSEED

database. This facilitated efficient data analysis and improved machine learning I/O performance.

3.4 Methods

3.4.1 Phase arrival-time picking by PhaseNet-TF

We develop a new phase picker PhaseNet-TF, based on its predecessor PhaseNet [Zhu and

Beroza, 2019], to leverage the benefits of the time-frequency domain, which excels in capturing both

temporal and spectral features of seismic data. PhaseNet-TF adapts the architecture of DeepLabv3+

[Chen et al., 2018] to accommodate data in the time-frequency domain (Figure 3.2). DeepLabv3+ is

a state-of-the-art semantic image segmentation model that incorporates an encoder-decoder struc-

ture to refine object boundaries in segmentation tasks. As part of the renowned DeepLab model

series, it offers top-tier performance in a wide array of applications, ranging from autonomous driv-

ing to medical image analysis, and outperforms earlier models such as U-Net [Ronneberger et al.,

2015], which was used in PhaseNet. A seismic spectrogram acts as an image that represents the

time-frequency distribution of phase signal and noise. A spectrogram is generated by applying the

short-time Fourier transform (STFT) to three-component time-domain waveforms, and thus con-

sists of 6 components: i.e., the real and imaginary parts of the three-component waveform spectra.

Using the spectrogram as an input, DeepLabv3+ produces a pixel-level classification image that

matches the dimension of the spectrogram. This output highlights the relative positions of signal

and noise and is subsequently processed by a multilayer perceptron to estimate phase and noise

57

probabilities in the time domain.

The manually picked reference dataset is divided into training, validation, and test datasets using

stratified sampling, with a distribution ratio of 90:5:5. The ratio for the training dataset is relatively

high as we have a limited amount of data. This approach ensures an equitable representation of both

P and S waves, especially given the fact that the number of S wave picks is about 1/5 of that of P wave

picks. The input waveform window is 120 seconds long, with the P wave arrival initially centered

at the 10-second timestamp. We augment the training dataset in 3 ways. First, we randomly shift

the waveform windows to prevent the model from overfitting to specific phase arrival positions.

The 10-minute long waveform in our dataset is sufficient for cutting and shifting the 120-second

window. Second, we randomly stack two signal-bearing windows or one signal-bearing and one

noise-only window. The ratio for stacking is fine-tuned as a hyperparameter, allowing the model

to adapt to more complex real-world scenarios and preventing it from mistakenly learning that a

120-second window always contains only two phases (P and S). Third, we stabilize the input data

through normalization by subtracting the mean and dividing by the standard deviation of the 120-

second window. When applying random stacking, each window is also normalized before stacking,

and then the entire stacked waveform is normalized. For the validation and testing sets, we do not

shift or stack windows and only stabilize the input data through normalization.

We formulate the training labels to represent the probability of phase arrival times and use the

Kullback-Leibler (KL) divergence for the loss function. The KL divergence differs from the cross-

entropy loss used in PhaseNet only by a constant value, so they are equivalent for optimization. We

define the probability at time t as follows:

𝑦true(𝑡) = 𝑒− (𝑡 −𝑡

0
2 𝛿2

)2

, where

|𝑡 − 𝑡0| ≤ 3𝛿

(3.1)

Here, 𝑡0 is the phase arrival time, and 𝛿 is the width of the label (20 points, or 0.5 s in our case).

This label definition smooths the phase arrival time and allows for the quantification of classifica-

tion uncertainty through the shape of the model predictions. The KL divergence, measuring the

58

similarity between two probability distributions, is defined as:

𝐿(𝑦true, 𝑦predict) = 𝑦true ∗ log

𝑦true
𝑦predict

(3.2)

where 𝑦predict is the model prediction. A lower KL divergence value indicates a higher similarity be-

tween 𝑦true and 𝑦predict. The KL divergence is evaluated across three output channels: probabilities

of P, S, and noise. The sum of these probabilities is fixed as 1 at each time stamp.

We use the open-source PyTorch library, including AdamW [Loshchilov and Hutter, 2019] for

optimization and MultiStepLR for learning rate scheduling. AdamW is widely used in computer

vision tasks and deviates from the traditional Adam optimizer by decoupling the weight decay from

the gradient update. MultiStepLR adjusts the learning rate at specific epochs, decreasing it by a

fixed rate of 0.6 in our case. Our model is trained for 400 epochs, starting with a learning rate of

0.0004, which decays at epochs 15, 30, 45, and 60. We also add an L2 regularization term with a

weight decay of 0.001 to the loss function to mitigate overfitting. Early stopping is implemented to

prevent overfitting and save computational time. Training is halted if the validation loss does not

improve for 30 consecutive epochs. For our reference dataset, the training took about 5 hours on

16 NVIDIA Tesla V100 GPU cards at the MSU HPCC.

We apply the trained PhaseNet-TF model to the continuous dataset for phase detection. The

output is continuous probability distributions for P waves, S waves, and noise. We first partition the

continuous waveforms into 120-second segments with a 60-second overlap between consecutive

segments. Each segment is normalized in the same way used for model training. Then we applied

the model to the entire continuous dataset, which took 16 hours on 4 NVIDIA Tesla V100 GPUs

to process. The output is 120-second segments of probability distributions for P wave, S wave, and

noise. We combine these 120-second segments into a single continuous time series by taking the

final output probability as the maximum value from the overlapping predictions. Peak probabilities

larger than 0.5 are counted as positive picks.

3.4.2 Phase association by GaMMA-1D

Associating phase picks to specific earthquakes is necessary for locating events and eliminating

unreliable picks. We use GaMMA-1D, a Bayesian Gaussian Mixture Model Associator with a 1D

59

velocity structure, which is an improved version of GaMMA [Zhu et al., 2022]. While GaMMA-1D

retains the Gaussian mixture model framework of its predecessor GaMMA for phase association,

it improves calculating phase arrival times by using a fast-sweeping method to solve the Eikonal

equation based on a 1D velocity model AK135 [Kennett et al., 1995]. In contrast to GaMMA which

used a uniform half-space for arrival time predictions, Gamma-1D uses a 1D velocity model, which

is critical for the large depth range of earthquakes in Tonga. Events associated with less than 10

picks are discarded. Figure 3.3 shows the association results for a densely packed sequence of phase

picks.

3.4.3 Earthquake relocation by teletomoDD

We use teletomoDD [Pesicek et al., 2010], a package for double-difference seismic tomography

and relocation, to relocate all events associated with GaMMA-1D in the previous step. The 3D

seismic velocity model is fixed during inversions and is adopted from the TX2019slab model [Lu

et al., 2019]. We apply a bootstrap resampling technique to estimate relocation uncertainties and

filter out events with large uncertainties. We create 1,000 subsets of the data by randomly excluding

30% of the stations from each subset. After relocating events in these 1,000 subsets, we compute

the mean and standard deviation of the hypocentre and origin time of each event. We eliminate

events with a standard deviation in longitude and latitude greater than 0.1 degrees, in depth of

greater than 10 km, or in origin time of greater than 1 second. This approach effectively removes

unreliable picks from PhaseNet-TF and GaMMA-1D as well as events that are poorly constrained.

The relocation output catalog contains the hypocentres and origin times of the remaining events

and the corresponding P- and S-wave arrival times.

3.4.4 Semi-supervised-learning-based workflow

Since there are only 1,163 manually picked events out of presumably tens of thousands of earth-

quakes in the Tonga subduction, the reference catalog and dataset may limit phase detection capa-

bility. Therefore, we utilize a semi-supervised learning strategy to iteratively refine labeled picks

and retrain PhaseNet-TF (Figure 3.4). This approach integrates a limited labeled dataset with a

larger pool of unlabeled data for model training. In Iteration #1, we train PhaseNet-TF with the

60

original labeled dataset, i.e., the manually picked reference dataset. This model is then applied

to the continuous dataset, generating new phase picks that may include false detections. The sub-

sequent steps of phase association by GaMMA-1D and event relocation by teletomoDD filter out

unreliable picks and events with large uncertainties. Compared to the reference catalog, the output

catalog thus contains a larger number of reliable picks. Similar to the reference dataset, we create

a new labeled dataset by windowing three-component waveforms 5 minutes before and 5 minutes

after each P wave arrival from the output catalog. This newly labeled dataset is divided into the

training, validation, and test datasets at a ratio of 90:5:5 to train the PhaseNet-TF model in the next

iteration. As more picks predicted by deep learning are added to the training dataset, one can expect

more picks and events to be detected, at the cost of increasing arrival time residuals compared to

the reference dataset. We continue this iterative workflow for several iterations until the number of

events reaches a plateau and the arrival time residuals do not increase dramatically.

3.5 Results

3.5.1 PhaseNet-TF model assessment

We assess the PhaseNet-TF model in each semi-supervised-learning iteration using the test

dataset, which is 5% of the labeled dataset in the corresponding iteration (Table 3.2). The evalua-

tion metrics include precision, recall, F1 score, and arrival-time residuals compared to the labeled

catalog. This labeled catalog is the manually picked reference catalog for Iteration #1 and is the

output catalog from the previous iteration for Iteration #2 and #3. Predicted picks with arrival-time

residuals smaller than 1 second are considered true positives, whereas predicted picks with larger

arrival-time residuals are false positives. Labeled picks that are not predicted by PhaseNet-TF are

considered false negatives. Precision, recall, and F1 score are defined as

Precision: 𝑃 =

𝑇𝑃
𝑇𝑃 + 𝐹𝑃

Recall: 𝑅 =

F1: 𝐹1 =

𝑇𝑃
𝑇𝑃 + 𝐹𝑁
2𝑃𝑅
𝑃 + 𝑅

61

(3.3)

(3.4)

(3.5)

where 𝑇𝑃, 𝐹𝑃, and 𝐹𝑁 are the numbers of true positives, false positives, and false negatives, respec-

tively.

In Iteration #1, the PhaseNet-TF output is evaluated against the manually picked reference cata-

log. For the P wave, the model exhibited a precision of 0.99, a recall of 0.99, and an F1 score of 0.99.

For the S wave, the corresponding values are 0.97, 0.99, and 0.98, respectively. Figure 3.5 shows

examples of seismograms, spectrograms, and prediction probabilities from iteration #1 at 2 OBSs

and 2 land-based stations. In the following iterations, the evaluation metrics remain at the same

high level, validating the semi-supervised learning approach and the robustness of PhaseNet-TF in

accurately identifying phase arrivals.

3.5.2 Phase association and earthquake relocation assessments

Phase association and earthquake relocation serve as critical filters for eliminating unreliable

predicted picks and poorly constrained events. When manual picks are associated with GaMMA-

1D, some of them are missed in the association catalog and considered false negative picks. The

picks associated with GaMMA-1D are subsequently used for event relocation by teletomoDD with

bootstrap resampling. Events with large uncertainties and their corresponding picks are discarded

during the relocation process. We assess these processes (Table 3.3) using the reference catalog that

contains manual picks associated with the Antelope software [Wei et al., 2017]. When comparing

the reference catalog and the output catalog by GaMMA-1D or teletomoDD, events with origin-time

residuals smaller than 15 seconds are considered true positive events, whereas events with larger

origin-time residuals are false positives. If some picks that were associated with a single event by

Antelope are associated with multiple events by GaMMA-1D, the new events are also counted as

false positives. Because there are no picks added during this processing, the recall for P- or S-wave

arrivals reflects the picks eliminated during association and relocation, and a high recall value is

desired. In contrast, new events may be added during the association process, lowering precision,

whereas existing events may be discarded during the association and relocation processes, lowering

recall. Thus, the F1 score for events that balance precision and recall serves as a better indicator of

the filtering performance.

62

As shown in Table 3.3, the recall for P-wave arrivals is 0.97 after association and 0.95 after

relocation, suggesting that GaMMA-1D and teletomoDD are highly effective in retaining manually

picked P-wave arrivals. However, the recall for S-wave arrivals is 0.92 after association and 0.86

after relocation. These numbers indicate that about 8% of the manually picked S-wave arrivals

are not successfully associated with GaMMA-1D, which impacts the subsequent relocation per-

formance. This could be attributed to either the limitations of GaMMA-1D in associating S-wave

arrivals or inaccurate manual S-wave picks. Nonetheless, the overall performance of the association

and relocation filtering processes remains promising.

3.5.3 Phase detection and event relocation on continuous data

In Iteration #1, the PhaseNet-TF model is trained by the manually picked reference dataset.

When applying this model to the continuous data, PhaseNet-TF detects 294,050 P-wave arrivals

and 112,547 S-wave arrivals, which is substantially more than the number of picks in the reference

catalog. Figure 3.6 demonstrates the performance on one hour of continuous data. These arrivals

are associated with GaMMA-1D in a preliminary catalog. In this step, about 10% of P- and 30%

of S-wave arrivals are discarded, and the associated catalog consists of 13,111 events with 265,439

P-wave, and 79,380 S-wave arrivals. These events generally align with the reference catalog but are

more scattered (Figures 3.7a, 3.7b, 3.8a, and 3.8b). Many events in the mantle wedge are not reliable

as they have a large azimuthal coverage gap. The subsequent relocation and error estimation filter

out most of these outlier events, leaving a new catalog of 9,427 events with 217,254 P-wave, and

63,590 S-wave arrivals (Figures. 3.7c and 3.8c, Table 3.1). When comparing this catalog with the

reference catalog, the recall for P-waves, S-waves, and events are 0.94, 0.83, and 0.96, respectively,

and the standard deviations of arrival-time residuals are 0.14 and 0.15 seconds for P and S waves,

respectively (Figures 3.9a and 3.9b), suggesting that our workflow can effectively detect seismic

arrivals and earthquakes that were manually picked. More importantly, the new catalog contains

dramatically more P- and S-wave arrivals and events (Table 3.1).

Leveraging this new catalog, we assemble a new labeled dataset enriched with phase arrivals

detected in Iteration #1 of the semi-supervised learning workflow. This new dataset (120 GB) is

63

substantially larger than our initial reference dataset (22 GB). The increased dataset size requires

additional computational resources, extending the training time from 5 to 24 hours while utilizing

the same number of GPUs. In Iteration #2, this new PhaseNet-TF model is applied to the continuous

dataset again, resulting in significantly more arrivals and events (Figures. 3.7d and 3.8d). The

standard deviation of arrival-time residuals for P waves remains 0.15 seconds (Figure 3.9c), but

that for S waves increases from 0.15 to 0.23 seconds (Figure 3.9d). In Iteration #3 which uses the

output catalog from Iteration #2 for training, the numbers of arrivals and events and arrival-time

residuals remain stable (Figs. 3.7e and 3.8e). We thus cease the semi-supervised learning workflow

after Iteration #3, anticipating diminishing returns in further iterations.

Our final catalog from Iteration #3 contains 13,406 relocated events with 372,774 P-wave ar-

rivals and 78,853 S-wave arrivals. Compared with the manually picked reference catalog, our final

compilation boasts a factor of 11 times more events, 8 times more P-wave phases, and 5 times

more S-wave phases. Figures 3.7 and 3.8 show that our final catalog offers enhanced delineation of

both the slab geometry and double seismic zone, demonstrating its superiority over the reference

catalog.

3.6 Discussion

3.6.1 Comparison with previous packages

In this study, PhaseNet-TF detects seismic arrivals in the time-frequency domain, different from

most other deep-learning phase pickers that work in the time domain. Using a manually picked ref-

erence catalog and dataset, we conduct a quantitative and fair comparison between PhaseNet and

PhaseNet-TF. We first test the original PhaseNet model that was trained by the Northern California

data [Zhu and Beroza, 2019]. We also retrain the PhaseNet architecture with our training dataset

from Tonga. The PhaseNet-TF model in Iteration #1 is used for comparison. All models are eval-

uated on the same testing dataset to ensure a fair comparison.

Table 3.2 highlights the superiority of PhaseNet-TF over PhaseNet in detecting seismic arrivals,

particularly for OBS data. PhaseNet with its original weights shows poor performance for both P

and S waves. This is an unsurprising outcome given that it was trained with a dataset from a different

64

tectonic setting and on land-based vs. OBS instruments. When retrained with the Tonga dataset,

PhaseNet’s performance is similar to PhaseNet-TF for detecting P-wave arrivals but displays a lower

performance for S waves. Furthermore, the standard deviation of S-waves arrival-time residuals

for PhaseNet-TF (Iteration #1) are significantly smaller than those for PhaseNet models (Figure

3.9b and 3.9h). These differences indicate that including the time-frequency domain and the new

architecture enhances the model’s capability to detect S waves. This is because the horizontal

components that record S waves are much noisier for OBS data compared to land-based stations,

due to seismometer tilting and ocean-bottom currents [Webb and Crawford, 1999, Wei et al., 2015].

For associating seismic arrivals from Tonga deep earthquakes, GaMMA-1D exhibits higher

performance compared to GaMMA, which was designed for California [Zhu et al., 2022]. That

is because GaMMA-1D uses a 1D velocity model AK135, whereas GaMMA assumes a uniform

velocity model. We test both GaMMA and GaMMA-1D to associate all manual picks and com-

pare the output catalogs against the reference catalog. When comparing the origin-time and depth

residuals, GaMMA-1D consistently achieves superior accuracy to its predecessor GaMMA (Figure

3.10). Table 3.3 lists the evaluation metrics for GaMMA-1D and GaMMA. Compared to GaMMA,

GaMMA-1D achieves a similar performance for associating P- and S-wave picks to certain events.

However, the low precision of event association (0.76) suggests that GaMMA tends to break a sin-

gle event into multiple events. This problem will impact the next step of earthquake relocation,

resulting in more events with poorer constraints and/or misassociated phases.

Our workflow is readily adaptable to a cloud computing setting through modifications to the

machine learning models employed in Quakeflow [Zhu et al., 2023b]. Quakeflow is a cloud-based

earthquake monitoring system designed for detecting seismic activity and analyzing source charac-

teristics from continuous waveform data. It currently utilizes PhaseNet for phase picking, GaMMA

for phase association, and HypoDD for event relocation. Given the compatibility in model inputs

and outputs, these can be smoothly swapped with PhaseNet-TF, GaMMA-1D, and teletomoDD. A

future version of Quakeflow potentially provides an efficient and effective solution for enhancing

real-time earthquake surveillance and for in-depth analysis of historical seismic data, particularly

65

for deep earthquakes in subduction zones and OBS data.

3.6.2 Tonga deep earthquakes revealed by the new catalog

With only 1 year of data, our results show unprecedented detail in the Wadati-Benioff zone.

Figure 3.11 compares the new catalog against the manually picked reference catalog [Wei et al.,

2017] and the ISC EHB Bulletin from 1964 to 2020 [International-Seismological-Centre, 2023].

The latter uses the EHB algorithm [Engdahl et al., 1998] to minimize hypocentre errors, particularly

in the vertical direction, and is arguably the most precise global catalog. The general pattern of

earthquake distribution remains similar across all catalogs.

Double seismic zones (DSZs), in which intermediate-depths earthquakes occur along two planes

parallel to the dip of the slab, are observed in many subduction zones and are attributed to meta-

morphic reactions that release fluids or volatiles in the slab [e.g., Hacker et al., 2003, Yamasaki and

Seno, 2003, Brudzinski et al., 2007]. The Tonga DSZ was initially discovered by Kawakatsu [1985]

using focal mechanisms constrained by global data. Using the 2009-2010 temporary deployment,

Wei et al. [2017] confirmed the existence of the Tonga DSZ and revealed more details. Our new

catalog shows a much clearer DSZ along cross-section B-B’ that extends to about 300 km depth.

Although hinted by Wei et al. [2017], the new results explicitly suggest that the lower plane of the

DSZ is confined between the latitudes of 19°S and 21°S (the cyan box in Figure 3.11a), diminishing

to the north and south. A DSZ with a limited extent in Tonga is in agreement with similar obser-

vations in Japan [Igarashi et al., 2001] and Alaska [Wei et al., 2021], suggesting that deeper parts

of the slab mantle are not uniformly hydrated. We do not observe a deeper DSZ at 350–460 km

depths that was interpreted as the edges of a metastable olivine wedge in the slab [Wiens et al.,

1993]. This could be because there were not enough earthquakes occurring at these depths during

the 1-year deployment to delineate that feature.

The Tonga subduction zone hosts the majority of deep-focus earthquakes in the world, including

4 notable large events: Mw 7.6 on 9 March 1994, Mw 7.3 on 9 November 2009, Mw 8.2 on 19

August 2018, and Mw 7.9 on 6 September 2018. Our results show that the 1994 Mw 7.6 and 2018

Mw 8.2 events occurred at the bottom of a highly seismogenic region (Figure 3.11d), consistent

66

with the previous suggestion that these events initiated rupturing in the cold slab [McGuire et al.,

1997, Fan et al., 2019].

In contrast, the 2018 Mw 7.9 event occurred in a previously aseismic

region (Figure 3.11g), possibly rupturing through a dissipative process at the edge of a warm fossil

slab [Fan et al., 2019, Jia et al., 2020]. The 2009 Mw 7.3 event occurred at the western end of a

seismicity band corresponding to a fossil slab subducted at the now inactive Vitiaz Trench [Cai and

Wiens, 2016].

3.7 Conclusions

Our integrated workflow has proven highly effective at detecting and locating deep earthquakes

in the Tonga subduction zone. We use PhaseNet-TF to detect P- and S-wave arrivals in the time-

frequency domain, showing superior performance over traditional ways of picking arrivals in the

time domain. Detecting arrivals in the time-frequency domain is critical for analyzing OBS data,

particularly the horizontal components with much higher noise levels compared with land-based

stations. We use GaMMA-1D to associate arrivals, teletomoDD to relocate events, and a bootstrap

resampling approach to estimate uncertainties. This workflow effectively removes artificial arrivals

and events with poor constraints. Furthermore, through semi-supervised learning, the PhaseNet-

TF model improves iteratively, leading to a more comprehensive and accurate earthquake catalog

compared to the initial manual picks.

This research opens new avenues for in-depth studies in subduction zones, particularly those

with limited local data coverage like Tonga. The new catalog with more events and arrivals poten-

tially benefits future work of high-resolution tomography imaging and earthquake similarity anal-

yses, helping us better understand deep earthquake mechanisms and subduction processes. While

our study focuses on Tonga, the methods and workflow are readily adaptable for application in

other subduction zones, offering a versatile tool for both real-time monitoring and historical data

analyses.

67

3.8 Figures

Figure 3.1 Map of the Tonga subduction zone and adjacent regions. Earthquakes in the manually
picked reference catalog are shown as dots color-coded by depth. Black triangles, inverted
triangles, and squares represent land-based stations, ocean-bottom seismographs, and a GSN
station, respectively, deployed from November 2009 to December 2010. Land areas are shaded in
grey. The bathymetry contours of 1 km highlight features such as the Tonga Ridge, Lau Ridge,
and Fiji Plateau. Additional bathymetry contours at 7, 8, 9, and 10 km delineate the Tonga Trench.
The black arrow indicates the Pacific Plate’s motion relative to Tonga. Inset provides a global
context for the study region.

68

176°E178°E180°178°W176°W174°W172°W26°S24°S22°S20°S18°S16°S14°STonga TrenchLau BasinFiji0200400600Earthquake depth (km)YLZ1IIFigure 3.2 PhaseNet-TF architecture. The input consists of 120-second three-component
seismograms with a sample rate of 40 Hz, so the input has a dimension of 3×4800. The output
includes three probability time series, corresponding to P-wave arrival, S-wave arrival, and noise,
with the same length as the input. Dimensions for each layer are denoted adjacent to the layer,
with the format of ”number of channels × layer height × layer width”. The input seismograms are
first transformed into spectrograms before being processed through the DeepLabv3+ network.
DeepLabv3+ includes an encoder and a decoder and is equipped with skip layers. The decoder
features a dilated ResNet34 and Atrous Spatial Pyramid Pooling (ASPP). The output of
DeepLabv3+ is subsequently refined through a multilayer perceptron.

69

Batch normalization + ReLUBatch normalizationDilated convolutionalDepthwise convolutionalPointwise convolutionalBilinear upsamplingReLUFully connectedConvolutionalLayersWaveforms (3components,120s)3 x 4800Spectrograms(real + imag)6 x 64 x 4800DilatedResNet3464 x 16 x 120064 x 16 x 1200256 x 4 x 300512 x 4 x 30064 x 32 x 2400256 x 1 x 1256 x 4 x300256 x 4 x300256 x 4 x300256 x 4 x300256 x 4 x3001024 x 4 x 300256 x 4 x 300256 x 4 x 300256 x 16 x 120048 x 16 x 1200304 x 16 x 1200256 x 16 x 12003 x 16 x 12003 x 64 x 48003 x 32 x 48003 x 4800Atrous Spatial PyramidPooling (ASPP)ConcateIdentical128 x 8 x 600ConcateEncoderDecoderProbabilities (Pwave, S wave,and noise)Atrous: 12Atrous: 24Atrous: 36Avg. PoolingShortcutShortcutShortcutShortcutBlockx 3Blockx 4Blockx 6Blockx 3DeepLabv3+Figure 3.3 Examples of GaMMA-1D association from UTC 13:43 to 13:53 on September 15,
2010. The top panel shows associated arrivals with respect to longitude, whereas the bottom panel
shows them against latitude. In both panels, individual colors denote distinct associated events.
Dots and triangles indicate P- and S-wave arrivals, respectively, while crosses mark the origin
time and locations of the associated events. Events associated with less than 10 phase arrivals
(e.g., the green-colored event and arrivals) will be discarded in the following step.

70

−25−20−15Latitude (degree)13:4413:4613:4813:5013:522010 09 15PSEventPSEventPSEventPSEventPSEventPSEvent175180185190Longitude (degree)PSEventPSEventPSEventPSEventPSEventPSEventFigure 3.4 Schematic of our semi-supervised learning workflow. The process begins with training
the PhaseNet-TF model using manually picked arrivals. The trained model is then applied to
continuous seismic data to obtain P- and S-wave arrival probabilities. The following phase
association by GaMMA-1D, event relocation by teletomoDD, and error estimation/filtering with a
bootstrap resampling approach together produce a refined event catalog and associated phase
arrivals. This updated catalog serves as a new labeled dataset for training and validating
PhaseNet-TF in the next iteration. The entire workflow is iteratively repeated until no significant
improvements are observed, culminating in the final event catalog and corresponding phase
arrivals.

71

Labeled picksPhaseNet-TF (Fig. 2)Continious 3-component waveformTrainingPredictingPredicted P and S arrivals(Fig. 6)Phase association(GaMMA-1D)Event relocation(teletomoDD)New catalog andcorresponding arrivalsError estimation and filtering(Bootstrap resampling)RefiningRetrainingFinal catalog andcorresponding arrivalsFigure 3.5 Examples of PhaseNet-TF prediction on the test dataset. (a) and (b) show deep
earthquakes recorded at OBSs, whereas (c) and (d) show events recorded at land-based stations.
In each panel, the title includes the origin time, hypocenter, and station name. The top 3 sections
display the three-component waveforms in the time domain, followed by 3 sections of
spectrograms (power value) of the three components. The bottom section shows the probabilities
of predicted P- and S-wave arrivals. Manually picked P- and S-wave arrivals are indicated by
dashed lines.

72

−0.80.00.8Amplitude1(a)2010 02 17 06:32:08.00 21.56S,178.18W 452.8km YL.C11W−0.80.00.8Amplitude2−0.80.00.8AmplitudeZ048Frequency1048Frequency2048FrequencyZ0.00.40.8ProbabilityPredicted PPredicted SManually picked PManually picked S1(b)2010 09 26 17:20:38.87 20.78S,178.39W 618.2km YL.A012Z12ZPredicted PPredicted SManually picked PManually picked S−0.80.00.8AmplitudeE(c)2010 07 11 10:29:21.54 25.11S,179.80W 655.2km Z1.FONI−0.80.00.8AmplitudeN−0.80.00.8AmplitudeZ048FrequencyE048FrequencyN048FrequencyZ0.00.40.8Probability020406080100120Time (s)Predicted PPredicted SManually picked PManually picked SE(d)2010 03 07 23:30:23.92 21.94S,175.70W 148.1km Z1.NMKANZENZ020406080100120Time (s)Predicted PPredicted SManually picked PManually picked S0.00.51.01.52.0Specrogram AmplitudeFigure 3.6 Examples of PhaseNet-TF prediction on continuous data from UTC 00:00 to 01:00 on
7 April 2010 on an OBS B04W (a) and a land-based station VAVP (b). (a) and (b) are similar to
Figure 3.5, except that there are no manual picks. (c) Zoomed-in waveform filtered at 3-10 Hz and
P/S probability within the time window indicated by the magenta box in (a). (d) Zoomed-in
waveform filtered at 3-10 Hz and P/S probability within the time window indicated by the
magenta box in (b). The P and S arrivals are confirmed to be true by the following steps of phase
association and event relocation.

73

−0.80.00.8AmplitudeYL.B04W 2010 04 07 00:00:00 to 01:00:001(a)−0.80.00.8Amplitude2−0.80.00.8AmplitudeZ048Frequency (Hz)1048Frequency (Hz)2048Frequency (Hz)Z0.00.51.0Possibility0100020003000Time (s)Predicted PPredicted S−0.80.00.8Zoomed in waveform and P/S probability(c)1−0.80.00.82−0.80.00.8125012601270128012901300Time (s)Z−0.80.00.8AmplitudeZ1.VAVP 2010 04 07 00:00:00 to 01:00:00E(b)−0.80.00.8AmplitudeN−0.80.00.8AmplitudeZ048Frequency (Hz)E048Frequency (Hz)N048Frequency (Hz)Z0.00.51.0Possibility0100020003000Time (s)Predicted PPredicted S−0.80.00.8Filtered between 3 and 10 Hz(d)E−0.80.00.8N−0.80.00.8147014801490150015101520Time (s)Z0.00.20.4Specrogram AmplitudeFigure 3.7 Maps of event distributions at various stages of the analysis. (a) The manually picked
reference catalog by Wei et al. [2017]. (b) Events predicted by PhaseNet-TF and associated by
GaMMA-1D in Iteration #1. (c) Relocated events serve as the output catalog of Iteration #1. (d)
Output catalogue of Iteration #2. (e) Output catalog of Iteration #3, considered the final catalog.
All panels are plotted similarly to Figure 3.1. The number of events for each step is shown in the
bottom-left corner of each panel.

74

25°S20°S15°S(a) Reference1163 events(b) Associated (Iteration #1)13111 events180°175°W(c) Iteration #19427 events25°S20°S15°S(d) Iteration #213799 events180°175°W(e) Iteration #313406 events0200400600Depth (km)Figure 3.8 Cross-sections of event distributions at various stages of the analysis. (a) The manually
picked reference catalog by Wei et al. [2017]. (b) Events predicted by PhaseNet-TF and associated
by GaMMA-1D in Iteration #1. (c) Relocated events serve as the output catalog of Iteration #1.
(d) Output catalogue of Iteration #2. (e) Output catalog of Iteration #3, considered the final
catalog. (f) The location of the cross-section (blue line). In each of (a) – (e), black circles indicate
earthquakes within 70 km away from the cross-section. Red and magenta curves show the slab
upper interface according to the Slab1.0 [Hayes et al., 2012] and Slab2 [Hayes et al., 2018]
models, respectively. Neither of these models is accurate in the Tonga subduction zone.

75

0200400600Depth (km)Slab2Slab 1.0(a) ReferenceSlab2Slab 1.0(b) Associated (Iteration #1)Slab2Slab 1.0(c) Iteration #10200400600Depth (km)0200400600Distance (km)Slab2Slab 1.0(d) Iteration #20200400600Distance (km)Slab2Slab 1.0(e) Iteration #3180°175°W25°S20°S15°S(f)Tonga TrenchLau BasinFijiFigure 3.9 Arrival-time residuals of P (top) and S (bottom) waves across different models and
iterations. When each model is applied to the continuous dataset, the predicted arrivals are
compared to manual picks when they exist, and the arrival-time differences contribute to the
histogram. (a-f) Results of PhaseNet-TF from three consecutive iterations. (g-h) Outcomes of
PhaseNet retrained with the Tonga dataset. (i-j) Results of PhaseNet with its original model
weights trained with data from Northern California.

76

0200040006000800010000Count(Iteration #1)PhaseNet TFP Wavemean: 0.02 sstd: 0.14 s(a)(Iteration #2)PhaseNet TFP Wavemean: 0.02 sstd: 0.15 s(c)(Iteration #3)PhaseNet TFP Wavemean: 0.02 sstd: 0.16 s(e)Tonga modelPhaseNetP Wavemean:  0.01 sstd: 0.15 s(g)origional modelPhaseNetP Wavemean: 0.00 sstd: 0.21 s(i)05001000150020002500Count−0.8−0.40.00.40.8Arrival time residual (s)S Wavemean:  0.02 sstd: 0.15 s(b)−0.8−0.40.00.40.8Arrival time residual (s)S Wavemean:  0.06 sstd: 0.23 s(d)−0.8−0.40.00.40.8Arrival time residual (s)S Wavemean:  0.10 sstd: 0.28 s(f)−0.8−0.40.00.40.8Arrival time residual (s)S Wavemean:  0.03 sstd: 0.29 s(h)−0.8−0.40.00.40.8Arrival time residual (s)S Wavemean: 0.09 sstd: 0.31 s(j)Figure 3.10 Origin-time (top) and depth (bottom) residuals for GaMMA-1D and GaMMA. When
manual picks are associated by GaMMA-1D or GaMMA, the event origin times and depths are
compared against the reference catalog that was associated by the Antelope software [Wei et al.,
2017].

77

0100200300400Count−20020Origin time Residual (s)GaMMA 1D(a)−20020Origin time Residual (s)GaMMA(b)050100150200250Count−160−80080160Depth residual (km)(c)−160−80080160Depth residual (km)(d)Figure 3.11 Event distributions across 3 distinct catalogs. (Left column) Our final catalog in this
study. (Middle column) Reference catalog by Wei et al. [2017]. (Right column) ISC-EHB
catalogue [International-Seismological-Centre, 2023]. In each column, the top panel shows the
map view, similar to Figure 3.7, and the following panels show 4 cross-sections, similar to Figure
3.8. Beachballs show the focal mechanisms of 4 notable large deep earthquakes (1994/3/9 Mw
7.6, 2009/11/9 Mw 7.3, 2018/8/19 Mw 8.2, and 2018/9/6 Mw 7.9) from the Global CMT catalog
[Dziewonski et al., 1981, Ekström et al., 2012].

78

(11/2009-12/2010)Final catalogue of this study180°175°W25°S20°S15°STonga TrenchLau BasinFijiAA’BB’CC’DD’09/11/200906/09/201809/03/199419/08/2018(a)(11/2009-12/2010)Manually picked reference catalogue180°175°WTonga TrenchLau BasinFijiAA’BB’CC’DD’09/11/200906/09/201809/03/199419/08/2018(b)(1964-2020)ISC-EHB catalogure180°175°WTonga TrenchLau BasinFijiAA’BB’CC’DD’09/11/200906/09/201809/03/199419/08/2018(c)0200400600Slab2Slab 1.009/03/199419/08/2018’AA(d)0200400600Slab2Slab 1.009/11/200906/09/2018’BB(g)0200400600Slab2Slab 1.0’CC(j)020040060005001000Distance (km)Slab2Slab 1.0’DD(m)Slab2Slab 1.009/03/199419/08/2018’AA(e)Slab2Slab 1.009/11/200906/09/2018’BB(h)Slab2Slab 1.0’CC(k)05001000Distance (km)Slab2Slab 1.0’DD(n)Slab2Slab 1.009/03/199419/08/2018’AA(f)Slab2Slab 1.009/11/200906/09/2018’BB(i)Slab2Slab 1.0’CC(l)05001000Distance (km)Slab2Slab 1.0’DD(o)0200400600Depth (km)DSZFossil slabSlab tear3.9 Tables

Table 3.1 Numbers of picks and events in the manually picked reference catalogue and output
catalogues. Recalls are evaluated against the reference catalogue. The picks with arrival-time
residuals < 1 s compared to the reference catalogue are counted as true positive picks, whereas the
events with origin-time residuals < 15 s are considered true positive events. The output catalogue
of Iteration #3 is considered the final catalogue. The low recall values of S-wave arrivals result
from the filtering processes of phase association and relocation, as many S-wave arrivals are
discarded.

Reference

Iteration #1 output

Iteration #2 output

Iteration #3 output (Final)

P-wave arrival

S-wave arrival

Event

Recall Number Recall Number Recall Number

N/A

0.94

0.94

0.89

42,256

217,254

343,247

372,774

N/A

0.83

0.80

0.75

14,852

63,590

79,593

78,853

N/A

0.96

0.96

0.91

1,163

9,427

13,799

13,406

Table 3.2 Evaluation metrics of phase pickers on the test dataset for different architectures and
models. Picks with arrival-time residuals < 1 s compared to the manually picked reference
catalogue are counted as true positive picks. The metrics in the table are evaluated on the test
dataset partitioned from the labelled dataset, which is the manually picked reference catalogue for
PhaseNet-TF Iteration #1 and two PhaseNet models, and the output catalogue from the previous
iteration for PhaseNet-TF Iterations #2 and #3.

PhaseNet-TF (Iteration #1)

PhaseNet-TF (Iteration #2)

PhaseNet-TF (Iteration #3)

PhaseNet retrained by the Tonga dataset

PhaseNet with original model weights
(trained with Northern California data)

P-wave arrival

S-wave arrival

Precision Recall

F1

Precision Recall

F1

0.99

0.99

0.99

0.97

0.99

0.99

0.98

0.97

0.99

0.99

0.99

0.97

0.97

0.96

0.96

0.85

0.99

0.99

0.99

0.94

0.98

0.98

0.98

0.90

0.89

0.66

0.76

0.48

0.28

0.36

79

Table 3.3 Evaluation of association and relocation filtering on the manually picked reference
catalogue (Iteration #1). Events with origin time residuals < 15 s are counted as true positive
events, and phase arrival-time residuals < 1 s are counted as true positive picks.

Association by GaMMA-1D

Relocation using GaMMA-1D
output and filtering
(epicentre uncertainty < 0.1°,
depth uncertainty < 10 km,
and origin-time uncertainty < 1 s)

Association by GaMMA

P-wave arrival

S-wave arrival

Event

Recall

0.97

Recall

0.92

Precision Recall

F1

0.97

0.98

0.97

0.95

0.97

0.86

0.91

0.98

0.76

0.95

0.96

0.96

0.85

80

CHAPTER 4

DETECTING CONVERTED SEISMIC PHASES OF TONGA DEEP EARTHQUAKES:
IN-DEPTH INSIGHTS FROM DEEP-LEARNING METHODS

An edited version of this chapter is in preparation for submission to Geophysics Journal

International.

4.1 Abstract

Converted seismic phases have been instrumental in advancing structural seismology. Recently,

the collection of seismic data in the Tonga subduction zone offers a good opportunity to explore the

Tonga slab. Among this data are numerous P-to-S (PS) waves from deep earthquakes converted at

the slab top interface, which are critical to understanding the geometry and internal structure of the

subducted slab. However, identifying PS phases in a large seismic dataset is challenging, primarily

due to the inconsistent distribution of energy in the frequency domain, varying picking criteria by

human analysts, and non-unique interpretations of the signals. To address these challenges, we

adopt a deep-learning-based package, PhaseNet-TF, specifically to detect PS-wave arrivals in a

dataset collected from an amphibious seismic array in the Tonga-Fiji area. We further use filters

of locations, travel-time, and energy and beamforming analyses to quality control the predicted

arrivals. With a semi-supervised learning strategy, we iteratively refine the arrival labels and retrain

PhaseNet-TF. This approach has significantly increased the number of PS arrival measurements,

which will be used to image the Tonga slab in detail. Our study demonstrates that deep learning

offers a promising avenue for improving the detection of converted phases in structural seismology.

4.2

Introduction

Detecting converted seismic waves is essential for understanding the complexities of slab struc-

tures and the dynamics of subduction processes. The analysis of P-slab-S and S-slab-P waves is

particularly crucial, as these waves provide insights distinct from those offered by direct P and S

waves, which primarily map the crust and mantle’s structure. Converted waves are sensitive to

variations in seismic wave speed within the slab and at its interface, offering critical information

for subduction studies. Research in northeastern Japan, the eastern Aleutian, and the Nazca sub-

81

duction zone has emphasized the role of PS waves in providing detailed information on the slab’s

upper interface and identifying low-velocity zones within the slab [Matsuzawa et al., 1986, 1990,

Zhao et al., 1997a, Shiina et al., 2013, 2014, 2017, Helffrich and Abers, 1997, Bock et al., 2000].

Converted SP waves have been studied predominantly in northeastern Japan, influenced by the ori-

gin of these waves in the lower plane of the double seismic zone [Ohmi and Hori, 2000]. Very few

such data are available from other subduction zones.

Recent advancements in applying deep learning techniques for seismic wave detection have sig-

nificantly enhanced the ability to identify seismic events without compromising accuracy. The evo-

lution from early applications of simple neural networks for phase detection [Gentili and Michelini,

2006, Ross et al., 2018] to the adoption of U-Net [Ronneberger et al., 2015]-inspired deep neural

networks like PhaseNet [Zhu and Beroza, 2019] and transformer [Vaswani et al., 2017]-inspired

architectures such as EQTransformer [Mousavi et al., 2020] demonstrates deep learning’s reliabil-

ity and efficiency in seismic wave detection. The development of PhaseNet-TF [Xi et al., 2023],

which integrates time-frequency domain information and semantic image segmentation techniques

to identify noisy seismic signals from ocean bottom seismometers, has motivated a focus on de-

tecting weak signals alongside the traditionally more emphasized strong seismic signals.

However, the detection of converted seismic phases with current deep learning methods faces

challenges, including limited labeled datasets and the presence of weak and noisy signals. One

significant challenge is that PS waves are not confined to fixed frequency bands; their behavior

is greatly influenced by source parameters and wave propagation effects, such as attenuation and

scattering. As a result, PS wave signals often mix with noise, complicating their identification for

phase picking. The concept of applying time-frequency domain analysis, akin to methods used in

models like PhaseNet-TF, emerges as a promising approach for detecting PS waves. Additionally,

leveraging array analysis techniques, such as beam-forming [Gong and McGuire, 2021], could also

help refine phase-picking results derived from deep learning models.

The Tonga subduction zone offers a unique opportunity to study converted seismic waves. Being

the coldest and fastest subducting slab on Earth, it serves as an exceptional case for examining the

82

mechanisms behind intermediate-depth earthquakes. These earthquakes are widely believed to be

associated with slab dehydration, although the precise mechanisms remain a subject of debate.

Given the cold and rapid subduction of the Tonga slab, it’s highly plausible that hydrous minerals

carried by the subduction could reach greater depths, potentially extending into the mantle transition

zone (MTZ). This scenario could significantly revise our understanding of Earth’s deep water cycle

and our estimates of the global water budget. Previously, the Tonga slab was not extensively imaged

due to limited data availability. The Lau Basin Ocean Bottom Seismometer (OBS) Experiment,

which operated from 2009 to 2010, has provided valuable data, complemented by years of manual

phase picking by human experts and a recent catalog developed using deep learning methods [Xi

et al., 2023]. However, the availability of converted phases through manual picking has been notably

lacking.

Detected converted seismic waves in the Tonga subduction zone can accurately pinpoint the slab

interface location and provide direct evidence of the low-velocity zone (LVZ) which can support

the presence of a hydrous slab crust and uppermost mantle. Since the lower plane of the Tonga

double seismic zone is less active compared to the upper plane, which experiences compressional

stress across the slab [Isacks and Molnar, 1971], PS waves are particularly relevant for studies in

this area. Previous estimates of the slab’s upper interface in the Tonga subduction zone, such as

Slab1.0 [Hayes et al., 2012] and Slab2 [Hayes et al., 2018], were primarily derived from seismicity

data. Since intermediate-depth earthquakes typically occur within the slab crust, these estimates

often suggest a deeper interface than its actual depth. Inverting PS waves could more precisely

constrain converting points on the slab interface, with potential errors less than 1 km [Matsuzawa

et al., 1986, Zhao et al., 1997a]. Previous tomographic imaging of the Tonga slab using direct waves

[van der Hilst, 1995, Zhao et al., 1997b, Conder and Wiens, 2006, Wei et al., 2015, 2016, Adam

et al., 2016] lacked resolution on the slab crust. Incorporating PS waves promises improvements in

imaging this structure.

To accurately and efficiently detect PS waves in the Tonga subduction zone, new methods must

be developed. First, the phase picker should have the capability to identify weak PS wave signals

83

with high precision and recall, even in the presence of noise commonly found in OBS data. Sec-

ond, implementing a robust quality control process is essential to minimize the risk of false phase

detection. This process should ideally incorporate techniques from seismic array analysis, such as

beamforming, to enhance detection accuracy. Lastly, considering the scarcity of labeled data, it

is imperative to employ a semi-supervised learning approach that leverages both labeled and unla-

beled datasets. This approach has already shown efficiency in detecting direct phases in the Tonga

subduction zone [Xi et al., 2023] and in picking seismic arrival times on distributed acoustic sensing

(DAS) data [Zhu et al., 2023a].

4.3 Data

In this study, we analyzed seismic data recorded by an amphibious array in Tonga and Fiji from

November 2009 to December 2010 (Figure 4.1b). This array included 49 OBSs (network ID YL),

17 island-based stations (network ID Z1), and one Global Seismograph Network (GSN) station

MSVF (network ID II). 1,163 local earthquakes with 42,256 P-wave and 14,852 S-wave arrivals

were manually picked by Wei et al. [2017]. In addition, 5,127 PS-wave arrivals, usually 1̃0 s after

the corresponding P-wave arrivals, were manually picked in the time-frequency domain. These

manually picked events and arrivals are hereinafter called the reference dataset. In our previous

study [Xi et al., 2023], we developed a semi-supervised deep-learning workflow to detect and locate

13,406 earthquakes with 372,774 P-wave and 78,853 S-wave arrivals from this dataset. In this study,

we further extend the deep-learning workflow to include the manually picked PS-wave arrivals and

focus on detecting PS waves from events that were previously located.

Similar to Xi et al. [2023] we created the training dataset by windowing three-component wave-

forms 5 minutes before and 5 minutes after each P-wave arrival. We also subsequently removed the

instrumental response to get displacement waveforms of the three components and further resam-

pled the data to 40 HZ with no additional processing. We also created a 1-year continuous dataset

by segmenting three-component waveforms into 10-day segments and subsequently removing the

instrumental response. All 3 channels were included, and any missing data from November 2009

to December 2010 were filled with zeros to ensure the continuity of the dataset.

84

4.4 Methods

4.4.1 Phase picking by PhaseNet-TF

In this study, we use PhaseNet-TF, a deep-learning-based phase picker developed by Xi et al.

[2023], to detect PS waves in the time-frequency domain. PhaseNet-TF is modeled after the ar-

chitecture of DeepLabv3+ [Chen et al., 2018], designed to handle data within the time-frequency

domain efficiently. The performance of PhaseNet-TF is comparable to that of PhaseNet [Zhu and

Beroza, 2019] for detecting P waves and showing superior efficacy in identifying S waves with

OBS data [Xi et al., 2023]. Originally, PhaseNet-TF processed three-component waveform inputs

to predict the probability of phase arrivals for P and S waves, along with noise probability. Here

we introduce an additional output channel dedicated to the probability of PS wave arrivals, while

the rest of the model architecture remains unchanged. This modification allows PhaseNet-TF to

maintain its accuracy in predicting P and S wave arrivals, as established in the previous study while

adding the capability of accurately predicting PS wave arrivals.

In refining PhaseNet-TF, we retain the original model’s loss function, optimizer, and learning

rate scheduler but modify the weighting strategy for loss functions to account for PS waves. Given

the scarcity of labeled PS wave arrivals compared to P- and S-wave arrivals, addressing the im-

balance in the dataset is crucial. We implement a strategy of differential weighting for our loss

functions, specifically using the Kullback-Leibler (KL) divergence for distinct categories:

∑

∑

𝑖 𝑊𝑖

𝐿 =

𝑥 𝑦𝑡𝑟𝑢𝑒,𝑖 (𝑥) ∗ log
∑

𝑦𝑡𝑟𝑢𝑒,𝑖 (𝑥)
𝑦 𝑝𝑟 𝑒𝑑𝑖𝑐𝑡 ,𝑖 (𝑥)

(4.1)

𝑖 𝑊𝑖

where 𝑥 denotes the summation performed across all points, and 𝑖 represents one of the four cat-

egories utilized in constructing the loss function: P-wave arrival probability, S-wave arrival prob-

ability, PS-wave arrival probability, and noise probability. The sum of these probabilities at any

given timestamp equals one. The weights for these categories 𝑊𝑖 are set as 1:1:3:3 for noise, P, S,

and PS probabilities, respectively. 𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡,𝑖 (𝑥) is the model prediction for the 𝑖-th category, and

and 𝑦𝑡𝑟𝑢𝑒,𝑖 (𝑥) corresponds to the reference.

PhaseNet-TF is trained by the manually picked reference dataset with 42,256 P-wave, 14,852

85

S-wave, and 5,127 PS-wave arrivals. This trained PhaseNet-TF model is then applied to the continu-

ous dataset to predict PS waves. Following Xi et al. [2023], we partition the continuous waveforms

into 120-second segments with a 60-second overlap between segments. The output segments of

PS-wave probabilities are combined into a single continuous time series by selecting the maximum

value from overlapping predictions. Since PS signals are typically weaker than P and S signals, peak

probabilities exceeding 0.3 are considered positive picks for PS waves, lower than the threshold of

0.5 for P and S waves. We call these PS-wave predications X arrivals before confirming they are the

PS waves converted at the slab top interface. We leverage the pre-established earthquake catalog

from Xi et al. [2023] and associate each X arrival with a P-wave arrival immediately preceding it,

and thus with the corresponding earthquake in the earthquake catalog.

These X arrivals inevitably include false PS arrivals because (1) the probability threshold of 0.3

is low and (2) the DeepLabv3+ architecture is mainly focused on signal processing and does not

carry much seismological knowledge. Consequently, the predictions may represent some seismic

signals other than the PS waves converted at the slab-top interface. As described in the follow-

ing subsections 4.4.2 and 4.4.3, we apply several quality control processes based on seismological

knowledge to filter X arrivals and only keep P-slab-S arrivals. These processes will eliminate most

false positive PS detections but cannot address false negative predictions. Since the scientific mo-

tivation of this study is to obtain reliable PS arrivals for seismic imaging, this approach will lead to

a robust, although not exhaustive, dataset of PS waves.

4.4.2 Quality control (QC): source depth, station location, travel time, and energy

We first examine the properties of individual X arrivals. According to Fig. 4.1a, the PS wave

source should be intra-slab intermediately-depth or deep-focus earthquakes. Intra-slab earthquakes

shallower than 50 km may also generate P-slab-S waves converted at the megathrust. However,

it is difficult to distinguish P-slab-S waves from P-moho-S waves converted at the overriding plate

Moho from these shallow intra-slab earthquakes. In addition, these P-slab-S arrivals, if successfully

picked, can help constrain the megathrust at <50 km depth, which is usually well constrained by

active-source or surface-wave studies, thus are less critical than the P-slab-S arrivals from deep

86

earthquakes. Therefore, we discard all X predictions from earthquakes shallower than 50 km. Given

the ray geometry, we also require the station to be between the earthquake and the Tonga Trench,

discarding all X predictions recorded at Fijian stations.

Travel-time is an effective measurement to filter X arrivals [Matsuzawa et al., 1986]. Given

the earthquake catalog from Xi et al. [2023], we first require each X arrival to be between the P-

and S-wave arrivals of the same event. If an event has no S-wave arrival detected, we estimate the

S-wave arrival based on a linear regression of all S-P differential travel-times versus hypocentral

distances (Fig. 4.3). X predictions that miss this requirement are discarded. Second, we exclude

any predicted X arrivals occurring within 5 seconds after the corresponding P-wave arrivals. This

step helps us eliminate P-moho-S waves converted at the overriding plate Moho (Fig. 4.1a) because

the crustal thickness in this region is only about 6–36 km from the Lau back-arc basin to the Tonga

fore-arc [Wei et al., 2016, Chen et al., 2019a]. This step can also eliminate S-slab-P waves converted

at the slab top interface, which usually arrive less than 3 seconds after the corresponding P waves.

In addition, we analyze the energy of X arrivals on 3 components to exclude any signals that

arrive as P waves. Given the shear wave nature of PS waves, the PS-wave arrivals should have the

strongest energy in the radial direction. Although the PS-wave energy may leak to the vertical com-

ponent due to the varying incident angle and complex site effects, we still expect higher energy on

horizontal components. Therefore, we discard X arrivals with the energy on the vertical component

higher than that on the horizontal components.

4.4.3 Quality control (QC): Source-side beamforming analysis

While straightforward quality control measures based on location, travel-time, and energy in-

formation successfully eliminate a significant number of falsely detected PS arrivals, they fall short

in filtering out phases that share similar arrival times and receiver-side polarity with the PS wave,

such as the SPS wave, along with other falsely detected signals not corresponding to any meaningful

phases. In addressing this limitation, we use a source-side beam-forming method to investigate the

nature of X arrivals near earthquakes. Specifically, we estimate the wave speed, azimuthal angle,

and take-off angle of X arrivals derived from source-side beamforming analyses and compare them

87

with the theoretical values of PS waves.

In contrast to previous 3D source-side beamforming studies [Spudich and Bostwick, 1987,

Nakata and Shelly, 2018, Gong and McGuire, 2021], we do not stack waveforms for several rea-

sons: (1) Unlike station-side beamforming, the waveforms in source-side beamforming are heavily

influenced by the event source mechanism, and directly stacking waveforms may not effectively

amplify the signal. (2) The X signals are noisy in the time domain, even if we filter the waveforms

within the optimal frequency band. Instead, we stack delta functions that represent the X arrivals

because only the travel-time information is needed for our purpose.

We follow Gong and McGuire [2021] to perform the source-side beamforming analysis. In ap-

plying this method to real data, waveform segments are aligned with the direct P wave, respectively.

The synthetic delta function-shaped pulse at the phase arrival time predicted by the PhaseNet-TF

model is then applied to represent the PS wave arrivals. The slowness vector, expressed in spher-

ical coordinates, depends on the takeoff angle, azimuth angle, and near-source velocity, and can

be obtained through a grid search to maximize the PS wave signal stacking energy. For a single

experiment of the source-side beamforming analysis, we have a group of X arrivals as the input

and the slowness vector as the output. Meanwhile, the deduced slowness vector should match the

tomographic data, the earthquake-station geometry, and ray tracing results, from where we have

the theoretical or expected slowness vector. Our goal is to discard those unreliable X arrivals and

obtain trustable PS arrivals as the output of this stage.

The synthetic delta function-shaped pulse is defined as:

𝐷 (𝑡) = 𝑒

)2
− (𝑡 −𝑡
0
2( 𝑤
)2
6

(4.2)

where 𝐷 (𝑡) represents the displacement at time 𝑡, 𝑡0 denotes the PS arrival time, and 𝑤 is a parameter

controlling the pulse’s width, set to 1 second in our study.

We use a Markov chain Monte Carlo (MCMC) approach to evaluate the trustworthiness of

individual X arrivals based on the collective assessment from the source-side beamforming analysis.

The trustworthiness of the 𝑖-th X arrival is considered a random variable 𝑅𝑖 with its expectation

𝐸 (𝑅𝑖) = 𝑡𝑖 which serves as a measure to determine the retention or discard of this X arrival. If

88

we denote 𝐷 𝑗 as the beamforming results of the 𝑗-th subset, the Bayesian framework allows us to

express the probability of the trustworthiness of all X arrivals as

𝑃(𝑅1, 𝑅2, ..., 𝑅𝑚 |𝐷1, 𝐷2, ..., 𝐷𝑛) =

𝑃(𝐷1, 𝐷2, ..., 𝐷𝑛|𝑅1, 𝑅2, ..., 𝑅𝑚)𝑃(𝑅1, 𝑅2, ..., 𝑅𝑚)
𝑃(𝐷1, 𝐷2, ..., 𝐷𝑛)

(4.3)

To obtain 𝑃(𝑅1, 𝑅2, ..., 𝑅𝑚 |𝐷1, 𝐷2, ..., 𝐷𝑛), we aim to derive the best estimation of the distribu-

tion for 𝑅1, 𝑅2, ..., 𝑅𝑚 based on the current information provided by 𝐷1, 𝐷2, ..., 𝐷𝑛, known as the

posterior distribution. Here, 𝑚 denotes the total number of X arrivals, while 𝑛 indicates the total

number of beamforming experiments conducted. 𝑃(𝐷1, 𝐷2, ..., 𝐷𝑛|𝑅1, 𝑅2, ..., 𝑅𝑚) represents the

likelihood. The prior distribution 𝑃(𝑅1, 𝑅2, ..., 𝑅𝑚) reflects our initial assumptions about the dis-

tribution of 𝑅1, 𝑅2, ..., 𝑅𝑚 before beamforming. We adopt a set of independent and identically dis-

tributed (iid) random variables, modeled by a Beta distribution characterized by parameters 𝛼 = 5

and 𝛽 = 1. This prior distribution aims to balance our initial belief about the trustworthiness of X

arrivals with the adjustability based on the beamforming outcomes.

Acknowledging the conditional independence of 𝐷1, 𝐷2, ..., 𝐷𝑛 given 𝑅1, 𝑅2, ..., 𝑅𝑚, the like-

lihood function can be represented as a product of individual likelihoods for each beamforming

result:

𝑃(𝐷1, 𝐷2, ..., 𝐷𝑛|𝑅1, 𝑅2, ..., 𝑅𝑚) =

∏

𝑗 ≤𝑛

𝑃(𝐷 𝑗 |𝑅1, 𝑅2, ..., 𝑅𝑚)

(4.4)

For each beamforming result 𝐷 𝑗 , its likelihood given the trustworthiness of X arrivals is

𝑃(𝐷 𝑗 |𝑅1, 𝑅2, . . . , 𝑅𝑚) = 𝑃(𝐷 𝑗 |all 𝑅𝑖, 𝑖 ∈ 𝐼 𝑗 )

∑

=

1
Length(𝐼 𝑗 )

𝑅𝑖 𝐿 (𝑉𝑗 ; 𝑉𝑡ℎ𝑒𝑜𝑟𝑖𝑡𝑖𝑐𝑎𝑙, 𝜎𝑉,𝑔𝑜𝑜𝑑)𝐿 (𝜃 𝑗 ; 𝜃𝑡ℎ𝑒𝑜𝑟𝑖𝑡𝑖𝑐𝑎𝑙, 𝜎𝜃,𝑔𝑜𝑜𝑑)

𝑖∈𝐼 𝑗

+ (1 − 𝑅𝑖)𝐿 (𝑉𝑗 ; 𝑉𝑡ℎ𝑒𝑜𝑟𝑖𝑡𝑖𝑐𝑎𝑙, 𝜎𝑉,𝑏𝑎𝑑)𝐿(𝜃 𝑗 ; 𝜃𝑡ℎ𝑒𝑜𝑟𝑖𝑡𝑖𝑐𝑎𝑙, 𝜎𝜃,𝑏𝑎𝑑)

(4.5)

where 𝐼 𝑗 denotes the set of all X arrivals utilized in obtaining 𝐷 𝑗 . The likelihood functions

𝐿(𝑉𝑗 ; 𝑉𝑡ℎ𝑒𝑜𝑟𝑒𝑡𝑖𝑐𝑎𝑙, 𝜎𝑉,𝑔𝑜𝑜𝑑) and 𝐿 (𝑉𝑗 ; 𝑉𝑡ℎ𝑒𝑜𝑟𝑒𝑡𝑖𝑐𝑎𝑙, 𝜎𝑉,𝑏𝑎𝑑) reflect the probability of observing the

beamforming wave speed 𝑉𝑗 under the assumptions of ”good” and ”bad” PS wave pickings, respec-

tively, modeled by normal distributions with theoretical wave speed 𝑉𝑡ℎ𝑒𝑜𝑟𝑒𝑡𝑖𝑐𝑎𝑙 from the IASP91

89

Earth model [Kennett and Engdahl, 1991] and standard deviations 𝜎𝑉,𝑔𝑜𝑜𝑑 = 3 km/s and 𝜎𝑉,𝑏𝑎𝑑 = 5

km/s to account for uncertainties in beamforming. Similarly, 𝐿 (𝜃 𝑗 ; 𝜃𝑡ℎ𝑒𝑜𝑟𝑒𝑡𝑖𝑐𝑎𝑙, 𝜎𝜃,𝑔𝑜𝑜𝑑) and

𝐿 (𝜃 𝑗 ; 𝜃𝑡ℎ𝑒𝑜𝑟𝑒𝑡𝑖𝑐𝑎𝑙, 𝜎𝜃,𝑏𝑎𝑑) assess take-off angles with 𝜎𝜃,𝑔𝑜𝑜𝑑 = 15 degrees for ”good” picks and

𝜎𝜃,𝑏𝑎𝑑 = 45 degrees for ”bad” picks. The azimuthal angle is not considered here for computational

efficiency, as its observed values consistently align well with theoretical predictions through our

experiments.

We use the No-U-Turn sampling algorithm [Hoffman and Gelman, 2014] to estimate the poste-

rior distribution 𝑃(𝑅1, 𝑅2, ..., 𝑅𝑚 |𝐷1, 𝐷2, ..., 𝐷𝑛). The marginal posterior distribution 𝑃(𝑅𝑖 |𝐷1, 𝐷2, ..., 𝐷𝑛)

for any 𝑖 ≤ 𝑚 can be directly computed through the MCMC sampling. This distribution effectively

represents the trustworthiness of any given X arrival post-analysis. By integrating this distribution,

we can calculate the expectation value 𝑡𝑖 for the trustworthiness of the 𝑖-th X arrival:
∫

1

𝑡𝑖 = 𝐸 (𝑅𝑖) =

0

𝑃(𝑅𝑖 (𝑥)|𝐷1, 𝐷2, ..., 𝐷𝑛)𝑑𝑥

(4.6)

This expectation 𝑡𝑖 serves as a quantitative measure of trustworthiness for each PS arrival, al-

lowing us to make informed decisions about which arrivals to retain for further analysis based on

their calculated trustworthiness levels. We discard all X arrivals with 𝑡𝑖 < 0.5.

The accuracy of estimating 𝑡𝑖, the trustworthiness of X arrivals, relies on the results of the

beam-forming analysis, 𝐷 𝑗 . To obtain these crucial data points, we conduct extensive source-side

beam-forming analyses on subsets of X arrivals. Given the wide distribution of earthquakes in our

study, we divide the X arrivals into subsets for beamforming. We first construct a grid with step

lengths of 0.25◦ in longitude and latitude, and 25 km in depth, organizing the study region into

boxes that are 1◦ wide in longitude, 1◦ wide in latitude, and 100 km wide in depth. Each box is

designed to contain at least 20 X arrivals from the same station. This approach excludes X arrivals

from events that are far from the others. Within each box, we subsample the events to generate

more beamforming observations. We repeat this process 1000 times with a subsampling rate of

70%. In total, we obtain more than 1 million beamforming results for uniquely different subsets of

X arrivals.

90

4.4.4 Semi-supervised-learning-based workflow

Integrating Semi-Supervised Learning into our workflow significantly enhances the detection

capabilities of PhaseNet-TF, employing a strategy that iteratively refines labeled PS arrival picks

and retrains the model. This semi-supervised learning approach can effectively leverage a blend of

limited labeled phases with an extensive pool of unlabeled data. As illustrated in the diagram of

Figure 4.6, in the initial iteration, given the scarcity of manually picked PS arrivals, we bypass the

quality control steps related to timing and energy, operating under the assumption that most manu-

ally picked phase arrivals are reliable. This iteration of the model is then applied to the continuous

dataset, which is constrained by the catalog from a prior study [Xi et al., 2023], to generate new

phase picks, including potential false detections.

The introduction of subsequent steps, such as quality control through energy and timing anal-

ysis, along with PS wave association via beamforming and the MCMC method, assists in filtering

out less reliable picks. These refined PS arrivals from the first iteration serve as labels for train-

ing PhaseNet-TF in the second iteration. With each iteration, as the training dataset is augmented

with more accurate picks, the model’s proficiency in detecting PS arrivals is expected to improve.

This iterative process is repeated until the addition of significant new PS arrivals ceases, ensuring

a progressively refined and accurate detection mechanism.

Both deep learning and beamforming require a large amount of PS arrivals. However, the manu-

ally picked reference catalog has only 5,127 PS-wave arrivals from 1,163 local earthquakes. Similar

to Xi et al. [2023], we use a semi-supervised learning strategy to iteratively increase the number

of PS arrivals, refine them with QC filters, and retrain PhaseNet-TF (Figure 4.6). This approach

can effectively leverage a blend of limited labeled phases with an extensive pool of unlabelled data.

Before Iteration #1, we only apply the QC filters of source location, station location, travel-time,

and energy to clean the reference dataset (Section 4.4.2), and do not use the QC filter of beamform-

ing analysis (Section 4.4.3), given the scarcity of manually picked arrivals. In Iteration #1, we use

the cleaned reference dataset to train PhaseNet-TF and apply the trained model to the continuous

dataset to predict new X arrivals. The predictions are subsequently filtered by all QC measures. In

91

the following iterations, we use the cleaned X picks from the previous iteration to retrain PhaseNet-

TF and predict more X arrivals, which are cleaned by QC filters explained in Sections 4.4.2 and

4.4.3. This iterative process is repeated until the addition of significant new PS arrivals ceases,

ensuring a progressively refined and accurate detection mechanism.

4.5 Results

4.5.1 PhaseNet-TF Model assessment

The performance of PhaseNet-TF in detecting PS arrivals is rigorously evaluated using pre-

cision, recall, and F1 score metrics on a test dataset, which is randomly selected as 5% of the

reference catalog for the initial iteration and 5% of the refined catalog for subsequent iterations.

Table 4.1 summarizes these results, showing that PhaseNet-TF consistently achieves high accuracy

in predicting P and S arrivals, with metrics above 95% across all iterations, and maintains over 95%

precision and recall in detecting PS arrivals. Figure 4.7 also shows an example of newly detected

PS arrival at the second supervised learning stage. This level of precision is comparable to manual

picks by human analysts, indicating the model’s robustness.

However, an essential aspect of the semi-supervised learning process involves the exclusion of

some PS arrivals from the second iteration onwards. This is not due to inaccuracies in their detection

or direct dismissal by the MCMC process but originates from the requirement for a minimum of

20 PS arrivals to form a box for beamforming analysis, which is a prerequisite not always met.

Such exclusions, while potentially omitting valid PS arrivals, represent a necessary compromise to

adhere to the structural requirements of our workflow. This is also suggested in Figure 4.8, where

the prediction and the reference residuals of the PS wave keep almost the same at different stages

of the semi-supervised framework, as well as a low standard deviation, but with a smaller number

of PS arrivals preserved in catalogs of stages 2 and beyond.

4.5.2

catalog refined with beamforming

In refining our PS arrival catalog, we began by training the PhaseNet-TF model on 5,127 man-

ually selected PS arrivals from an existing reference catalog, which enabled the model to predict

approximately 46,210 PS arrivals from a dataset spanning an entire year. Utilizing constraints from

92

the previous study (Xi et al. [2023]) as described in the data section, we focused the initial large set

of predictions down to 18,290 PS arrivals. As shown in 4.3(b), through subsequent quality control

measures that emphasized precise timing and energy distribution analysis, this number was further

refined to 15,939 PS arrivals. Only around 12,000 PS arrivals can perform beam-forming through

the boxing constraint. The further refinement process incorporated beam-forming analysis and

MCMC simulations as critical components, enabling a detailed evaluation of each PS arrival pre-

diction. Figure 4.4 shows the PS arrivals are clearly clustered into two groups, i.e., trustable or not,

through this approach. This stage was instrumental in selecting approximately 8,979 PS arrivals,

enriching the catalog with both previously recognized arrivals and 6,200 new detections. Building

on this enhanced dataset, the PhaseNet-TF model underwent additional training iterations. With the

detection of new PS arrivals plateauing after the third iteration, we concluded the semi-supervised

learning process. Figure 4.9 compares the station-counted PS arrivals between the reference and

the subsequent stages’ catalog, which showcases this improvement.

In total, for the reference catalogue, it takes around 5 hours and 400 CPU cores on MSU ICER

for the beam-forming calculation, and around 1 hour and 4 Nvidia V100S GPUs for the 4-chains

MCMC simulation. For the PS arrivals predicted by the PhaseNet-TF model (first semi-supervised

iteration, see the next section), it takes around 36 hours and 400 CPUS for the beam-forming calcu-

lation, and around 4 hours and 4 Nvidia V100S GPUs for the 4 chains MCMC simulation. Figure

4.5 shows a comparison before and after applying the abovementioned method to the PS arrivals

from the reference catalog, which shows its effectiveness in detecting the potentially falsely detected

PS arrivals.

This methodical and iterative approach to catalog refinement not only expanded the number

of detected PS arrivals but also significantly improved the accuracy and reliability of the catalog.

With ongoing developments in Ocean Bottom Seismometer (OBS) deployments, such as the con-

tributions from the SaLOON Project, our refined methodology is set to further advance PS arrival

detection in future seismic research, ensuring a more comprehensive and precise understanding of

seismic activities.

93

4.6 Discussion and Conclusions

In our study, we have developed a comprehensive workflow for the detection and analysis of

converted seismic waves within the Tonga subduction zone. This workflow represents a signifi-

cant advancement in seismic research, particularly in the context of improving the accuracy and

reliability of PS wave arrival detection.

To be specific, we find:

• Our study introduces a robust workflow that combines the enhanced PhaseNet-TF model with

a semi-supervised learning approach. This integration is crucial for addressing the complex

challenge of detecting weak and noisy signals in seismic data, thereby enhancing the precision

of phase picking.

• A novel aspect of our workflow is the incorporation of Markov chain Monte Carlo simulations

and source-side beamforming as quality control measures. These techniques allow us to

refine our detection process significantly, ensuring that our analysis accurately distinguishes

between true seismic events and background noise.

• By applying our workflow, we have substantially extended the existing catalog of PS arrivals

in the Tonga subduction zone. This expanded catalog is not only a valuable resource for ongo-

ing research but also holds great potential for future studies focused on seismic tomography

and the dynamics of earthquakes within this subduction zone.

94

4.7 Figures

Figure 4.1 (a) Diagrams showing direct P waves and PS waves from an earthquake near the slab
surface. The figures illustrate the paths of seismic waves as they travel through the Earth’s layers.
(b) Map showing the Tonga subduction zone and surrounding areas. Triangles represent
land-based seismic stations, inverted triangles denote ocean-bottom seismographs, and the square
marks a GSN (Global Seismograph Network) station. These stations were operational from
November 2009 to December 2010. The color coding of the stations indicates the number of PS
waves recorded at each station for the reference catalog. The distribution of earthquakes from the
manually picked reference catalog is depicted with dots, color-coded according to their depth,
showcasing the spatial and depth variation of seismic activity in the region. Land regions are
marked in grey. The bathymetry, with 1 km contours, highlights geographical features such as the
Tonga Ridge, Lau Ridge, and Fiji Plateau, while deeper contours at 7, 8, 9, and 10 km outline the
Tonga Trench. The black arrow shows the direction of the Pacific Plate’s movement relative to the
Tonga area. An inset map provides a global perspective of the study area.

95

PSDirect POverriding platePacific PlateStationEarthquakeSlab surfaceSlab Mohobac176°E178°E180°178°W176°W174°W172°W26°S24°S22°S20°S18°S16°S14°STonga TrenchLau BasinFiji0200400600Manually picked earthquake depth (km)YLZ1II200400600Number of manually picked PS arrivalsFigure 4.2 PhaseNet-TF model predictions on selected examples from the test dataset, derived
from the reference dataset. Panels (a) and (b) illustrate recordings of deep earthquakes at
land-based fore-arc stations, while panels (c) and (d) present recordings from back-arc stations
equipped with Ocean Bottom Seismometers (OBS). Each panel is labeled with the earthquake’s
origin time, hypocenter, and the respective station name. The initial three sections in each panel
depict waveform data across three components (vertical, north-south, and east-west) in the time
domain. These are followed by three corresponding sections showing spectrograms, which
represent the power values of each waveform component. The final section in each panel
visualizes the probabilities of detected P, S, and PS wave arrivals as predicted by the model.
96
Manually identified arrivals for P, S, and PS waves are marked with dashed lines for comparison.

−0.80.00.8AmplitudeE(a)4_54845(2010 09 16 11:37:39.51 18.00S,177.87W 573.5km) TNGA−0.80.00.8AmplitudeN−0.80.00.8AmplitudeZ048FrequencyE048FrequencyN048FrequencyZ0.00.40.8ProbabilityPredicted PPredicted SPredicted PSManually picked PManually picked SManually picked PSE(b)4_54746(2010 08 16 22:09:48.45 22.47S,178.12W 366.5km) NMKANZENZPredicted PPredicted SPredicted PSManually picked PManually picked SManually picked PS−0.80.00.8Amplitude1(c)1_51849(2009 12 01 08:07:01.29 17.67S,178.56W 561.4km) A01−0.80.00.8Amplitude2−0.80.00.8AmplitudeZ048Frequency1048Frequency2048FrequencyZ0.00.40.8Probability020406080100120Time (s)Predicted PPredicted SPredicted PSManually picked PManually picked SManually picked PS1(d)11_52113(2009 12 02 02:25:13.61 21.00S,176.06W 189.5km) A12W2Z12Z020406080100120Time (s)Predicted PPredicted SPredicted PSManually picked PManually picked PS0.00.51.01.52.02.53.0Specrogram AmplitudeFigure 4.3 Comparison of PS-P and S-P arrival time differences versus hypocentral distance
between the reference and final catalogs. The final catalog was developed by applying energy,
timing, and beamforming-based quality control measures to PhaseNet-TF predictions of PS
arrivals, represented as colored dots. The solid black line illustrates the least squares fit of the
PS-P time difference curve of the final catalog. The two dashed black lines indicate the 5% and
95% boundaries, marking the regions where the majority of PS arrivals from the final catalog fall
within these limits.

97

020406080100Differencial traveltime (sec)020040060080010001200Hypocentral distance (km)ManualS PFinal PS PNumber of selected PS arrivals: 5127020040060080010001200Hypocentral distance (km)FinalS PPS P removed due to energyPS P removed due to beamformingPS P removed due to travel timeFinal PS PNumber of selected PS arrivals: 8979Figure 4.4 Distribution of trustworthiness values derived from beam-forming MCMC simulations,
where higher values indicate greater reliability of the corresponding PS arrival. Panel (a) displays
the evaluation of this method using the reference dataset, while panel (b) shows the outcomes after
the first iteration of our semi-supervised learning framework.

98

02004006008001000Count0.00.20.40.60.81.0Trustworthiness Expection(a)manual010002000300040000.00.20.40.60.81.0Trustworthiness Expection(b)first round PhaseNet TFFigure 4.5 Comparison of beamforming results before and after applying MCMC-based quality
control on the reference dataset. The top-left map displays station FONI and associated
earthquake events, represented as either black or red dots. Red dots indicate events whose PS
arrivals are retained post-MCMC analysis, while black dots represent discarded arrivals. The
bottom-left plot shows waveform data for PS arrivals, aligned at the P arrival time, with red
indicating arrivals kept after beamforming analysis and black indicating those discarded. This
data has been filtered around the center frequency of manually identified PS waves, with a
bandwidth of 4 Hz. On the right, beamforming results are presented in a series of plots: the top
plot shows results for all P arrivals; the middle plot displays results for all PS arrivals before
MCMC filtering; the bottom plot illustrates results for PS arrivals post-MCMC filtering.
Following the method described by Gong and McGuire [2021], we optimize the power spectrum,
and the plots depict this spectrum as a function of various parameters: wave speed, azimuthal
angle, and takeoff angle, as well as their combinations, such as takeoff angle and azimuthal angle,
takeoff angle and wave speed, and azimuthal angle and wave speed.

99

180°178°W176°W174°W20°S18°S16°S(a)RemovedKeptFONI360380400420440460480500Hypocentral Distance (km)010203040Time (s)(b)P arrivalPS arrival−50050Takeoff Angle (degree)0100200300Azimuth Angle (degree)0.00.51.0Power6810Wave Speed (km/s)(e) PS beamforming after QC−50050Takeoff Angle (degree)0100200300Azimuth Angle (degree)0.00.51.0Power6810Wave Speed (km/s)(d) PS beamforming before QC−50050Takeoff Angle (degree)0100200300Azimuth Angle (degree)0.00.51.0Power6810Wave Speed (km/s)(c) P beamforming−50050Takeoff Angle (degree)6810Wave Speed (km/s)0.00.51.0Power0100200300Azimuth Angle (degree)−50050Takeoff Angle (degree)6810Wave Speed (km/s)0.00.51.0Power0100200300Azimuth Angle (degree)−50050Takeoff Angle (degree)6810Wave Speed (km/s)0.00.51.0Power0100200300Azimuth Angle (degree)0100200300Azimuth Angle (degree)6810Wave Speed (km/s)0.00.20.40.60.81.0Power0.00.51.0Power−50050Takeoff Angle (degree)0100200300Azimuth Angle (degree)6810Wave Speed (km/s)0.00.20.40.60.81.0Power0.00.51.0Power−50050Takeoff Angle (degree)0100200300Azimuth Angle (degree)6810Wave Speed (km/s)0.00.20.40.60.81.0Power0.00.51.0Power−50050Takeoff Angle (degree)Figure 4.6 Schematic illustration of our semi-supervised learning workflow. The process initiates
with the training of the PhaseNet-TF model on a dataset of manually picked seismic arrivals.
Subsequently, this trained model is deployed on continuous seismic records to predict
probabilities for P, S, and PS wave arrivals. These predictions, specifically for PS arrivals, are
then cross-referenced with a finalized catalog from previous studies, such as the one by Xi et al.
(2023), to ensure consistency and reliability. Quality control measures, including analysis of
timing discrepancies, the ratio of horizontal to vertical energy, and beamforming-based MCMC
simulations, are employed to refine this set of PS arrivals, resulting in an updated catalog. This
refined catalog is then utilized as a newly labeled dataset for further training and validation of the
PhaseNet-TF model in subsequent iterations. The workflow is iteratively executed until the
enhancements in the catalog become marginal, culminating in a comprehensive and accurate final
catalog of seismic events and their associated phase arrivals.

100

labeled picksPhaseNet-TFContinious unlabeled datasetTrainingPredictingPredicted PS arrivalsEnergy and TimingBased Quality ControlNew catalog andcorresponding arrivalsFinal catalog andcorresponding arrivalsRetrainingIndependent earthquake catalog (Xi et al., 2023, EarthArXiv)WaveformsSpectrogrmDeeplabV3+SpectrogramSegmentationPredictionsConstrainBeforeAfterQuality Control fromBeam-forming & MCMCRefiningFigure 4.7 Examples of PhaseNet-TF predictions from the test dataset, which was sampled from
the training dataset of the second-stage semi-supervised learning. Annotations and layout follow
those described in figure 4.2.

101

−0.80.00.8Amplitude1(a)3232(2009 11 25 16:42:22.31 20.19S,177.45W 511.4km) A01−0.80.00.8Amplitude2−0.80.00.8AmplitudeZ048Frequency1048Frequency2048FrequencyZ0.00.40.8ProbabilityPredicted PPredicted SPredicted PSManually picked PManually picked SManually picked PSE(b)4103(2009 11 30 06:52:49.41 21.74S,175.56W 89.0km) EUAPNZENZPredicted PPredicted SPredicted PSManually picked PManually picked SManually picked PS−0.80.00.8AmplitudeE(c)4386(2009 11 12 16:15:25.07 20.33S,177.32W 437.4km) NMKA−0.80.00.8AmplitudeN−0.80.00.8AmplitudeZ048FrequencyE048FrequencyN048FrequencyZ0.00.40.8Probability020406080100120Time (s)Predicted PPredicted SPredicted PSManually picked PManually picked SManually picked PSE(d)4478(2009 11 15 12:37:33.24 19.22S,177.01W 540.1km) NMKANZENZ020406080100120Time (s)Predicted PPredicted SPredicted PSManually picked PManually picked SManually picked PS0.00.51.01.52.0Specrogram AmplitudeFigure 4.8 Arrival-time residuals for PS waves throughout various iterations of semi-supervised
learning. Each histogram represents the differences between predicted arrival times, as
determined by the PhaseNet-TF model at each iteration, and manual picks, for instances where
both are available. Panel (a) showcases the residuals from the first iteration of the PhaseNet-TF
model application, while panel (b) displays the outcomes from the second iteration.

102

0200400600800Count−0.8−0.40.00.40.8Time Difference (s)(Iteration #1)PhaseNet TFmean:  0.01 sstd: 0.15 s(a)−0.8−0.40.00.40.8Time Difference (s)(Iteration #2)PhaseNet TFmean:  0.03 sstd: 0.25 s(b)Figure 4.9 Comparison of the reference and final catalogs, with stations color-coded according to
the number of PS arrivals recorded at each location (a-b), and events color-coded according to the
depth (c-d). In (c-d), the sizes of events are scaled based on the number of PS arrivals. The map
includes features similar to those described in Figure 4.1, such as station symbols, land shading,
and bathymetry contours.

103

26°S24°S22°S20°S18°S16°S14°STonga TrenchLau BasinFijiYLZ1II200400600Number of manually picked PS arrivals(a) ReferenceTonga TrenchLau BasinFijiYLZ1II50010001500Number of PS arrivals predicted by PhaseNet TF (Iteration #1)(b) PhaseNet TF (Iteration #1)180°175°W26°S24°S22°S20°S18°S16°S14°STonga TrenchLau BasinFiji0200400600Earthquake depth (km)(c) Reference180°175°WTonga TrenchLau BasinFiji0200400600Earthquake depth (km)(d) PhaseNet TF (Iteration #1)4.8 Tables

Table 4.1 Evaluation metrics for phase pickers applied to the test dataset across different stages of
semi-supervised learning models. Picks are considered true positives if their arrival-time
residuals are less than 1 second when compared to the corresponding labels. The table presents
metrics evaluated on a test dataset, which is partitioned from the labelled dataset. For
PhaseNet-TF Iteration #1, this labelled dataset is the manually picked reference catalogue. For
subsequent iterations, such as PhaseNet-TF Iteration #2, the labelled dataset is derived from the
output catalogue of the preceding iteration.

P-wave arrival

S-wave arrival

PS-wave arrival

Precision Recall

F1

Precision Recall

F1

Precision Recall

F1

PhaseNet-TF
(Iteration #1)

PhaseNet-TF
(Iteration #2)

0.99

0.99

0.99

0.95

0.99

0.97

0.98

0.999

0.99

0.98

0.97

0.97

0.94

0.99

0.97

0.93

0.99

0.97

104

BIBLIOGRAPHY

John Adam, Simon Turner, and Tracy Rushmer. The genesis of silicic arc magmas in shallow crustal
cold zones. Lithos, 264:472–494, 2016. doi: 10.1016/j.lithos.2016.07.036. Publisher: Elsevier.

Keiiti Aki and W. H. K. Lee. Determination of three-dimensional velocity anomalies under a seis-
mic array using first P arrival times from local earthquakes: 1. A homogeneous initial model.
Journal of Geophysical Research, 81(23):4381–4399, August 1976.
ISSN 01480227. doi:
10.1029/JB081i023p04381.

C. J. Allégre, V. Courtillot, P. Tapponnier, A. Hirn, M. Mattauer, C. Coulon, J. J. Jaeger, J. Achache,
U. Schärer, J. Marcoux, J. P. Burg, J. Girardeau, R. Armijo, C. Gariépy, C. Göpel, Li Tin-
dong, Xiao Xuchang, Chang Chenfa, Li Guangqin, Lin Baoyu, Teng Jiwen, Wang Naiwen, Chen
Guoming, Han Tonglin, Wang Xibin, Den Wanming, Sheng Huaibin, Cao Yougong, Zhou Ji,
Qiu Hongrong, Bao Peisheng, Wang Songchan, Wang Bixiang, Zhou Yaoxiu, and Ronghua Xu.
Structure and evolution of the Himalaya–Tibet orogenic belt. Nature, 307(5946):17–22, January
1984. ISSN 0028-0836, 1476-4687. doi: 10.1038/307017a0.

M. L. Amaru. Global travel time tomography with 3-D reference models, volume 274. Utrecht

University, 2007.

M. Baer and U. Kradolfer. An automatic phase picker for local and teleseismic events. Bulletin of
the Seismological Society of America, 77(4):1437–1445, August 1987. ISSN 0037-1106. doi:
10.1785/BSSA0770041437.

Xuewei Bao, Xiaodong Song, and Jiangtao Li. High-resolution lithospheric structure beneath Main-
land China from ambient noise and earthquake surface-wave tomography. Earth and Planetary
Science Letters, 417:132–141, May 2015. ISSN 0012821X. doi: 10.1016/j.epsl.2015.02.024.

M. Beyreuther, R. Barsch, L. Krischer, T. Megies, Y. Behr, and J. Wassermann. ObsPy: A Python
ISSN

Toolbox for Seismology. Seismological Research Letters, 81(3):530–533, May 2010.
0895-0695. doi: 10.1785/gssrl.81.3.530.

Günter Bock, Bernd Schurr, and Günter Asch. High‐resolution image of the oceanic moho in the
subducting Nazca Plate from P‐S converted waves. Geophysical Research Letters, 27(23):3929–
3932, December 2000. ISSN 0094-8276, 1944-8007. doi: 10.1029/2000GL011881.

Thomas Bornstein, Dietrich Lange, Jannes Münchmeyer, Jack Woollam, Andreas Rietbrock, Grace
Barcheck, Ingo Grevemeyer, and Frederik Tilmann. PickBlue: Seismic phase picking for ocean
bottom seismometers with deep learning, April 2023. arXiv:2304.06635 [physics].

Ebru Bozdağ, Jeannot Trampert, and Jeroen Tromp. Misfit functions for full waveform inversion
based on instantaneous phase and envelope measurements. Geophysical Journal International,
185(2):845–870, May 2011. ISSN 0956-540X. doi: 10.1111/j.1365-246X.2011.04970.x.

105

Ebru Bozdağ, Daniel Peter, Matthieu Lefebvre, Dimitri Komatitsch, Jeroen Tromp, Judith Hill,
Norbert Podhorszki, and David Pugmire. Global adjoint tomography: first-generation model.
Geophysical Journal International, 207(3):1739–1766, December 2016.
ISSN 0956-540X,
1365-246X. doi: 10.1093/gji/ggw356.

E. Bruce Watson and James M. Brenan. Fluids in the lithosphere, 1. Experimentally-determined
wetting characteristics of CO2H2O fluids and their implications for fluid transport, host-rock
physical properties, and fluid inclusion formation. Earth and Planetary Science Letters, 85(4):
497–515, October 1987. ISSN 0012821X. doi: 10.1016/0012-821X(87)90144-0.

Michael R. Brudzinski, Clifford H. Thurber, Bradley R. Hacker, and E. Robert Engdahl. Global
Prevalence of Double Benioff Zones. Science, 316(5830):1472–1474, June 2007. ISSN 0036-
8075, 1095-9203. doi: 10.1126/science.1139204.

Chen Cai and Douglas A. Wiens. Dynamic triggering of deep earthquakes within a fossil slab.
ISSN 0094-8276, 1944-

Geophysical Research Letters, 43(18):9492–9499, September 2016.
8007. doi: 10.1002/2016GL070347.

Chengping Chai, Monica Maceira, Hector J. Santos-Villalobos, Singanallur V. Venkatakrishnan,
Martin Schoenball, Weiqiang Zhu, Gregory C. Beroza, Clifford Thurber, and EGS Collab Team.
Using a Deep Neural Network and Transfer Learning to Bridge Scales for Seismic Phase Picking.
Geophysical Research Letters, 47(16):e2020GL088651, 2020. ISSN 1944-8007. doi: 10.1029/
2020GL088651.

Jie Chen, Yongshun John Chen, Douglas A. Wiens, S. Shawn Wei, Yang Zha, Jordi Julià, and Chen
Cai. Crustal and lithospheric structure of inactive volcanic arc terrains in Fiji. Tectonophysics,
750:394–403, 2019a. Publisher: Elsevier.

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-
Decoder with Atrous Separable Convolution for Semantic Image Segmentation, August 2018.
arXiv:1802.02611 [cs].

Min Chen, Fenglin Niu, Qinya Liu, Jeroen Tromp, and Xiufen Zheng. Multiparameter adjoint
tomography of the crust and upper mantle beneath East Asia: 1. Model construction and com-
parisons. Journal of Geophysical Research: Solid Earth, 120(3):1762–1786, March 2015. ISSN
21699313. doi: 10.1002/2014JB011638.

Min Chen, Vlad Constantin Manea, Fenglin Niu, S. Shawn Wei, and Eric Kiser. Genesis of
Intermediate-Depth and Deep Intraslab Earthquakes beneath Japan Constrained by Seismic To-
mography, Seismicity, and Thermal Modeling. Geophysical Research Letters, 46(4):2025–2036,
2019b. ISSN 1944-8007. doi: 10.1029/2018GL080025.

Robert Clayton and Björn Engquist. Absorbing boundary conditions for acoustic and elastic wave
equations. Bulletin of the Seismological Society of America, 67(6):1529–1540, December 1977.
ISSN 1943-3573, 0037-1106. doi: 10.1785/BSSA0670061529.

106

James A. Conder and Douglas A. Wiens. Seismic structure beneath the Tonga arc and Lau back‐arc
basin determined from joint Vp, Vp/Vs tomography. Geochemistry, Geophysics, Geosystems, 7
(3):2005GC001113, March 2006. ISSN 1525-2027, 1525-2027. doi: 10.1029/2005GC001113.

C. DeMets, R. G. Gordon, D. F. Argus, and S. Stein. Current plate motions. Geophysi-
ISSN 0956540X, 1365246X. doi:

cal Journal International, 101(2):425–478, May 1990.
10.1111/j.1365-246X.1990.tb06579.x.

Joseph J. Durek and Göran Ekström. A radial model of anelasticity consistent with long-period
surface-wave attenuation. Bulletin of the Seismological Society of America, 86(1A):144–158,
February 1996. ISSN 0037-1106. doi: 10.1785/BSSA08601A0144.

A. M. Dziewonski, T.‐A. Chou, and J. H. Woodhouse. Determination of earthquake source pa-
rameters from waveform data for studies of global and regional seismicity. Journal of Geo-
physical Research: Solid Earth, 86(B4):2825–2852, April 1981.
doi:
10.1029/JB086iB04p02825.

ISSN 0148-0227.

Adam M. Dziewonski and Don L. Anderson. Preliminary reference Earth model. Physics
doi:

of the Earth and Planetary Interiors, 25(4):297–356, June 1981.
10.1016/0031-9201(81)90046-7.

ISSN 00319201.

Adam M. Dziewonski, Bradford H. Hager, and Richard J. O’Connell. Large-scale heterogeneities
ISSN

in the lower mantle. Journal of Geophysical Research, 82(2):239–255, January 1977.
01480227. doi: 10.1029/JB082i002p00239.

G. Ekström, M. Nettles, and A.M. Dziewoński. The global CMT project 2004–2010: Centroid-
moment tensors for 13,017 earthquakes. Physics of the Earth and Planetary Interiors, 200-201:
1–9, June 2012. ISSN 00319201. doi: 10.1016/j.pepi.2012.04.002.

E. Robert Engdahl, Rob van der Hilst, and Raymond Buland. Global teleseismic earthquake re-
location with improved travel times and procedures for depth determination. Bulletin of the
Seismological Society of America, 88(3):722–743, 1998. Publisher: The Seismological Society
of America.

Wenyuan Fan, S. Shawn Wei, Dongdong Tian, Jeffrey J. McGuire, and Douglas A. Wiens. Complex
and Diverse Rupture Processes of the 2018 M w 8.2 and M w 7.9 Tonga‐Fiji Deep Earthquakes.
Geophysical Research Letters, 46(5):2434–2448, March 2019.
ISSN 0094-8276, 1944-8007.
doi: 10.1029/2018GL080997.

Andreas Fichtner and Jeannot Trampert. Hessian kernels of seismic data functionals based upon
ISSN

adjoint techniques. Geophysical Journal International, 185(2):775–798, May 2011a.
0956540X. doi: 10.1111/j.1365-246X.2011.04966.x.

Andreas Fichtner and Jeannot Trampert. Resolution analysis in full waveform inversion. Geo-
ISSN 0956540X. doi:

physical Journal International, 187(3):1604–1624, December 2011b.

107

10.1111/j.1365-246X.2011.05218.x.

Andreas Fichtner, Brian L. N. Kennett, Heiner Igel, and Hans-Peter Bunge. Full seismic wave-
form tomography for upper-mantle structure in the Australasian region using adjoint meth-
ods. Geophysical Journal International, 179(3):1703–1725, December 2009. ISSN 0956540X,
1365246X. doi: 10.1111/j.1365-246X.2009.04368.x.

Andreas Fichtner, Brian L.N. Kennett, Heiner Igel, and Hans-Peter Bunge. Full waveform tomogra-
phy for radially anisotropic structure: New insights into present and past states of the Australasian
upper mantle. Earth and Planetary Science Letters, 290(3-4):270–280, February 2010. ISSN
0012821X. doi: 10.1016/j.epsl.2009.12.003.

Andreas Fichtner, Jeannot Trampert, Paul Cupillard, Erdinc Saygin, Tuncay Taymaz, Yann Capdev-
ille, and Antonio Villaseñor. Multiscale full waveform inversion. Geophysical Journal Interna-
tional, 194(1):534–556, July 2013. ISSN 0956-540X. doi: 10.1093/gji/ggt118.

M. A. Florez and G. A. Prieto. Controlling Factors of Seismicity and Geometry in Double Seismic
Zones. Geophysical Research Letters, 46(8):4174–4181, April 2019. ISSN 0094-8276, 1944-
8007. doi: 10.1029/2018GL081168.

S. W. French and B. A. Romanowicz. Whole-mantle radially anisotropic shear velocity structure
from spectral-element waveform tomography. Geophysical Journal International, 199(3):1303–
1327, December 2014. ISSN 1365-246X, 0956-540X. doi: 10.1093/gji/ggu334.

Scott W. French and Barbara Romanowicz. Broad plumes rooted at the base of the Earth’s mantle
beneath major hotspots. Nature, 525(7567):95–99, September 2015. ISSN 0028-0836, 1476-
4687. doi: 10.1038/nature14876.

Wolfgang Friederich. The S -velocity structure of the East Asian mantle from inversion of shear
and surface waveforms. Geophysical Journal International, 153(1):88–102, April 2003. ISSN
0956540X, 1365246X. doi: 10.1046/j.1365-246X.2003.01869.x.

Lind S. Gee and Thomas H. Jordan. Generalized seismological data functionals. Geophysical
ISSN 0956-540X. doi: 10.1111/j.

Journal International, 111(2):363–390, November 1992.
1365-246X.1992.tb00584.x.

S. Gentili and A. Michelini. Automatic picking of P and S phases using a neural tree. Journal of
Seismology, 10(1):39–63, January 2006. ISSN 1573-157X. doi: 10.1007/s10950-006-2296-6.

Jianhua Gong and Jeffrey J. McGuire. Constraints on the Geometry of the Subducted Gorda Plate
From Converted Phases Generated by Local Earthquakes. Journal of Geophysical Research:
Solid Earth, 126(2):e2020JB019962, February 2021. ISSN 2169-9313, 2169-9356. doi: 10.
1029/2020JB019962.

Jianhua Gong, Wenyuan Fan, and Ross Parnell-Turner. Machine Learning-Based New Earthquake

108

Catalog Illuminates On-Fault and Off-Fault Seismicity Patterns at the Discovery Transform Fault,
East Pacific Rise. Geochemistry, Geophysics, Geosystems, 24(9):e2023GC011043, 2023. ISSN
1525-2027. doi: 10.1029/2023GC011043.

Harry W. Green and Heidi Houston. The Mechanics of Deep Earthquakes. Annual Review of
Earth and Planetary Sciences, 23(1):169–213, May 1995. ISSN 0084-6597, 1545-4495. doi:
10.1146/annurev.ea.23.050195.001125.

Zhen Guo, Y. John Chen, Jieyuan Ning, Yingjie Yang, Juan Carlos Afonso, and Youcai Tang.
Seismic evidence of on-going sublithosphere upper mantle convection for intra-plate volcanism
in Northeast China. Earth and Planetary Science Letters, 433:31–43, January 2016.
ISSN
0012821X. doi: 10.1016/j.epsl.2015.09.035.

Zhen Guo, Kai Wang, Yingjie Yang, Youcai Tang, Y. John Chen, and Shu-Huei Hung. The Origin
and Mantle Dynamics of Quaternary Intraplate Volcanism in Northeast China From Joint Inver-
sion of Surface Wave and Body Wave. Journal of Geophysical Research: Solid Earth, 123(3):
2410–2425, March 2018. ISSN 21699313. doi: 10.1002/2017JB014948.

Bradley R. Hacker, Simon M. Peacock, Geoffrey A. Abers, and Stephen D. Holloway. Subduc-
tion factory 2. Are intermediate‐depth earthquakes in subducting slabs linked to metamorphic
dehydration reactions? Journal of Geophysical Research: Solid Earth, 108(B1):2001JB001129,
January 2003. ISSN 0148-0227. doi: 10.1029/2001JB001129.

Gavin P. Hayes, David J. Wald, and Rebecca L. Johnson. Slab1.0: A three-dimensional model of
global subduction zone geometries. Journal of Geophysical Research: Solid Earth, 117(B1),
2012. ISSN 2156-2202. doi: 10.1029/2011JB008524.

Gavin P. Hayes, Ginevra L. Moore, Daniel E. Portner, Mike Hearne, Hanna Flamme, Maria Furtney,
and Gregory M. Smoczyk. Slab2, a comprehensive subduction zone geometry model. Science,
362(6410):58–61, October 2018. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.aat4723.

Chuansong He, Shuwen Dong, Xuanhua Chen, M. Santosh, and Shuyin Niu. Seismic evidence for
plume-induced rifting in the Songliao Basin of Northeast China. Tectonophysics, 627:171–181,
July 2014. ISSN 00401951. doi: 10.1016/j.tecto.2013.07.015.

George Helffrich and Geoffrey A. Abers. Slab low-velocity layer in the eastern Aleutian subduction
zone. Geophysical Journal International, 130(3):640–648, 1997. doi: 10.1111/j.1365-246X.
1997.tb01858.x. Publisher: Blackwell Publishing Ltd Oxford, UK.

Matthew D. Hoffman and Andrew Gelman. The No-U-Turn sampler: adaptively setting path lengths

in Hamiltonian Monte Carlo. J. Mach. Learn. Res., 15(1):1593–1623, 2014.

Jinli Huang and Dapeng Zhao. High-resolution mantle tomography of China and surrounding
ISSN 0148-0227. doi:

regions. Journal of Geophysical Research, 111(B9):B09305, 2006.
10.1029/2005JB004066.

109

Zhouchuan Huang, Pan Wang, Mingjie Xu, Liangshu Wang, Zhifeng Ding, Yan Wu, Mijian Xu,
Ning Mi, Dayong Yu, and Hua Li. Mantle structure and dynamics beneath SE Tibet revealed by
new seismic images. Earth and Planetary Science Letters, 411:100–111, February 2015. ISSN
0012821X. doi: 10.1016/j.epsl.2014.11.040.

Toshihiro Igarashi, Toru Matsuzawa, Norihito Umino, and Akira Hasegawa. Spatial distribution
of focal mechanisms for interplate and intraplate earthquakes associated with the subducting
Pacific plate beneath the northeastern Japan arc: A triple‐planed deep seismic zone. Journal of
Geophysical Research: Solid Earth, 106(B2):2177–2191, February 2001. ISSN 0148-0227. doi:
10.1029/2000JB900386.

International-Seismological-Centre. ISC-EHB dataset. 2023.

Bryan Isacks and Peter Molnar. Distribution of stresses in the descending lithosphere from a global
survey of focal‐mechanism solutions of mantle earthquakes. Reviews of Geophysics, 9(1):103–
174, February 1971. ISSN 8755-1209, 1944-9208. doi: 10.1029/RG009i001p00103.

Zhe Jia, Zhichao Shen, Zhongwen Zhan, Chenyu Li, Zhigang Peng, and Michael Gurnis. The 2018
Fiji Mw 8.2 and 7.9 deep earthquakes: One doublet in two slabs. Earth and Planetary Science
Letters, 531:115997, 2020. Publisher: Elsevier.

Hitoshi Kawakatsu. Double seismic zone in Tonga. Nature, 316(6023):53–55, 1985. Publisher:

Nature Publishing Group UK London.

Hitoshi Kawakatsu, Prakash Kumar, Yasuko Takei, Masanao Shinohara, Toshihiko Kanazawa, Ei-
ichiro Araki, and Kiyoshi Suyehiro. Seismic Evidence for Sharp Lithosphere-Asthenosphere
Boundaries of Oceanic Plates. Science, 324(5926):499–502, April 2009.
ISSN 0036-8075,
1095-9203. doi: 10.1126/science.1169499.

B. L. N. Kennett and E. R. Engdahl. Traveltimes for global earthquake location and phase identifi-
cation. Geophysical Journal International, 105(2):429–465, May 1991. ISSN 0956-540X. doi:
10.1111/j.1365-246X.1991.tb06724.x.

B. L. N. Kennett, E. R. Engdahl, and R. Buland. Constraints on seismic velocities in the Earth from
traveltimes. Geophysical Journal International, 122(1):108–124, July 1995. ISSN 0956540X,
1365246X. doi: 10.1111/j.1365-246X.1995.tb03540.x.

Yong‐Woo Kim, Sung‐Joon Chang, Michael Witek, Jieyuan Ning, and Jingchong Wen. S‐Velocity
Mantle Structure of East Asia From Teleseismic Traveltime Tomography: Inferred Mechanisms
for the Cenozoic Intraplate Volcanoes. Journal of Geophysical Research: Solid Earth, 126(3),
March 2021. ISSN 2169-9313, 2169-9356. doi: 10.1029/2020JB020345.

YoungHee Kim, Qinya Liu, and Jeroen Tromp. Adjoint centroid-moment tensor inversions. Geo-
physical Journal International, 186(1):264–278, July 2011. ISSN 0956540X. doi: 10.1111/j.
1365-246X.2011.05027.x.

110

Saeko Kita, Tomomi Okada, Akira Hasegawa, Junichi Nakajima, and Toru Matsuzawa. Anoma-
lous deepening of a seismic belt in the upper-plane of the double seismic zone in the Pacific slab
beneath the Hokkaido corner: Possible evidence for thermal shielding caused by subducted fore-
arc crust materials. Earth and Planetary Science Letters, 290(3-4):415–426, 2010. Publisher:
Elsevier.

Dimitri Komatitsch and Jeroen Tromp. Spectral-element simulations of global seismic wave
propagation-I. Validation. Geophysical Journal International, 149(2):390–412, May 2002a.
ISSN 0956540X, 1365246X. doi: 10.1046/j.1365-246X.2002.01653.x.

Dimitri Komatitsch and Jeroen Tromp. Spectral-element simulations of global seismic wave
propagation-II. Three-dimensional models, oceans, rotation and self-gravitation. Geophysi-
cal Journal International, 150(1):303–318, July 2002b.
ISSN 0956540X, 1365246X. doi:
10.1046/j.1365-246X.2002.01716.x.

Lion Krischer, James Smith, Wenjie Lei, Matthieu Lefebvre, Youyi Ruan, Elliott Sales De An-
drade, Norbert Podhorszki, Ebru Bozdağ, and Jeroen Tromp. An Adaptable Seismic Data For-
mat. Geophysical Journal International, 207(2):1003–1011, November 2016. ISSN 0956-540X,
1365-246X. doi: 10.1093/gji/ggw319.

Lion Krischer, Andreas Fichtner, Christian Boehm, and Heiner Igel. Automated Large-Scale
Full Seismic Waveform Inversion for North America and the North Atlantic. Journal of Geo-
physical Research: Solid Earth, 123(7):5902–5928, 2018.
ISSN 2169-9356. doi: 10.1029/
2017JB015289.

B. Kustowski, G. Ekström, and A. M. Dziewoński. Anisotropic shear-wave velocity structure of
the Earth’s mantle: A global model. Journal of Geophysical Research, 113(B6):B06306, June
2008a. ISSN 0148-0227. doi: 10.1029/2007JB005169.

B. Kustowski, G. Ekström, and A. M. Dziewoński. The shear-wave velocity structure in the upper
mantle beneath Eurasia. Geophysical Journal International, 174(3):978–992, September 2008b.
ISSN 0956540X, 1365246X. doi: 10.1111/j.1365-246X.2008.03865.x.

Gabi Laske, Guy Masters, Zhitu Ma, and Mike Pasyanos. Update on CRUST1. 0—A 1-degree

global model of Earth’s crust. In Geophys. res. abstr, volume 15, page 2658, 2013.

Jianshe Lei. Upper-mantle tomography and dynamics beneath the North China Craton. Journal
of Geophysical Research: Solid Earth, 117(B6), June 2012. ISSN 01480227. doi: 10.1029/
2012JB009212.

Jianshe Lei and Dapeng Zhao. P-wave tomography and origin of the Changbai intraplate volcano
in Northeast Asia. Tectonophysics, 397(3-4):281–295, March 2005. ISSN 00401951. doi: 10.
1016/j.tecto.2004.12.009.

Jianshe Lei and Dapeng Zhao. Teleseismic P‐wave tomography and mantle dynamics beneath

111

Eastern Tibet. Geochemistry, Geophysics, Geosystems, 17(5):1861–1884, May 2016.
1525-2027, 1525-2027. doi: 10.1002/2016GC006262.

ISSN

Jianshe Lei, Dapeng Zhao, Bernhard Steinberger, Bateer Wu, Fanluan Shen, and Zhixiong Li. New
seismic constraints on the upper mantle structure of the Hainan plume. Physics of the Earth and
Planetary Interiors, 173(1-2):33–50, March 2009. ISSN 00319201. doi: 10.1016/j.pepi.2008.
10.013.

Jianshe Lei, Furen Xie, Qicheng Fan, and M. Santosh. Seismic imaging of the deep structure under
the Chinese volcanoes: An overview. Physics of the Earth and Planetary Interiors, 224:104–123,
November 2013. ISSN 00319201. doi: 10.1016/j.pepi.2013.08.008.

Jianshe Lei, Dapeng Zhao, Xiwei Xu, Yi-Gang Xu, and Mofei Du. Is there a big mantle wedge
under eastern Tibet? Physics of the Earth and Planetary Interiors, 292:100–113, July 2019.
ISSN 00319201. doi: 10.1016/j.pepi.2019.04.005.

Wenjie Lei, Youyi Ruan, Ebru Bozdağ, Daniel Peter, Matthieu Lefebvre, Dimitri Komatitsch,
Global adjoint
Jeroen Tromp, Judith Hill, Norbert Podhorszki, and David Pugmire.
tomography—model GLAD-M25. Geophysical Journal International, 223(1):1–21, October
2020. ISSN 0956-540X, 1365-246X. doi: 10.1093/gji/ggaa253.

Chang Li and Robert D. Van Der Hilst. Structure of the upper mantle and transition zone beneath
Southeast Asia from traveltime tomography. Journal of Geophysical Research, 115(B7):B07308,
July 2010. ISSN 0148-0227. doi: 10.1029/2009JB006882.

Chang Li, Robert D. Van Der Hilst, E. Robert Engdahl, and Scott Burdick. A new global model for
P wave speed variations in Earth’s mantle. Geochemistry, Geophysics, Geosystems, 9(5), May
2008. ISSN 15252027. doi: 10.1029/2007GC001806.

Guoliang Li, Haichao Chen, Fenglin Niu, Zhen Guo, Yingjie Yang, and Jun Xie. Measurement
of Rayleigh wave ellipticity and its application to the joint inversion of high‐resolution S wave
velocity structure beneath northeast China. Journal of Geophysical Research: Solid Earth, 121
(2):864–880, February 2016. ISSN 2169-9313, 2169-9356. doi: 10.1002/2015JB012459.

Shaohua Li, Jiaqi Li, Thomas P. Ferrand, Tong Zhou, Mingda Lv, Ziyi Xi, Ross Maguire, Guangjie
Han, Juan Li, Xiyuan Bao, Yiran Jiang, and Tiezhao Bao. Deep Geophysical Anomalies Be-
Journal of Geophysical Research: Solid Earth, 128(4):
neath the Changbaishan Volcano.
e2022JB025671, April 2023. ISSN 2169-9313, 2169-9356. doi: 10.1029/2022JB025671.

Hobin Lim, YoungHee Kim, Teh-Ru Alex Song, and Xuzhang Shen. Measurement of seismome-
ter orientation using the tangential P-wave receiver function based on harmonic decomposition.
Geophysical Journal International, 212(3):1747–1765, March 2018. ISSN 0956-540X, 1365-
246X. doi: 10.1093/gji/ggx515.

Min Liu, Miao Zhang, Weiqiang Zhu, William L. Ellsworth, and Hongyi Li. Rapid Characterization

112

of the July 2019 Ridgecrest, California, Earthquake Sequence From Raw Seismic Data Using
Machine-Learning Phase Picker. Geophysical Research Letters, 47(4):e2019GL086189, 2020.
ISSN 1944-8007. doi: 10.1029/2019GL086189.

Min Liu, Lu Li, Miao Zhang, Xinglin Lei, Mladen R. Nedimović, Alexandre P. Plourde, Rumeng
Guo, Weitao Wang, and Hongyi Li. Complexity of initiation and evolution of the 2013 Yunlong
earthquake swarm. Earth and Planetary Science Letters, 612:118168, June 2023. ISSN 0012-
821X. doi: 10.1016/j.epsl.2023.118168.

Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization, January 2019.

arXiv:1711.05101 [cs, math].

Chang Lu, Stephen P. Grand, Hongyu Lai, and Edward J. Garnero. TX2019slab: A New P and S
Tomography Model Incorporating Subducting Slabs. Journal of Geophysical Research: Solid
Earth, 124(11):11549–11567, 2019. ISSN 2169-9356. doi: 10.1029/2019JB017448.

Jincheng Ma, Hans-Peter Bunge, Solvi Thrastarson, Andreas Fichtner, Dirk-Philip van Herwaarden,
You Tian, Sung-Joon Chang, and Tingting Liu. Seismic full-waveform inversion of the crust-
mantle structure beneath China and adjacent regions. Journal of Geophysical Research: Solid
Earth, 127(9), 2022.

Alessia Maggi, Carl Tape, Min Chen, Daniel Chao, and Jeroen Tromp. An automated time-window
selection algorithm for seismic tomography. Geophysical Journal International, 178(1):257–
281, July 2009. ISSN 0956540X, 1365246X. doi: 10.1111/j.1365-246X.2009.04099.x.

Toru Matsuzawa, Norihito Umino, Akira Hasegawa, and Akio Takagi. Upper mantle velocity struc-
ture estimated from PS-converted wave beneath the north-eastern Japan Arc. Geophysical Jour-
nal International, 86(3):767–787, 1986. doi: 10.1111/j.1365-246X.1986.tb00659.x. Publisher:
Blackwell Publishing Ltd Oxford, UK.

Toru Matsuzawa, Toshio Kono, Akira Hasegawa, and Akio Takagi. Subducting plate boundary
beneath the northeastern Japan arc estimated from SP converted waves. Tectonophysics, 181
(1-4):123–133, 1990. doi: 10.1016/0040-1951(90)90012-W. Publisher: Elsevier.

Sarah E. Mazza, Esteban Gazel, Michael Bizimis, Robert Moucha, Paul Béguelin, Elizabeth A.
Johnson, Ryan J. McAleer, and Alexander V. Sobolev. Sampling the volatile-rich transition zone
beneath Bermuda. Nature, 569(7756):398–403, May 2019. ISSN 0028-0836, 1476-4687. doi:
10.1038/s41586-019-1183-6.

Ian W. McBrearty and Gregory C. Beroza. Earthquake phase association with graph neural net-
works. Bulletin of the Seismological Society of America, 113(2):524–547, 2023. Publisher:
Seismological Society of America.

Jeffrey J. McGuire, Douglas A. Wiens, Patrick J. Shore, and Michael G. Bevis. The March 9, 1994
( M w 7.6), deep Tonga earthquake: Rupture outside the seismically active slab. Journal of

113

Geophysical Research: Solid Earth, 102(B7):15163–15182, July 1997. ISSN 0148-0227. doi:
10.1029/96JB03185.

Anne Meltzer, Joshua C. Stachnik, Demberel Sodnomsambuu, Ulziibat Munkhuu, Baasanbat
Tsagaan, Mungunsuren Dashdondog, and Raymond Russo. The Central Mongolia Seismic
Experiment: Multiple Applications of Temporary Broadband Seismic Arrays. Seismological
Research Letters, 90(3):1364–1376, May 2019. ISSN 0895-0695, 1938-2057. doi: 10.1785/
0220180360.

Raffaella Montelli, Guust Nolet, F. A. Dahlen, Guy Masters, E. Robert Engdahl, and Shu-Huei
Hung. Finite-Frequency Tomography Reveals a Variety of Plumes in the Mantle. Science, 303
(5656):338–343, January 2004. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.1092485.

S. Mostafa Mousavi and Gregory C. Beroza. Deep-learning seismology. Science, 377(6607):

eabm4470, August 2022. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.abm4470.

S. Mostafa Mousavi and Gregory C. Beroza. Machine Learning in Earthquake Seismology. Annual
Review of Earth and Planetary Sciences, 51(1):105–129, May 2023. ISSN 0084-6597, 1545-
4495. doi: 10.1146/annurev-earth-071822-100323.

S. Mostafa Mousavi, Yixiao Sheng, Weiqiang Zhu, and Gregory C. Beroza. STanford EArthquake
Dataset (STEAD): A Global Data Set of Seismic Signals for AI. IEEE Access, 7:179464–179476,
2019. ISSN 2169-3536. doi: 10.1109/ACCESS.2019.2947848. Conference Name: IEEE Ac-
cess.

S. Mostafa Mousavi, William L. Ellsworth, Weiqiang Zhu, Lindsay Y. Chuang, and Gregory C.
Beroza. Earthquake transformer—an attentive deep-learning model for simultaneous earthquake
detection and phase picking. Nature Communications, 11(1):3952, August 2020. ISSN 2041-
1723. doi: 10.1038/s41467-020-17591-w. Number: 1 Publisher: Nature Publishing Group.

Nori Nakata and David R. Shelly. Imaging a Crustal Low‐Velocity Layer Using Reflected Seismic
Waves From the 2014 Earthquake Swarm at Long Valley Caldera, California: The Magmatic
System Roof? Geophysical Research Letters, 45(8):3481–3488, April 2018. ISSN 0094-8276,
1944-8007. doi: 10.1029/2018GL077260.

Peter L. Nelson and Stephen P. Grand. Lower-mantle plume beneath the Yellowstone hotspot re-
vealed by core waves. Nature Geoscience, 11(4):280–284, April 2018. ISSN 1752-0894, 1752-
0908. doi: 10.1038/s41561-018-0075-y.

Fenglin Niu and Juan Li. Component azimuths of the CEArray stations estimated from P-wave
particle motion. Earthquake Science, 24(1):3–13, February 2011. ISSN 1674-4519, 1867-8777.
doi: 10.1007/s11589-011-0764-8.

Masayuki Obayashi, Junko Yoshimitsu, Guust Nolet, Yoshio Fukao, Hajime Shiobara, Hiroko Sug-
ioka, Hiroki Miyamachi, and Yuan Gao. Finite frequency whole mantle P wave tomography: Im-

114

provement of subducted slab images. Geophysical Research Letters, 40(21):5652–5657, Novem-
ber 2013. ISSN 00948276. doi: 10.1002/2013GL057401.

Mathias Obrebski, Richard M. Allen, Fengxue Zhang, Jiatie Pan, Qingju Wu, and Shu-Huei Hung.
Shear wave tomography of China using joint inversion of body and surface wave constraints.
Journal of Geophysical Research: Solid Earth, 117(B1), January 2012. ISSN 01480227. doi:
10.1029/2011JB008349.

Shiro Ohmi and Sadaki Hori. Seismic wave conversion near the upper boundary of the Pacific plate
beneath the Kanto district, Japan. Geophysical Journal International, 141(1):136–148, April
2000. ISSN 0956-540X. doi: 10.1046/j.1365-246X.2000.00086.x.

Yoshimitsu Okada, Keiji Kasahara, Sadaki Hori, Kazushige Obara, Shoji Sekiguchi, Hiroyuki Fuji-
wara, and Akira Yamamoto. Recent progress of seismic observation networks in Japan —Hi-net,
F-net, K-NET and KiK-net—. Earth, Planets and Space, 56(8):xv–xxviii, August 2004. ISSN
1880-5981. doi: 10.1186/BF03353076.

Mark Panning and Barbara Romanowicz. A three-dimensional radially anisotropic model of shear
velocity in the whole mantle. Geophysical Journal International, 167(1):361–379, October 2006.
ISSN 0956540X, 1365246X. doi: 10.1111/j.1365-246X.2006.03100.x.

Yongsoo Park, S. Mostafa Mousavi, Weiqiang Zhu, William L. Ellsworth, and Gregory C. Beroza.
Machine-Learning-Based Analysis of the Guy-Greenbrier, Arkansas Earthquakes: A Tale of Two
Sequences. Geophysical Research Letters, 47(6):e2020GL087032, 2020. ISSN 1944-8007. doi:
10.1029/2020GL087032.

J. D. Pesicek, C. H. Thurber, H. Zhang, H. R. DeShon, E. R. Engdahl, and S. Widiyantoro. Teleseis-
mic double-difference relocation of earthquakes along the Sumatra-Andaman subduction zone
using a 3-D model. Journal of Geophysical Research, 115(B10):B10303, October 2010. ISSN
0148-0227. doi: 10.1029/2010JB007443.

Daniel Peter, Dimitri Komatitsch, Yang Luo, Roland Martin, Nicolas Le Goff, Emanuele Casarotti,
Pieyre Le Loher, Federica Magnoni, Qinya Liu, Céline Blitz, Tarje Nissen-Meyer, Piero Basini,
and Jeroen Tromp. Forward and adjoint simulations of seismic wave propagation on fully unstruc-
tured hexahedral meshes. Geophysical Journal International, 186(2):721–739, August 2011.
ISSN 0956540X. doi: 10.1111/j.1365-246X.2011.05044.x.

E. Polak and G. Ribiere. Note sur la convergence de méthodes de directions conjuguées. Revue
française d’informatique et de recherche opérationnelle. Série rouge, 3(16):35–43, 1969. ISSN
0373-8000. doi: 10.1051/m2an/196903R100351.

M. J. D. Powell and Professor of Applied Numerical Analysis M. J. D. Powell. Approximation
ISBN 978-0-521-29514-7.

Theory and Methods. Cambridge University Press, March 1981.
Google-Books-ID: ODZ1OYR3w4cC.

115

Florian Rickers, Andreas Fichtner, and Jeannot Trampert.

Imaging mantle plumes with instan-
taneous phase measurements of diffracted waves. Geophysical Journal International, 190(1):
650–664, August 2012. ISSN 0956-540X. doi: 10.1111/j.1365-246X.2012.05515.x.

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional Networks for Biomed-
ical Image Segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejan-
dro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI
2015, Lecture Notes in Computer Science, pages 234–241, Cham, 2015. Springer International
Publishing. ISBN 978-3-319-24574-4. doi: 10.1007/978-3-319-24574-4_28.

Zachary E. Ross, Men‐Andrin Meier, Egill Hauksson, and Thomas H. Heaton. Generalized Seismic
Phase Detection with Deep Learning. Bulletin of the Seismological Society of America, 108(5A):
2894–2901, August 2018. ISSN 0037-1106. doi: 10.1785/0120180080.

Zachary E. Ross, Elizabeth S. Cochran, Daniel T. Trugman, and Jonathan D. Smith. 3D fault
architecture controls the dynamism of earthquake swarms. Science, 368(6497):1357–1361, June
2020. doi: 10.1126/science.abb0779. Publisher: American Association for the Advancement of
Science.

Youyi Ruan, Wenjie Lei, Ryan Modrak, Rıdvan Örsvuran, Ebru Bozdağ, and Jeroen Tromp. Bal-
ancing unevenly distributed data in seismic tomography: a global adjoint tomography exam-
ple. Geophysical Journal International, 219(2):1225–1236, November 2019. ISSN 0956-540X,
1365-246X. doi: 10.1093/gji/ggz356.

Larry Ruff and Hiroo Kanamori. Seismicity and the subduction process. Physics of the Earth
doi: 10.1016/

ISSN 0031-9201.

and Planetary Interiors, 23(3):240–252, October 1980.
0031-9201(80)90117-X.

Takahiro Shiina, Junichi Nakajima, and Toru Matsuzawa. Seismic evidence for high pore pressures
in the oceanic crust: Implications for fluid‐related embrittlement. Geophysical Research Letters,
40(10):2006–2010, May 2013. ISSN 0094-8276, 1944-8007. doi: 10.1002/grl.50468.

Takahiro Shiina, Junichi Nakajima, Genti Toyokuni, and Toru Matsuzawa. Guided wave obser-
vations and evidence for the low-velocity subducting crust beneath Hokkaido, northern Japan.
Earth, Planets and Space, 66(1):69, December 2014.
doi: 10.1186/
1880-5981-66-69.

ISSN 1880-5981.

Takahiro Shiina, Junichi Nakajima, Toru Matsuzawa, Genti Toyokuni, and Saeko Kita. Depth vari-
ations in seismic velocity in the subducting crust: Evidence for fluid‐related embrittlement for
intermediate‐depth earthquakes. Geophysical Research Letters, 44(2):810–817, January 2017.
ISSN 0094-8276, 1944-8007. doi: 10.1002/2016GL071798.

N A Simmons, S C Myers, C Morency, A Chiang, and D R Knapp. SPiRaL: a multiresolution
global tomography model of seismic wave speeds and radial anisotropy variations in the crust
and mantle. Geophysical Journal International, 227(2):1366–1391, November 2021.
ISSN

116

0956-540X. doi: 10.1093/gji/ggab277.

Reinoud Sleeman and Torild van Eck. Robust automatic P-phase picking: an on-line implemen-
tation in the analysis of broadband seismogram recordings. Physics of the Earth and Planetary
Interiors, 113(1):265–275, June 1999. ISSN 0031-9201. doi: 10.1016/S0031-9201(99)00007-2.

Smithsonian-Institution. Global Volcanism Program, 2023. [Database] Volcanoes of the World (v.
5.0.4; 17 Apr 2023), April 2023. Distributed by Smithsonian Institution, compiled by Venzke,
E.

Paul Spudich and Todd Bostwick. Studies of the seismic coda using an earthquake cluster as
a deeply buried seismograph array. Journal of Geophysical Research: Solid Earth, 92(B10):
10526–10546, September 1987. ISSN 0148-0227. doi: 10.1029/JB092iB10p10526.

Richard Stacey. Improved transparent boundary formulations for the elastic-wave equation. Bulletin
of the Seismological Society of America, 78(6):2089–2097, December 1988. ISSN 1943-3573,
0037-1106. doi: 10.1785/BSSA0780062089.

Yen Joe Tan, Felix Waldhauser, William L. Ellsworth, Miao Zhang, Weiqiang Zhu, Maddalena
Michele, Lauro Chiaraluce, Gregory C. Beroza, and Margarita Segou. Machine‐Learning‐Based
High‐Resolution Earthquake Catalog Reveals How Complex Fault Structures Were Activated
during the 2016–2017 Central Italy Sequence. The Seismic Record, 1(1):11–19, May 2021.
ISSN 2694-4006. doi: 10.1785/0320210001.

Youcai Tang, Masayuki Obayashi, Fenglin Niu, Stephen P. Grand, Yongshun John Chen, Hitoshi
Kawakatsu, Satoru Tanaka, Jieyuan Ning, and James F. Ni. Changbaishan volcanism in northeast
China linked to subduction-induced mantle upwelling. Nature Geoscience, 7(6):470–475, June
2014. ISSN 1752-0894, 1752-0908. doi: 10.1038/ngeo2166.

Kai Tao, Stephen P. Grand, and Fenglin Niu. Full-waveform inversion of triplicated data using
a normalized-correlation-coefficient-based misfit function. Geophysical Journal International,
210(3):1517–1524, September 2017. ISSN 0956-540X, 1365-246X. doi: 10.1093/gji/ggx249.

Kai Tao, Stephen P. Grand, and Fenglin Niu. Seismic Structure of the Upper Mantle Beneath Eastern
Asia From Full Waveform Seismic Tomography. Geochemistry, Geophysics, Geosystems, 19(8):
2732–2763, August 2018. ISSN 1525-2027, 1525-2027. doi: 10.1029/2018GC007460.

Carl Tape, Qinya Liu, and Jeroen Tromp. Finite-frequency tomography using adjoint methods-
Methodology and examples using membrane surface waves. Geophysical Journal International,
168(3):1105–1129, March 2007. ISSN 0956540X, 1365246X. doi: 10.1111/j.1365-246X.2006.
03191.x.

Carl Tape, Qinya Liu, Alessia Maggi, and Jeroen Tromp. Adjoint Tomography of the Southern
California Crust. Science, 325(5943):988–992, August 2009. ISSN 0036-8075, 1095-9203. doi:
10.1126/science.1175298.

117

Carl Tape, Qinya Liu, Alessia Maggi, and Jeroen Tromp. Seismic tomography of the southern Cal-
ifornia crust based on spectral-element and adjoint methods. Geophysical Journal International,
180(1):433–462, January 2010. ISSN 0956540X, 1365246X. doi: 10.1111/j.1365-246X.2009.
04429.x.

P. Tapponnier, G. Peltzer, and R. Armijo. On the mechanics of the collision between India and
Asia. Geological Society, London, Special Publications, 19(1):113–157, January 1986. ISSN
0305-8719, 2041-4927. doi: 10.1144/GSL.SP.1986.019.01.07.

You Tian, Dapeng Zhao, Ruomei Sun, and Jiwen Teng. Seismic imaging of the crust and upper
mantle beneath the North China Craton. Physics of the Earth and Planetary Interiors, 172(3-4):
169–182, February 2009. ISSN 00319201. doi: 10.1016/j.pepi.2008.09.002.

Ron van der Hilst. Complex morphology of subducted lithosphere in the mantle beneath the Tonga
trench. Nature, 374(6518):154–157, 1995. doi: 10.1038/374154a0. Publisher: Nature Publish-
ing Group UK London.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
Łukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In Advances in Neural Informa-
tion Processing Systems, volume 30. Curran Associates, Inc., 2017.

Suyun Wang, Fenglin Niu, and Guomin Zhang. Velocity structure of the uppermost mantle beneath
East Asia from Pn tomography and its dynamic implications. Journal of Geophysical Research:
Solid Earth, 118(1):290–301, January 2013. ISSN 21699313. doi: 10.1002/jgrb.50085.

Xin Wang, Qi-Fu Chen, Fenglin Niu, Shengji Wei, Jieyuan Ning, Juan Li, Weijun Wang, Johannes
Buchen, and Lijun Liu. Distinct slab interfaces imaged within the mantle transition zone. Nature
Geoscience, 13(12):822–827, December 2020.
ISSN 1752-0894, 1752-0908. doi: 10.1038/
s41561-020-00653-5.

Yu Wang, Xuemin Zhang, Chaosong Jiang, Haiquan Wei, and Jinglin Wan. Tectonic controls on
the late Miocene–Holocene volcanic eruptions of the Tengchong volcanic field along the south-
eastern margin of the Tibetan plateau. Journal of Asian Earth Sciences, 30(2):375–389, April
2007. ISSN 13679120. doi: 10.1016/j.jseaes.2006.11.005.

Michael Warner, Andrew Ratcliffe, Tenice Nangoo, Joanna Morgan, Adrian Umpleby, Nikhil Shah,
Vetle Vinje, Ivan Štekl, Lluís Guasch, Caroline Win, Graham Conroy, and Alexandre Bertrand.
Anisotropic 3D full-waveform inversion. GEOPHYSICS, 78(2):R59–R80, March 2013. ISSN
0016-8033, 1942-2156. doi: 10.1190/geo2012-0338.1.

Spahr C. Webb and Wayne C. Crawford. Long-period seafloor seismology and deformation under
ocean waves. Bulletin of the Seismological Society of America, 89(6):1535–1542, 1999. Pub-
lisher: The Seismological Society of America.

S. Shawn Wei and Y. John Chen. Seismic evidence of the Hainan mantle plume by receiver function

118

analysis in southern China. Geophysical Research Letters, 43(17):8978–8985, September 2016.
ISSN 00948276. doi: 10.1002/2016GL069513.

S. Shawn Wei, Douglas A. Wiens, Yang Zha, Terry Plank, Spahr C. Webb, Donna K. Blackman,
Robert A. Dunn, and James A. Conder. Seismic evidence of effects of water on melt transport
in the Lau back-arc mantle. Nature, 518(7539):395–398, 2015. Publisher: Nature Publishing
Group UK London.

S. Shawn Wei, Yang Zha, Weisen Shen, Douglas A. Wiens, James A. Conder, and Spahr C. Webb.
Upper mantle structure of the Tonga‐Lau‐Fiji region from Rayleigh wave tomography. Geochem-
istry, Geophysics, Geosystems, 17(11):4705–4724, November 2016.
ISSN 1525-2027, 1525-
2027. doi: 10.1002/2016GC006656.

S. Shawn Wei, Douglas A. Wiens, Peter E. van Keken, and Chen Cai. Slab temperature controls on
the Tonga double seismic zone and slab mantle dehydration. Science Advances, 3(1):e1601755,
January 2017. doi: 10.1126/sciadv.1601755. Publisher: American Association for the Advance-
ment of Science.

S. Shawn Wei, Philipp Ruprecht, Sydney L. Gable, Ellyn G. Huggins, Natalia Ruppert, Lei Gao,
and Haijiang Zhang. Along-strike variations in intermediate-depth seismicity and arc magmatism
along the Alaska Peninsula. Earth and Planetary Science Letters, 563:116878, 2021. Publisher:
Elsevier.

Songqiao Wei, Y. John Chen, Eric Sandvol, Shiyong Zhou, Han Yue, Ge Jin, Thomas M. Hearn,
Mingming Jiang, Haiyang Wang, Wenyuan Fan, Zheng Liu, Zengxi Ge, Yanbin Wang, Yongge
Feng, and James Ni. Regional earthquakes in northern Tibetan Plateau: Implications for litho-
spheric strength in Tibet. Geophysical Research Letters, 37(19):n/a–n/a, October 2010. ISSN
00948276. doi: 10.1029/2010GL044800.

Wei Wei, Jiandong Xu, Dapeng Zhao, and Yaolin Shi. East Asia mantle tomography: New insight
into plate subduction and intraplate volcanism. Journal of Asian Earth Sciences, 60:88–103,
October 2012. ISSN 13679120. doi: 10.1016/j.jseaes.2012.08.001.

J Weston, E R Engdahl, J Harris, D Di Giacomo, and D A Storchak. ISC-EHB: reconstruction of
a robust earthquake data set. Geophysical Journal International, 214(1):474–484, July 2018.
ISSN 0956-540X, 1365-246X. doi: 10.1093/gji/ggy155.

Douglas A. Wiens, Jeffrey J. McGuire, and Patrick J. Shore. Evidence for transformational faulting
from a deep double seismic zone in Tonga. Nature, 364(6440):790–793, 1993. Publisher: Nature
Publishing Group UK London.

Douglas A. Wiens, Jeffrey J. McGuire, Patrick J. Shore, Michael G. Bevis, Kitione Draunidalo,
Gajendra Prasad, and Saimon P. Helu. A deep earthquake aftershock sequence and implications
for the rupture mechanism of deep earthquakes. Nature, 372(6506):540–543, 1994. Publisher:
Nature Publishing Group UK London.

119

John D. Wilding, Weiqiang Zhu, Zachary E. Ross, and Jennifer M. Jackson. The magmatic web
beneath Hawai‘i. Science, 379(6631):462–468, February 2023. doi: 10.1126/science.ade5755.
Publisher: American Association for the Advancement of Science.

M. Witek, S.-J. Chang, D. Y. Lim, S. Ning, and J. Ning. Radial Anisotropy in East Asia From
Multimode Surface Wave Tomography. Journal of Geophysical Research: Solid Earth, 126(7):
e2020JB021201, 2021. ISSN 2169-9356. doi: 10.1029/2020JB021201.

Michael Witek, Suzan van der Lee, and Tae-Seob Kang. Rayleigh wave group velocity distributions
for East Asia using ambient seismic noise. Geophysical Research Letters, 41(22):8045–8052,
November 2014. ISSN 00948276. doi: 10.1002/2014GL062016.

Cecily J. Wolfe, Sean C. Solomon, Gabi Laske, John A. Collins, Robert S. Detrick, John A. Orcutt,
David Bercovici, and Erik H. Hauri. Mantle Shear-Wave Velocity Structure Beneath the Hawai-
ian Hot Spot. Science, 326(5958):1388–1390, December 2009. ISSN 0036-8075, 1095-9203.
doi: 10.1126/science.1180165.

Jonny Wu, John Suppe, Renqi Lu, and Ravi Kanda. Philippine Sea and East Asian plate tectonics
since 52 Ma constrained by new subducted slab reconstruction methods: PHILIPPINE SEA
PLATE TECTONICS-SLABS. Journal of Geophysical Research: Solid Earth, 121(6):4670–
4741, June 2016. ISSN 21699313. doi: 10.1002/2016JB012923.

Ziyi Xi, Songqiao Shawn Wei, Weiqiang Zhu, Greg Beroza, Yaqi Jie, and Nooshin Saloor. Deep
Learning for Deep Earthquakes: Insights from OBS Observations of the Tonga Subduction Zone.
2023. doi: 10.31223/X5C105. Publisher: EarthArXiv.

Jiayi Xie, Michael H. Ritzwoller, Weisen Shen, Yingjie Yang, Yong Zheng, and Longquan
Zhou. Crustal radial anisotropy across Eastern Tibet and the Western Yangtze Craton. Jour-
nal of Geophysical Research: Solid Earth, 118(8):4226–4252, 2013.
ISSN 2169-9356. doi:
10.1002/jgrb.50296.

Yi-Gang Xu, Jin-Long Ma, Frederick A. Frey, Mark D. Feigenson, and Jiang-Feng Liu. Role of
lithosphere–asthenosphere interaction in the genesis of Quaternary alkali and tholeiitic basalts
from Datong, western North China Craton. Chemical Geology, 224(4):247–271, December
2005. ISSN 00092541. doi: 10.1016/j.chemgeo.2005.08.004.

Tadashi Yamasaki and Tetsuzo Seno. Double seismic zone and dehydration embrittlement of the
subducting slab. Journal of Geophysical Research: Solid Earth, 108(B4):2002JB001918, April
2003. ISSN 0148-0227. doi: 10.1029/2002JB001918.

Yu Yang, Jianshe Lei, Yinshuang Ai, Guangwei Zhang, Changqing Sun, Enbo Fan, Long Li, Qi Mi,
Mingwen Lu, Jing He, Jian Wang, Mofei Du, Bing Zhang, Fanfan Tian, Chen Ma, and Zemin
Liu. Crustal structure beneath Northeast China from ambient noise tomography. Physics of the
Earth and Planetary Interiors, 293:106257, August 2019. ISSN 00319201. doi: 10.1016/j.pepi.
2019.04.008.

120

Huajian Yao, Robert D. Van Der Hilst, and Maarten V. De Hoop. Surface-wave array tomogra-
phy in SE Tibet from ambient seismic noise and two-station analysis - I. Phase velocity maps.
Geophysical Journal International, 166(2):732–744, August 2006. ISSN 0956540X, 1365246X.
doi: 10.1111/j.1365-246X.2006.03028.x.

Huajian Yao, Caroline Beghein, and Robert D. Van Der Hilst. Surface wave array tomography
in SE Tibet from ambient seismic noise and two-station analysis - II. Crustal and upper-mantle
structure. Geophysical Journal International, 173(1):205–219, April 2008. ISSN 0956540X,
1365246X. doi: 10.1111/j.1365-246X.2007.03696.x.

An Yin and T. Mark Harrison. Geologic Evolution of the Himalayan-Tibetan Orogen. Annual
Review of Earth and Planetary Sciences, 28(1):211–280, May 2000. ISSN 0084-6597, 1545-
4495. doi: 10.1146/annurev.earth.28.1.211.

Yanhua O Yuan, Ebru Bozdağ, Caio Ciardelli, Fuchun Gao, and F J Simons. The exponenti-
ated phase measurement, and objective-function hybridization for adjoint waveform tomogra-
phy. Geophysical Journal International, 221(2):1145–1164, May 2020. ISSN 0956-540X. doi:
10.1093/gji/ggaa063.

Han Yue, Y. John Chen, Eric Sandvol, James Ni, Thomas Hearn, Shiyong Zhou, Yongge Feng,
Zengxi Ge, Andrea Trujillo, Yanbin Wang, Ge Jin, Mingming Jiang, Youcai Tang, Xiaofeng
Liang, Songqiao Wei, Haiyang Wang, Wenyuan Fan, and Zheng Liu. Lithospheric and upper
mantle structure of the northeastern Tibetan Plateau. Journal of Geophysical Research: Solid
Earth, 117(B5), May 2012. ISSN 01480227. doi: 10.1029/2011JB008545.

Zhongwen Zhan. Mechanisms and Implications of Deep Earthquakes. Annual Review of Earth and
Planetary Sciences, 48(1):147–174, May 2020. ISSN 0084-6597, 1545-4495. doi: 10.1146/
annurev-earth-053018-060314.

Haijiang Zhang, Fan Wang, Robert Myhill, and Hao Guo. Slab morphology and deformation be-
neath Izu-Bonin. Nature Communications, 10(1):1310, March 2019a. ISSN 2041-1723. doi:
10.1038/s41467-019-09279-7.

K.-J Zhang. North and South China collision along the eastern and southern North China mar-
ISSN 00401951. doi: 10.1016/

gins. Tectonophysics, 270(1-2):145–156, February 1997.
S0040-1951(96)00208-9.

Maoliang Zhang, Zhengfu Guo, Jiaqi Liu, Guoming Liu, Lihong Zhang, Ming Lei, Wenbin
Zhao, Lin Ma, Vincenzo Sepe, and Guido Ventura. The intraplate Changbaishan volcanic field
(China/North Korea): A review on eruptive history, magma genesis, geodynamic significance,
recent dynamics and potential hazards. Earth-Science Reviews, 187:19–52, December 2018.
ISSN 00128252. doi: 10.1016/j.earscirev.2018.07.011.

Miao Zhang, William L. Ellsworth, and Gregory C. Beroza. Rapid Earthquake Association and
Location. Seismological Research Letters, 90(6):2276–2284, September 2019b. ISSN 0895-

121

0695. doi: 10.1785/0220190052.

Dapeng Zhao, Toru Matsuzawa, and Akira Hasegawa. Morphology of the subducting slab boundary
in the northeastern Japan arc. Physics of the Earth and Planetary Interiors, 102(1):89–104, June
1997a. ISSN 0031-9201. doi: 10.1016/S0031-9201(96)03258-X.

Dapeng Zhao, Yingbiao Xu, Douglas A. Wiens, LeRoy Dorman, John Hildebrand, and Spahr
Webb. Depth Extent of the Lau Back-Arc Spreading Center and Its Relation to Subduction
Processes. Science, 278(5336):254–257, October 1997b. ISSN 0036-8075, 1095-9203. doi:
10.1126/science.278.5336.254.

Dapeng Zhao, Shigenori Maruyama, and Soichi Omori. Mantle dynamics of Western Pacific and
East Asia: Insight from seismic tomography and mineral physics. Gondwana Research, 11(1-2):
120–131, January 2007. ISSN 1342937X. doi: 10.1016/j.gr.2006.06.006.

Dapeng Zhao, You Tian, Jianshe Lei, Lucy Liu, and Sihua Zheng. Seismic image and origin of
the Changbai intraplate volcano in East Asia: Role of big mantle wedge above the stagnant Pa-
cific slab. Physics of the Earth and Planetary Interiors, 173(3-4):197–206, April 2009. ISSN
00319201. doi: 10.1016/j.pepi.2008.11.009.

Tianyu Zheng, Ling Chen, Liang Zhao, Weiwei Xu, and Rixiang Zhu. Crust–mantle structure
difference across the gravity gradient zone in North China Craton: Seismic image of the thinned
continental crust. Physics of the Earth and Planetary Interiors, 159(1-2):43–58, November 2006.
ISSN 00319201. doi: 10.1016/j.pepi.2006.05.004.

Xiu-Fen Zheng, Biao Ouyang, Dong-Ning Zhang, Zhi-Xiang Yao, Jian-Hong Liang, and Jie Zheng.
Construction of the Technical System of Data Backup Centre for China Seismograph Network
and Its Support to Wenchuan Earthquake Researches. Chinese Journal of Geophysics, 52(3):
657–662, May 2009. ISSN 08989591. doi: 10.1002/cjg2.1387.

Quan Zhou, Lijun Liu, and Jiashun Hu. Western US volcanism due to intruding oceanic mantle
driven by ancient Farallon slabs. Nature Geoscience, 11(1):70–76, January 2018. ISSN 1752-
0894, 1752-0908. doi: 10.1038/s41561-017-0035-y.

Tong Zhou, Ziyi Xi, Min Chen, and Jiaqi Li. Assessment of seismic tomographic models of
the contiguous United States using intermediate-period 3-D wavefield simulation. Geophysi-
cal Journal International, 228(2):1392–1409, November 2021. ISSN 0956-540X, 1365-246X.
doi: 10.1093/gji/ggab406.

Tong Zhou, Jiaqi Li, Ziyi Xi, Guoliang Li, and Min Chen. CUSRA2021: A Radially Anisotropic
Model of the Contiguous US and Surrounding Regions by Full‐Waveform Inversion. Journal of
Geophysical Research: Solid Earth, 127(8), August 2022. ISSN 2169-9313, 2169-9356. doi:
10.1029/2021JB023893.

X Zhou and R Armstrong. Cenozoic volcanic rocks of eastern China — secular and geographic

122

trends in chemistry and strontium isotopic composition. Earth and Planetary Science Letters,
58(3):301–329, May 1982. ISSN 0012821X. doi: 10.1016/0012-821X(82)90083-8.

Ying Zhou. Anomalous mantle transition zone beneath the Yellowstone hotspot track. Na-
ISSN 1752-0894, 1752-0908. doi: 10.1038/

ture Geoscience, 11(6):449–453, June 2018.
s41561-018-0126-4.

Hejun Zhu, Ebru Bozdağ, Daniel Peter, and Jeroen Tromp. Seismic wavespeed images across the
Iapetus and Tornquist suture zones. Geophysical Research Letters, 39(18), September 2012a.
ISSN 00948276. doi: 10.1029/2012GL053053.

Hejun Zhu, Ebru Bozdağ, Daniel Peter, and Jeroen Tromp. Structure of the European upper mantle
revealed by adjoint tomography. Nature Geoscience, 5(7):493–498, July 2012b. ISSN 1752-
0894, 1752-0908. doi: 10.1038/ngeo1501.

Hejun Zhu, Ebru Bozdağ, and Jeroen Tromp. Seismic structure of the European upper mantle
based on adjoint tomography. Geophysical Journal International, 201(1):18–52, April 2015.
ISSN 1365-246X, 0956-540X. doi: 10.1093/gji/ggu492.

RiXiang Zhu, Ling Chen, FuYuan Wu, and JunLai Liu. Timing, scale and mechanism of the de-
struction of the North China Craton. Science China Earth Sciences, 54(6):789–797, June 2011.
ISSN 1674-7313, 1869-1897. doi: 10.1007/s11430-011-4203-4.

Weiqiang Zhu and Gregory C Beroza. PhaseNet: a deep-neural-network-based seismic arrival-
time picking method. Geophysical Journal International, 216(1):261–273, January 2019. ISSN
0956-540X. doi: 10.1093/gji/ggy423.

Weiqiang Zhu, Ian W. McBrearty, S. Mostafa Mousavi, William L. Ellsworth, and Gregory C.
Beroza. Earthquake Phase Association Using a Bayesian Gaussian Mixture Model. Journal
of Geophysical Research: Solid Earth, 127(5):e2021JB023249, 2022. ISSN 2169-9356. doi:
10.1029/2021JB023249.

Weiqiang Zhu, Ettore Biondi, Jiaxuan Li, Jiuxun Yin, Zachary E. Ross, and Zhongwen Zhan. Seis-
mic Arrival-time Picking on Distributed Acoustic Sensing Data using Semi-supervised Learning,
March 2023a. arXiv:2302.08747 [physics].

Weiqiang Zhu, Alvin Brian Hou, Robert Yang, Avoy Datta, S Mostafa Mousavi, William L
Ellsworth, and Gregory C Beroza. QuakeFlow: a scalable machine-learning-based earthquake
monitoring workflow with cloud computing. Geophysical Journal International, 232(1):684–
693, January 2023b. ISSN 0956-540X. doi: 10.1093/gji/ggac355.

Jiahui Zuo, Liwei Wang, and Fenglin Niu. Multiple source downwellings beneath eastern North
China revealed by 3-D CCP migration of receiver function data. Journal of Asian Earth Sciences,
192:104266, May 2020. ISSN 13679120. doi: 10.1016/j.jseaes.2020.104266.

123