NEW QUANTUM ALGORITHMS AND ANALYSES FOR HAMILTONIAN SIMULATION By Jacob Watkins A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Physicsβ€”Doctor of Philosophy 2024 ABSTRACT Digital quantum Hamiltonian simulation is, by now, a relatively mature field of study; however, new investigations are justified by the importance of quantum simulation for scientific and soci- etal applications. In this dissertation, we discuss several advances in circuit-based Hamiltonian simulation. First, following two introductory chapters, we consider the mitigation of Trotter errors using Chebyshev interpolation, a standard yet powerful function approximation technique. Implications for estimating time-evolved expectation values are discussed, and a rigorous analysis of errors and complexity show near optimal estimation of dynamical expectation values using only Trotter and constant overhead. We supplement our theoretical findings with numerical demonstrations on a 1D random Heisenberg model. Next, we introduce a computational reduction from time dependent to time independent Hamil- tonian simulation based on the standard (𝑑, 𝑑′) technique. Our approach achieves two advances. First, we provide an algorithm for simulating time dependent Hamiltonians using qubitization, an optimal algorithm that cannot handle time-ordering directly. Second, we provide an algorithm for time dependent simulation using a natural generalization of multiproduct formulas, achieving higher accuracies than product formulas while retaining commutator scaling. Rigorous performance anal- yses are performed for both algorithms, and simple numerics demonstrate the effectiveness of the multiproduct formulas procedure at reducing Trotter error. Finally, we consider several practical methods for near-term quantum simulation. First, we consider the analog quantum simulation of bound systems with discrete scale invariance using trapped-ion systems, with applications to Efimov physics. Next, we discuss the Projected Cooling Algorithm, a method for preparing bound states of non-relativistic quantum systems with localized interactions based on the dispersion of unbound states. Lastly, we discuss the Rodeo Algorithm, a probabilistic, iterative, phase-estimation-like protocol which is resource-frugal and effective at measuring and preparing eigenstates. Concluding remarks and possible future directions of research are given in a brief final chapter. Dedicated to my late grandmother, Elizabeth, who swore to be the first to call me "Dr. Watkins." (I told her my committee would beat her to it. She insisted otherwise.) iii ACKNOWLEDGEMENTS This work is the culmination of six years of graduate study in physics and quantum computing, which itself builds upon a long journey of educational and personal development. I could not have done this without the many people and institutions that have supported me along the way. First, I thank the Earth for supporting lifeforms who do silly things like study quantum com- puting. May these pages not be a waste of your resources. I thank my family, who I’ve missed dearly, for their encouragement and support. In particular: my mother Diana, for setting an example of good character, for being tender, and for knowing how to handle a son who was hard on himself; my grandfather Rob, for contributing to my love for the outdoors and showing me how to cut the Thanksgiving turkey; my Aunt Dawn, for being the only one who enjoys board games and karaoke as much as I do; my Uncle Gene, for teaching me to like the taste of oyster and pause to appreciate the flowers; and my cousins Ethel and Elizabeth, for being like siblings to me. Friends, past and present, give warmth and meaning to my life. I thank my "camp friends" (you know who you are), and Camp Orkila itself for letting me sing silly songs and instilling values I retain today. I thank friends I made at the University of Washington, especially Brian, Frances, Katrina, and Mark. My visits to you, and yours to me, have been a source of strength during my PhD. Thanks to Keigan and Elliott for uplifting phone calls and for bringing Rosemary into this world. To Kyndra and Josh; I will always appreciate our adventures into the woods. A long list of mentors and educators have guided me since deep into my childhood. I thank my Taekwondo instructor Eric Shields for teaching me about perseverance and indomitable spirit. I thank the wonderful educators of Shaw, Kalles, and Puyallup High who have given me a well- rounded liberal arts education. Particular thanks go to Ms. Kreiger, for calling my mother to tell her I did well on an assignment, and to Ms. Kooser, for an engaging, hands-on social studies curriculum. Thanks to Mr. Ryan, for facilitating most of my social life via the PHS band program. Of course, a heartfelt thanks to Mr. Segers, my high school physics teacher, for tolerating my (frequent) class-entry skits and my (occasional) anxieties about grades. iv More recently, thanks to Ben Hall, my first office mate and friend in graduate school, for showing me how cool quantum computing is; Maria Violaris for singing Taylor Swift parodies with me at the premier quantum information conference and showing me that "punting" meant more than kicking an American football. Thanks to Kirtimaan Mohan, Katie Hinko, Nick Ivanov, Rachel Barnard, Matt Oney and others who have helped me explore my interests in education while being supportive mentors and collaborators. Despite being housed in a major nuclear science lab for most of my studies, the reader may notice that my research has strayed quite far from nuclear physics proper. I thank Dean Lee for giving me ample freedom and encouragement to explore my interests. Thanks as well to Nathan Wiebe for hosting me for a summer in Toronto, and for mentoring me in the kind of rigorous quantum algorithms research exhibited in much of this thesis. I also thank Ale Roggero for being a supportive and patient mentor and collaborator, and for being there for a young graduate student struggling through his first real research talk. Finally, I thank my partner, Brita, for her excellent pies and shakshuka, introducing me to Mar- garet Atwood, and converting me to coffee. We have supported each other through the difficulties of the pandemic and come out stronger people. I love you like a rock. v CHAPTER 1 HOUSEKEEPING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 TABLE OF CONTENTS CHAPTER 2 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . 3 2.1 Quantum Mechanics and Challenges to Realism . . . . . . . . . . . . . . . . . 6 2.2 Computer Science and the Role of Physics . . . . . . . . . . . . . . . . . . . . 8 2.3 Quantum Computing, an Overview . . . . . . . . . . . . . . . . . . . . . . . . 2.4 The Case for Quantum-Based Quantum Simulation . . . . . . . . . . . . . . . 12 2.5 Quantum Hamiltonians: A Closer Look . . . . . . . . . . . . . . . . . . . . . 16 2.6 Classical Simulation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7 Quantum Simulation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.8 Mathematical Reference CHAPTER 3 TROTTER ERROR MITIGATION . . . . . . . . . . . . . . . . . . . . 36 3.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2 Polynomial Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.3 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.4 The Effective Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Application to Dynamical Observables . . . . . . . . . . . . . . . . . . . . . . 55 3.6 Numerical Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 . 63 3.7 Discussion . . . 65 . . 3.8 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 4 TIME DEPENDENT HAMILTONIAN SIMULATION THROUGH DISCRETE CLOCK CONSTRUCTIONS . . . . . . . . . . . . . . . . . 71 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 . 93 . 102 4.1 4.2 The Clock Space . 4.3 Finite Clock Spaces . 4.4 Time Dependent Qubitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Discussion . . . . . . CHAPTER 5 MULTIPRODUCT FORMULAS FOR TIME DEPENDENT . 103 SIMULATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction and Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.2 Definition and Effectiveness 5.3 Time Dependent Multiproduct Simulation . . . . . . . . . . . . . . . . . . . . 111 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.4 Error Analysis . . . . 5.5 Time Step Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 . 5.6 Query Complexity . 5.7 Numerical Demonstrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5.8 Discussion . . 143 5.9 Algorithm for Time Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 6 A SIMULATION MIXED BAG . . . . . . . . . . . . . . . . . . . . . . 146 6.1 Discrete Scale Invariance on Trapped-Ion Systems . . . . . . . . . . . . . . . . 146 vi 6.2 Projected Cooling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 . 159 6.3 Rodeo Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 7 CONCLUSION AND OUTLOOK . . . . . . . . . . . . . . . . . . . . . 173 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 vii CHAPTER 1 HOUSEKEEPING This dissertation is concerned primarily with the task of simulating a Hamiltonian on a quantum computer. What is a Hamiltonian? Why do we want to simulate it? What is a quantum computer? Those already in-the-know, who want to skip to the technical advances of this thesis, are welcome to survey the chapters to find what they are looking for. I expect they are already inclined to do so! Those who want a little more context for this thesis, and to learn some of the big ideas underlying the field, are encouraged to continue on to the next chapter. There I will provide a sweeping, but necessarily brief, survey of the big ideas in the field of digital quantum Hamiltonian simulation, steadily working towards the more technical advances of this work. The Abstract of this dissertation provides, as expected, a synopsis of all topics covered. All chapters are essentially independent of each other, with the partial exception of Chapter 5, which relies some ideas from Chapter 4 to support a conjecture. Moreover, all but Chapter 6 correspond to a single, self-contained research project. The penultimate Chapter 6 covers a mixed bag of projects to which I contributed. All of the projects discussed in this thesis have corresponding publications or preprints, and references to these works are provided near the beginning of the relevant chapter or section. When deciding what to include here, and what to leave to those works, I use several criteria. First, since this thesis serves as a compendium of my work, I focused chiefly on my contributions to a given project. Sometimes though, for the sake of a self-contained document, I include results and derivations that are primarily due to my collaborators. I will indicate clearly when this is the case. Finally, for various reasons, there are results from these projects that did not make their way into the corresponding publication, especially in the miscellaneous Chapter 6. By including those here, I hope to complement the publications by supporting and extending their findings through numerics or derivations. The flavor of this thesis is analytical, in at least two senses. First, in the mathematical sense. To approach a problem "analytically" means to utilize tools of mathematical proof and derivation, in 1 contrast to numerical calculation. The central results are proofs and analytical bounds on error and computational complexity. Numerics, however, are used to provide assurance and to see the "actual" performance in a way that complexities cannot showcase. These benchmarks are usually far from complete, suggesting an obvious path for additional research. Second, this thesis is analytical in that it is primarily concerned with analyzing something, namely quantum algorithms. Although we propose novel simulation methods, they are typically variations on existing tools. The performance and errors analyses are likely the major technical advancement of this dissertation. I believe that such careful analyses provide firm guideposts for those who wish to apply algorithms to specific use cases. Hopefully, our methods for estimating algorithmic resources can be useful for the analysis of quantum algorithms developed in the future. Often in practice, analytical error bounds fall short of representing the typical error of a simulation method [25, 66]. This is mostly good news, meaning performance is often much better than expected. What, then, is the value of such bounds if they fail to capture the "actual" behavior of the method? Worst-case error and resource bounds represent a first important step towards understanding the behavior and capability of a method, providing us the most robust guarantees of how well an algorithm will perform. This is only part of the picture, and while numerical experiments can provide more insight, there is additional work for theorists as well. For example, recent work on average-case hardness for Trotter simulations likely represents a step towards a fuller understanding of the "typical" hardness [23]. Without further ado, please enjoy what this dissertation has to offer. I hope you find these chapters helpful not only for technical content, but for inspiration and ideas for your own pursuits. 2 CHAPTER 2 BACKGROUND I expect there will be as many readers of this thesis who are newcomers to quantum computing as those who are quantum algorithms experts. Thus, I am motivated to dedicate a chapter to provide both background and inspiration for the technical results that follow. We will start with broad scope and little detail, gradually narrowing our focus to the new stuff. More detailed background on specific research is given at the beginning of each chapter. Quantum information science, which includes quantum computing, is a relatively young dis- cipline which overlaps several technical fields, particularly physics and computer science. My approach to this chapter is to discuss each of these domains separately, then their surprising inter- connection made most obvious by (but not reliant on) quantum computing. With these ingredients in place, we then introduce Hamiltonian simulation, which relates to the computation of either closed, naturally occurring quantum systems or problems with equivalent mathematical structure. I hope the reader finds these short surveys valuable to understand the more particular and technical work of later chapters. 2.1 Quantum Mechanics and Challenges to Realism The developments in physics which began at the turn of the 20th century, were, in many respects, parallel with those found in the arts in that same period. As the modernists eschewed accurate portrayals for abstract figures and geometries, the physicists grappled with phenomena increasingly removed from regular experience. And, like the modernists, these new ways of doing physics were met with some backlash. Despite this, the resulting theories, namely relativity and quantum mechanics, were better at explaining the world around them than the earlier "classical" physics. Yet their character was so strange that it led some physicists, notably Dirac1, to emphasize mathematics over the senses in formulating physical theories. This is especially true for quantum mechanics, which on the surface said a number of very 1"I learned to distrust all physical concepts as the basis for a theory. Instead one should put one’s trust in a mathematical scheme, even if the scheme does not appear at first sight to be connected with physics." [92] 3 strange things. Particles were neither here nor there until measured, the story goes, seemingly defying the scientific tenet of realism. Particles may be waves, waves may be particles. More truthful than such common refrains is that quantum mechanics provides a relatively well-defined framework for accurate calculations of particles, atoms, molecules, and nuclei, even as the theory appeared unintuitive or even nonsensical. The mathematics of quantum mechanics nicely captured a variety of phenomena which eluded classical treatment, but the classical theory made more sense. Yet even today, the meaning of quantum mechanics remains mainly unresolved. In response to this absurdity, the prevailing attitude amongst quantum physics practitioners is captured in the pithy mandate: "Shut up and calculate." The meaning: don’t worry about what the theory means, per se, just worry about what it predicts. While it’s easy to criticize this point of view, delaying thorny questions of interpretation arguably allowed for more rapid understanding of physical phenomena in the decades following the invention of quantum mechanics. One unsettling aspect of quantum mechanics is its intrinsic nondeterminism. The theory only predicts probabilities of certain outcomes in a physical experiment, where "experiment" is interpreted broadly as any means by which observers (such as people) experience the world around them. While probabilities had appeared earlier in statistical mechanics and its connection to thermodynamic entropy, their appearance here in a fundamental physical theory was a notable break from the past, and carried unsettling philosophical implications. It led Einstein, Podolsky, and Rosen to argue that quantum mechanics was actually an incomplete theory of reality [42], and a more complete understanding, even if impossible to achieve by mere mortals, would reveal an underlying determinism. Some alternate theories, most notably Bohmian mechanics [55], purport to restore determinism to quantum by relegating all chance to inaccessible knowledge of the particle trajectories. It was shown in a groundbreaking work by John Bell that certain reasonable "hidden variable theories" make predictions distinct from those of quantum mechanics [12]. Experiments on the matter came out in favor of quantum theory [7, 49], and these contributions were rewarded with the 2022 Nobel Prize in Physics. The bottom line is that it is hard to restore determinism, or even conventional probability, to quantum mechanics without violating other cherished physical 4 principles such as locality. In contrast to hidden-variable theories, which seem to deny quantum mechanics as it is, the many worlds interpretation [38] asks us to take quantum theory at face value, including the reality of the quantum wavefunction as a description of all phenomena. So far, I have emphasized that quantum physics is a radical departure, conceptually, from classical physics. Yet this belies the fact that, in formulating quantum mechanical models, classical models are often used as a starting points. It is easiest to start with existing tools when trying to create new ones. The process of taking a classical theory and tweaking it to describe a quantum system is known as "quantization." It turns out that not all ideas from classical physics are equally suitable for quantization. For example, the Newtonian framework, in which changes in motion are generated by forces, does not have a great correspondence to quantum mechanics principles. Rather, the most natural jumping off point for quantum mechanics is the Hamiltonian formulation, named after Irish mathematician, astronomer, and physicist William Rowan Hamilton. In this framework, a classical system has physical configurations given by a number of coordinates π‘ž. For example, the location and orientation of an airplane may be exactly represented by a set of 6 numbers. Each coordinate π‘ž has a corresponding conjugate momentum 𝑝, which in simple cases may be seen as expressing a "velocity" for π‘ž. Specifying all coordinates π‘ž and momenta 𝑝 gives a complete specification of the system in the sense that any "observable quantity" 𝑂 is a function 𝑂 (π‘ž, 𝑝) of the coordinates and momenta. One uniquely special observable is the Hamiltonian 𝐻 (π‘ž, 𝑝), which provides the total energy of the system as a function of its coordinates and momenta. It also contains all information about future states of the physical system. That is, the classical Hamiltonian defines a set of differential equations π‘‘π‘ž 𝑑𝑑 = πœ•π» πœ• 𝑝 , 𝑑𝑝 𝑑𝑑 = βˆ’ πœ•π» πœ•π‘ž (2.1) which, when solved, provide the state of the system at any subsequent time. More succinctly, 𝐻 encodes the dynamics of the physical system in question. The importance to physics is immediate. One of the primary goals of physics is to understand a phenomena well enough to make future predictions given current data. Prediction is more powerful, and impressive, than retroactive 5 explanation. The Hamiltonian provides all the information needed to make these predictions. In an exactly analogous manner, the quantum Hamiltonian 𝐻 (we use the same symbol) encodes all of the dynamics of closed quantum system. Given a system described by wavefunction |πœ“βŸ© the dynamics are found by solving the famous SchrΓΆdinger equation. π‘–πœ•π‘‘ |πœ“π‘‘βŸ© = 𝐻 |πœ“π‘‘βŸ© (2.2) Formally, 𝐻 in the quantum setting is a Hermitian operator on a Hilbert space, and |πœ“π‘‘βŸ© is a vector- valued function on this space. The main point is that solving (2.2) is a fundamentally important task for understanding the dynamics of physical phenomena. Solving this equation allows for understanding the formation of the elements, the properties of molecules and materials, and the fundamental constituents of nature. In practice, solving (2.2) is far too difficult with even today’s best computational devices and cleverest tricks. Instead, people come up with a number of clever partial solutions and approximation schemes, to various degrees of success. In talking about what can be computed efficiently and what cannot, however, we have already begun to enter a different domain worth discussion: computer science. 2.2 Computer Science and the Role of Physics In Scott Aaronson’s Quantum Computing Since Democritus, he writes that computer science "is a bit of a misnomer." Rather than being about computers, in the particular sense of desktops, servers, and smart phones, he views it as "the study of the capacity of finite being such as us to learn mathematical truths." [2] Aaronson understands that "mathematical truths" encompasses more than what is sought by professional mathematicians. It could involve finding the shortest route to work (DΔ³kstra’s Algorithm), or predicting protein structure from an amino acid sequence [74]. Such tasks, in light of modern science, are likely viewed as being intrinsically mathematical and thus within the domain of computation. However, with recent developments in artificial intelligence, particularly Large Language Models (e.g., chatGPT), even more domains of human activity have been made amenable to computational treatment. 6 The diversity of computational problems is paralleled by the diversity of entities which can serve as a computational medium. Indeed, prior to the development of general-purpose, digital computers in the mid-20th century (and even following that), the term referred to human computers who performed much of the calculations for scientific, industrial, and governmental applications (see, e.g., [117]). While standard, modern computers are electronic, computers could in principle be made from billiard balls [48] or even water [3]. In some sense, however, all of these computers are equivalent to a Turing machine, an idealized computer consisting of a tape for manipulating symbols, an input program, and several internal states. One way to express this equivalence is that a Turing machine may be used to simulate the calculations performed by any of these other models. The hypothesis that "computable by Turing machine" captures the notion of what is computable is referred to as the Church-Turing thesis [32]. No serious challenge to this thesis has been sustained in the near-century since it was proposed. If the thesis holds, it seems to suggest that details about the physical system performing the computation may be abstracted away. If so, one shouldn’t expect questions in physics to have much bearing on computer science, besides the practicalities of engineering an effective computational device. Although understanding computability, i.e. what problems may be solved by computation, is important, we are left without an understanding of which problems are "practically" solvable on real world computers. For example, let’s return to the SchrΓΆdinger equation (2.2). This equation can be solved straightforwardly, to arbitrary accuracy, given enough time, space, and energy. However, the amount of these resources needed is prohibitively large for interesting instances. No one wants to wait 1000 years for a single result. Questions of what may be computed efficiently falls under the purview of the subfield of computational complexity. Nowadays, most aspects of theoretical computer science are concerned with complexity, and computability is a relatively closed subject. If we are actually interested in computational complexity, rather than computability, it makes sense to consider a modified Church-Turing thesis which deals more with the former than the latter. In particular, we may ask: Is "efficiently computable by Turing machine" equivalent to "efficiently computable by actual, physical computers"? The so-called strong Church-Turing thesis is the 7 assertion that this is the case, and because of the new word "efficiently" in the above question, the claim is indeed stronger than what either Church or Turing originally proposed. Though the claim is stronger, the evidence for its truth is correspondingly weaker. In fact, existing evidence suggests that quantum computers, if constructed, could solve problems efficiently that Turing machines could not [118, 120]. The potential for quantum computers to solve problems relevant to society has driven major investment from government and industry [53]. From a more fundamental perspective, the power of quantum computing suggests a greater interplay between physics and computer science than has been historically explored. Are there other physically realizable models of computation even more powerful than classical or quantum computers? In this direction, work by Aaronson has shown how computers based on hidden variable theories would be slightly more powerful than standard quantum computers [1]. The upshot of these developments is that physics seems to play an essential role in a fundamental computer science question: what computational tasks may be efficiently performed? 2.3 Quantum Computing, an Overview In the last section, we rapidly converged on the notion of a quantum computer, and here we discuss in more detail what this means. As we approach our primary topic, Hamiltonian simulation on a quantum computer, I will use increasingly precise and technical language, and no longer avoid mathematics. Readers with background in linear algebra and complex numbers are encouraged to consult standard resources for more thorough introductions to quantum computing [101]. At a high level, a quantum computer is nothing more than a computer based on the laws of quantum mechanics. Any computation requires, abstractly, the encoding of information and its manipulation by certain operations to achieve a result. For example, a classical computer may perform addition by storing two numbers in binary registers, then manipulating these registers in a specified way to get the sum on one of the two registers. A quantum computer, by contrast, stores its information as quantum states, and manipulates these states. As a caution, although any laptop is describable, in principle, in quantum mechanical terms, the way they store information and perform operations is most aptly described as "classical." This is true of essentially any computational device, 8 save the handful of quantum computers under development today. How do we model the workings of a quantum computer concretely? Just like with clas- sical computers, many possible computational models exist. Quantum Turing machines [37], measurement-based [111], and adiabatic quantum computation [4] are several well-explored ex- amples. But the most popular approach by far is the circuit-based model of quantum computation, which we will now explain in detail. The reader will benefit in having some background in the classical circuit model of computation, or experience with real digital logic circuits. Figure 2.1 provides an example of a quantum circuit. As with the classical circuit model, quantum circuits have wires and gates that feed forward (no feedback loops), but here the wires contain quantum information. Instead of well-defined bits in the 0 or 1 state, wires carry quantum bits, or qubits. In isolation, a qubit can have a state in the form 𝑐0|0⟩ + 𝑐1|1⟩. (2.3) Here, the symbols |0⟩ and |1⟩ are "kets" which take on a precise meaning as orthonormal vectors in a two-dimensional complex inner product space, but can be thought of informally as the "definite" states that the qubit can take. The coefficients 𝑐𝑖 are complex numbers that are often called "amplitudes." The fact that 𝑐𝑖 does not have to be a positive real number is the crucial difference between quantum computing and probabilistic, a.k.a. Monte Carlo, computing. The probability of measuring 0 or 1 is given by |𝑐0|2 and |𝑐1|2, respectively, but before a measurement is performed the amplitudes can exhibit interference. In order to have total probability one, we must have the following normalization condition. |𝑐0|2 + |𝑐1|2 = 1 (2.4) For vectors |πœ“βŸ© and |πœ™βŸ©, their inner product is denoted βŸ¨πœ™|πœ“βŸ©. In this language, the normalization condition for a qubit in state |πœ“βŸ© can be written as βŸ¨πœ“|πœ“βŸ© = 1. In this thesis, the term state vector will mean the normalized vector used to mathematically represent the state of our quantum system. Many readers will be familiar with the more general density matrix representation of quantum states, but because of our focus on closed-system dynamics we will not have much need for this more complicated formalism. 9 𝑋 𝐻 𝑇 𝑆 Figure 2.1 Example of a quantum circuit on three qubits. Information is carried on the wires as a collective quantum state, which is a superposition of possible values of the bitstrings on the register. This information is manipulated by gates which act on one or more of the qubits. Partial information about the quantum state is obtained by measurements, here represented by meters. The measurements also affect the state in accordance with quantum mechanics. For any interesting computation, combining multiple qubits together will be necessary. Such a collection will be referred to as a quantum register. As with combining any two (distinguishable) quantum mechanical systems, joint qubit states are described formally through the tensor product of the individual state spaces. Any state vector |Ψ⟩ on the joint system is a linear combination of product vectors of the form |πœ™βŸ© βŠ— | πœ’βŸ© (2.5) where |πœ™βŸ© and | πœ’βŸ© are state vectors on each individual space. As an important example, a state vector on a collection of 𝑛 qubits may be generally expressed as a sum |πœ“βŸ© = βˆ‘οΈ 𝑐𝑏 |π‘βŸ© π‘βˆˆ{0,1}𝑛 (2.6) over all bitstrings 𝑏 = 𝑏1𝑏2 . . . 𝑏𝑛. There is an underlying tensor product |π‘βŸ© ≑ |𝑏1⟩ βŠ— |𝑏2⟩ βŠ— Β· Β· Β· βŠ— |π‘π‘›βŸ© ≑ |𝑏1𝑏2 . . . π‘π‘›βŸ© that is often convenient to leave implicit. Whenever the joint state |Ψ⟩ cannot be written as a product vector, it is said to be entangled. Generically speaking, almost all quantum states are entangled, in the sense that choosing a random state has vanishingly small probability of being a product state for systems with more than a small number of states [144]. Entanglement is a necessary condition for quantum computers to exhibit superior performance over classical computers, though identifying the source of "power" quantum computation is a somewhat subtle issue [69]. To summarize, the objects of our quantum computer are qubits, and their collective state is 10 given by quantum state vector, i.e. a normalized vector in the tensor product over each qubit vector space. More concisely, this is just a normalized vector in C2𝑛 for 𝑛 qubits. Measuring the qubits returns a bitstring with probability given by the squared amplitude. We must now establish the appropriate operations for our quantum computer. Naturally, our operations should not take us outside the set of allowed states, namely the quantum state vectors described above. Moreover, empirical evidence suggests that quantum mechanical operations are all linear, so that operations on the full state may be understood by considering the operations on each component of the superposition. We are therefore led to consider gates as unitary operations acting on a subset of qubits. Unitary operations can be defined in a number of equivalent ways, all of which relate to the idea of preserving the norm of the state vector. More precisely, a unitary π‘ˆ is a linear operator on an inner product space such that βˆ₯π‘ˆ |π‘£βŸ©βˆ₯ = βˆ₯ |π‘£βŸ© βˆ₯ (2.7) for any vector 𝑣 in the space, where βˆ₯ Β· βˆ₯ represents the Euclidean, or 𝐿2, norm. A quantum gate, or simply "gate" in this context, is a unitary operation acting on a small number of qubits in a circuit. We generally represent gates as boxes with wires passing through. Today, computer programmers rarely work at the level of digital logic and gates on their laptop. Instead, programmers work at higher levels of abstraction to skirt minute details, accomplishing more as a result. In this thesis, we will seek to analyze quantum algorithms, and it will be easier and more insightful if we consider larger chunks, or subroutines, used to carry out the method. In the analysis, we will call these subroutines oracles: "black boxes" that are used as part of the algorithm, whose inner workings we either don’t know or delay considering. For the quantum context, the oracle 𝑂 will be a unitary operation on some number of qubits. As an example of an oracle, consider a function 𝑓 from π‘š-bit to 𝑛-bit strings. From this classical operation, we can define a unitary oracle π‘ˆ 𝑓 that computes 𝑓 . Given two quantum registers of length 𝑛 and π‘š in state |π‘₯⟩ βŠ— |π‘ŽβŸ© ≑ |π‘₯⟩ |π‘ŽβŸ©, define π‘ˆ 𝑓 as π‘ˆ 𝑓 |π‘₯⟩ |π‘ŽβŸ© = |π‘₯⟩ | 𝑓 (π‘₯) βŠ• π‘ŽβŸ© , (2.8) 11 where βŠ• is addition modulo 2, and extend the definition to general vectors by linearity. It can be checked that π‘ˆ 𝑓 is unitary, and when π‘Ž = 0 it is clear that π‘ˆ 𝑓 computes 𝑓 (π‘₯). Such oracles are used, for example, in Shor’s order-finding algorithm, where 𝑓 (π‘₯) = π‘Ž Β· π‘₯ is multiplication by some integer π‘Ž. In the context of this dissertation, we will define oracles that compute the parameters of a Hamiltonian or evolve a quantum register according to some Hamiltonian. Oracles may also encode specific observables to be measured during a quantum algorithm. Without knowing the computational cost of implementing an oracle 𝑂, it is impossible to know the cost of any algorithm utilizing 𝑂 as a subroutine. One of the benefits of using oracles is to break down the problem into smaller pieces: first implementing the oracle, then implementing the algorithm given the oracle. Another benefit is abstraction: we can analyze the general oracle problem, then apply those results to any particular instance of that oracle. In the example above, this might entail different functions 𝑓 , some of which are easy to compute, others of which are uncomputable! 2.4 The Case for Quantum-Based Quantum Simulation The advent of quantum mechanics carried great promise to better understand a range of physical phenomena, but challenges remained that were more about computational feasibility than theoretical understanding. As expressed by Paul Dirac [39] regarding the invention of quantum mechanics, The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. Though there appears "only" one barrier to solving all of chemistry, it is indeed a very large one. Massive supercomputing resources are needed to solve, starting from quantum mechanics, even relatively small molecules. As a result, heuristic and phenomenological techniques are employed in practice, which are less expensive but also less reliable and general. Examples of computational techniques for quantum many-body problems include a slew of variational techniques, such as Hartree-Fock and Coupled Cluster, and dynamical methods such as Molecular Dynamics [33]. 12 Despite these clever and well-established approaches, we emphasize that there remains a lack of general, efficient methods for classical simulation of quantum dynamics. Although efficient quantum simulation would likely not obviate the need for high-level, domain- specific concepts in chemistry, materials science, etc., such simulations would nevertheless be valuable to science and human knowledge. Unfortunately, it is believed that classical computers could never simulate quantum physics efficiently, meaning they couldn’t solve the SchrΓΆdinger equation (2.2) in interesting instances without intractable amounts of resources. After a hundred years of effort, effective classical methods for computing general quantum properties have simply not been found, although discovering such methods would be both useful and philosophically profound. Some of the earliest explorations into quantum computing were motivated by the desire for efficient quantum simulation [46, 37]. As it happens, not only can quantum computers simulate physical Hamiltonians efficiently, as we shall see in Section 2.7, but also this task fully captures the power of quantum computing. More precisely, it turns out that π‘˜-local Hamiltonian simulation is BQP-complete. Here π‘˜-local refers to Hamiltonians of the form Ξ“ βˆ‘οΈ 𝐻𝛾 𝐻 = (2.9) 𝛾=1 where each 𝐻𝛾 only operates on at most π‘˜ qubits. In words, interactions across the system are built up from interactions involving only a small number of constituents. BQP is, loosely speaking, the class of problems which may be efficiently solved by a quantum computer. "BQP-complete" refers to the fact that π‘˜-local Hamiltonian simulation is both in BQP (quantum computers can do it) and also that any problem in BQP may be encoded as a π‘˜-local simulation problem (in fact, 2-local)2. Thus, we may view digital quantum circuits and local Hamiltonian simulations as two sides of the same coin. This may not come as a surprise to many physicists, who are used to viewing all quantum operations as arising from an underlying Hamiltonian. The importance is that, if quantum computers can do anything interesting at all compared to classical computers, then Hamiltonian simulation should be one of those things. 2For experts: this encoding may require at most polynomial overhead in spatial and temporal resources. 13 A heuristic, imperfect argument that quantum computers can simulate quantum systems ef- fectively is that they also are quantum mechanical. By having the computer mimic the quantum transformations occurring in the physical system, one might hope to achieve what the classical computer cannot replicate efficiently. This argument provides some insight but also, by itself, is unconvincing. What is the essential difference between quantum simulation and simply "watching" a quantum system of interest "do its thing"? There are, in fact, major differences: 1. You may not have much access to the system of interest. For example, it is challenging, expensive, or impossible to send a probe to the sun’s core to gain information on nuclear processes. A protein under study may behave differently in a test tube than in situ. Essentially, there are many phenomena that are hard to measure directly under the desired circumstances. 2. There may be very little control over the parameters of the system of interest, or little ability to measure a wide variety of properties. Digital quantum simulation gives an enormous (though not limitless) degree of control over the model parameters and read out of the desired results. Identical trials of the same simulation could, in principle, be prepared as desired. The ability to tweak aspects of the simulation leads to a greater understanding of the phenomenon under study. By contrast, barring a highly tunable experimental setup, a system "just is" and doesn’t necessarily provide insight. In short, simulation is more than just experiment, both classically and quantumly. In Section 2.7, we will see how Hamiltonian simulation algorithms can be quite abstracted from any particular information about the system, thus earning designations such as "computation" and "algorithm". The classical intractability of quantum simulation is a more subtle issue than is often let on in popular, or even research, presentations, and we take a moment to challenge these oversimpli- fications. The common argument starts and ends with the exponential growth in the number of elementary states as the quantum system size increases. Recall the expression (2.6) for an arbitrary 𝑛-qubit state. Without additional structure, it appears the state |πœ“βŸ© requires 2𝑛 complex numbers to describe, one for each bitstring 𝑏. Each additional qubit doubles the number of states, and 2100 is 14 already an enormous number for just 100 particles. Normalization conditions do not help much, nor does the irrelevance of global phase. This amount of data is inefficient to store in memory, let alone perform operations on. The problem with the above argument is that any physical system, quantum or not, has such a scaling: a system of 𝑛 particles with 𝑑 single-particle states has a total of 𝑑𝑛 states. Why is there no objections to exponentiality in this context? To explore this, we look at standard probability theory, which is the closest jumping-off point for quantum mechanics. We may express the state | π‘βŸ© of 𝑛 probabilistic bits3 in the following suggestive form. | π‘βŸ© = βˆ‘οΈ 𝑝𝑏 |π‘βŸ© π‘βˆˆ{0,1}𝑛 (2.10) Here the collection of 2𝑛 𝑝𝑏’s forms a probability distribution, where 𝑝𝑏 is the probability of observing bitstring 𝑏. Compare with equation (2.6), and we seem to be facing the same conundrum. And indeed we are, if the goal is to represent the full distribution | π‘βŸ©. However, in applications, we would usually rather sample from the distribution rather than express it. The difference is the same as flipping a fair coin vs. writing a list (1/2, 1/2). Sampling typically requires much less spatial overhead, and can be performed using probabilistic bits and operations. The situation in quantum computing is very, very similar. We cannot "see" the output state |πœ“βŸ© in entirety, but measure and achieve some result with some probability. Thus, the source of quantum computing’s power appears to come not just from an exponential number of amplitudes, but also the way these amplitudes interfere under unitary operations. To summarize, although we have a comprehensive theory of quantum phenomena, it is currently very difficult to compute the consequences of this theory. Further, we should not believe generic, efficient classical methods will ever be found. On the other hand, the use of quantum computers for quantum simulation is an intuitive and promising solution to this challenge. We shall see in Section 2.7 that there are several efficient quantum algorithms for Hamiltonian simulation. For the most part, the only remaining obstacle is a very difficult engineering problem: building better 3Note that the notation | π‘βŸ© is not suggestive of anything quantum. I use it here in this context to emphasize that the ket is just notation. 15 quantum computers4. Building effective quantum hardware is, in its own right, a fascinating and difficult problem. At the time of writing, many different platforms for quantum computing are being actively developed or researched [19, 17, 65, 47], and after enormous investments [53] the technology is improving at a steady clip. However, robust error-corrected quantum computing remains out of reach. Nevertheless, this dissertation will, for sake of analysis, typically assume the kind of noiseless (or error corrected) quantum computing capabilities that we hope for in the future. Occasional comments on the likely effects of hardware noise are made, but detailed analysis is left for future work. 2.5 Quantum Hamiltonians: A Closer Look Having motivated digital quantum Hamiltonian simulation, let’s now elaborate on what this entails. First, strictly speaking, Hamiltonians describe closed quantum systems. Closed systems are idealizations of any real physical system, where the system of interest is perfectly isolated from any external influence. Though this is a useful concept in all areas of physics, closed systems are in some sense far from typical. Indeed, the science of decoherence suggests that the apparent classicality of our world emerges from open system dynamics [143]. For open systems, a more general framework of quantum states and operations needs to be invoked, but for our purposes the mathematical representations we’ve discussed thus far (state vectors, unitary operations, etc.) will suffice. Taking physical time 𝑑 to be continuous, we imagine the state |πœ“π‘‘βŸ© at any time 𝑑 to be related to the state at a previous time 𝑠 ≀ 𝑑 via some unknown unitary operation, which we denote π‘ˆ (𝑑, 𝑠). |πœ“π‘‘βŸ© = π‘ˆ (𝑑, 𝑠) |πœ“π‘ βŸ© (2.11) What properties should π‘ˆ possess? Tacit in our notation is the assumption that π‘ˆ does not depend on the states |πœ“π‘ βŸ© and |πœ“π‘‘βŸ©, meaning that the dynamical laws themselves do not care about the specific states involved. This is also the situation we find in classical mechanics. An another reasonable property, we might expect various π‘ˆ to chain together naturally when applied in succession. 4Part of the solution likely entail better protocols for correcting errors. π‘ˆ (𝑑, 𝑠) = π‘ˆ (𝑑, π‘Ÿ)π‘ˆ (π‘Ÿ, 𝑠) 𝑠 ≀ π‘Ÿ ≀ 𝑑 (2.12) 16 As a reasonable corollary, we have π‘ˆ (𝑑, 𝑑) = 𝐼. It is sensible to define π‘ˆ (𝑠, 𝑑) ≑ π‘ˆ (𝑑, 𝑠)† for 𝑠 < 𝑑, and with this the transitive property above generalizes to any 𝑠, π‘Ÿ, 𝑑 ∈ R. We might reasonably impose some degree of continuity to the wavefunction |πœ“π‘‘βŸ©, hence to the unitary π‘ˆ. Let’s assume, at the very least, that π‘ˆ (𝑑, 𝑠) is differentiable in 𝑑 (and by symmetry, 𝑠). Taking a derivative of equation (2.11) with respect to 𝑑, we obtain (cid:12) (cid:12)πœ“β€² 𝑑 (cid:11) = π‘ˆβ€²(𝑑, 𝑠) |πœ“π‘ βŸ© = π‘ˆβ€²(𝑑, 𝑠)π‘ˆ (𝑠, 𝑑) |πœ“π‘‘βŸ© . (2.13) Now we have a differential equation in |πœ“π‘‘βŸ©. For consistency, we must have that π‘ˆβ€²(𝑑, 𝑠)π‘ˆ (𝑠, 𝑑) is independent of 𝑠 (this can be verified explicitly). Moreover, by using some basic properties of π‘ˆ and the product rule, it is easy to check that it is also anti-Hermitian. Thus, if βˆ’π‘–π» (𝑑) ≑ π‘ˆβ€²(𝑑, 𝑠)π‘ˆ (𝑠, 𝑑), (2.14) then 𝐻 is a Hermitian operator. We call 𝐻 the Hamiltonian, and with this notation we recover the famous SchrΓΆdinger equation π‘–πœ•π‘‘ |πœ“π‘‘βŸ© = 𝐻 (𝑑) |πœ“π‘‘βŸ© . More generally, using relation (2.11), we obtain an operator SchrΓΆdinger equation π‘–πœ•π‘‘π‘ˆ (𝑑, 𝑠) = 𝐻 (𝑑)π‘ˆ (𝑑, 𝑠) (2.15) (2.16) which, in conjunction with (2.11), provides an expression for a quantum state at any time 𝑑 following an initial time 𝑠. To the practitioner of quantum physics, this might seem like an odd approach to the foundations of quantum dynamics. We started by postulating a unitary evolution operator π‘ˆ with certain reasonable properties, then derived a Hamiltonian which generates π‘ˆ via the SchrΓΆdinger equation. In physics, one typically starts with a Hamiltonian, then attempts to solve for π‘ˆ. It is insightful to ponder the reasons for this. As discussed in Section 2.1, the Hamiltonian concept is closely linked to classical mechanics, and hence serves as a more natural starting point for humans to model quantum phenomena. Perhaps more fundamentally, a compact, elementary description of 17 π‘ˆ instead of 𝐻 would be like having the cheat code for solving any quantum physics problem. By analogy, while the forces on classical systems may be more or less easy to describe, predicting resulting trajectories is a straightforward but expensive computational task. In short, having π‘ˆ instead of 𝐻 feels "too good to be true." The primary problem of quantum simulation, then, is to solve (2.16), or (2.15) more specifically, given a description of 𝐻 (𝑑). As needed for sensible physics, the solution for π‘ˆ (𝑑, 𝑠) exists and is unique with the initial condition π‘ˆ (𝑠, 𝑠) = 𝐼. More interesting is that a succinct description of the solution exists. We may write it as a so-called time ordered operator exponential π‘ˆ (𝑑, 𝑠) = expT ∫ 𝑑 (cid:26) βˆ’π‘– 𝑠 (cid:27) 𝐻 (𝜏)π‘‘πœ (2.17) which may be understood in a number of ways. One that is particularly relevant to this thesis is the product integration approach. Given a family of partitions {𝑑 𝑗 }𝑛 𝑗=1 of the interval [𝑠, 𝑑], with maximum width 𝛿𝑛 tending to zero as 𝑛 β†’ ∞, a solution is given by the product integral [73] π‘ˆ (𝑑, 𝑠) = lim π‘›β†’βˆž π‘›βˆ’1 (cid:214) 𝑗=1 π‘’βˆ’π‘–π» (𝑑 𝑗 )𝛿𝑑 𝑗 (2.18) where 𝛿𝑑 𝑗 = 𝑑 𝑗+1 βˆ’ 𝑑 𝑗 . One feature of this approach is that, for sufficiently large but finite 𝑛, the product represents an approximation that is also unitary, in contrast with the more common Dyson series representation. Solution (2.18) is closely linked to the idea of product formulas, which will be discussed in subsequent chapters. An even simpler expression for π‘ˆ can be found when 𝐻 is independent of time. Philosophically, this condition amounts to the notion that the laws of physics should not change over time. In this case, (2.18) simplifies to a simple operator exponential π‘ˆ (𝑑, 𝑠) = π‘’βˆ’π‘–π»Ξ”π‘‘ (2.19) where Δ𝑑 = 𝑑 βˆ’ 𝑠. This expression can understood through a power series expansion or the spectral theorem for normal operators. It is remarkable that such a simple and succinct expression can be written for the solution to essentially all closed quantum dynamics. If 𝐻 is over a finite-dimensional space, i.e., a matrix, computing a partial sum π‘ˆ (𝑑, 𝑠) β‰ˆ (βˆ’π‘–π»Ξ”π‘‘) 𝑗 𝑗! 𝑁 βˆ‘οΈ 𝑗=0 18 (2.20) for a sufficiently large 𝑁 will yield an arbitrarily accurate approximation of π‘ˆ (𝑑, 𝑠). Hence, given the matrix 𝐻, the computation of π‘ˆ reduces to "just" matrix multiplication and addition. Similarly, equation (2.18) can be approximated by taking 𝑛 sufficiently large (but finite) and calculating the product of matrix exponentials. For systems that aren’t finite dimensional, they may nevertheless be approximated to arbitrary accuracy by a sufficiently large, but finite, quantum system via discretization. Consistent with our previous discussion, we have shown that quantum dynamics may be computed using standard computational techniques, however, the exponentially large matrices involved ensure that the above approach will not be efficient. In light of the elegant solutions to the SchrΓΆdinger equation given by expressions such (2.19), and more generally (2.18), a mathematically inclined reader may conclude that closed-system quantum dynamics is, at the broadest level, a solved problem. However, without computational methods, we cannot extract useful information from these solutions, such as what’s needed to make concrete predictions about the behavior of a physical system. In this section, our goal is to clearly state a high-level procedure for carrying out such computations on both classical and quantum computers. 2.6 Classical Simulation Algorithms We’ve already discussed the obstacles to classical simulation of quantum mechanics in prior sections. Despite these, the importance of the problem to physics, and more broadly natural science, has led practitioners to develop very clever methods that provide insight in limited but interesting cases. Exact diagonalization [86] of the Hamiltonian, expressed as a matrix in a suitable basis, is a guaranteed approach in principle but intractable for large systems. Quantum Monte Carlo [8] methods refer to a broad range of techniques which utilize random sampling, and are especially effective at calculating the low lying energies of a Hamiltonian. However, these cannot be used directly for general calculations of quantum dynamics. Moreover, Monte Carlo techniques suffer a notorious sign problem, in which large quantities of various signs need to be added together to get a relatively small result. This leads to severe round-off errors from floating-point arithmetic. Indeed, it would be both interesting and surprising if Monte Carlo methods were more successful 19 at computing quantum properties. This would suggest that, computationally, the randomness of quantum theory could be reduced to mere coin flips. For this author, the sign problem is an indication that there is more to quantum probability theory than normal randomness. More recently, tensor network methods, based on Matrix Product States (MPSs) or Projected Entangled Pairs (PEPs), represent the state of the art for dynamical simulation. As the name suggests, such methods involve the representation of quantum states as tensor networks rather than a vector of amplitudes. For systems with low entanglement, the "rank" of the tensors involved is not too large. This allows for efficient representation and manipulation of quantum states. Area-law bounds on entanglement, as found in lattice systems with geometrically local interactions, aid the effectiveness of such methods [43]. Moreover, noise in imperfectly isolated computers can reduce the coherence and entanglement, which can be further exploited. Thus, tensor networks have brought more quantum systems within the capabilities of classical computers, and have become a standard benchmark for testing the classical feasibility of quantum circuits. Despite the aforementioned techniques, efficient simulation by classical means is likely unob- tainable. This comes from both complexity theoretic arguments as well as the sheer effort towards making effective simulation methods. 2.7 Quantum Simulation Algorithms As we’ve anticipated, quantum computers provide a platform for efficient quantum simulation given reasonable assumptions on the input Hamiltonian. Figure 2.2, reprinted from [52], gives a high-level overview of the Hamiltonian simulation workflow on quantum devices. Before any computation can be performed, the problem of interest must be encoded onto a quantum computer. This could be accomplished in several ways depending on the nature of the problem. For example, effective mappings from fermionic [94] and bosonic [116] systems to qubits have been extensively studied in the literature. We will not say much more on how to properly map a problem of interest onto a collection of qubits, though this is a crucial starting step for any attempt at quantum Hamiltonian simulation. Having identified a mapping, one can turn to the simulation proper. An initial state |πœ“0⟩ 20 Figure 2.2 Schematic of the quantum simulation workflow. The system of interest, represented by the upper "cloud," is first represented on the quantum computer through some correspondence given by the dashed lines. The three main tasks for simulation are state preparation |πœ“(0)⟩, time evolution by π‘ˆ (𝑑, 0), and a measurement procedure. Each of these steps has its own set of tools and methods. Reprinted from [52] with permission. Copyright 2014 by the American Physical Society. of the simulation must be prepared by a quantum circuit given an initial "fiducial state" of the device, typically |0βŸ©βŠ—π‘› . The difficulty of preparing |πœ“0⟩ depends on the nature of the problem and the qubit encoding used. Entire subdomains of digital Hamiltonian simulation are devoted to preparing effective initial states in various contexts [121, 40, 9, 99]. Although special states such as ground states and wavepackets are often sought for, the arbitrariness of initial state choice makes it difficult to do a general analysis of such methods. In contrast, given a specific input model for the Hamiltonian 𝐻 (𝑑), one can express general algorithms for generating a quantum circuit 𝑉 which approximates the time evolution operator π‘ˆ. Often, this step alone is what is referred to as "Hamiltonian simulation", though it is necessarily one part of the full process needed to extract useful information. Essentially, the problem is, given an input Hamiltonian 𝐻 (𝑑), a desired simulation interval [0, 𝑑] and a desired tolerance πœ– > 0 construct a quantum circuit 𝑉 such that the output 𝑉 |πœ“0⟩ is within πœ– of π‘ˆ (𝑑, 0) |πœ“0⟩. Once the state is evolved in time, it needs to be measured. Simply having the evolved state on a quantum register is not enough, and is nothing like having the state vector written on paper. Learning 21 the full output state |πœ“π‘‘βŸ©, known as full-state tomography, is extremely inefficient. However, in actual quantum calculations we are typically interested in a handful {𝑂𝑖} of observables of interest. Learning these observables given |πœ“π‘‘βŸ© turns out to be a much more approachable task; standard routines such as phase estimation and amplitude estimation are sufficient [56, 67]. In Section 6.3, we will discuss the Rodeo Algorithm, a new addition to the suite of phase estimation algorithms which can perform resource-efficient measurements in the eigenbasis of a Hamiltonian using time evolution. Despite the utility of the of the Preparation-Evolution-Measurement schematic, actual simula- tion protocols may not be purely sequential. For example, optimal protocols for eigenvalue and expectation value measurement protocols, particularly quantum phase estimation protocols, incor- porate the Evolution as a subroutine. We will in fact see this in the Iterative Amplitude Estimation protocol of Chapter 3. Even so, the framework is still useful in providing a program for developing simulation algorithms. In any conceivable case where a simulation is needed, we will need a way to (a) prepare a state, (b) evolve the state and (c) measure the state. Without additional assumptions, even quantum computers cannot solve Hamiltonian simulation efficiently in all instances. A simple counting argument shows that an πœ–-approximation to an 𝑛 qubit unitary π‘ˆ requires, in general, an exponentially large number of elementary quantum gates in 𝑛 [101]. Any such unitary may, if desired, be viewed as a time evolution operator for some Hamiltonian 𝐻. π‘ˆ = π‘’βˆ’π‘–π» (2.21) Evidently, there are some Hamiltonians 𝐻 whose simulation is requires exponential quantum resources, and is hence intractable. This point aside, most Hamiltonians of interest are not of this nature. Physical Hamiltonians, as described previously, are naturally π‘˜-local, and we shall leverage this in the product formula algo- rithms below. In contexts removed from physics, such as solving linear systems, the Hamiltonian 𝐻 is a sparse matrix (with efficiently computable and locatable nonzero entries). Such assumptions ensure efficient simulations are possible, and are typically baked into the definition of "Hamiltonian 22 simulation" and not mentioned. Over the past several decades, enormous progress has been made such that there are multiple good quantum algorithms for approximating π‘ˆ (𝑑, 𝑠). We now outline the three categories which, broadly speaking, classify all such methods. 2.7.1 Product Formulas Early thinkers like Feynman and Deutsch had long claimed that quantum computers could efficiently simulate quantum mechanics, but it was Seth Lloyd’s seminal algorithm, based on Trotterization, that first proved this was the case [87]. Lloyd considered π‘˜-local Hamiltonians of the form (2.9), and in this case the exponentials exp{βˆ’π‘–π»π›ΎΞ”π‘‘} of each term were unitary operations on only π‘˜ qubits. Since π‘˜ remains fixed as the number of qubits 𝑛 grew, these exponentials can be implemented as quantum circuits of fixed depth. Moreover, these unitaries can be combined in sequence to approximate the full time evolution exp{βˆ’π‘–π»π‘‘}. In his paper, Lloyd used the so-called first-order Trotter formula 𝑆1, defined as 𝑆1(𝑑) := π‘’βˆ’π‘–π»1π‘‘π‘’βˆ’π‘–π»2𝑑 . . . π‘’βˆ’π‘–π»Ξ“π‘‘ . This is a unitary operator which for small 𝑑, accurately approximates π‘ˆ (𝑑). In particular, 𝑆1(𝑑) βˆ’ π‘’βˆ’π‘–π»π‘‘ ∈ 𝑂 (𝑑2), (2.22) (2.23) where 𝑑 is taken asymptotically to zero. Longer simulation times can be achieved by dividing the full interval into π‘Ÿ steps. 𝑆1(𝑑/π‘Ÿ)π‘Ÿ = (cid:16) π‘’βˆ’π‘–π»1𝑑/π‘Ÿ π‘’βˆ’π‘–π»2𝑑/π‘Ÿ . . . π‘’βˆ’π‘–π»Ξ“π‘‘/π‘Ÿ (cid:17)π‘Ÿ = π‘’βˆ’π‘–π»π‘‘ + 𝑂 (𝑑2/π‘Ÿ) (2.24) By taking π‘Ÿ sufficiently large, the error can be arbitrarily diminished. More generally, product formulas are unitary approximations to π‘’βˆ’π‘–π»π‘‘ made by splitting the exponential along the terms 𝐻𝛾 in a specified sequence. The order of a product formula P characterizes the degree of approximation to π‘ˆ, and is the largest integer 𝑝 ∈ Z+ such that π‘ˆ (𝑑) βˆ’ P (𝑑) ∈ 𝑂 (𝑑 𝑝+1). (2.25) In other words, the order is the largest 𝑝 for which π‘ˆ (𝑑) and P (𝑑) share the same 𝑝th Taylor polynomial. This definition justifies our referring to (2.22) as 1st order. Product formulae exist for 23 all orders 𝑝, and in fact there is a recursive procedure for generating higher order formulas from lower ones [123]. The standard candle is the so-called Suzuki-Trotter formulas. To define these let 𝑆1 be given by (2.22), and define 𝑆2(𝑑) := rev[𝑆1(𝑑/2)]𝑆1(𝑑/2) (2.26) where rev is the reverse of the terms in the product of 𝑆1. It can be quickly verified via Taylor expansion that 𝑆2 is second order. It is also symmetric, both in the sense of the ordering of its terms and in a time reversal sense. 𝑆2(βˆ’π‘‘) = 𝑆2(𝑑)† (2.27) Any product formula P satisfying the condition (2.27) will be termed "symmetric." Symmetric product formulas have the useful property that their error series, namely the power series of π‘ˆ (𝑑) βˆ’ P (𝑑), is an odd function. Thus, any procedure which seeks to eliminate errors term by term can skip all even powers. For any π‘˜ ∈ Z+, with π‘˜ > 1, we define 𝑆2π‘˜ recursively as 𝑆2π‘˜ (𝑑) = 𝑆2 2(π‘˜βˆ’1) (π‘’π‘˜π‘‘)𝑆2(π‘˜βˆ’1) ((1 βˆ’ 4π‘’π‘˜ )𝑑)𝑆2 2(π‘˜βˆ’1) (π‘’π‘˜π‘‘), (2.28) where π‘’π‘˜ = (4 βˆ’ 4(π‘˜βˆ’1))βˆ’1. This formula is symmetric and order 2π‘˜. Thus, product formulas of arbitrary order exist. However, we observe that our recursive procedure generates an exponentially increasing number of unitaries as a function of π‘˜, leading to impractically high costs for modest accuracy gains. It is possible to show that this feature is present for any high order formula, by considering the number of terms needed to eliminate errors term by term. Thus, in practice, only the lowest orders formulas are used. Nevertheless, the existence of arbitrary order formulas is valuable theoretically to understand asymptotic scaling of simulation costs. Moreover, the forward and backward evolutions present in (2.28) give rise to fractal behavior for large π‘˜ that is itself interesting [122]. Product formulas satisfy the useful property that, in the case where all the 𝐻𝛾 commute, the error in the Trotterization vanishes, for the same reason that 𝑒𝑀𝑒𝑧 = 𝑒𝑀+𝑧 for 𝑀, 𝑧 ∈ C. More 24 generally, one should expect the simulation errors to be small even when the commutator [𝐻𝑖, 𝐻 𝑗 ] := 𝐻𝑖𝐻 𝑗 βˆ’ 𝐻 𝑗 𝐻𝑖 (2.29) is small but nonzero. We term this feature commutator scaling. It took a surprisingly long time to show this rigorously, but was finally done using the calculus of matrix-valued functions [27]. These results applied to a general class of staged product formulas of the form P (𝑑) = Ξ₯ (cid:214) Ξ“ (cid:214) 𝑗=1 𝛾=1 π‘’βˆ’π‘‘πœπ‘— 𝛾 𝐻 πœ‹ 𝑗 (𝛾) . (2.30) Here, Ξ₯ is the number of "stages", and πœ‹ 𝑗 is a permutation of the first Ξ“ positive integers. Moreover, πœπ‘— π‘˜ are real numbers. The Suzuki-Trotter formulas are examples of staged formulas. It was shown that the additive and multiplicative errors A, M in the estimation of π‘’βˆ’π‘–π»π‘‘ for a 𝑝th order staged product formula P𝑝 scale as A, M ∈ 𝑂 ( Λœπ›Όcomm𝑑 𝑝+1) where Λœπ›Όcomm = Ξ“ βˆ‘οΈ Ξ“ βˆ‘οΈ Β· Β· Β· Ξ“ βˆ‘οΈ 𝛾1=1 𝛾2=1 𝛾 𝑝+1=1 βˆ₯ [𝐻𝛾 𝑝+1 , . . . , [𝐻𝛾2 , 𝐻𝛾1], . . . ] βˆ₯ (2.31) (2.32) is a sum of the norms of all nested commutators of 𝑝 + 1 terms of 𝐻 [27]. Though Λœπ›Όcomm may be practically difficult to compute, this result gives a tighter characterization of the kinds of errors to expect from a Trotter simulation. Moreover, these commutators can be computed once for specific classes of Hamiltonians, such as lattice systems, and the results can be applied thereafter. Despite these sophisticated theoretical characterizations, product formulas have been observed to perform even better than expected [25, 66]. The relative simplicity and flexibility of Trotter meth- ods, as compared to the more recent and theoretically improved methods described in subsequent sections, makes this approach the current frontrunner in practical quantum simulation. However, as quantum hardware continues to improve, we may find that so-called post-Trotter methods become increasingly attractive as the overhead costs become less burdensome. 25 Using the error bounds (2.31), we can derive a rigorous upper bound on the number of Trotter steps π‘Ÿ needed to simulate a π‘˜-local 𝐻 for time 𝑇 to accuracy πœ– using a 𝑝th order product formula. π‘Ÿ ∈ 𝑂 (cid:32) Λœπ›Ό1/𝑝 comm𝑇 1+1/𝑝 πœ– 1/𝑝 (cid:33) (2.33) For fixed order formula 𝑝 the total number of exponentials π‘’βˆ’π‘–π» 𝑗 𝛿𝑑 scales as π‘Ÿ up to constant factors. Since each exponential requires at most a constant number of quantum gates, the above formula also gives the scaling of the number of two-qubit gates. What we see is that we have achieved efficient simulation in terms of 𝑇 and πœ–. Moreover, the number of qubits needed is only those 𝑛 which generate the state space of the system. Contrast this with the 2𝑛 needed, naively, to write the full state classically. Product formulas will recur throughout this thesis. One of the deficiencies of this approach to simulation is their relatively low accuracy, especially to post-Trotter methods with 𝑂 (log 1/πœ–) scaling in the accuracy. In Chapter 3, we explore the use of polynomial interpolation to improve this accuracy without employing additional quantum resources. Trotterization also arises in the time dependent setting, where each 𝐻𝛾 (𝑑) depends on time, by using (2.18) for finite 𝑛 and Trotterizing each 𝐻 (𝑑 𝑗 ). The use of an auxiliary "clock space" connects the notions time independent and time dependent simulation (and Trotterization), which we use to propose and analyze several approaches to time dependent Hamiltonian simulation. This is considered in Chapter 4. 2.7.2 Linear Combination of Unitaries For a couple decades (a long time relative to the field), product formulas were the only algorithm in town for Hamiltonian simulation on quantum computers. Then, in 2012, Childs and Wiebe introduced [24] a new primitive for quantum computation: applying a linear combination of unitaries 𝐿 βˆ‘οΈ 𝑗=1 𝛼 π‘—π‘ˆ 𝑗 . (2.34) to a quantum register. Figure 2.3 gives a schematic circuit for implementing this sum. Because a sum of unitaries is not unitary, we should expect that some measurements are required for the successful implementation of the operation, and indeed, it is conditioned on measuring all 0’s on 26 |0⟩ |0⟩ |0⟩ |πœ“βŸ© PREP† . . . . . . . . . . . . π‘ˆ8 PREP π‘š π‘ˆ1 π‘ˆ2 Figure 2.3 A schematic of the linear combination of unitaries (LCU) circuit. Conditioned upon measuring 0 on every measurement, the result applied is (cid:205) 𝑗 𝛼 π‘—π‘ˆ 𝑗 . In this case, there are 3 auxiliary qubits, hence 23 = 8 possible unitaries to add. The PREP circuits are any circuit satisfying PREP |0βŸ©π‘› = (cid:205) 𝑗 βˆšπ›Ό 𝑗 | π‘—βŸ©. the auxiliary register. Measuring "success" is more likely the closer (2.34) is to a unitary operator. Success can be achieved through repeated trials or, more efficiently, through quantum amplitude amplification. The linear combination of unitaries (LCU) circuit consists of two subroutines. The first is a PREP ("prepare") unitary which acts on a quantum register initialized to |0βŸ©βŠ—π‘˜ as PREP |0βŸ©βŠ—π‘˜ = 𝐿 βˆ‘οΈ βˆšοΈ‚ 𝛼 𝑗 βˆ₯π‘Žβˆ₯1 | π‘—βŸ© (2.35) 𝑗=1 with βˆ₯π‘Žβˆ₯1 := (cid:205) 𝑗 |π‘Ž 𝑗 |. The second is a SEL ("select") unitary which applies π‘ˆ 𝑗 to the main register controlled on state 𝑗 of the auxiliary. SEL | π‘—βŸ© |πœ“βŸ© = | π‘—βŸ© π‘ˆ 𝑗 |πœ“βŸ© (2.36) By applying the operation (PREP† βŠ— 𝐼)SEL(PREP βŠ— 𝐼), then measuring the appropriate outcome on the auxiliary (all zeros), the operation (2.34) may be implemented on the main register up to normalization. The LCU primitive led to several distinct Hamiltonian simulation algorithms, the first post- Trotter methods. Here we discuss the one most relative to this thesis, and the one considered in the original Childs and Wiebe paper. This is the quantum implementation of the multiproduct formula (MPF), which is a linear combination of product formulas producing a more accurate approximation to π‘’βˆ’π‘–π»π‘‘. The best way to understand multiproduct formulas is as an instance of 27 Richardson extrapolation, and we will elaborate on this point in Chapter 5. The major point is that we wish to extrapolate π‘Ÿ β†’ ∞, or equivalently 1/π‘Ÿ β†’ 0, to achieve increasing accuracies. While higher-order Trotter formulas require an exponential number of terms, it is much less demanding to cancel error terms using summation. The upshot is that MPFs, and other well-known LCU algorithms, achieve an exponential improvement in accuracy over product formulas alone. 2.7.3 Qubitization In our discussion of the Linear Combination of Unitaries (LCU) primitive above, we saw that the unitary π‘ˆ = (PREP† βŠ— 𝐼)SEL(PREP βŠ— 𝐼) (2.37) encodes the desired operation (a sum of unitaries) provide a certain measurement result is achieved on a part of the full quantum register. This turns out to point towards a more general phenomenon. We say that π‘ˆ block encodes the desired operation (cid:205) 𝑗 𝛼 π‘—π‘ˆ 𝑗 via the state |0βŸ©βŠ—π‘˜ , in the sense that 𝛼 π‘—π‘ˆ 𝑗 = (⟨0|βŠ—π‘˜ βŠ— 𝐼)π‘ˆ (|0βŸ©βŠ—π‘˜ βŠ— 𝐼) (2.38) βˆ‘οΈ 𝑗 up to normalization. In our context, this block-encoded operator may be the Hamiltonian of interest. Remarkably, this simple requirement, the encoding of 𝐻 in a subblock of a larger unitary matrix, is all that is necessary for qubitization [88], the first asymptotically optimal approach to Hamiltonian simulation. By this, we mean that the simulation cost for simulating for time 𝑇 and accuracy πœ– scales as (cid:18) 𝑇 + 𝑂 (cid:19) log 1/πœ– log log 1/πœ– (2.39) which saturates the known lower bounds [14] for 𝑇 and 1/πœ– and is additive rather than multiplicative. Besides the concept of block encoding, the other major ingredient to qubitization is Quan- tum Signal Processing (QSP), which generates polynomial functions of the eigenvalues of the block encoded operator. It does so by interleaving unitary rotations that act independently on a two-dimensional subspace corresponding to each eigenvalue. Since π‘’βˆ’π‘–πœ†π‘‘ can be approximated by polynomials 𝑃(πœ†π‘‘) uniformly over an interval [0, πœ†π‘‡], constructing these polynomial transfor- mations provide a means to Hamiltonian simulation. Qubitization was later generalized to the 28 Quantum Singular Value Transformation (QSVT) [54], which performs polynomial transformation of the singular values of a block encoded matrix. The method is effective, general, conceptually rich and, unfortunately, rather complicated to understand fully. One notable drawback to qubitization is that it only directly works for time independent Hamil- tonian simulations. Of course, one could Trotterize via (2.18) and perform qubitization on each factor, but the error from the Trotterization would overshadow any gains from qubitization. In Chapter 4, we will embed 𝐻 (𝑑) in a system with time dependent Hamiltonian, then apply qubitiza- tion "directly." We construct a suitable block encoding of the augmented Hamiltonian, then derive bounds on the query complexity. While our analysis does not show improvements compared to other time dependent schemes, we believe this is from imperfections in the analysis rather than a features of the method. 2.7.4 Analog Quantum Simulation In order to anticipate the material of Section 6.1, we digress momentarily from our discussion of quantum computers and consider an alternative simulation platform: analog quantum simulators. Recall from Figure 2.2 the need to map the problem of interest onto a simulator. Instead of mapping to a "digital" setting, in which a universal set of discrete, unitary quantum gates are applied in sequence to construct the time evolution operator, we could instead map to a system which, though not a quantum computer, nevertheless allows for a great deal of control and emulation of a system of interest. This is potentially less technologically demanding, because we no longer need to perform arbitrary unitary operations, but rather specific Hamiltonians of interest. For excellent discussions of analog quantum simulators, see e.g. [63, 52]. For our purposes, we can roughly model analog simulators as devices which can implement a class of Hamiltonians, with terms βˆ‘οΈ 𝐻 = 𝐽 𝑗 (𝑑)𝐻 𝑗 𝑗 (2.40) such that the time dependence 𝐽 𝑗 (𝑑) is controllable to some degree. Depending on the collection of 𝐻 𝑗 and the degree of time dependence 𝐽 𝑗 , we might be able to achieve a universal device. Yet this may not be necessary for the application we are interested in. 29 Analog simulators are often considered a more achievable route to quantum Hamiltonian simulation. In Section 6.1 we will explore some interesting phenomena accessible to trapped-ion quantum simulators, with implications in nuclear physics. 2.7.5 Measurement and Hamiltonian Evolution As already emphasized, time evolution represents only one piece of a full quantum simulation algorithm. Although the time evolved state |πœ“π‘‘βŸ© is prepared on a quantum register, no information is gained without measurement. By analogy, a random variable that is not sampled leaves nothing gained. There is nontrivial work to be done in the extraction of information, as sampling in the quantum case (i.e., measurement) is more nuanced than simple probability sampling. This is because multiple different bases can be measured, choosing the right basis is important for extracting valuable information. One basic fact we might like to know about a Hamiltonian are its eigenvalues, which correspond to the "allowed" energies of a physical system. Additionally, general observables 𝑂 can be often simulated as if they were Hamiltonians assuming they are, say, sparse or π‘˜-local. Suppose we are able to prepare an (approximate) eigenvector |𝐸⟩ of 𝐻 on a quantum register. Performing Hamiltonian simulation will produce an output state that picks up merely a phase. π‘ˆ (𝑑) |𝐸⟩ = π‘’βˆ’π‘–πΈπ‘‘ |𝐸⟩ (2.41) This phase cannot be measured as is, because only relative phase shifts, not overall phases, are measurable in general wavelike phenomena. Our strategy will be to introduce a reference unshifted state and produce interference, such that the phase πœ‘ = 𝐸𝑑 is measurable. One way to do this in the setting of quantum computing is to apply a controlled-π‘ˆ gate. By putting an auxiliary register in a superposition |0⟩ + |1⟩ and applying π‘ˆ to the main register conditioned on |1⟩, we’ve introduced a relative phase shift into the auxiliary qubit. This can be measured through a change of basis. What we’ve heuristically described above is a general framework for performing phase esti- mation on a quantum computer, which allows for the extraction of eigenvalues. Figure 2.4 gives the basic circuit for the simplest phase estimation algorithm. Physically, the circuit is essentially a 30 |0⟩ |𝐸⟩ Had 𝑛 π‘’βˆ’π‘–π»π‘‘ 𝑃(πœƒ) Had Figure 2.4 Schematic of the simplest quantum phase estimation algorithm, which is useful for measuring in the eigenbasis of a Hamiltonian. The measurement outcomes associated with the top auxiliary qubit are directly related to the eigenvalues of 𝐻, and by repeated trials these eigenvalues can be estimated. The phase rotation 𝑃(πœƒ) has an angle parameter πœƒ which may be varied to resolve certain ambiguities due to cos2 not being one-to-one on [0, πœ‹]. Mach-Zehnder interferometer [67], with Hadamards Had := 1 √ 2 1 1 (cid:169) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) 1 βˆ’1 (cid:172) (2.42) acting as beam splitters and the controlled time evolution introducing a phase shift. The family of phase estimation algorithms is, by now, enormous [67, 77, 125, 136]. This thesis will discuss a recent addition to this family known as the Rodeo Algorithm, a resource-efficient, randomized procedure which performs a selective search over the space of possible eigenvalues. Like all phase estimation protocols, it works well provided the initial state has reasonable overlap with the eigenstates of interest (formally, if the overlap decreases polynomially with problem size). Perhaps even more foundational than phase estimation, in a sense, is the estimation of an amplitude on a quantum computer. In fact, the simple scheme above encodes the phase in an amplitude to be measured. Slightly more generally, we suppose an operator 𝑉 acting on an initial state |0⟩ of a single qubit as 𝑉 |0⟩ = π‘Ž0 |0⟩ + π‘Ž1 |1⟩, and wish to estimate π‘Ž0 (without loss of generality we may take π‘Ž0 β‰₯ 0). This estimation may be done through repeated computational basis measurements, but more clever approaches using amplitude amplification lead to a quadratic improvement in scaling. See [59] for details on an efficient iterative procedure for amplitude estimation. 2.8 Mathematical Reference The following section includes important technical definitions and results that are used through- out the dissertation. The reader may, if interested, browse these mathematical tools now, or they may come back and reference as needed as they read subsequent chapters. 31 2.8.1 Combinatorics The simple factorial 𝑛! counts the number of permutations of 𝑛 objects, and is usefully ap- proximated by Stirling’s approximation. In the paper, we always make use of a version of the approximation which gives strict bounds for 𝑛 ∈ Z+. √ 2πœ‹π‘› (cid:17) 𝑛 (cid:16) 𝑛 𝑒 < 𝑛! < √ 2πœ‹π‘› (cid:17) 𝑛 (cid:16) 𝑛 𝑒 𝑒1/(12𝑛) (2.43) These bounds are extremely tight, even for small 𝑛. The multinomial coefficient is a generalization of the more common binomial coefficient, and it arises in several combinatorial situations. It is defined by (cid:19) (cid:18) 𝑛 𝑛1, ..., π‘›π‘˜ := 𝑛! 𝑛1!𝑛2!...π‘›π‘˜ ! (2.44) where 𝑛 ∈ Z+ and the (𝑛ℓ) π‘˜ β„“=1 are nonnegative integers which sum to 𝑛. It is a positive integer corresponding to the number of distinct ways of placing 𝑛 distinguishable items into π‘˜ boxes, where each box has a fixed number 𝑛ℓ of items. In this work, we will find occasion to make use of the multinomial when evaluating high-order derivatives of a product. (cid:19) 𝑛 (cid:18) 𝑑 𝑑𝑑 𝑓1(𝑑) 𝑓2(𝑑) . . . π‘“π‘˜ (𝑑) (2.45) Here, ( 𝑓ℓ) π‘˜ β„“=1 are 𝑛-differentiable functions of 𝑑 ∈ R. Employing the product rule, one is left to count all the possible combinations of derivatives of each 𝑓ℓ. It turns out that the multinomial is suited for this. (cid:18) 𝑑 𝑑𝑑 (cid:19) 𝑛 π‘˜ (cid:214) β„“=1 𝑓ℓ (𝑑) = (cid:18) βˆ‘οΈ 𝑁 𝑛 𝑛1, . . . , π‘›π‘˜ (cid:19) π‘˜ (cid:214) β„“=1 (cid:19) 𝑛ℓ (cid:18) 𝑑 𝑑𝑑 𝑓ℓ (𝑑) (2.46) The sum is taken over the set 𝑁 of sequences of nonnegative integers (𝑛ℓ) π‘˜ β„“=1 summing to 𝑛. A useful property is that (cid:18) βˆ‘οΈ 𝑁 (cid:19) 𝑛 𝑛1, . . . , π‘›π‘˜ = π‘˜ 𝑛 (2.47) for nonnegative integers π‘˜, 𝑛 (with convention 00 = limπ‘₯β†’0 π‘₯π‘₯ = 1). Besides derivatives of products, we will also need to bound derivatives of ordinary exponentials of a time dependent matrix. Useful for this purpose is an expression for derivatives of exponentials 32 of a scalar function π‘Ž(𝑑). (cid:19) 𝑛 (cid:18) 𝑑 𝑑𝑑 π‘’π‘Ž(𝑑) The solution we rely on is FaΓ  di Bruno’s formula, which asserts that (cid:19) 𝑛 (cid:18) 𝑑 𝑑𝑑 π‘’π‘Ž(𝑑) = π‘’π‘Ž(𝑑)π‘Œπ‘› (π‘Žβ€²(𝑑), π‘Žβ€(𝑑), . . . , π‘Ž (𝑛) (𝑑)) where π‘Œπ‘› is the complete exponential Bell polynomial [31]. An explicit formula is given by π‘Œπ‘› (π‘₯1, π‘₯2, . . . , π‘₯𝑛) = βˆ‘οΈ 𝐢 𝑛! 𝑐1!𝑐2! . . . 𝑐𝑛! (cid:19) 𝑐 𝑗 𝑛 (cid:214) 𝑗=1 (cid:18) π‘₯ 𝑗 𝑗! where the sum is taken over the set 𝐢 of all sequences (𝑐 𝑗 )𝑛 𝑗=1 such that 𝑐 𝑗 β‰₯ 0 and 𝑐1 + 2𝑐2 + Β· Β· Β· + 𝑛𝑐𝑛 = 𝑛. (2.48) (2.49) (2.50) (2.51) Essentially, each coefficient in π‘Œπ‘› counts the ways one can partition a set of fixed size 𝑛 into subsets of given sizes and number. When one simply wants to count the total number of possible partitions, one is led to the Bell numbers 𝑏𝑛. These are related to the π‘Œπ‘› by evaluating all arguments to 1. More generally, for any π‘₯ ∈ R, 𝑏𝑛 = π‘Œπ‘› (1, 1, ...1) π‘Œπ‘› (π‘₯, π‘₯2, . . . , π‘₯𝑛) = π‘₯𝑛𝑏𝑛, (2.52) (2.53) which can be seen directly from (2.50) along with the sum rule (2.51). The Bell numbers 𝑏𝑛 grow combinatorially; in particular, the following upper bound [13] is useful. 𝑏𝑛 < (cid:19) 𝑛 (cid:18) .792𝑛 log(𝑛 + 1) , βˆ€π‘› ∈ Z+ (2.54) More generally, the single-variable Bell polynomial, or Touchard polynomial 𝐡𝑛 (π‘₯), is simply π‘Œπ‘› with all arguments evaluated to π‘₯. 𝐡𝑛 (π‘₯) = π‘Œπ‘› (π‘₯, π‘₯, . . . , π‘₯) (2.55) Of course, 𝑏𝑛 = 𝐡𝑛 (1). The 𝑛th Bell polynomial 𝐡𝑛 (π‘₯) is also the value of the 𝑛th moment of the Poisson distribution with mean π‘₯. From [5] we have the following upper bound on 𝐡𝑛 (cid:18) (cid:19) 𝑛 𝐡𝑛 (π‘₯) ≀ 𝑛 log(1 + 𝑛 π‘₯ ) , βˆ€π‘₯ β‰₯ 0 (2.56) 33 which we observe is very close to that for the Bell numbers (π‘₯ = 1) in equation (2.54). From their definitions, π‘Œπ‘›, 𝐡𝑛 and 𝑏𝑛 all grow monotonically, both in their functional arguments and their index 𝑛. This is intuitive from being combinatorial functions whose coefficients count something according to the size of 𝑛. 2.8.2 Norms Norms are used widely throughout the paper to characterize the size of mathematical objects and quantify simulation costs. For finite-dimensional vectors 𝑣 = (𝑣1, . . . , 𝑣𝑛), real valued or complex, the Schatten 𝑝-norm, is defined as 1/𝑝 (2.57) |𝑣| 𝑝 := (cid:169) (cid:173) (cid:171) 𝑛 βˆ‘οΈ 𝑗=1 |𝑣 𝑗 | 𝑝(cid:170) (cid:174) (cid:172) for any 𝑝 ∈ [1, ∞), and for 𝑝 = ∞ as βˆ₯𝑣βˆ₯∞ = max 𝑗 |𝑣 𝑗 |. We make particular use of the 1 and ∞ norm in our paper, to express our results or quote previous ones. We also make use of functional norms, which are defined analogously. Given a scalar function 𝑓 (𝑑) with scalar input over an interval [0, 𝑇], the 𝑝-norm, or 𝐿 𝑝 norm, for 𝑝 ∈ [1, ∞) is given by βˆ₯ 𝑓 (𝑑) βˆ₯ 𝑝 := (cid:19) 1/𝑝 | 𝑓 (𝜏)| π‘π‘‘πœ (cid:18)∫ 𝑇 0 (2.58) for functions such that it is defined. Analogously, the ∞-norm is given by the supremum sup| 𝑓 (𝑑)| over [0, 𝑇]. For the piecewise smooth functions we consider, this is just the maximum value on the interval, so we might write βˆ₯ 𝑓 (𝑑)βˆ₯max. In Table 1, we use notation βˆ₯ Β· βˆ₯ 𝑝,π‘ž to denote nested norms whenever our objects have both a (finite) vector and functional character. Specifically, this notation means take the vec- tor 𝑝-norm first, then take the π‘ž-norm of the resulting scalar function. For example, if 𝛼(𝑑) = (𝛼1(𝑑), 𝛼2(𝑑), . . . , 𝛼𝑛 (𝑑)), then and βˆ₯𝛼βˆ₯1,1 = (cid:13) 𝑛 (cid:13) βˆ‘οΈ (cid:13) (cid:13) (cid:13) (cid:13) 𝑗=1 |𝛼 𝑗 (𝑑)| (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13)1 = ∫ 𝑇 𝑛 βˆ‘οΈ 0 𝑗=1 |𝛼 𝑗 (𝜏)|π‘‘πœ βˆ₯𝛼βˆ₯1,max = max 𝜏∈[0,𝑇] 𝑛 βˆ‘οΈ 𝑗=1 |𝛼 𝑗 (𝜏)|. 34 (2.59) (2.60) In the main paper we claim our Hamiltonian simulation algorithm exhibits 𝐿1-norm scaling. This means it has complexity 𝑂 (βˆ₯ 𝑓 βˆ₯1), where 𝑓 is a function whose value is some measure of the size of the Hamiltonian and its derivatives at each 𝑑 ∈ [0, 𝑇]. Finally, our paper makes use of the spectral norm for linear operators, also known as the induced 2-norm. It is defined for any bounded operator 𝐴 on a Hilbert space H by βˆ₯ 𝐴βˆ₯ := sup π‘£βˆˆH \{0} βˆ₯ 𝐴𝑣βˆ₯2 βˆ₯𝑣βˆ₯2 . (2.61) In our case, 𝐴 will always be finite dimensional, and βˆ₯ 𝐴βˆ₯ is the largest singular value of 𝐴. This norm is invariant under left or right multiplication by a unitary operator π‘ˆ, and βˆ₯π‘ˆ βˆ₯ = 1. The spectral norm is submultiplicative, a property we make frequent use of. βˆ₯ 𝐴𝐡βˆ₯ ≀ βˆ₯ 𝐴βˆ₯βˆ₯𝐡βˆ₯ (2.62) 35 CHAPTER 3 TROTTER ERROR MITIGATION This chapter is based on recent work [113] concerning the reduction of Trotter error by use of standard Chebyshev interpolation, and is outlined as follows. After providing some background and motivation, we spend two sections describing the interpolation procedure and proving some results on its robustness to noisy data. We then apply the framework to the important task of measuring expectation values of time-evolved observables. Numerical demonstrations are provided for a random 1D Heisenberg model, and show the expected behavior. Discussion on the implications of this work is given, and technical results are proven at the end of the chapter. With the exception of the numerics of Section 5.7, which are new, additional details, applications, and numerics can be found in the main publication [113]. 3.1 Introduction and Motivation In Section 2.7, we gave an overview of product formulas as a means for Hamiltonian simulation. Efficient, versatile, and simple, product formulas will perhaps remain the preferred method of quantum simulation on quantum computers for the foreseeable future. This motivates the search for techniques to further bolster the method, particularly by mitigating its biggest flaws. Chief among these, perhaps, is their relative inaccuracy compared to post-Trotter methods such as qubitization. As an example, the 1st order Trotter formula 𝑆1, which splits the exponential π‘’βˆ’π‘–π»π‘‘ in the simplest imaginable way, 𝑆1(𝑑) = π‘’βˆ’π‘–π»1π‘‘π‘’βˆ’π‘–π»2𝑑 . . . π‘’βˆ’π‘–π»Ξ“π‘‘ (3.1) has an error scaling given by the 𝑂 (𝑑2/π‘Ÿ). From this we deduce that the simulation cost scales with the error πœ– as 𝑂 (1/πœ–). Contrast this with, say, qubitization, which scales slightly better than 𝑂 (log 1/πœ–), an exponential improvement. More generally, every major post-Trotter method scales polylogarithmically in 1/πœ– (that is, 𝑂 ( 𝑝(log 1/πœ–)) for some polynomial 𝑝). Higher order product formulas have better accuracy, but are generally impractical due to exponential scaling of cost with the order π‘˜ of the formula. Can the accuracy of product formulas be improved with additional techniques? Indeed, multi- 36 product formulas, mentioned in Section 2.7.2 and elaborated in Chapter 5, achieve exponentially improved accuracy compared to product formulas alone by summing formulas of different Trot- ter step size. Unfortunately, the additional quantum overhead and controlled operations required for implementing the required LCU procedure are noteworthy, and introduce barriers which are especially burdensome in the current era of noisy hardware. Faced with these limitations, we might look beyond multiproduct formulas based on LCU and ask if there is a way forward using only classical resources, such as randomness. Indeed, within the past few years there have been claims of using multiproduct formulas without LCU-type procedures [45, 130]. These schemes don’t produce a true multiproduct formula, in the sense of applying the operation MPF(𝑑) = 𝐿 βˆ‘οΈ 𝑐 𝑗 P (𝑑/π‘Ÿ 𝑗 )π‘Ÿ 𝑗 (3.2) 𝑗=1 to a quantum register, where P is a product formula, π‘Ÿ 𝑗 ∈ Z+, and 𝑐 𝑗 ∈ R. But in any case, this is not strictly necessary. After all, preparing π‘’βˆ’π‘–π»π‘‘ |πœ“βŸ© is not a full algorithm, but a possible intermediate step, after which measurements must be performed to extract the desired information. For example, one might be interested in estimated computing the dynamics of observables via βŸ¨π‘‚ (𝑑)⟩ = βŸ¨πœ“π‘‘ | 𝑂 |πœ“π‘‘βŸ© . (3.3) This suggests the possibility of summing not operations, but classical data coming from measure- ment schemes. This amounts to a standard Richardson extrapolation of the expectation values [44]. Let 𝑠 = 1/π‘Ÿ be the "normalized Trotter step." For a given product formula P in π‘Ÿ steps, the expectation value βŸ¨π‘‚ 𝑠 (𝑑)⟩ := βŸ¨πœ“0| P†(𝑠𝑑)1/𝑠𝑂P (𝑠𝑑)1/𝑠 |πœ“0⟩ (3.4) is a real-valued, smooth function 𝑓 (𝑠). By estimating 𝑓 (𝑠 𝑗 ) for various 𝑠 𝑗 , an extrapolation can be performed to the desired 𝑠 = 0. What is attractive about this scheme is that no additional quantum resources, beyond the product formulas simulation and expectation value measurements, are needed. 37 There are several directions in which these ideas can be extended. First, the function 𝑓 (𝑠) could represent a variety of kinds of measurements obtained from product simulations, beyond expectation values. For example, they could represent eigenvalue estimates from phase estimation. Second, we might consider other means of estimating 𝑓 (0) besides Richardson extrapolation. One possibility is to produce a uniform approximation to 𝑓 in a neighborhood of 𝑠 = 0. Function approximation is an old science, and many numerical techniques are available for the task. Modern machine learning approaches such as neural networks might achieve excellent approximation for 𝑓 , but finding the simplest effective tool for the task is desirable, and moreover it is difficult to provide rigorous guarantees for machine learning. 3.2 Polynomial Interpolation Among the collection of function approximation methods available, we choose one that is simple to implement and easy to analyze: polynomial interpolation. Essentially, our goal is to use interpolation to "extrapolate" the Trotter step size 𝑠 to the ideal of 𝑠 = 0.1 There are many quantities that we could be interested in extrapolating. For this thesis, we will be primarily concerned with expectation value estimation. βŸ¨π‘‚ 𝑠 (𝑑)⟩ = Tr πœŒπ‘‚ 𝑠 (𝑑) 𝑂 𝑠 (𝑑) : = Λœπ‘ˆπ‘  (𝑑)†𝑂 Λœπ‘ˆπ‘  (𝑑) (3.5) For purposes of analysis, we’ll assume these expectation values are estimated on a quantum computer using one of the Suzuki-Trotter (ST) formulas of equation (2.28). However, in principle our approach should work for any product formula simulation, not just ST. While the interpolation is classical and independent of the method in which the data is generated, we will assume a quantum simulation is used when considering the computational cost. We assume all quantum operations are executed perfectly, including the exponentials exp(βˆ’π‘–π» 𝑗 𝑑) for simulation. This is not to say that the interpolation method could not be applied to noisy quantum systems, but rather that our cost analysis does not account for it. Consequently, the only sources of error 1We occasionally interchange between the terminology "extra-" and "interpolation." We view our method as an extrapolation beyond the data using a numerical technique commonly known as polynomial interpolation. 38 considered are the interpolation error and error in the calculation of the data points (e.g. the Hamiltonian energies or expectation values at various points 𝑠𝑖). Error in the data points may arise from hardware noise, but even in its absence, a measurement protocol such as phase estimation induces a systematic error that cannot be removed. Without further ado, we now describe the interpolation framework. Let 𝑓 ∈ 𝐢∞( [βˆ’π‘Ž, π‘Ž]) be a smooth, real-valued function of a single variable 𝑠 ∈ [βˆ’π‘Ž, π‘Ž] and suppose we have calculated 𝑓 (perfectly) for 𝑛 distinct points 𝑠1, 𝑠2 . . . 𝑠𝑛 ∈ [βˆ’π‘Ž, π‘Ž]. That is, we have data in the form of a set of tuples 𝐷 = {(𝑠𝑖, 𝑓𝑖)}𝑛 𝑖=1, where 𝑓𝑖 = 𝑓 (𝑠𝑖). Let π‘ƒπ‘›βˆ’1 𝑓 be the unique (𝑛 βˆ’ 1)-degree polynomial interpolating 𝐷, i.e. 𝑃 π‘“π‘›βˆ’1(𝑠𝑖) = 𝑓𝑖 for each 𝑖 = 1, . . . , 𝑛. For any 𝑠 ∈ [βˆ’π‘Ž, π‘Ž], standard results in polynomial interpolation [107] tell us that the signed error is given by πΈπ‘›βˆ’1(𝑠) := 𝑓 (𝑠) βˆ’ π‘ƒπ‘›βˆ’1 𝑓 (𝑠) = 𝑓 (𝑛) (πœ‰) 𝑛! πœ”π‘› (𝑠) (3.6) for some πœ‰ ∈ 𝐼𝑠, where 𝐼𝑠 βŠ‚ [βˆ’π‘Ž, π‘Ž] is the smallest interval containing 𝑠 and the interpolation points {𝑠𝑖}. Throughout this work, superscripts such as in 𝑓 (𝑛) will refer to 𝑛th-order derivatives. The 𝑛th degree nodal polynomial πœ”π‘› (𝑠) is defined as the unique monic polynomial with zeros at the interpolation points. πœ”π‘› (𝑠) := 𝑛 (cid:214) (𝑠 βˆ’ 𝑠𝑖) (3.7) 𝑖=1 Our estimate for 𝑓 (0) is π‘ƒπ‘›βˆ’1 𝑓 (0). Since we are interested in 𝑠 = 0, πœ”π‘› becomes a (signed) product of the interpolation points. We can bound the interpolation error 𝐸𝑛 (0) in a way that is independent of the precise value of πœ‰ (which is unknown and difficult to find) by maximizing over πœ‰ ∈ 𝐼𝑠. |πΈπ‘›βˆ’1(0)| ≀ max π‘ βˆˆπΌπ‘  | 𝑓 (𝑛) (𝑠)| 𝑛! 𝑛 (cid:214) 𝑖=1 |𝑠𝑖 | (3.8) Much of the technical work in this dissertation involves finding suitable bounds on the size of the derivatives 𝑓 (𝑛). In particular, in the expectation values of equation (3.5), 𝑓 (𝑠) = βŸ¨π‘‚ 𝑠 (𝑑)⟩. For reasons which we discuss in the following Section, we choose the Chebyshev nodes on [βˆ’π‘Ž, π‘Ž] as our interpolation points. 𝑠𝑖 = π‘Ž cos (cid:18) 2𝑖 βˆ’ 1 2𝑛 (cid:19) πœ‹ 39 (3.9) This allows us to specialize our interpolation error in the manner described in the following lemma. Lemma 3.2.1. Let 𝑠𝑖, 𝑖 = 1, 2, . . . , 𝑛 be the collection of Chebyshev interpolation points on the interval [βˆ’π‘Ž, π‘Ž]. In the notation above, we have |πΈπ‘›βˆ’1(0)| < max π‘ βˆˆ[βˆ’π‘Ž,π‘Ž] | 𝑓 (𝑛) (𝑠)| (cid:17) 𝑛 . (cid:16) π‘Ž 2𝑛 Proof. For 𝑛 odd, 𝑠 = 0 is one of the interpolation points, so the error is zero and the bound holds automatically. Hereafter, we only consider 𝑛 even (which will be the case of practical interest). Using the generic bound (3.8) with the Chebyshev nodes, |πΈπ‘›βˆ’1(0)| ≀ max πœ‰βˆˆ[βˆ’π‘Ž,π‘Ž] | 𝑓 (𝑛) (πœ‰)| π‘Žπ‘› 1 𝑛! 𝑛 (cid:214) 𝑖=1 (cid:12) (cid:12) (cid:12) (cid:12) cos (cid:18) 2𝑖 βˆ’ 1 2𝑛 πœ‹ (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) . (3.10) To obtain the lemma, we just need to appropriately bound the product of cosines. Since 𝑛 is even, 𝑛 = 2π‘š for some π‘š ∈ Z+. Moreover, we have a reflectional symmetry about π‘š, in the sense that (cid:12) (cid:12) cos (cid:12) (cid:12) (cid:18) 2𝑖 βˆ’ 1 2𝑛 πœ‹ (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) = (cid:12) (cid:12) (cid:12) (cid:12) cos (cid:18) 2(𝑛 βˆ’ 𝑖 + 1) βˆ’ 1 2𝑛 πœ‹ (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) . Hence, we only need to take the product over 𝑖 = 1, . . . , π‘š and square it. 𝑛 (cid:214) 𝑖=1 (cid:12) (cid:12) (cid:12) (cid:12) cos (cid:18) 2𝑖 βˆ’ 1 2𝑛 πœ‹ (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) = (cid:32) π‘š (cid:214) 𝑖=1 (cid:19) (cid:33) 2 cos (cid:18) 2𝑖 βˆ’ 1 4π‘š πœ‹ To proceed further, let’s reindex the remaining product by 𝑖 β†’ π‘š βˆ’ 𝑖 + 1. This gives π‘š (cid:214) 𝑖=1 cos (cid:18) 2𝑖 βˆ’ 1 4π‘š (cid:19) πœ‹ = = < π‘š (cid:214) 𝑖=1 π‘š (cid:214) 𝑖=1 π‘š (cid:214) 𝑖=1 (cid:19) πœ‹ 2𝑖 βˆ’ 1 4π‘š (cid:19) (cid:18) πœ‹ cos sin βˆ’ 2 (cid:18) 2𝑖 βˆ’ 1 4π‘š 2𝑖 βˆ’ 1 4π‘š (3.11) (3.12) (3.13) where we used the fact that sin(π‘₯) < π‘₯ for all π‘₯ > 0. Factoring out the denominator from the product, the remaining terms become a double factorial. π‘š (cid:214) 𝑖=1 2𝑖 βˆ’ 1 4π‘š = (2π‘š βˆ’ 1)!! (4π‘š)π‘š 40 (3.14) The double factorial can be bounded as (2π‘š βˆ’ 1)!!2 < (2π‘š βˆ’ 1)!!(2π‘š)!! = 2π‘š! , (3.15) so that (2π‘šβˆ’1)!! < √︁(2π‘š)!. Returning to the original product of equation (3.12), and reintroducing 𝑛 = 2π‘š, the resulting bound is 𝑛 (cid:214) 𝑖=1 (cid:12) (cid:12) (cid:12) (cid:12) cos (cid:18) 2𝑖 βˆ’ 1 2𝑛 πœ‹ (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) < (cid:33) 2 (cid:32) √ 𝑛! (2𝑛)𝑛/2 = 𝑛! (2𝑛)𝑛 . (3.16) Reinserting this result into the last line of equation (3.10) gives the bound stated in the lemma. β–‘ Though Chebyshev interpolation enjoys nice mathematical properties, it presents a challenge for Trotter simulation because of the need for noninteger time steps in equation (3.44). In the face of this obstacle, there are several options one could take: rounding to integer time steps, or perform fractional queries using, say, the Quantum Singular Value Transformation (QSVT). First, consider rounding to integer time steps, i.e., gathering data at the nearest reciprocal integer 1/ Λœπ‘Ÿ to the Chebyshev node 𝑠. For symmetrical interval [βˆ’π‘Ž, π‘Ž], the rounding error |𝑠 βˆ’ 1/ Λœπ‘Ÿ | goes as 𝑂 (π‘Ž2) as π‘Ž β†’ 0. From here, one could either (a) take the estimate for 𝑓 (1/ Λœπ‘Ÿ) as the estimate for 𝑓 (𝑠), accruing some error in the process, or (b) perform the interpolation at the approximate Chebyshev nodes given by the collection of points 1/π‘Ÿπ‘–. Unfortunately, for our purposes, option (a) leads to unacceptable errors of order 𝑂 (π‘Ž) in the data, eliminating accuracy gains. As for option (b), it is possible to use robustness results on Chebyshev interpolation [132] to argue that almost- Chebyshev nodes should be almost as well-conditioned. Again, however, we find that our scaling of the number of nodes is such that the node displacements must be quite small, leading again to poorer scaling. Because of this, for most of this work we choose to invoke access to fractional queries using the QSVT [54]. While fractional queries increase the overhead compared to Trotter alone, this overhead is a constant. In practice, it may be that taking approximate Chebyshev nodes is perfectly acceptable and stable, and if so would likely be the preferred method. 3.3 Stability Analysis Polynomial interpolation is a valuable numerical tool, but some implementations can lead to numerical instability [36]. However, the situation is not as bad as often presented in textbooks [126]. 41 While linear algebraic approaches involving Vandermonde matrices suffer instability for high degree polynomials [51], methods such as barycentric formulas are provably stable with respect to floating point arithmetic [68]. A particularly important consideration is the choice of interpolation nodes. It is well known that equally spaced nodes can lead to the Runge phenomenon: rapid oscillations near the ends of the interval that grow with polynomial degree [107]. These oscillations can be overcome with a superior choice of nodes, such as the zeros of the Chebyshev polynomials. Interpolations done with this set of nodes are guaranteed to converge to functions that are Lipschitz continuous as 𝑛 β†’ ∞. Moreover, they are well-conditioned in the sense of small errors in the data values. Finally, because they anti-cluster around 𝑠 = 0, they are relatively cheap to compute with Trotter formulas. In this work, we will always interpolate at the 𝑛th-degree Chebyshev nodes, or approximations thereof, on a symmetric interval [βˆ’π‘Ž, π‘Ž] about the origin, defined in (3.9). We choose even 𝑛 so as to avoid the origin (which has infinite cost to compute), and also utilize the reflectional symmetry of 𝑓 (𝑠). To compute the interpolant π‘ƒπ‘›βˆ’1 𝑓 linear algebraically, we overcome the limitations of the standard Vandermonde approach by expanding in terms of orthonormal Chebyshev polynomials rather than monomials π‘₯ 𝑗 . Here, 𝑝 𝑗 is defined by π‘ƒπ‘›βˆ’1 𝑓 (𝑠) = π‘›βˆ’1 βˆ‘οΈ 𝑗=0 𝑐 𝑗 𝑝 𝑗 (𝑠) 𝑝 𝑗 (𝑠) := βˆšοΈƒ 1 𝑛𝑇0(𝑠), βˆšοΈƒ 2 𝑛𝑇𝑗 (𝑠), 𝑗 = 0 𝑗 = 1, 2, . . .   ο£³ where 𝑇𝑗 is the standard 𝑗th Chebyshev polynomial. 𝑇𝑗 (π‘₯) := cos( 𝑗 cosβˆ’1 π‘₯) By orthonormality, we are referring to the condition [93] 𝑛 βˆ‘οΈ π‘˜=1 𝑝𝑖 (π‘ π‘˜ ) 𝑝 𝑗 (π‘ π‘˜ ) = 𝛿𝑖 𝑗 42 (3.17) (3.18) (3.19) (3.20) for all 0 ≀ 𝑖, 𝑗 < 𝑛, with π‘ π‘˜ being the zeros of 𝑇𝑛 given in (3.9). This immediately implies the matrix V := 𝑝0(𝑠1) 𝑝1(𝑠1) . . . π‘π‘›βˆ’1(𝑠1) 𝑝0(𝑠2) 𝑝1(𝑠2) ... ... . . . π‘π‘›βˆ’1(𝑠2) . . . ... 𝑝0(𝑠𝑛) 𝑝1(𝑠𝑛) . . . π‘π‘›βˆ’1(𝑠𝑛) (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (3.21) is orthogonal, and therefore has condition number πœ…(V) := βˆ₯Vβˆ₯βˆ₯Vβˆ’1βˆ₯ equal to one. This is the source of well-conditioning in our approach. The coefficients 𝑐 = (𝑐0, 𝑐1, . . . , π‘π‘›βˆ’1) in equa- tion (3.17) satisfy 𝑦 = V𝑐 (3.22) for the vector of values 𝑦 = ( 𝑓 (𝑠1), 𝑓 (𝑠2), . . . , 𝑓 (𝑠𝑛)), since π‘ƒπ‘›βˆ’1 𝑓 𝑐 = V𝑇 𝑦 gives the vector of coefficients. is an interpolant. Hence, We now develop our argument for well-conditioning. Unless otherwise subscripted, all loga- rithms are natural. Lemma 3.3.1. Let 𝑠1, 𝑠2, . . . , 𝑠𝑛 be the standard Chebyshev nodes on [βˆ’π‘Ž, π‘Ž] (3.9) with 𝑛 even. Then the nodes satisfy 𝑛 βˆ‘οΈ π‘˜=1 1 |π‘ π‘˜ | ≀ 4𝑛 πœ‹π‘Ž (𝛾 + log(2𝑛 + 2)) , where 𝛾 β‰ˆ 0.577 is the Euler-Mascheroni constant. Proof. We focus on the case π‘Ž = 1, since the general result follows by a simple rescaling. Because sine and cosine are phase shifted by πœ‹/2, 𝑛 βˆ‘οΈ π‘˜=1 1 |π‘ π‘˜ | = 𝑛 βˆ‘οΈ π‘˜=1 1 (cid:16) 2π‘˜βˆ’1 2𝑛 πœ‹ (cid:17)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)cos = 𝑛 βˆ‘οΈ π‘˜=1 1 (cid:16) π‘›βˆ’2π‘˜+1 2𝑛 πœ‹ . (cid:17)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)sin Taking advantage of the symmetry about 𝑠 = 0, 𝑛 βˆ‘οΈ π‘˜=1 1 (cid:16) π‘›βˆ’2π‘˜+1 2𝑛 πœ‹ (cid:17)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)sin 𝑛/2 βˆ‘οΈ = 2 π‘˜=1 sin 1 (cid:16) 2π‘˜βˆ’1 2𝑛 πœ‹ . (cid:17) 43 (3.23) (3.24) Next, we use the lower bound sin π‘₯ β‰₯ π‘₯/2 (0 ≀ π‘₯ ≀ πœ‹/2) (3.25) in order to bound the terms of the sum. 𝑛/2 βˆ‘οΈ 2 π‘˜=1 sin 8𝑛 πœ‹ (cid:17) ≀ 1 (cid:16) 2π‘˜βˆ’1 2𝑛 πœ‹ 8𝑛 πœ‹ = 𝑛/2 βˆ‘οΈ π‘˜=1 (cid:18) 1 2π‘˜ βˆ’ 1 𝐻𝑛 βˆ’ (cid:19) 𝐻𝑛/2 1 2 (3.26) Here, 𝐻𝑛 denotes the 𝑛th harmonic number. From the relation 𝐻𝑛 = 𝛾 + πœ“(𝑛 + 1), where πœ“ is the digamma function, 𝐻𝑛 βˆ’ 1 2 𝐻𝑛/2 = 𝛾/2 + πœ“(𝑛 + 1) βˆ’ πœ“(𝑛/2 + 1). 1 2 (3.27) Moreover, since πœ“(π‘₯) ∈ (log(π‘₯ βˆ’ 1/2), log(π‘₯)) for any π‘₯ > 1/2, this is upper bounded by 𝐻𝑛 βˆ’ 1 2 𝐻𝑛/2 < 𝛾/2 + log(𝑛 + 1) βˆ’ 1 2 log( 𝑛 + 1 2 ) = 𝛾 + log 2 2 + log(𝑛 + 1) 2 . (3.28) Reinserting this into (3.26), one obtains the bound 𝑛 βˆ‘οΈ π‘˜=1 1 |π‘ π‘˜ | ≀ 4𝑛 πœ‹ (𝛾 + log(2𝑛 + 2)) . The general lemma follows from a rescaling by 1/π‘Ž. (3.29) β–‘ Observe that 1/|π‘ π‘˜ | is essentially the number of Trotter steps to compute the π‘˜th interpolation point. Thus, Lemma 3.3.1 amounts to a bound on the total number of Trotter steps, and we see this grows as 𝑂 (π‘Žβˆ’1𝑛 log 𝑛). Lemma 3.3.2. Let 𝑝(𝑠) = ( 𝑝0(𝑠), 𝑝1(𝑠), . . . , π‘π‘›βˆ’1(𝑠)) be a vector of (normalized) Chebyshev polynomials on [βˆ’π‘Ž, π‘Ž]. Then, βˆ₯V𝑝(0) βˆ₯1 < 2 πœ‹ log (𝑛 + 1) + 1 where βˆ₯ Β· βˆ₯1 denotes the vector 1-norm. 44 Proof. Let 𝑑 (𝑠) = V𝑝(𝑠). For each π‘˜ = 1, 2, . . . 𝑛 we have π‘‘π‘˜ (𝑠) = π‘›βˆ’1 βˆ‘οΈ 𝑗=0 Vπ‘˜ 𝑗 𝑝 𝑗 (𝑠) = π‘›βˆ’1 βˆ‘οΈ 𝑝 𝑗 (π‘ π‘˜ ) 𝑝 𝑗 (𝑠) = 1 𝑛 + 2 𝑛 (cid:18) 𝑗 cos π‘›βˆ’1 βˆ‘οΈ 𝑗=1 𝑗=0 (cid:18) 2π‘˜ βˆ’ 1 2𝑛 (cid:19)(cid:19) πœ‹ cos( 𝑗 cosβˆ’1(𝑠)). At 𝑠 = 0, cos( 𝑗 cosβˆ’1(0)) = cos( 𝑗 πœ‹/2), which is zero for odd 𝑗. Hence, π‘‘π‘˜ (0) = = 1 𝑛 1 𝑛 + + 2 𝑛 2 𝑛 π‘›βˆ’2 βˆ‘οΈ 𝑗=2,even 𝑛/2βˆ’1 βˆ‘οΈ 𝑗 β€²=1 cos (cid:18) 𝑗 (cid:18) 2π‘˜ βˆ’ 1 2𝑛 (cid:19)(cid:19) πœ‹ (βˆ’1) 𝑗/2 (βˆ’1) 𝑗 β€² (cid:18) πœ‹ 𝑗 β€² 2π‘˜ βˆ’ 1 𝑛 (cid:19) . cos The sum can be evaluated exactly (we used Mathematica), yielding π‘‘π‘˜ (0) = = = 1 𝑛 1 𝑛 1 𝑛 (cid:32) (cid:18) βˆ’ βˆ’ 2 𝑛 1 𝑛 1 βˆ’ cos((π‘˜ + 𝑛/2)πœ‹) tan(πœ‹ 2π‘˜βˆ’1 2𝑛 ) 2 (cid:33) 1 βˆ’ (βˆ’1) π‘˜+𝑛/2 tan( 2π‘˜ βˆ’ 1 2𝑛 (cid:19) πœ‹) (βˆ’1) π‘˜+𝑛/2 tan (cid:18) 2π‘˜ βˆ’ 1 2𝑛 (cid:19) πœ‹ . With coefficients in hand, we now compute the one norm of 𝑑 (0). (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) (cid:18) 2π‘˜ βˆ’ 1 2𝑛 βˆ₯𝑑 (0)βˆ₯1 = 𝑛 βˆ‘οΈ tan 1 𝑛 (cid:12) (cid:12) (cid:12) (cid:12) (3.30) (3.31) (3.32) (3.33) (3.34) (3.35) π‘˜=1 We have a reflectional symmetry about π‘˜ β†’ 𝑛 βˆ’ π‘˜ + 1, allowing us to cut the sum in half and remove the absolute value sign. βˆ₯𝑑 (0) βˆ₯1 = 2 𝑛 𝑛/2 βˆ‘οΈ tan π‘˜=1 π‘š βˆ‘οΈ tan (cid:19) πœ‹ (cid:18) 2π‘˜ βˆ’ 1 2𝑛 (cid:18) 2π‘˜ βˆ’ 1 2π‘š 2 = 1 π‘š (cid:19) πœ‹ (3.36) π‘˜=1 Here, π‘š ≑ 𝑛/2. We observe that the sum increases as π‘˜ approaches π‘š due to the first order pole at πœ‹/2. We can upper bound tan(π‘₯), and therefore the sum above, as follows. (cid:19) 1 π‘š π‘š βˆ‘οΈ π‘˜=1 tan (cid:18) 2π‘˜ βˆ’ 1 2π‘š πœ‹ 2 π‘š βˆ‘οΈ ≀ 1 π‘š (cid:17) = 4 πœ‹ 1 (cid:16) 2π‘˜βˆ’1 2π‘š π‘˜=1 πœ‹ 2 βˆ’ πœ‹ 2 1 2(π‘š βˆ’ π‘˜) + 1 (3.37) 1 2 𝑗 βˆ’ 1 π‘š βˆ‘οΈ π‘˜=1 π‘š βˆ‘οΈ 𝑗=1 = 4 πœ‹ 45 In the last line, we reindexed by 𝑗 = π‘š βˆ’ π‘˜ + 1. Borrowing the reasoning from the prior lemma, 4 πœ‹ π‘š βˆ‘οΈ 𝑗=1 1 2 𝑗 βˆ’ 1 < 2 πœ‹ (𝛾 + log(2𝑛 + 2)). Tracing back, this is an upper bound on βˆ₯𝑑 (0) βˆ₯1. Hence, βˆ₯𝑑 (0)βˆ₯1 < 2 πœ‹ log(𝑛 + 1) + 2(𝛾 + log(2)) πœ‹ < 2 πœ‹ log(𝑛 + 1) + 1. (3.38) (3.39) β–‘ The benefit of well-conditioning comes from relaxing the need to have exquisitely precise data to achieve good interpolations. This property is captured by the following Theorem. Theorem 3.3.3. Let 𝑦 = ( 𝑓 (𝑠1), 𝑓 (𝑠2), . . . , 𝑓 (𝑠𝑛))𝑇 , and let Λœπ‘¦ ∈ R𝑛 be an approximation of 𝑦 in πœ‹ log(𝑛 + 1) + 1) with probability at least the sense that, for all 1 ≀ 𝑗 ≀ 𝑛, | 𝑓 (𝑠 𝑗 ) βˆ’ Λœπ‘¦ 𝑗 | ≀ πœ–/( 2 1 βˆ’ 𝛿/𝑛. Let 𝑝(𝑠) = ( 𝑝0(𝑠), . . . , π‘π‘›βˆ’1(𝑠))𝑇 be the vector of orthonormal Chebyshev polynomials. Then Λœπ‘¦π‘‡ V𝑝(𝑠) is an estimate of the interpolant π‘ƒπ‘›βˆ’1 𝑓 (𝑠) at 𝑠 = 0 to precision |π‘ƒπ‘›βˆ’1 𝑓 (0) βˆ’ Λœπ‘¦π‘‡ V𝑝(0)| ≀ πœ– with probability at least 1 βˆ’ 𝛿. Proof. First, observe that π‘ƒπ‘›βˆ’1 𝑓 (𝑠) = 𝑝(𝑠)𝑇 𝑐 = 𝑝(𝑠)𝑇 V𝑇 𝑦 by the discussion surrounding (3.22). Hence, By HΓΆlder’s inequality, |π‘ƒπ‘›βˆ’1 𝑓 (0) βˆ’ 𝑝(0)𝑇 V𝑇 Λœπ‘¦| = |(V𝑝(0))𝑇 (𝑦 βˆ’ Λœπ‘¦)|. |(V𝑝(0))𝑇 (𝑦 βˆ’ Λœπ‘¦)| ≀ βˆ₯V𝑝(0) βˆ₯1βˆ₯𝑦 βˆ’ Λœπ‘¦βˆ₯∞. From Lemma 3.3.2, and from the assumptions on the distance between 𝑦 and Λœπ‘¦, βˆ₯V𝑝(𝑠)βˆ₯1βˆ₯𝑦 βˆ’ Λœπ‘¦βˆ₯∞ ≀ (cid:18) 2 πœ‹ log(𝑛 + 1) + 1 (cid:19) πœ– πœ‹ log(𝑛 + 1) + 1 2 = πœ– (3.40) (3.41) (3.42) with probability Pr = (1 βˆ’ 𝛿/𝑛)𝑛. In fact, since the probability of each component Λœπ‘¦ exceeding the specified distance is 𝛿/𝑛, by the union bound the total probability of at least one component 46 exceeding this distance is less than 𝑛 Γ— (𝛿/𝑛) = 𝛿. Thus, the inequality is satisfied with probability Pr β‰₯ 1 βˆ’ 𝛿. This completes the proof. β–‘ Theorem 3.3.3 is what suggests that our interpolation approach may have the potential to achieve accuracy improvements without increasing costs compared to standard Trotter. It tells us that the error in Trotter data can be as large as the error of the final estimate up to a factor which is logarithmically small in the number of interpolation points, and therefore these data Λœπ‘¦π‘– can be computed "cheaply enough." Thus, Theorem 3.3.3 is plays an important role in the proofs of Lemma 3.5.1, presented in Section 3.5. 3.4 The Effective Hamiltonian Trotter formulas approximate π‘ˆ (𝑑) only in a neighborhood around 𝑑 = 0; thus the standard procedure for product formula simulations, as described in Section 2.7.1, is to divide the simulation interval [0, 𝑑] into π‘Ÿ subintervals, such that each interval is sufficiently small that the Trotter approximation is valid. For the simple case of a uniform mesh of π‘Ÿ subintervals, this becomes 𝑆2π‘˜ (𝑑/π‘Ÿ)π‘Ÿ = π‘ˆ (𝑑) + 𝑂 (𝑑2π‘˜+1/π‘Ÿ 2π‘˜ ) (3.43) where big 𝑂 is understood as taking π‘Ÿ large. However, it is simpler for our subsequent analysis to consider 𝑠 = 1/π‘Ÿ as a "dimensionless step size," and instead think about 𝑠 β†’ 0. In terms of 𝑠, we define Λœπ‘ˆπ‘  (𝑑) := 𝑆2π‘˜ (𝑠𝑑)1/𝑠 (3.44) as the approximate evolution operator for 𝑠 β‰  0. The discontinuity at 𝑠 = 0 in (3.44) may be filled by the exact evolution Λœπ‘ˆ0(𝑑) := π‘ˆ (𝑑). Though we defined 𝑠 as a reciprocal integer, definition (3.44) suggests an extension to allow 𝑠 to be real-valued. In fact, the resulting function Λœπ‘ˆπ‘  is smooth on a neighborhood of 𝑠 = 0, a fact that will allow us to precisely characterize the interpolation error. For our purposes, we will only consider |𝑠| ≀ 1. When 1/𝑠 is not an integer, we may implement Λœπ‘ˆπ‘  using fractional queries [54] by splitting 1/𝑠 into integer and fractional parts. 1/𝑠 = π‘Ÿ + 𝑓 47 (3.45) Here, π‘Ÿ = rnd(1/𝑠) ∈ Z is 1/𝑠 rounded to the nearest integer, and 𝑓 ∈ [βˆ’1/2, 1/2]. Finally, we note that Λœπ‘ˆπ‘  is an even function of 𝑠, which we will make use of to cut the number of interpolation points in half by reflecting across 𝑠 = 0. Prior work has demonstrated the value of considering the effective Trotter Hamiltonian in the analysis of Trotter formulas [139]. This approach is also helps us calculate high order derivatives of Λœπ‘ˆπ‘  as needed for our error bounds. We define an effective Hamiltonian so that Λœπ»π‘  := 𝑖 𝑠𝑑 log 𝑆2π‘˜ (𝑠𝑑) Λœπ‘ˆπ‘  (𝑑) = π‘’βˆ’π‘– Λœπ»π‘ π‘‘ . (3.46) (3.47) Note that Λœπ»π‘  depends on 𝑑 as well, though this dependence will be left implicit. For the purposes of bounding the interpolation error, we require a bound on the norm of Λœπ»π‘ . This is supplied by the following lemma. Lemma 3.4.1. In the notation introduced above, let 𝑠 be chosen such that π‘˜ (5/3) π‘˜ π‘š max π‘™βˆˆ[1,π‘š] βˆ₯𝐻𝑙 βˆ₯|𝑠|𝑑 ≀ πœ‹/20. Then the following bound on the derivatives of Λœπ»π‘  with respect to 𝑠 holds. βˆ₯πœ•π‘› 𝑠 Λœπ»π‘  βˆ₯ ≀ 2π‘‘βˆ’1𝑛𝑛 (𝑒2π‘˜ (5/3) π‘˜ π‘š max π‘™βˆˆ[1,π‘š] βˆ₯𝐻𝑙 βˆ₯𝑑)𝑛+1 Note that our bounds are uniformly worse for larger π‘˜, i.e., higher order ST formulas. Assuming that this is not an artifact of our mathematical treatment, this suggests low order formulas are unconditionally preferred over high order ones for interpolation. Numerical studies could help determine the true impact of higher order formulas on the interpolation procedure. We conclude this section with the proof of the above Lemma, which will be essential to our subsequent error analysis. The upper bound will prove useful because, as we will see, the error in polynomial interpolation can be expressed using a formula akin to the Taylor remainder, which involves a high-order derivative. 48 Proof of Lemma 3.4.1. Recall the definition of the effective Hamiltonian (3.46), defined for 𝑠 ∈ R \ {0} and for 𝑠 = 0 by ˜𝐻0 := lim𝑠→0 Λœπ»π‘  = 𝐻. We will understand log 𝑆2π‘˜ (𝑠𝑑) through a power series expansion about the identity. log 𝑆2π‘˜ (𝑠𝑑) = ∞ βˆ‘οΈ 𝑗=0 (βˆ’1) 𝑗 𝑗 + 1 (𝑆2π‘˜ (𝑠𝑑) βˆ’ 𝐼) 𝑗+1 This series converges precisely when βˆ₯𝑆2π‘˜ (𝑠𝑑) βˆ’ 𝐼 βˆ₯ ≀ 1. (3.48) (3.49) Using the fundamental theorem of calculus, we can derive a suitable condition for convergence as a neighborhood about 𝑠 = 0. The condition above implies which is satisfied provided (cid:13) (cid:13) (cid:13) (cid:13) ∫ 𝑠𝑑 0 𝑆2π‘˜ (π‘₯)𝑑π‘₯ (cid:13) (cid:13) (cid:13) (cid:13) ≀ 1 |𝑠𝑑| max π‘₯∈[0,𝑠𝑑] (cid:13) (cid:13) (cid:13) (cid:13) 𝑑 𝑑π‘₯ 𝑆2π‘˜ (π‘₯) (cid:13) (cid:13) (cid:13) (cid:13) ≀ 1. (3.50) (3.51) Writing out 𝑆2π‘˜ (π‘₯) = (cid:206)π‘π‘˜ 𝑙=1 exp(βˆ’π‘–π» 𝑗𝑙 πœπ‘™π‘₯) where 𝐻 𝑗𝑙 is some Hamiltonian piece 𝐻 𝑗 indexed by 𝑙, the derivative can be upper bounded as (cid:13) (cid:13) (cid:13) (cid:13) 𝑑 𝑑π‘₯ max π‘₯∈[0,𝑠𝑑] 𝑆2π‘˜ (π‘₯) (cid:13) (cid:13) (cid:13) (cid:13) ≀ π‘π‘˜βˆ‘οΈ 𝑙=1 βˆ₯𝐻 𝑗𝑙 βˆ₯|πœπ‘™ | ≀ max 𝑗 βˆ₯𝐻 𝑗 βˆ₯βˆ₯𝜏βˆ₯1 (3.52) where 𝜏 = (πœπ‘™) π‘π‘˜ 𝑙=1 is the vector of ST coefficients, and in going to the second line we used a HΓΆlder inequality. We have βˆ₯πœπ‘™ βˆ₯1 ≀ π‘π‘˜ max𝑙 |πœπ‘™ |, and from Appendix A of [137] we have |πœπ‘™ | ≀ 2π‘˜/3π‘˜ . max 𝑙 Thus, the requirement for convergence of the logarithm becomes 4 3 π‘˜ (5/3) π‘˜βˆ’1π‘š|𝑠𝑑| max 𝑗 βˆ₯𝐻 𝑗 βˆ₯ ≀ 1 where we used the expression π‘π‘˜ = (2π‘š)5π‘˜βˆ’1 for the number of ST exponentials. (3.53) (3.54) 49 We now assume 𝑠 is within the symmetric interval defined by (3.54), such that (3.48) is convergent. Since log 𝑆2π‘˜ (0) = 0, 𝑠 = 0 is a zero of order at least one. We want to absorb the diverging 1/𝑠 term and better understanding the leading dependence in 𝑠. To facilitate this, we write where we defined Λœπ»π‘  = βˆ’ 1 𝑖𝑑 ∞ βˆ‘οΈ 𝑗=0 (βˆ’1) 𝑗 𝑗 + 1 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1 Δ𝑆2π‘˜ (𝑠𝑑) := 𝑆2π‘˜ (𝑠𝑑) βˆ’ 𝐼 𝑠 . Note that Δ𝑆2π‘˜ is analytic in 𝑠, and is a finite difference around 𝑠 = 0, such that Δ𝑆2π‘˜ (𝑠𝑑) = βˆ’π‘–π»π‘‘. lim 𝑠→0 (3.55) (3.56) (3.57) Through the series expansion (3.55) we may bound derivatives of Λœπ»π‘  via bounds on derivatives of Δ𝑆2π‘˜ . We first obtain a power series of Δ𝑆2π‘˜ by Taylor expanding every term in the product formula 𝑆2π‘˜ . Regrouping in powers of 𝑠𝑑, the result is Δ𝑆2π‘˜ (𝑠𝑑) = ∞ βˆ‘οΈ 𝑗=1 𝑠 π‘—βˆ’1(βˆ’π‘–π‘‘) 𝑗 𝑗! (cid:18) βˆ‘οΈ 𝐽 𝑗 𝑗1 . . . π‘—π‘π‘˜ (cid:19) π‘π‘˜(cid:214) 𝑙=1 (π»π‘™πœπ‘™) 𝑗𝑙 (3.58) where the parenthetical symbol is the multinomial coefficient, and the sum (cid:205)𝐽 is over all values of 𝐽 = ( 𝑗1, . . . , π‘—π‘π‘˜ ) such that (cid:205)π‘˜ π‘—π‘˜ = 𝑗. The derivatives with respect to 𝑠 are now easy to compute. Using the fact that 𝑠 𝑠 π‘—βˆ’1 = πœ•π‘› ( 𝑗 βˆ’ 1)! ( 𝑗 βˆ’ 1 βˆ’ 𝑛)! 𝑠 π‘—βˆ’π‘›βˆ’1 for 𝑗 > 𝑛 (and zero otherwise), we have πœ•π‘› 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) = βˆ₯πœ•π‘› 𝑠 Δ𝑆2π‘˜ (𝑠𝑑)βˆ₯ ≀ ∞ βˆ‘οΈ 𝑗=𝑛+1 ∞ βˆ‘οΈ 𝑗=𝑛+1 𝑠 π‘—βˆ’π‘›βˆ’1(βˆ’π‘–π‘‘) 𝑗 𝑗! ( 𝑗 βˆ’ 1)! ( 𝑗 βˆ’ 1 βˆ’ 𝑛)! (cid:18) βˆ‘οΈ 𝐽 𝑗 𝑗1 . . . π‘—π‘π‘˜ (cid:19) π‘π‘˜(cid:214) 𝑙=1 (π»π‘™πœπ‘™) 𝑗𝑙 𝑑 𝑗 ( 𝑗 βˆ’ 𝑛 βˆ’ 1)! 𝑠 π‘—βˆ’π‘›βˆ’1(𝜏maxπ‘π‘˜ Ξ›) 𝑗 (3.59) (3.60) 50 where Ξ› := max 𝑗 βˆ₯𝐻 𝑗 βˆ₯ and 𝜏max = max𝑙 |πœπ‘™ |. Factoring out powers of 𝑛 + 1 and reindexing, we are left with the following bound on derivatives of Δ𝑆2π‘˜ . βˆ₯πœ•π‘› 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) βˆ₯ ≀ (𝜏maxπ‘π‘˜ Λ𝑑)𝑛+1π‘’π‘ πœmaxπ‘π‘˜Ξ›π‘‘ (3.61) This expression is quite elegant; it is as if we were taking 𝑛 + 1 derivatives of the exponential 𝑒𝑐𝑠 with 𝑐 := 𝜏maxπ‘π‘˜ Λ𝑑 ≀ π‘˜ (5/3) π‘˜ π‘šΞ›π‘‘. (3.62) Factors of 𝑐 will occur frequently in what follows, so we find it convenient to adopt this symbol as shorthand. We return to bounding the derivatives of powers of Δ𝑆2π‘˜ (𝑠𝑑) as in equation (3.55). πœ•π‘› 𝑠 (cid:2)Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:3) We reduce this to the previous case by performing a multinomial expansion. 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1 = πœ•π‘› (cid:18) βˆ‘οΈ 𝑁 𝑛 𝑛0 . . . 𝑛 𝑗 (cid:19) 𝑗 (cid:214) 𝑙=0 πœ•π‘›π‘™ 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) (3.63) (3.64) As usual, the capital letter 𝑁 denotes the set of all nonnegative indices 𝑛0, . . . , 𝑛 𝑗 summing to 𝑛. Applying the triangle inequality and submultiplicativity, and employing the bound (3.61), βˆ₯πœ•π‘› 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1βˆ₯ ≀ (cid:18) βˆ‘οΈ 𝑁 𝑛 𝑛0 . . . 𝑛 𝑗 (cid:19) 𝑗 (cid:214) βˆ₯πœ•π‘›π‘™ 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) βˆ₯ (cid:19) (cid:18) ≀ βˆ‘οΈ 𝑛 𝑛0 . . . 𝑛 𝑗 𝑙=0 (cid:18) = 𝑒( 𝑗+1)𝑐𝑠𝑐𝑛+ 𝑗+1 βˆ‘οΈ 𝑁 𝑙=0 𝑗 (cid:214) 𝑁 𝑐𝑛𝑙+1𝑒𝑐𝑠 𝑛 𝑛0 . . . 𝑛 𝑗 (cid:19) , (3.65) where we’ve used the sum property of the 𝑛𝑙 where appropriate. The remaining sum over the multinomial coefficient is given by ( 𝑗 + 1)𝑛. Hence, βˆ₯πœ•π‘› 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1βˆ₯ ≀ (( 𝑗 + 1)𝑐)𝑛 (𝑐𝑒𝑐𝑠) 𝑗+1. (3.66) 51 Notice that, when 𝑗 = 0, this is consistent with equation (3.61). With result (3.66) in hand, we return to the power series (3.55). Differentiating term by term πœ•π‘› 𝑠 Λœπ»π‘  = βˆ’ 1 𝑖𝑑 ∞ βˆ‘οΈ 𝑗=0 (βˆ’1) 𝑗 𝑗 + 1 πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) and performing a binomial expansion for each term πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) = 𝑛 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18)𝑛 π‘ž 𝑠 𝑠 𝑗 (cid:1) (cid:16) (cid:0)πœ•π‘ž 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) πœ•π‘›βˆ’π‘ž (3.67) (3.68) will allow us to apply our previous results. It will be helpful to consider two cases separately: 𝑗 ≀ 𝑛 and 𝑗 > 𝑛. These regimes are somewhat qualitatively different, since the derivatives of 𝑠 𝑗 may or may not vanish depending on the number of derivatives. Focusing on the case 𝑗 ≀ 𝑛, we have πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) = 𝑗 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18)𝑛 π‘ž 𝑗! ( 𝑗 βˆ’ π‘ž)! 𝑠 π‘—βˆ’π‘ž (cid:16) 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) πœ•π‘›βˆ’π‘ž . (3.69) Note that the sum runs only to 𝑗, not 𝑛. Taking a triangle inequality upper bound using (3.66), we may upper bound (3.69) as 𝑗 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18)𝑛 π‘ž 𝑗! ( 𝑗 βˆ’ π‘ž)! 𝑠 π‘—βˆ’π‘ž (( 𝑗 + 1)𝑐)π‘›βˆ’π‘ž (𝑐𝑒𝑐𝑠) 𝑗+1 = (𝑐𝑒𝑐𝑠) 𝑗+1 𝑗 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18) 𝑗 π‘ž 𝑛! (𝑛 βˆ’ π‘ž)! 𝑠 π‘—βˆ’π‘ž (( 𝑗 + 1)𝑐)π‘›βˆ’π‘ž (3.70) where we have factored out terms not involving π‘ž from the sum, and manipulated the factorials for reasons which will be seen presently. Taking the upper bound 𝑛!/(𝑛 βˆ’ π‘ž)! < π‘›π‘ž, and factoring out 𝑛 βˆ’ 𝑗 powers of ( 𝑗 + 1)𝑐, we may upper bound the above expression by (𝑐𝑒𝑐𝑠) 𝑗+1(( 𝑗 + 1)𝑐)π‘›βˆ’ 𝑗 𝑗 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18) 𝑗 π‘ž π‘›π‘ž (( 𝑗 + 1)𝑐𝑠) π‘—βˆ’π‘ž = (𝑐𝑒𝑐𝑠) 𝑗+1(( 𝑗 + 1)𝑐)π‘›βˆ’ 𝑗 (𝑛 + ( 𝑗 + 1)𝑐𝑠) 𝑗 . Thus, with some minor polishing, we may express the bound on (3.68) for 𝑗 ≀ 𝑛 as βˆ₯πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) βˆ₯ ≀ 𝑒( 𝑗+1)𝑐𝑠𝑐𝑛+1( 𝑗 + 1)𝑛 (cid:18) 𝑛 𝑗 + 1 (cid:19) 𝑗 . + 𝑐𝑠 52 (3.71) (3.72) Now let’s move on to the 𝑗 > 𝑛 case. Here, there are not enough derivatives to kill off the 𝑠 𝑗 term, so the binomial sum in (3.69) will run from π‘ž = 0 to 𝑛. πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) = 𝑛 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18)𝑛 π‘ž 𝑗! ( 𝑗 βˆ’ π‘ž)! 𝑠 π‘—βˆ’π‘ž (cid:16) 𝑠 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) πœ•π‘›βˆ’π‘ž (3.73) Similar to before, we use the bound (3.66), to obtain βˆ₯πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) βˆ₯ ≀ 𝑛 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18)𝑛 π‘ž 𝑠 π‘—βˆ’π‘ž (( 𝑗 + 1)𝑐)π‘›βˆ’π‘ž (𝑐𝑒𝑐𝑠) 𝑗+1 = (𝑐𝑒𝑐𝑠) 𝑗+1𝑠 π‘—βˆ’π‘› (cid:19) (cid:18)𝑛 π‘ž 𝑗! ( 𝑗 βˆ’ π‘ž)! (( 𝑗 + 1)𝑐𝑠)π‘›βˆ’π‘ž. (3.74) 𝑗! ( 𝑗 βˆ’ π‘ž)! 𝑛 βˆ‘οΈ π‘ž=0 Taking 𝑗!/( 𝑗 βˆ’ π‘ž)! < 𝑗 π‘ž, a simpler upper bound is given by (𝑐𝑒𝑐𝑠) 𝑗+1𝑠 π‘—βˆ’π‘› 𝑛 βˆ‘οΈ π‘ž=0 (cid:19) (cid:18)𝑛 π‘ž 𝑗 π‘ž (( 𝑗 + 1)𝑐𝑠)π‘›βˆ’π‘ž = (𝑐𝑒𝑐𝑠) 𝑗+1𝑠 π‘—βˆ’π‘› ( 𝑗 + ( 𝑗 + 1)𝑐𝑠)𝑛. (3.75) With some minor rearrangements, this gives the following upper bound for 𝑗 > 𝑛. βˆ₯πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) βˆ₯ ≀ 𝑒( 𝑗+1)𝑐𝑠𝑐𝑛+1( 𝑗 + 1)𝑛 (𝑐𝑠) π‘—βˆ’π‘› (cid:18) 𝑗 𝑗 + 1 (cid:19) 𝑛 + 𝑐𝑠 (3.76) With the bounds (3.72) and (3.76), we can return to bounding πœ•π‘› 𝑠 Λœπ»π‘ . Still separating the two cases 𝑗 ≀ 𝑛 and 𝑗 > 𝑛, we can write βˆ₯πœ•π‘› 𝑠 Λœπ»π‘  βˆ₯𝑑 ≀ 𝑛 βˆ‘οΈ 𝑗=0 1 𝑗 + 1 = 𝐡𝑙 + π΅β„Ž βˆ₯πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) βˆ₯ + ∞ βˆ‘οΈ 𝑗=𝑛+1 1 𝑗 + 1 βˆ₯πœ•π‘› 𝑠 (cid:16) 𝑠 𝑗 Δ𝑆2π‘˜ (𝑠𝑑) 𝑗+1(cid:17) βˆ₯ (3.77) where 𝐡𝑙 and π΅β„Ž refer to bounds on the "low" and "high" parts of the series. Employing the bounds from equations (3.72) and (3.76), we have 𝐡𝑙 ≀ 𝑛 βˆ‘οΈ 𝑗=0 = 𝑐𝑛+1 1 𝑗 + 1 𝑛 βˆ‘οΈ 𝑒( 𝑗+1)𝑐𝑠𝑐𝑛+1( 𝑗 + 1)𝑛 (cid:18) 𝑛 𝑗 + 1 (cid:19) 𝑗 + 𝑐𝑠 𝑒( 𝑗+1)𝑐𝑠 ( 𝑗 + 1)π‘›βˆ’1 (cid:18) 𝑐𝑠 + (cid:19) 𝑗 𝑛 𝑗 + 1 (3.78) 𝑗=0 53 and π΅β„Ž ≀ ∞ βˆ‘οΈ 𝑗=𝑛+1 1 𝑗 + 1 𝑒( 𝑗+1)𝑐𝑠𝑐𝑛+1( 𝑗 + 1)𝑛 (𝑐𝑠) π‘—βˆ’π‘› (cid:18) 𝑗 𝑗 + 1 (cid:19) 𝑛 + 𝑐𝑠 ≀ 𝑐𝑛+1 ∞ βˆ‘οΈ 𝑗=𝑛+1 𝑒( 𝑗+1)𝑐𝑠 ( 𝑗 + 1)π‘›βˆ’1(𝑐𝑠) π‘—βˆ’π‘› (1 + 𝑐𝑠)𝑛 (3.79) = 𝑐𝑛+1 (1 + 𝑐𝑠)𝑛 ∞ βˆ‘οΈ 𝑗=𝑛+1 𝑒( 𝑗+1)𝑐𝑠 ( 𝑗 + 1)π‘›βˆ’1(𝑐𝑠) π‘—βˆ’π‘›. Let’s begin by simplifying the bound on 𝐡𝑙. We will at this point make the assumption that 𝑠 is sufficiently small such that 𝑐𝑠 < 1. This will necessarily factor into the cost later. This simplification yields 𝐡𝑙 ≀ 𝑐𝑛+1 ≀ 𝑐𝑛+1 𝑛 βˆ‘οΈ 𝑗=0 𝑛 βˆ‘οΈ 𝑗=0 𝑒 𝑗+1( 𝑗 + 1)π‘›βˆ’1 (cid:18) 1 + (cid:19) 𝑗 𝑛 𝑗 + 1 𝑒 𝑗+1( 𝑗 + 1)π‘›βˆ’1𝑒𝑛 ≀ 1 𝑒 (𝑒2𝑐)𝑛+1 𝑛+1 βˆ‘οΈ 𝑗=1 𝑗 π‘›βˆ’1. The remaining sum can be bounded by (𝑛 + 1)𝑛, hence, 𝐡𝑙 ≀ (𝑒2(𝑛 + 1)𝑐)𝑛+1 𝑒(𝑛 + 1) ≀ (𝑒2𝑐)𝑛+1𝑛𝑛, (3.80) (3.81) where the definition that 00 = 1 handles the edge case. Let’s turn our attention to π΅β„Ž. We will start by reindexing so that the series begins at 𝑗 = 0 in (3.79). π΅β„Ž ≀ 𝑐𝑛+1(1 + 𝑐𝑠)𝑛 ∞ βˆ‘οΈ 𝑗=0 𝑒( 𝑗+𝑛+2)𝑐𝑠 ( 𝑗 + 𝑛 + 2)π‘›βˆ’1(𝑐𝑠) 𝑗+1 = (𝑐𝑒𝑐𝑠)𝑛+1(1 + 𝑐𝑠)𝑛 ∞ βˆ‘οΈ 𝑗=0 (𝑐𝑠𝑒𝑐𝑠) 𝑗+1( 𝑗 + 𝑛 + 2)π‘›βˆ’1 The series converges if and only if 𝑐𝑠𝑒𝑐𝑠 < 1. (3.82) (3.83) (3.84) This condition is slightly stronger than the condition (3.54) that we need for convergence of the logarithm (3.48), and is equivalent to 𝑐𝑠 < π‘Š (1) β‰ˆ 0.567, where π‘Š is the principal brach of the 54 Lambert W function. Returning to (3.82), we have the bound ( 𝑗 + 𝑛 + 2)π‘›βˆ’1 = (𝑛 + 2)π‘›βˆ’1 (cid:19) π‘›βˆ’1 (cid:18) 1 + 𝑗 𝑛 + 2 ≀ (𝑛 + 2)π‘›βˆ’1𝑒 𝑗 . (3.85) Thus, we have π΅β„Ž ≀ (𝑐𝑒𝑐𝑠)𝑛+1(1 + 𝑐𝑠)𝑛 (𝑛 + 2)π‘›βˆ’1 = (𝑐𝑒𝑐𝑠)𝑛+1(1 + 𝑐𝑠)𝑛 (𝑛 + 2)π‘›βˆ’1 (𝑒𝑐𝑠𝑒𝑐𝑠) 𝑗 ∞ βˆ‘οΈ 𝑗=0 1 1 βˆ’ 𝑒𝑐𝑠𝑒𝑐𝑠 . (3.86) To be concrete, let’s take 𝑒𝑐𝑠𝑒𝑐𝑠 < 1/2, which is implied by 𝑐𝑠 < πœ‹/20. Coupled with the inequality in (3.62), this condition can be met provided that π‘˜ (5/3) π‘˜ π‘šΞ›π‘ π‘‘ ≀ πœ‹/20, (3.87) which is exactly the assumption of Lemma 3.4.1. This allows us to upper bound π΅β„Ž further as π΅β„Ž ≀ 2π‘’πœ‹/20(𝑐𝑠)𝑛+1(3π‘’πœ‹/20/2)𝑛 (𝑛 + 2)π‘›βˆ’1 ≀ 4(𝑐𝑠)𝑛+1(9/5)𝑛 (𝑛 + 2)π‘›βˆ’1. (3.88) Since (𝑛 + 2)π‘›βˆ’1 ≀ 𝑒2𝑛𝑛/2 (using 00 := 1 for the edge case 𝑛 = 0), we have Altogether, using 𝑠 ≀ 1 π΅β„Ž ≀ 2𝑒2(𝑐𝑠)𝑛+1 (9/5)𝑛 𝑛𝑛. βˆ₯πœ•π‘› 𝑠 Λœπ»π‘  βˆ₯𝑑 ≀ 𝑛𝑛 (𝑒2𝑐)𝑛+1 (cid:16) 1 + 2(9/5𝑒2)𝑛(cid:17) ≀ 2𝑛𝑛 (𝑒2𝑐)𝑛+1. (3.89) (3.90) The final result then follows from substituting for 𝑐 and noting that the duration of each time step is at most 2π‘˜/3π‘˜βˆ’1 using the results of [138]. β–‘ 3.5 Application to Dynamical Observables We now consider the application of Chebyshev interpolation to estimate expectation values, a fundamental task in quantum computation. The setting is as follows: given a quantum state 𝜌 and observable 𝑂, the expectation value is given by βŸ¨π‘‚βŸ© = Tr(πœŒπ‘‚). We evolve our system according 55 to a 2π‘˜-th order ST formula Λœπ‘ˆπ‘  given by (3.44). The time evolved expectation values of interest is captured by the function 𝑓 (𝑠) := Tr(πœŒπ‘‚ 𝑠 (𝑑)) βˆ₯𝑂 βˆ₯ (3.91) where 𝑂 𝑠 (𝑑) is given by equation (3.5). We’ve normalized the expectation values by βˆ₯𝑂 βˆ₯ because the relative error is a useful and natural metric, and also the normalized operators may be block encoded for amplitude estimation. Alternatively, we simply restrict our attention to normalized observables with βˆ₯𝑂 βˆ₯ = 1. The interpolation algorithm we propose can be summarized as follows. 1. Given Hamiltonian 𝐻, simulation time 𝑑, and tolerance πœ– for the estimate of βŸ¨π‘‚ (𝑑)⟩/βˆ₯𝑂 βˆ₯, choose the appropriate interpolation interval [βˆ’π‘Ž, π‘Ž] and an even number 𝑛 of Chebyshev nodes. We neglect the cost of this step. The error analysis we will perform subsequently will inform the choices of π‘Ž and 𝑛. 2. Compute estimates Λœπ‘¦π‘– of the expectation values βŸ¨π‘‚ 𝑠𝑖 (𝑑)⟩ for each 𝑠𝑖 with 𝑖 = 1, . . . , 𝑛/2, to an accuracy depending on πœ– and 𝑛. We will assume this step is done with Iterative Quantum Amplitude Estimation (IQAE) [56], a recent approach to amplitude estimation that exhibits low quantum overhead. Our metric of cost is the number of 𝐻 𝑗 exponentials executed on a quantum circuit, where 𝐻 = (cid:205) 𝑗 𝐻 𝑗 . Note that by symmetry, we need not compute Λœπ‘¦π‘– for 𝑖 > 𝑛/2. We have 𝑓 (𝑠𝑖) = 𝑓 (π‘ π‘›βˆ’π‘–+1) for all 𝑖 ∈ {1, . . . , 𝑛}. 3. Perform the polynomial fit Λœπ‘ƒπ‘›βˆ’1 𝑓 through the points (𝑠𝑖, Λœπ‘¦π‘–) using a Chebyshev expan- sion (3.17). Note that Λœπ‘ƒπ‘›βˆ’1 𝑓 will automatically be even. This fit is well-conditioned, and we neglect the cost of this step. 4. Evaluate the Λœπ‘ƒπ‘›βˆ’1 𝑓 (0) to be our estimate of βŸ¨π‘‚ (𝑑)⟩. To summarize, one performs amplitude estimation to acquire the time evolved expectation value at each Chebyshev node, then performs a polynomial interpolation of the data. The estimate is the value at 𝑠 = 0. 56 Given an even set of Chebyshev nodes {𝑠1, . . . , 𝑠𝑛}, and making use of Lemma 3.2.1, the interpolation error πΈπ‘›βˆ’1 assuming perfect data points is given by |πΈπ‘›βˆ’1(0)| ≀ |Tr 𝜌 πœ•π‘› 𝑠 𝑂 𝑠 (𝑑)| βˆ₯𝑂 βˆ₯𝑛! 𝑛 (cid:214) 𝑖=1 |𝑠𝑖 | ≀ max π‘ βˆˆ[βˆ’π‘Ž,π‘Ž] βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 (𝑑) βˆ₯ βˆ₯𝑂 βˆ₯ (cid:16) π‘Ž 2𝑛 (cid:17) 𝑛 . (3.92) With a suitable bound on πœ•π‘› 𝑠 𝑂 (𝑑), we can provide an upper bound on the interpolation error at 𝑠 = 0. This bound is provided by the following lemma. In what follows, it will be helpful to define the parameter 𝑐 := π‘˜ (5/3) π‘˜ π‘š max π‘™βˆˆ[1,π‘š] βˆ₯𝐻𝑙 βˆ₯𝑑 (3.93) for ease of notation. This parameter is proportional to the the Hamiltonian size and the "total Trotter time," meaning the sum of all the forward and backward time steps, in absolute value, for a 2π‘˜-th ST formula. Lemma 3.5.1 (Extrapolation Error Bound for Time-Evolved Observables.). Under the conditions of Lemma 3.4.1 (π‘π‘Ž ≀ πœ‹/20), the following bounds holds on the Trotterized evolution 𝑂 𝑠 (𝑑) with step parameter 𝑠 ∈ (0, π‘Ž]: 1. for 𝑐 > 𝑛 we have that βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 (𝑑) βˆ₯ βˆ₯𝑂 βˆ₯ βˆšοΈƒ (cid:18) 𝑐 < 𝑒3(1 + √︁ 8/πœ‹π‘’2) (cid:19) 2𝑛 which gives an interpolation error |πΈπ‘›βˆ’1(0)| < (cid:18) 129 (cid:19) 𝑛 . 𝑐2π‘Ž 𝑛 2. For 𝑐 ≀ 𝑛, we have βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 (𝑑) βˆ₯ βˆ₯𝑂 βˆ₯ βˆšοΈ‚ 2𝑛 πœ‹ ≀ (cid:18) 𝑒4𝑐 (cid:19) 𝑛 2 √ 2/πœ‹ 𝑛!𝑒4𝑐𝑒2 giving an interpolation error |πΈπ‘›βˆ’1(0) βˆ₯ ≀ 2 √ 2𝑛 (6π‘π‘Ž)𝑛 𝑒24𝑐. 57 The proof of this lemma is a tedious exercise in repeated in the combinatorics of large derivatives and the triangle inequality, and is left to the end of this chapter. Note that once the derivative bound holds, the interpolation error bound follows immediately from Lemma 3.2.1. One motivation for these bounds is deriving asymptotic expressions for the algorithmic com- plexity. The following theorem gives an asymptotic query complexity for the number 𝑁exp of Trotter exponentials exp(βˆ’π‘–π» 𝑗 𝜏). Theorem 3.5.2. Let 𝑂 (𝑑) = π‘ˆβ€ (𝑑)π‘‚π‘ˆ (𝑑) be a time-evolved observable under a Hamiltonian 𝐻 = (cid:205)π‘š 𝑙=1 𝐻𝑙 on 𝑛 qubits, so that π‘ˆ (𝑑) = π‘’βˆ’π‘–π»π‘‘. Suppose there exists a 𝛾 ∈ R+ such that 𝑂/𝛾 can be block encoded via a unitary π‘ˆenc by a state |𝐺⟩ on a set of 𝐿 auxiliary qubits. Let 𝜌 be a quantum state on 𝑛 qubits, and suppose 𝛾/βˆ₯𝑂 βˆ₯ ∈ 𝑂 (1). Then, the number of exponentials 𝑁exp required to estimate Tr(πœŒπ‘‚ (𝑑))/βˆ₯𝑂 βˆ₯ to precision πœ– with confidence 1 βˆ’ 𝛿 using a 2π‘˜ order Suzuki Trotter formula satisfies 𝑁exp ∈ Λœπ‘‚ (cid:16) 𝑐 max{𝑐, log(1/πœ–)}πœ– βˆ’1 log(1/𝛿) (cid:17) . Here, Λœπ‘‚ is big-𝑂 with multiplicative terms suppressed which are logarithmically smaller in 1/πœ– and 𝑐. Moreover, the number of auxiliary qubits needed is 𝑂 (𝐿). We give a sketch of the proof. Given a choice of interval [βˆ’π‘Ž, π‘Ž] and (even) number of interpolation points 𝑛, we have from Lemma 3.3.1 that the number of exponentials to perform evolutions for all Chebyshev nodes goes as 𝑂 (cid:19) (cid:18) 𝑛 log 𝑛 π‘Ž . (3.94) However, this is not the total cost since these circuits need to be repeated to perform the appropriate measurement protocols. Since 𝑂 can be block encoded, the expectation value can be obtained via an amplitude estimation protocol. By the well-conditioning of Lemma 3.3.3, each data point needs to be within πœ– of the exact Trotter value, up to a logarithmic factor in 𝑛. This robustness is why our result maintains a (cid:101)𝑂 (πœ– βˆ’1) scaling. In our proof, we assume the IQAE protocol is used, requiring only a single qubit overhead. The fractional queries for the noninteger time step also require 𝑂 (1) overhead, meaning the total 58 overhead is 𝑂 (𝐿) due to the block encoding. To relate 𝑛 and π‘Ž to the required precision πœ–, simulation time 𝑑 and Hamiltonian 𝐻, Lemma 3.5.1 can be used. Thus, we can relate 𝑁exp to these basic parameters. We carry out the formal proof in Section 3.8. As advertised, we see there is a "near-Heisenberg" scaling of 1/πœ–, up to logarithmic factors. However, there is an unsavory quadratic scaling in the simulation time in cases without high accuracy demands. I believe this can be improved, because our approach us forces us to have 𝑛 scale linearly in 𝑇, which seems overly pessimistic. Finally, our results suggest the best performance for using low order formulas, since our bounds are strictly worse for increasing ST order π‘˜. 3.6 Numerical Demonstration To support the theoretical findings of this chapter, we numerically emulate the polynomial interpolation procedure on the 1D random Heisenberg model. I thank James Watson for his collaboration on these numerics. The Hamiltonian of interest is 𝐻 = π‘›βˆ’1 βˆ‘οΈ 𝑖=0 πœŽπ‘– Β· πœŽπ‘–+1 + β„Žπ‘– 𝑍𝑖 (3.95) where each β„Žπ‘– ∈ [βˆ’β„Ž, β„Ž] is sampled randomly and uniformly, with β„Ž > 0 setting the "disorder strength." Moreover, πœŽπ‘– Β· πœŽπ‘–+1 = 𝑋𝑖 𝑋𝑖+1 + π‘Œπ‘–π‘Œπ‘–+1 + 𝑍𝑖 𝑍𝑖+1 is the dot product of the vector of Paulis πœŽπ‘– = (𝑋𝑖, π‘Œπ‘–, 𝑍𝑖). We take the chain to be a circle, so the 𝑛th qubit is adjacent to the 1st (πœŽπ‘›+1 = 𝜎1). We choose this system because it is simple, yet sufficiently interesting from the perspective of condensed matter physics, providing a model for the often-studied phenomenon of many-body localization and closed-system thermalization [25]. Although the product formula simulations are meant to be done on a quantum computer, here we instead perform all computations classically. Specifically, we compute the product of matrix exponentials for each product formula. Although not a true quantum simulation, this still provides accurate information about the Trotter error mitigated by the polynomial interpolation procedure. To proceed, we first need to decide on a decomposition of the Hamiltonian (3.95), i.e. a partition of the various terms such that each partition is easy to simulate. To this end, it is helpful to observe that this system can be represented as a circular graph of 𝑛 nodes, with links representing the hopping interaction πœŽπ‘– Β· πœŽπ‘–+1. The interactions commute if the links don’t meet at a vertex. Thus, 59 we might seek to color the edges of our graph, such that two edges of the same color don’t meet. If 𝑛 is even, only two colors are necessary for this, by coloring alternate edges. Taking 𝑛 even for simplicity, we thus partition 𝐻 as 𝐻 = 𝐻even + 𝐻odd + 𝐻pot (3.96) where 𝐻even = βˆ‘οΈ πœŽπ‘– Β· πœŽπ‘–+1 𝐻odd = 𝐻pot = 𝑖 even βˆ‘οΈ πœŽπ‘– Β· πœŽπ‘–+1 𝑖 odd βˆ‘οΈ 𝑖 β„Žπ‘– 𝑍𝑖. (3.97) Within each partition, the terms commute, and therefore the exponential can be split without error. As a circuit, each term can be implemented in parallel. Exact methods for computing a two-qubit unitary can be used [129]. We use the decomposition (3.97) for all of the results in this Section. Multiple different observables and initial states could be potentially considered. Here we choose the following: 𝑂 = π‘π‘›βˆ’1𝑍0𝑋1 |πœ“0⟩ = |1βŸ©π‘›/2 (3.98) where |1βŸ©π‘›/2 is the computational basis state on 𝑛 qubits with a single 1 on qubit 𝑛/2. Intuitively, we imagine the 1 representing an excitation, and across the circle is the observable of interest 𝑂. The hopping terms will cause the 1 to "travel" to the other side, but the potential 𝐻pot will cause more nontrivial behavior. To be concrete, we take the following parameter values in all of the data presented below: 𝑇 = 1, β„Ž = 1, 𝑛 = 6 and 2nd order Trotter. We first wish to observe the error mitigation in action, as a proof of concept. To do so, need an "exact" estimate of βŸ¨π‘‚ (𝑑)⟩, meaning an estimate far more accurate than the simulation methods employed. We use direct matrix exponentiation with 20 digits of precision to accommodate this need, and all errors are measured with respect to this "exact" calculation. Next, we fix the 60 Figure 3.1 Additive error of the time-evolved expectation value βŸ¨π‘‚ (𝑇)⟩ plotted against maximum Trotter depth. The seven blue points represent seven distinct numbers of Chebyshev nodes 𝑁 ranging from 2 to 14 in even increments. All data points are calculated to high precision using matrix multiplication, and thus the dominant error is due to Trotter. The orange denotes the best single data point gathered for the interpolation, representing the best estimate without classical post-processing. We see that the polynomial procedure, as the theory suggests, provides an exponential reduction in Trotter error with each additional data point. Meanwhile, the orange line appears to, as expected, follow only a polynomial trend. interpolation interval [βˆ’π‘Ž, π‘Ž], with π‘Ž = (βˆ₯𝐻 βˆ₯𝑇)βˆ’1 a reasonable choice for the largest step size 𝑠. We then vary the number 𝑁 of interpolation points, and for each 𝑁 (from 2 to 14 in even increments) we perform the interpolation procedure and calculate the error. By comparing this error with the error of the "best" data point, i.e., the data point with the smallest Trotter step size, we can get a sense of how much the Trotter error is mitigated. Figure 3.1 shows the results of this numerical experiment. Each blue point, going left to right, represents the Chebyshev interpolation estimate of βŸ¨π‘‚ (𝑑)⟩ with increasing number of interpolation points 𝑁, while the corresponding orange point is the best single data point. We see that the inter- polation post-processing dramatically improves the accuracy of the expectation value calculation. The downward trend continues until flatlining where numerical roundoff takes over as the dominant error. One respect in which the results of Figure 3.1 are limited is that data points will not be computed 61 02040608010010βˆ’1810βˆ’1410βˆ’1010βˆ’610βˆ’2MaximumTrotterDepthTrotterErrorforObservableMitigatedTrotterErrorvs.CircuitDepthExtrapolatedErrorDirectlyMeasuredError to near perfection (or be limited by digital round off) even in the case of perfect time evolution. Expectation values must be estimated through a quantum measurement protocol which can achieve, at best, a Heisenberg limited scaling of 𝑂 (1/πœ–data) for the number of operations needed to reach precision πœ–data. And while Chebyshev interpolation provides robustness to data errors, a numerical demonstration of this is desirable. To work within our classical setup, we model data imperfection as Gaussian noise of fixed width on top of the numerically computed value at each 𝑠. We change this noise parameter πœ–data and observe the effect on the true error πœ– in the final interpolation estimate. Figure 3.2 gives the results for various degrees 𝑁 of Chebyshev interpolation. We plot in terms of inverse errors, so that moving up along either axis corresponds to increased precision in the data or final estimate. We observe, for each 𝑁, two regimes: a linearly-sloped regime at low data precision and a plateau for sufficiently precise Trotter data. The first regime corresponds to the error in the final estimate being dominated by errors in the measurement itself, rather than the systematic Trotter error. In this regime, the Trotter error has been effectively mitigated, and only the data error remains. However, for each 𝑁 there is a crossover point where the Trotter error can no longer be brought smaller than πœ–data. At this point, further increases in data precision no longer improve the final estimate, because the dominant source is fundamentally the Trotter error. Unsurprisingly, larger 𝑁 increases the mitigation of Trotter error, delaying this crossover point and improving the final estimate. Beyond verifying our interpolation approach, this graph illustrates how, for measurement data gathered on Trotter simulations, there is little value in achieving measurement accuracy beyond the crossover point. Absent hardware error, Trotter mitigation appears to remove barriers to higher accuracies through more precise data acquisition. We haven’t said much about the effects of simulation time, and we conclude with a brief investigation into this point. Figure 3.3 shows the error in interpolation calculations across four orders of magnitude for the simulation time. Other fundamental simulation parameters are held fixed, but following our definitions above, our interpolation interval [βˆ’π‘Ž, π‘Ž] will change and hence the simulation cost. For all values of 𝑇 considered, we see the exponential decay in error leading, 62 Figure 3.2 Error in the expectation value plotted against data noise for several different degrees 𝑁 βˆ’ 1 of Chebyshev interpolation. Going up along the y axis indicates improved performance. For small values of 1/πœ–data (large data errors), the final estimate error essentially tracks the data error, and the Trotter error is negligible. Once πœ–data is brought under a certain threshold, the final error flatlines to some plateau, indicating that the final error is dominated by Trotter (interpolation) error. As 𝑁 increases, the crossover point happens at smaller πœ–, and the final estimate is more accurate. ultimately, to the plateau of machine precision. As 𝑇 increases, the rate remains exponential but decreases in rate. 3.7 Discussion In this chapter, we considered the mitigation of Trotter error by use of a standard numerical technique: polynomial interpolation. This approach is inspired by multiproduct formulas, which systematically cancel errors due to nonzero Trotter step. Here, however, we are cancelling the errors "offline", i.e. following the measurements of dynamical observables. This offers accuracy improvements without enormous quantum overhead, which is especially important as near-term quantum hardware is noisy and limited. Classical resources, though perhaps less powerful, are relatively abundant and cheap. It is interesting to consider to what extent classical resources can chip away at the advantages of post-Trotter methods over product formulas. The fact that time evolution is only a piece of a full simulation has allowed us to achieve an exponential reduction in the Trotter error for measuring 63 1001031061091012101510110310510710910111/Ο΅data1/Ο΅TotalErrorScalingvsMeasurementErrorScalingN=2N=4N=6N=8 Figure 3.3 Chebyshev approximation error for the observable (3.98), plotted against Chebyshev degree for various total simulation times 𝑇. Across several orders of magnitude for 𝑇 we observe an exponential decrease in the algorithmic error with respect to the number of data points until floating-point precision is reached. However, the decay is slower for larger times. dynamical observables. A natural follow up question is whether classical techniques, coupled with first-order Trotter, can reduce the dependent on simulation time 𝑇 to near-linear. An intuitive argument against this may be that such a scheme would require evolutions much less than 𝑇, and extrapolating the behavior to later times would be infeasible. This question is likely intimately related to the 𝑂 (𝑇 2) dependence we’ve derived for our method. Numerical studies may ultimately shed light as to whether the 𝑇 2 scaling is "real" and whether it can be improved. Polynomial interpolation is by no means the only approach to understanding the functional relation 𝑓 (𝑠) between the Trotter step 𝑠 and the quantity of interest. We could also apply Richardson extrapolation to 𝑓 to estimate 𝑓 (0) given nearby points. Alternatively, rational approximations or machine learning techniques could be used to uniformly approximate 𝑓 (𝑠). Extending our tests with the randomized Heisenberg model to include, say, Richardson extrapolation would be beneficial to facilitate this comparison. 64 5101520253010βˆ’1810βˆ’1410βˆ’1010βˆ’610βˆ’2DegreeofChebyshevApproximantErrorinFinalApproximationExtrapolatedErrorforDifferentSimulationTimesT=0.1T=1T=10T=100 3.8 Proofs We now provide proofs for Lemma 3.5.1 and Theorem 3.5.2, whose statements were given in Section 3.5. Proof of Lemma 3.5.1. For scalar functions 𝑓 (𝑠), derivatives of exp( 𝑓 (𝑠)) can be expressed through the complete Bell polynomials via FaΓ  di Bruno’s formula. 𝑠 𝑒 𝑓 (𝑠) = π‘Œπ‘› ( 𝑓 β€²(𝑠), 𝑓 ”(𝑠), . . . , 𝑓 (𝑛) (𝑠))𝑒 𝑓 (𝑠) πœ•π‘› (3.99) For operator exponentials such as exp(βˆ’π‘– Λœπ»π‘ π‘‘), derivatives can be expressed via repeated application of Duhamel’s formula. Yet these expressions are always upper bounded by the commuting (scalar) case [133], so that βˆ₯πœ•π‘› 𝑠 π‘’βˆ’π‘– Λœπ»π‘ π‘‘ βˆ₯ ≀ π‘Œπ‘› (cid:16) 𝑑 βˆ₯πœ•π‘  Λœπ»π‘  βˆ₯, 𝑑 βˆ₯πœ•2 𝑠 Λœπ»π‘  βˆ₯, . . . , 𝑑 βˆ₯πœ•π‘› 𝑠 Λœπ»π‘  βˆ₯ (cid:17) . (3.100) Note that the exponential disappeared in the bound since it has norm one. Applying Lemma 3.4.1 and invoking the fact that π‘Œπ‘› is monotonic in each argument, this is upper bounded by (cid:16) π‘Œπ‘› (2 𝑗 𝑗 (𝑒2𝑐) 𝑗+1)𝑛 𝑗=1 (cid:17) . An explicit formula for this is given by (cid:16) π‘Œπ‘› (2 𝑗 𝑗 (𝑒2𝑐) 𝑗+1)𝑛 𝑗=1 (cid:17) = βˆ‘οΈ 𝐷 𝑛! 𝑑1! . . . 𝑑𝑛! 𝑛 (cid:214) 𝑗=1 (cid:19) 𝑑 𝑗 (cid:18) 2 𝑗 𝑗 (𝑒2𝑐) 𝑗+1 𝑗! where 𝐷 is a sum over all indices (𝑑 𝑗 )𝑛 𝑗=1 such that 𝑑 𝑗 β‰₯ 0 and Using a Stirling-type bound 𝑛 βˆ‘οΈ 𝑗=1 𝑑 𝑗 𝑗 = 𝑛. 1 𝑗! ≀ (cid:19) 𝑗 (cid:18) 𝑒 𝑗 1 √ 2πœ‹ 65 (3.101) (3.102) (3.103) (3.104) allows us to write (cid:16) π‘Œπ‘› (2 𝑗 𝑗 (𝑒2𝑐) 𝑗+1)𝑛 𝑗=1 (cid:17) ≀ βˆ‘οΈ 𝐷 𝑛! 𝑑1! . . . 𝑑𝑛! (cid:32)βˆšοΈ‚ 𝑛 (cid:214) 𝑗=1 (cid:33) 𝑑 𝑗 2 πœ‹ 𝑒 𝑗 (𝑒2𝑐) 𝑗+1 = (𝑒3𝑐)𝑛 βˆ‘οΈ 𝐷 𝑛! 𝑑1! . . . 𝑑𝑛! = (𝑒3𝑐)π‘›π‘Œπ‘› (√︁ 2/πœ‹π‘’2𝑐, √︁ (cid:32)βˆšοΈ‚ 𝑛 (cid:214) (cid:33) 𝑑 𝑗 𝑒2𝑐 2 πœ‹ 𝑗=1 2/πœ‹π‘’2𝑐, . . . , √︁ 2/πœ‹π‘’2𝑐) (3.105) = (𝑒3𝑐)𝑛𝐡𝑛 (√︁ 2/πœ‹π‘’2𝑐). In the second line we brought out 𝑛 factors of 𝑒𝑐 using the condition on the indices 𝐷, and we identified π‘Œπ‘› evaluated the same at every argument to be the single-variable Bell (or Touchard) polynomial 𝐡𝑛. We can bound the size of 𝐡𝑛 (√︁2/πœ‹π‘’2𝑐) [5] by 𝐡𝑛 (√︁ 2/πœ‹π‘’2𝑐) ≀ (cid:32) 𝑛 log(1 + √︁ πœ‹ 2 𝑛/(𝑒2𝑐)) (cid:33) 𝑛 for all 𝑛 > 0, with 𝑛 = 0 defined by the limit (which is 1). With this, βˆ₯πœ•π‘› 𝑠 π‘’βˆ’π‘– Λœπ»π‘ π‘‘ βˆ₯ ≀ (cid:32) 𝑒3𝑐𝑛 log(1 + √︁ πœ‹ 2 𝑛/(𝑒2𝑐)) (cid:33) 𝑛 ≀ (cid:18) 𝑒3𝑐𝑛 (cid:19) 𝑛 (cid:32) 2 (cid:33) 𝑛 βˆšοΈ‚ 8 πœ‹ 𝑒2𝑐 𝑛 1 + (3.106) (3.107) where we’ve used the bound 1/log(1 + π‘₯) ≀ 1/2 + 1/π‘₯. Again, this inequality is valid for 𝑛 = 0 via the limit, which is always one. With this bound on the ST formula derivatives, we now turn to bounding πœ•π‘› 𝑠 𝑂 𝑠 (𝑑). Applying the binomial theorem and triangle inequality to (3.5), (cid:19) βˆ₯πœ• 𝑝 𝑠 𝑒𝑖𝑑 Λœπ»π‘  βˆ₯ βˆ₯πœ•π‘›βˆ’π‘ 𝑠 π‘’βˆ’π‘–π‘‘ Λœπ»π‘  βˆ₯ βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 (𝑑)βˆ₯ βˆ₯𝑂 βˆ₯ ≀ ≀ 𝑛 βˆ‘οΈ (cid:18)𝑛 𝑝 𝑝=0 (cid:18) 𝑒3𝑐 (cid:19) 𝑛 𝑛 βˆ‘οΈ 𝑝=0 (cid:19) (cid:18)𝑛 𝑝 𝑝 𝑝 (𝑛 βˆ’ 𝑝)π‘›βˆ’π‘ (cid:32) βˆšοΈ‚ 1 + 8 πœ‹ 𝑒2𝑐 𝑝 (cid:33) 𝑝 (cid:32) βˆšοΈ‚ 1 + 8 πœ‹ 𝑒2𝑐 𝑛 βˆ’ 𝑝 (cid:33) π‘›βˆ’π‘ . (3.108) 2 At this point, it will be fruitful to consider two regimes. Recall that 𝑐 encodes information about the simulation time. 66 In the case where 𝑐 > 𝑛, we have βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 βˆ₯ βˆ₯𝑂 βˆ₯ (cid:18) 𝑒3𝑐 (cid:19) 𝑛 𝑛 βˆ‘οΈ 2 (cid:18) 𝑒3𝑐 (cid:19) 𝑛 2 𝑝=0 𝑐𝑛 (cid:16) ≀ ≀ (cid:19) (cid:18)𝑛 𝑝 (𝑐 + √︁ 8/πœ‹π‘’2𝑐) 𝑝 (𝑐 + √︁ 8/πœ‹π‘’2𝑐)π‘›βˆ’π‘ 1 + √︁ 8/πœ‹π‘’2(cid:17) 𝑛 𝑛 βˆ‘οΈ 𝑝=0 (cid:19) (cid:18)𝑛 𝑝 βˆšοΈƒ (cid:18) 𝑐 = 𝑒3(1 + √︁ 8/πœ‹π‘’2) (cid:19) 2𝑛 . This implies a relative error in the polynomial fit bounded by |πΈπ‘›βˆ’1(0)| < (cid:18) 129 (cid:19) 𝑛 . 𝑐2π‘Ž 𝑛 In the case where 𝑐 ≀ 𝑛, the approximation (cid:32) 1 + 𝑒2 (cid:33) 𝑝 βˆšοΈ‚ 8 πœ‹ 𝑐 𝑝 √ 8/πœ‹ < 𝑒𝑐𝑒2 holds and is not so crude. Applying this to (3.108), βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 βˆ₯ βˆ₯𝑂 βˆ₯ ≀ (cid:18) 𝑒3𝑐 (cid:19) 𝑛 2 𝑛! 𝑛 βˆ‘οΈ 𝑝=0 𝑝 𝑝 𝑝! (𝑛 βˆ’ 𝑝)π‘›βˆ’π‘ (𝑛 βˆ’ 𝑝)! √ 2/πœ‹. 𝑒4𝑐𝑒2 Regrouping and employing a Stirling bound where appropriate, βˆ₯πœ•π‘› 𝑠 𝑂 𝑠 βˆ₯ βˆ₯𝑂 βˆ₯ ≀ 𝑒4𝑐𝑒2 √ 2/πœ‹ (cid:18) 𝑒3𝑐 (cid:19) 𝑛 2 ≀ 𝑒4𝑐𝑒2 √ 2/πœ‹ (cid:18) 𝑒4𝑐 (cid:19) 𝑛 2 2 𝑛! (cid:169) (cid:173) (cid:171) (cid:32) 𝑛! √ 1 βˆšοΈπ‘(𝑛 βˆ’ 𝑝) (cid:170) (cid:174) (cid:172) √ 𝑒𝑛 2πœ‹π‘› + 𝑒𝑛 2πœ‹ √ + 2 2πœ‹π‘› (cid:19) 𝑛 π‘›βˆ’1 βˆ‘οΈ 𝑝=1 (cid:33) 𝑛 βˆ’ 1 2πœ‹ √ 2/πœ‹ 1 2πœ‹ βˆšοΈ‚ ≀ ≀ √ 8πœ‹ + ( √ 𝑛 βˆ’ 1) (cid:18) 𝑒4𝑐 𝑛!𝑒4𝑐𝑒2 2𝑛 πœ‹ (cid:18) 𝑒4𝑐 (cid:19) 𝑛 2 𝑛!𝑒4𝑐𝑒2 2 √ 2/πœ‹. After another Stirling bound, this gives a corresponding interpolation error of √ πΈπ‘›βˆ’1(0) < 2 2𝑛 (6π‘π‘Ž)𝑛 𝑒24𝑐. 67 (3.109) (3.110) (3.111) (3.112) (3.113) (3.114) β–‘ Proof of Theorem 3.5.2. Let 𝑓 (𝑠) = βŸ¨π‘‚ 𝑠 (𝑑)⟩/βˆ₯𝑂 βˆ₯ be the normalized expectation value under Trot- ter evolution. Our interpolation algorithm produces an estimate ¯𝑓 of 𝑓 (0) which we require to be accurate within πœ–. | 𝑓 (0) βˆ’ ¯𝑓 | ≀ πœ– (3.115) There is the interpolation error from the polynomial π‘ƒπ‘›βˆ’1 𝑓 fitting 𝑓 assuming perfect inter- polation points (𝑠𝑖, 𝑓 (𝑠𝑖)). But 𝑓 (𝑠𝑖) can only be estimated; let’s call Λœπ‘¦π‘– this estimate. The error in Λœπ‘¦π‘– in our analysis comes from the statistical error inherent in the estimation protocol as well as the error in the fractional query procedure for a 1/𝑠 evolution. We can independently consider the interpolation error and the data error via the triangle inequality. | 𝑓 (0) βˆ’ ¯𝑓 | ≀ | 𝑓 (0) βˆ’ π‘ƒπ‘›βˆ’1 𝑓 (0)| + |π‘ƒπ‘›βˆ’1 𝑓 (0) βˆ’ Λœπ‘ƒπ‘›βˆ’1 𝑓 (0)| (3.116) ≀ πœ–int + πΏπ‘›πœ–data Here 𝐿𝑛 is the Lebesgue constant of the interpolation, essentially a condition number, and πœ–data is an upper bound on the error in the data. Λœπ‘ƒπ‘›βˆ’1 𝑓 is the fit to the imperfect data and π‘ƒπ‘›βˆ’1 𝑓 the fit to the perfect data (𝑠𝑖, 𝑓 (𝑠𝑖)). For generic interpolation nodes, 𝐿𝑛 can grow rapidly; however, for the set of Chebyshev nodes we obtain a near-optimal value [114]. 𝐿𝑛 ≀ 2 πœ‹ log(𝑛 + 1) + 1 Since we want the total error to be within a threshold πœ–, we can require πœ–data ≀ πœ– 2𝐿𝑛 , πœ–int ≀ . πœ– 2 (3.117) (3.118) Given these error bounds, we can now turn to the cost of acquiring the data points. Because 𝑂/𝛾 can be block encoded, the expectation value calculation can be encoded as an amplitude estimation problem. Specifically, a Hadamard test circuit gives the amplitude 1 + βŸ¨π‘‚ 𝑠𝑖 (𝑑)⟩/𝛾 2 . (3.119) If we estimate this amplitude to within accuracy πœ–dataβˆ₯𝑂 βˆ₯/2𝛾, we can estimate 𝑓 (𝑠𝑖) within πœ–data. Using Iterative Quantum Amplitude Estimation [56], we can obtain this estimate using a Grover 68 iterate 𝐺 constructed from two Hadamard test oracles. The number of Grover oracles 𝑁𝐺 required is given by 𝑁𝐺 ≀ 200𝛾𝐿𝑛 βˆ₯𝑂 βˆ₯πœ–data log (cid:18) 2𝑛 𝛿 log2 (cid:19)(cid:19) (cid:18) π›ΎπΏπ‘›πœ‹ βˆ₯𝑂 βˆ₯πœ–data (3.120) to ensure probability 1 βˆ’ 𝛿 of all data being within πœ–data of the true value. Each 𝐺 requires two Hadamard tests, and each Hadamard oracle calls a (controlled) ST evolution once. The number of controlled exponentials needed for a single data point at value 𝑠𝑖 is in 𝑂 (cid:18) π‘π‘˜ |𝑠𝑖 | (cid:19) log 1/πœ–data (3.121) where π‘π‘˜ = 2π‘š5π‘˜βˆ’1, and where the logarithm comes from the need for fractional queries with QSVT. There is also a 𝑂 (1) overhead associated with the fractional queries and IQAE. Altogether, the number of exponentials for a single data point is in (cid:18) 𝑂 𝑁𝐺 Γ— 2 Γ— log 1/πœ–data (cid:19) . π‘π‘˜ |𝑠𝑖 | (3.122) Therefore, the total number 𝑁exp of generating all 𝑛/2 data points (we only need half due to symmetry) is in (cid:32) 𝑁exp ∈ 𝑂 𝑁𝐺 π‘π‘˜ (cid:33) log(1/πœ–data) . 𝑛/2 βˆ‘οΈ 𝑖=1 1 𝑠𝑖 Plugging in (3.120) for 𝑁𝐺 above and summing over 1/𝑠𝑖 using Lemma 3.3.1, 𝑁exp ∈ 𝑂 (cid:18) π›Ύπ‘π‘˜ 𝐿𝑛𝑛 βˆ₯𝑂 βˆ₯πœ– π‘Ž (log 𝑛) log (cid:18) 2𝑛 𝛿 log2 (cid:19)(cid:19) (cid:18) π›ΎπΏπ‘›πœ‹ βˆ₯𝑂 βˆ₯πœ– (cid:19) log(1/πœ–data) βŠ‚ Λœπ‘‚ (cid:16) 𝑛 π‘Žπœ– log 1/𝛿 (cid:17) (3.123) (3.124) where Λœπ‘‚ suppresses factors logarithmic in 𝑛 and πœ–. We also employed our assumption that 𝛾/βˆ₯𝑂 βˆ₯ ∈ 𝑂 (1). The number of nodes 𝑛 and the interpolation interval [βˆ’π‘Ž, π‘Ž] will be determined by πœ–int, the interpolation error assuming perfect data. To apply our error bounds from the previous subsection, choose π‘Ž to satisfy Lemma 3.4.1, i.e. π‘π‘Ž < πœ‹/20, while also taking 1/π‘Ž ∈ 𝑂 (𝑐). Choose 𝑛 β‰₯ βŒˆπ‘βŒ‰. Then the second bound of Lemma 3.5.1 holds. From the interpolation error, we must satisfy √ 2 2𝑛(6π‘π‘Ž)𝑛𝑒24𝑐 < πœ–/2 (3.125) 69 which in turn can be satisfied provided that √ 2𝑛 4 (cid:16) 6𝑒24π‘π‘Ž (cid:17) 𝑛 < πœ– (3.126) since 𝑛 β‰₯ 𝑐. Choose π‘Ž such that 6𝑒24π‘π‘Ž = 1/2, which is consistent with our previous conditions on π‘Ž. Then, to satisfy the error bound, 𝑛 can satisfy √ 2𝑛2βˆ’π‘› < πœ– . 4 This can be solved using the βˆ’1 branch of the LambertW function. LambertWβˆ’1 (cid:16) βˆ’πœ– log 2/(4 √ 2)) (cid:17) log 2 𝑛 > βˆ’ The appropriate asymptotics is 𝑛 ∈ 𝑂 (log(1/πœ–). By taking 𝑛 = π‘›βˆ— where (cid:40) (cid:38) π‘›βˆ— = max βŒˆπ‘βŒ‰, βˆ’ LambertWβˆ’1(βˆ’πœ– log 2/(4 √ 2) (cid:39)(cid:41) log 2 ∈ 𝑂 (max{𝑐, log(1/πœ–)}) we satisfy all required constraints and arrive at our final asymptotic scaling. 𝑁exp ∈ Λœπ‘‚ (cid:16) max{𝑐, log(1/πœ–)}π‘πœ– βˆ’1 log(1/𝛿) (cid:17) (3.127) (3.128) (3.129) (3.130) (3.131) β–‘ In the proof above, we set 𝑛 > 𝑐 from the beginning, in order to use the second of the two bounds from Lemma 3.5.1. Together with 1/π‘Ž ∈ 𝑂 (𝑐) this condemns us to a suboptimal 𝑐2 scaling in the large 𝑐 limit. However, using the first bound instead of the second would not help us, since the 𝑛𝑐2/π‘Ž term in that bound must be order one. 70 CHAPTER 4 TIME DEPENDENT HAMILTONIAN SIMULATION THROUGH DISCRETE CLOCK CONSTRUCTIONS The previous chapter was entirely concerned with time independent Hamiltonians. However, this is only a special (though important) case of general time dependent Hamiltonians that can occur in systems of interest. Interestingly, a reduction from time dependence to time independence is always possible, though this has not been properly utilized in the Hamiltonian simulation community. This chapter concerns itself with a discretized version of the reduction, known as the (𝑑, 𝑑′) method, that is finite dimensional and therefore amenable to computation. We will consider, as an application of our formalism, a simulation by qubitization of the encoded time dependent Hamiltonian. After a brief introduction and motivation for time dependent simulations, we review the standard (𝑑, 𝑑′) formalism. Then we discretize the clock variable suitably and prove asymptotic error bounds on the accuracy compared to the time ordered operation. Finally, we describe a possible simulation by qubitization of the full clock system, before ending with some discussion. The subsequent Chapter 5 makes use of the clock space construction to argue in favor of the conjecture that certain multiproduct formulas serve as good approximants to π‘ˆ. However, in that chapter, the clock space is only used for a theoretical proof, and plays no role in the simulation. Here, we take the clock space quite seriously and consider what it would take to fully implement it using the powerful method of qubitization. Both chapter and the next are subjects of ongoing research. An early preprint has been posted [133] which will be updated as these projects reach a close. 4.1 Introduction and Motivation When a fundamental physical law is expressed as a Hamiltonian, it should generally be expected to be time independent. This conveys the expected constancy of the theory under consideration. Such expectations are validated, for example, by experiments which seek to measure time variations in fundamental physical constants, such as ℏ [127]. To date, no such variations have been measured, though we must remember there is no theoretical reason, beyond elegance, to disallow them. 71 In any case, there are many practical situations in which a quantum system is naturally modeled as closed (hence unitary), while still exhibiting time-varying laws. For example, we may imagine impinging a molecule with electromagnetic pulses in a laboratory setting. These pulses may be considered large enough to be unaffected by the state of the molecule, yet the molecule is certainly influenced by the pulse train. Since the amplitude of this pulse may vary with time, so will the Hamiltonian describing the system. If the state of the quantum system did have an effect on the incoming electrical pulse, a distinct formalism of open system dynamics should be invoked. Even if the dynamics aren’t inherently time-varying, it is often useful in the mathematics to shift to an interaction picture. When a Hamiltonian 𝐻 = 𝐻0 + 𝐻1 has two pieces as shown, where 𝐻0 is some baseline, "trivial" Hamiltonian that we know how to solve, moving into a frame of reference "rotating" with respect to 𝐻0 generates a new time dependent 𝐻𝐼 (𝑑) that encapsulates the nontrivial part. Such as splitting in 𝐻 is common in perturbation theory, where we imagine 𝐻1 small yet important. Thus, the study of time independent 𝐻 can benefit from understanding how to simulate time dependent Hamiltonians. As often occurs in physics, symmetries (in this case related to time translation invariance) lead to useful simplifications. We have seen in Section 2.5 how for time independent 𝐻, the time evolution operator π‘ˆ (𝑑) = π‘’βˆ’π‘–π»π‘‘ is a simple matrix exponential, rather than the more nuanced time-ordered exponential of equation (2.18). Moreover, the existence of stationary states in the time independent setting provides a useful characterization of the "allowed energies" of a system, and a set of states whose dynamics are trivial. Despite these simplification, the study of quantum systems with time independent Hamiltonians is sufficiently rich as to warrant its own focused study. In Chapter 2, we provided examples of time independent Hamiltonian simulation being used in generic quantum algorithms applications, particularly linear systems solvers. There are also applications for time dependent Hamiltonian simulations, such as adiabatic evolution. Generally speaking, a process is β€˜adiabatic’ if it involves the slow transformation of parameters from one value to another. Here the meaning of "slow" depends on the nature of the problem at hand. Specifying to Hamiltonians, an adiabatic evolution is a dynamical evolution under a Hamiltonian 𝐻 (𝑑) that 72 changes slowly with time. A standard example is the linear adiabatic evolution 𝐻 (𝑑) = (cid:16) 1 βˆ’ (cid:17) 𝑑 𝑇 𝐻0 + 𝑑 𝑇 𝐻1 (4.1) which, from 𝑑 = 0 to 𝑑 = 𝑇, changes the Hamiltonian from 𝐻0 to 𝐻1. By "slow" we mean 𝑇 is much larger than the smallest gap min𝑑 1/𝛿𝐸 (𝑑) between two eigenenergies which overlap the state of interest. Interestingly, such adiabatic evolutions approximately preserve eigenstates, in the sense that, starting and initial state as an eigenstate |𝐸0⟩ of 𝐻0, the final state evolved under 𝐻 (𝑑) will be approximately |𝐸1⟩, and eigenstate of 𝐻1. This is especially useful when trying to prepare ground states of a nontrivial Hamiltonian 𝐻1 given a simpler one 𝐻0. While the adiabatic approach is valuable for physical applications [128, 131], it can also enable solutions to optimization problems when we think of the ground state energy as minimizing, or "optimizing," a function. One can imagine the ground state of 𝐻1 exhibiting information related to an optimization problem, unrelated to a physics context. For example, take Quadratic Unconstrained Binary Optimization (QUBO), which seeks a binary vector π‘₯ ∈ {βˆ’1, 1}𝑛 that minimizes π‘₯𝑖𝑄𝑖 𝑗 π‘₯ 𝑗 βˆ‘οΈ 𝑖 𝑗 (4.2) for some real symmetric matrix 𝑄. The solution can be encoded as a computational basis state |π‘₯⟩ that is the ground state of the Hamiltonian 𝐻𝑄 = βˆ‘οΈ 𝑖 𝑗 𝑍𝑖𝑄𝑖 𝑗 𝑍 𝑗 . (4.3) While this highlights the utility of ground state preparation for general optimization, recent literature points to other methods, such as Quantum Imaginary Time Evolution, as preferred for these sorts of tasks as opposed to adiabatic preparation [6, 142]. Nevertheless, it highlights how time dependent simulations can manifest in broader algorithmic settings. Because time independent Hamiltonians are special cases of time dependent ones, any algo- rithms for general 𝐻 (𝑑), and analysis thereof, immediately specializes to an algorithm and analysis for time independent case. While true, in practice our grasp of time independent simulation algo- rithms outstrips knowledge of the time dependent case. For example, qubitization is only viable for 73 simulating time independent Hamiltonians, as its basis of operation is a polynomial approximation to 𝑓 (πœ†) = π‘’βˆ’π‘–πœ†π‘‡ on the interval [0, 𝑇] [88]. One could proceed by approximating π‘ˆ (𝑇, 0) via evaluating the expression (2.18) for fixed, large 𝑛. However, the resulting algorithm is, in general, quite inferior to its time independent version, since the errors from the truncation wash out any gains in precision. As of writing, there is no time dependent simulation algorithm which saturates known lower bounds. Product formulas can be used in a similar manner, applying the formula to each term in the truncation. This is perhaps the simplest approach to time dependent simulation on a quantum computer. Because largely fluctuating Hamiltonians will require finer time meshes, the final cost will depend on the smoothness of the Hamiltonian and the size of derivatives within the mesh points [137]. More general time dependencies can be handled, without smoothness requirements, by a certain generalization of the Trotter scheme in which the time integrals (without time ordering) are retained [105]. The dropping of the smoothness requirement demonstrates that a much broader class of Hamiltonians is feasibly simulatable. Within their arguments, the authors use a generalization of product formulas in which the time ordered exponential is split, but the integrals are retained [70]. In this chapter, we will consider an entirely distinct generalization of the product formula in the time dependent setting. In fact, we will show that it relates to a standard product formula in an augmented clock space. 4.2 The Clock Space Mathematically, and more broadly than the Hamiltonian setting, the distinction between time dependent vs independent systems can be cast as a distinction between autonomous and nonau- tonomous dynamical systems. Dynamical are differential equations in a single evolution parameter 𝑑, which can always be cast as a first-order initial value problem (cid:164)π‘₯ = 𝑓 (π‘₯, 𝑑), π‘₯(0) = π‘₯0 (4.4) possibly by standard reduction-of-order techniques. Here π‘₯ ∈ R𝑛 consists of 𝑛 evolution parameters which implicitly depend on time 𝑑. Reasonable smoothness conditions on 𝑓 may be imposed. The autonomous case corresponds to 𝑓 being independent of 𝑑. Such equations are valuable because they 74 admit a geometric description in terms of phase space and flows. Classically, time independent Hamiltonian dynamics are a special case of autonomous systems, and the relevant initial value problem is given by Hamilton’s equations. ( (cid:164)π‘ž, (cid:164)𝑝) = (cid:18) πœ•π» πœ• 𝑝 , βˆ’ (cid:19) , πœ•π» πœ•π‘ž (π‘ž(0), 𝑝(0)) = (π‘ž0, 𝑝0) (4.5) This system is autonomous when 𝐻 is independent of the evolution parameter 𝑑. It has long been recognized that a simple transformation allows for the reduction of nonau- tonomous systems to autonomous ones [60]. The trick is to promote 𝑑 to a coordinate, thereby making 𝑓 (π‘₯, 𝑑) satisfy the requirement of only depending on coordinates. Letting 𝑠 take the place of the evolution parameter (time), we still want 𝑑 and 𝑠 to be essentially the same. This is supplied by the simple equation (cid:164)𝑑 = 𝑑𝑑 𝑑𝑠 = 1, 𝑑 (0) = 0. With this, we have the following autonomous system ( (cid:164)π‘₯, (cid:164)𝑑) = ( 𝑓 (π‘₯, 𝑑), 1), (π‘₯(0), 𝑑 (0)) = (π‘₯0, 0) (4.6) (4.7) whose solution encodes the solution to the original (4.4). While it seems that not much has been gained, this framework finds much application in the consideration of periodically driven systems. In this case, a cylindrical phase space, with 𝑑 representing the angle, allows for interesting geometric understanding of the dynamics. When a classical Hamiltonian 𝐻 (𝑑) is time dependent (nonautonomous), can we perform a similar trick to reduce to the time independent (autonomous) case? Promoting 𝑑 to a coordinate, we must formally introduce a conjugate momentum βˆ’πΈ, choosing notation suggestive of the time- energy correspondence. We call the full Hamiltonian 𝐾 (π‘ž, 𝑝; 𝑑, 𝐸), to distinguish from the original Hamiltonian 𝐻 (π‘ž, 𝑝, 𝑑), and we wish to determine if a suitable 𝐾 exists. As before, we want 𝑑𝑑/𝑑𝑠 = 1, which because of Hamilton’s equations imply πœ•π» πœ• (βˆ’πΈ) = 1. 75 (4.8) We conclude that 𝐾 = βˆ’πΈ + 𝐹 (π‘ž, 𝑝, 𝑑) for some 𝐹. But we need 𝐾 to reproduce the same dynamics for π‘ž, 𝑝 as 𝐻. This implies 𝐹 = 𝐻. Interestingly, the final Hamilton equation 𝑑𝐸 𝑑𝑠 = πœ•π» πœ•π‘‘ (4.9) corresponds with the expected equation for energy change, up to a sign. This explains our choice of a minus sign in the conjugate momentum βˆ’πΈ. In summary, we have a prescription for converting nonautonomous Hamiltonians into au- tonomous ones depending only on coordinates. In hindsight, the way this is accomplished is rather silly. The coordinate 𝑑 has dynamics completely independent of the values of any other coordinate or momentum, including 𝐸, and gets pulled in a straight line at a constant velocity. Time marches forward. As 𝑑 changes, so does 𝐻, which affects all of the other coordinates in the desired way. This is much like a reel of movie tape moving through the machine at a constant rate to change what’s on the screen, mimicking the forward flow of time. What about quantum Hamiltonians? We could try to quantize the above, imposing the usual commutation relation [ˆ𝑑, βˆ’ ˆ𝐸] = 𝑖𝐼. (4.10) Absent periodicity, a natural choice of Hilbert space of 𝑑 is 𝐿2( [0, 𝑇]): square integrable functions on the interval of simulation. We then get a representation of 𝑑 and 𝐸 as 𝑑 multiplication and 𝑑-derivatives, respectively. (Λ†π‘‘πœ“)(𝑑) = π‘‘πœ“(𝑑), ( Λ†πΈπœ“) (𝑑) = π‘–πœ•π‘‘πœ“ (4.11) The augmented "clock" Hamiltonian, so named because of the time coordinate 𝑑, has the form 𝐻𝑐 = 𝐻 βˆ’ π‘–πœ•π‘‘ (4.12) which looks eerily similar to a rearranged SchrΓΆdinger operator. Indeed, if πœ“sol(𝑑, π‘ž) is a solution to the SchrΓΆdinger equation encoded on the full "clock" Hilbert space, then we see that the state πœ“π›Ό (𝑑, π‘ž) = π‘’βˆ’π‘–π›Όπ‘‘πœ“sol, 𝛼 ∈ R (4.13) 76 is formally an eigenstate of 𝐻𝑐 with eigenvalue 𝛼. As a caution, this state is not properly normalized, nor is it normalizable unless the simulation interval [0, 𝑇] is finite. The above manipulations are purely formal, helpful only for calculational or interpretational purposes. From a physical point of view, 𝐻𝑐 is not bounded from below, allowing for potentially infinite energy extraction if such as system existed and could be coupled with. In particular, this system transitions to lower energies as wavepackets travel faster and faster to the left of the clock space, a seemingly senseless possibility. The quantum mechanical version of this trick, sometimes called the (𝑑, 𝑑′)-formalism because of the two distinct "times", finds use in periodically-driven quantum systems [21]. But our purposes, the elimination of explicit dependence on the evolution parameter is most exciting, because it implies the time evolution operator requires no time-ordering, while still encoding the full time dynamics [103]. Thus, while time independent 𝐻 is a special case of time dependence, time dependent simulation 𝐻 reduces, formally, to time independent simulation on an augmented space. To understand the nature of the encoding better, we have to be somewhat more careful. Let’s provide a more concrete description of the situation. The full Hilbert space is given by H = H𝑠 βŠ— H𝑐 (4.14) where H𝑐 (cid:27) 𝐿2(M), and M is the (connected) one-dimensional smooth manifold representing 𝑑. We have considered M (cid:27) [0, 𝑇] here, but we might also consider a circle (periodic dynamics) or the real line, where translations are a bit more natural. On H𝑐, 𝐸 acts as a generator of translations, but is an unbounded operator. Nevertheless, the exponentials of 𝐸 above are well defined through the spectral theorem and functional calculus for unbounded operators [61]. States πœ“ ∈ H can then be expressed as certain functions on M whose value πœ“(𝑑) is a state on H𝑠. The inner product on H is the natural one. βŸ¨πœ™|πœ“βŸ© := ∫ M βŸ¨πœ™(𝑑)|πœ“(𝑑)βŸ©π‘  𝑑𝑑 (4.15) Here, ⟨·|Β·βŸ©π‘  denotes the inner product on H𝑠. From now on, we use 𝜏 to denote the evolution parameter in order to avoid confusion with the clock coordinate 𝑑. 77 The interval 𝐼 over which the dynamics of the quantum system take place needs to be embedded within M. If 𝐼 is not exactly M, then the definition a time dependent observable 𝐴(𝑑) will need to be extended to cover the entire clock space. Once done, 𝐴(𝑑) is promoted to an (time independent) observable on H , denoted A, by acting on πœ“ ∈ H in a manner corresponding with the original space. (Aπœ“) (𝑑) := 𝐴(𝑑)πœ“(𝑑) (4.16) All such operators A are seen to be local in H𝑐, since they act with simple multiplication. Having laid the above groundwork, we can return to the question of dynamics. Let H be the promoted Hamiltonian operator as discussed in the previous paragraph. Let U(𝜏) be the unitary operator given by U(𝜏) = π‘’π‘–πΈπœπ‘’βˆ’π‘–(Hβˆ’πΈ)𝜏. One can verify that U solves the following SchrΓΆdinger equation. Here, π‘–πœ•πœU(𝜏) = H(𝜏)U(𝜏) U(0) = 𝐼 H(𝜏) ≑ π‘’π‘–πΈπœHπ‘’βˆ’π‘–πΈπœ (4.17) (4.18) (4.19) is a 𝜏-dependent Hamiltonian corresponding to simple, uniform translation along the clock space. For any state Ξ¨0 ∈ H , the function Ξ¨(𝜏) := U(𝜏)Ξ¨0 (4.20) solves the SchrΓΆdinger equation generated by H(𝜏), but more importantly, it encodes solutions to the dynamics under 𝐻 (𝑑). Indeed, for any 𝑑 ∈ M, we have a state πœ“(𝜏, 𝑑) ∈ H𝑠 defined by πœ“(𝜏, 𝑑) := [Ξ¨(𝜏)] (𝑑) = [U(𝜏)Ξ¨0] (𝑑) (4.21) 78 which solves the SchrΓΆdinger equation of interest. π‘–πœ•πœπœ“(𝜏, 𝑑) = π‘–πœ•πœ [U(𝜏)Ξ¨0] (𝑑) = [H(𝜏)Ξ¨0] (𝑑) = 𝐻 (𝜏 + 𝑑)πœ“(𝜏, 𝑑) (4.22) The interpretation is that each 𝑑 constitutes an initial time for performing the simulation, so we have a family of solutions parametrized by 𝑑 with initial state πœ“(0, 𝑑). The evolution parameter acts, as expected, as the total time elapsed in the simulation. Finally, we can obtain a collection of induced time-evolution operators, π‘ˆ (𝑑′ + Δ𝑑, 𝑑′) on H𝑠 for each 𝑑′ ∈ M. It acts on states πœ“0 ∈ H𝑠 as follows π‘ˆ (𝑑 + 𝜏, 𝑑)πœ“0 = [U(𝜏)Ξ¨0] (𝑑) (4.23) where Ξ¨0 ∈ H is any state for which Ξ¨0(𝑑) = πœ“0. This operator is unitary and solves the SchrΓΆdinger equation in the usual sense. Therefore it is exactly equivalent to the time-ordered exponential of expression (2.17). We’ve shown now that the clock space encodes a collection of solutions to the dynamics under 𝐻 (𝜏), one for each initial time 𝑑 with initial state Ξ¨0(𝑑) ∈ H𝑠. However, one might only care about one solution, say, at 𝑑 = 0, and the ability to extract that solution from the encoding. Suppose the desired initial state is |πœ“0⟩ ∈ H𝑠, and our initial time is 𝑑0 ∈ M. The idea is to prepare an initial product state Ξ¨0 = πœ“0 βŠ— πœ™0 ∈ H , where πœ™0 ∈ H𝑐 has overlap 1 βˆ’ 𝛿 in a πœ–-neighborhood of 𝑑0. After performing the evolution under U for the desired length 𝜏, perform a measurement of 𝑑. The probability of measuring 𝑑 within πœ– of 𝑑0 + 𝜏 is 1 βˆ’ 𝛿. Moreover, provided that the variation of 𝐻 (𝑑) around 𝑑 = 𝑑0 is small for variations of 𝑑 Β± πœ–, any value within the πœ–-neighborhood will suffice. Of course, any real measurement of 𝑑 will have an uncertainty width, and this must be brought within the size of the variation of 𝐻. The actual state will be slightly mixed, but very close to pure. Preparing the initial state πœ“0 βŠ— πœ™0 is just as hard as preparing each separately, so we focus on πœ™0. One way to prepare a sharp peaked state is an initial 𝑑 measurement of accuracy πœ–meas according to the requirements above. If 𝐻 (𝑑) can be shifted appropriately so that the measurement result 79 aligns with the desired start time, or the state can be shifted, then we’ve successfully prepared the desired state, and the clock Hamiltonian H βˆ’ 𝐸 can be turned on. However, no serious attempt towards an implementation on physical quantum hardware has been made. In terms of digital quantum computing, a proposal for simulating the clock system will be supplied in Section 4.4, which follows our construction of a discrete clock space. To summarize, we have shown how the propagator π‘ˆ generated by a time dependent 𝐻 can be cast as an ordinary operator exponential, via the inclusion of a 1-dimensional clock space. Interesting in its own right, this framework also allows for a natural unification of ideas regarding "Trotterization." This term is used to refer to both (a) the splitting up of an (ordinary) operator exponential of 𝐻 = (cid:205) 𝑗 𝐻 𝑗 into exponentials of the various 𝐻 𝑗 , or (b) the simulation of a time dependent Hamiltonian by time independent simulations over small time intervals, indicated in (2.18). These can, in fact be viewed as manifestations of the same phenomenon: a splitting of operator exponentials. To illustrate this with an example relevant to this paper, let’s consider a simple symmetric Trotterization of equation (4.17). π‘ˆ2(𝑑0 + Δ𝑑, 𝑑0) ≑ 𝑒𝑖𝐸Δ𝑑 (cid:16) π‘’βˆ’π‘–πΈΞ”π‘‘/2π‘’βˆ’π‘–π» (𝑑0)Ξ”π‘‘π‘’βˆ’π‘–πΈΞ”π‘‘/2(cid:17) = π‘’βˆ’π‘–π» (𝑑0+Δ𝑑/2)Δ𝑑 (4.24) We have just derived the midpoint formula [137, 124] from scratch. The Trotter product theorem says that 𝑒𝑖𝐸Δ𝑑 (cid:16) π‘’βˆ’π‘–πΈΞ”π‘‘/2π‘˜ π‘’βˆ’π‘–π» (𝑑0)Δ𝑑/π‘˜ π‘’βˆ’π‘–πΈΞ”π‘‘/2π‘˜ (cid:17) π‘˜ lim π‘˜β†’βˆž = π‘ˆ (𝑑0 + Δ𝑑, 𝑑0) (4.25) even though 𝐸 is unbounded [97]. Thus, we can expect that π‘ˆ (π‘˜) 2 (𝑑0 + Δ𝑑, 𝑑0) ≑ 𝑒𝑖𝐸Δ𝑑 (cid:16) π‘’βˆ’π‘–πΈΞ”π‘‘/2π‘˜ π‘’βˆ’π‘–π» (𝑑0)Δ𝑑/π‘˜ π‘’βˆ’π‘–πΈΞ”π‘‘/2π‘˜ (cid:17) π‘˜ (4.26) constitutes a good approximation to π‘ˆ for sufficiently large π‘˜ ∈ Z+ and small Δ𝑑. This opens up the possibility of a more unified approach to Hamiltonian simulation algorithms, which has not yet been properly considered. For example, a natural generalization of product formulas to time dependent 𝐻 could be a regular product formula of 𝐻𝑐 on the enlarged space. For simplicity, let’s take 𝐻 (𝑑) in whole as a single term, so that 𝐻𝑐 = 𝐻 βˆ’ 𝐸 has two terms. We can 80 define time dependent product formulas by taking product formulas along these two terms. This allows us to define, in particular time dependent generalizations of the recursive Suzuki-Trotter formulas of (2.28). Mathematical difficulties emerge in classifying the approximation order of these formulas, arising from the unboundedness of 𝐸. We seek to ameliorate this in our current work, focusing our attention on the 2nd order symmetric formula. 4.3 Finite Clock Spaces There are several reasons we are motivated to consider a discretization of the clock space introduced in the previous section. First, any real computation performed using the clock space will require a finite number of states. A natural choice for discretization is along the time ("position") basis, and we will consider that here. There are also formal reasons to consider a finite clock space. We’ve already seen how the clock space aids in the understanding of time dependent generalization of product formulas. Taking this idea further, we might consider time dependent multiproduct formulas, i.e. how to construct MPFs for time dependent problems. To ensure these MPFs work, we would like to show that an error series exists whose terms can be cancelled order by order through linear combinations of time dependent product formulas. However, such error series are more difficult to show on generic separable Hilbert spaces, and moreover the operator 𝐸 is unbounded. By making the clock space finite, performing the relevant analysis, then taking the limit, we can potentially avoid these. This programme is described in more detail in Chapter 5, but a full proof of this is an ongoing research project. Without further ado, we introduce our finite dimensional clock space, which we will sometimes call the "clock register." We discretize the clock variable 𝑑 into 𝑁𝑐 = 𝑁 𝑝 Γ— π‘π‘ž basis states, where 𝑁 𝑝 ∈ Z+ will represent the number of Trotter steps used in the simulation. Each "Trotter step" is further divided into π‘π‘ž ∈ Z+ steps for reasons that will be discussed shortly. We label these orthonormal basis states | π‘—βŸ© for 𝑗 ∈ [0, 𝑁𝑐 βˆ’ 1] ∩ Z. We will find it useful to consider, for our purposes, only periodic Hamiltonians. This is natural to understand, since translation operators like 𝐸 act most naturally on R or the circle (periodic boundary conditions), and the circle 81 is bounded. Nonperiodic Hamiltonians can be accommodated by a simple reflection, defining 𝐻 (𝑇 + 𝑑) := 𝐻 (𝑇 βˆ’ 𝑑) for 𝑑 ∈ [0, 𝑇]. In our work below, we will want 𝐻 (𝑑) to be a differentiable bounded function within the grid points, and although the reflection introduces nonsmoothness, we can simply take one of the grid points to be the midpoint of simulation. For simplicity, and for lack of a reason otherwise, we will take these grid points (𝑑 𝑗 ) π‘π‘βˆ’1 𝑗=0 to be uniformly spaced over the interval [0, 𝑇]: 𝑑 𝑗 = 𝑇 𝑗/𝑁𝑐 (taking 𝑁𝑐 to be an even integer, so that the midpoint requirement discussed directly above is satisfied). We let 𝛿𝑑 := 𝑇/𝑁𝑐 denote the grid width. We also take the natural discretization of 𝐻 (𝑑) onto the clock space 𝐻 (𝑑) ↦→ π‘π‘βˆ’1 βˆ‘οΈ 𝑗=0 𝐻 𝑗 βŠ— | π‘—βŸ©βŸ¨ 𝑗 | ≑ 𝐢 (𝐻) (4.27) where 𝐻 𝑗 ≑ 𝐻 (𝑑 𝑗 ). Observe that 𝐢 (𝐻) has no dependence on the evolution parameter, i.e., it is autonomous. The notation 𝐢 (𝐻) is used to suggest a controlled operation, where the control is on the clock register. Choosing the appropriate discretization of 𝐸 is somewhat more tricky, though the choice appears obvious in hindsight. Since 𝐸 acts as a derivative, it makes sense to take the discretized version to be a finite difference operator. For example, Ξ” := βˆ’π‘– π‘ˆ+ βˆ’ π‘ˆβˆ’ 2𝛿𝑑 (4.28) where π‘ˆ+ is the shift operator defined by π‘ˆ+| π‘—βŸ© = | 𝑗 + 1⟩ and π‘ˆβˆ’ = π‘ˆβ€  + is the backwards shift (all increments taken mod 𝑁𝑐). This is the approach we ultimately take. However, we note that author of this dissertation, and collaborators, began by considering a distinct approach via the logarithm of the translation operator ΛœΞ” = 𝑖 log π‘ˆ+. (4.29) While apparently sensible, given the analogous relation between 𝐸 and shifts on the clock space, this operator is not nicely behaved. For example, its commutator with the "position operator" (cid:205) 𝑗 𝑑 𝑗 | π‘—βŸ©βŸ¨ 𝑗 |, rather than being near-identity, has long off-diagonal tails. This behavior may be of independent interest, but from now on we will concern ourselves with Ξ” as the discrete version of 𝐸. 82 With these choices, our full clock Hamiltonian becomes 𝐻𝑐 := 𝐢 (𝐻) βˆ’ Ξ”. (4.30) Already, we can show some reasonable properties carry over to this setting. Lemma 4.3.1. In the notation above, let 𝐻 : [0, 𝑇] β†’ Herm(H ) be a time dependent Hamiltonian on a finite-dimensional vector space H . Then [Ξ”, 𝐢 (𝐻)] = 𝑖 Re (cid:32) π‘ˆ+ βˆ‘οΈ 𝑗 𝐻 𝑗+1 βˆ’ 𝐻 𝑗 𝛿𝑑 (cid:33) βŠ— | π‘—βŸ©βŸ¨ 𝑗 | where Re( 𝐴) := ( 𝐴 + 𝐴†)/2 denotes the Hermitian part of 𝐴. If 𝐻 is differentiable in each subinterval with bounded derivative, then we further have βˆ₯ [Ξ”, 𝐢 (𝐻)] βˆ₯ ≀ max π‘‘βˆˆ[0,𝑇] βˆ₯ (cid:164)𝐻 (𝑑) βˆ₯. We remark here the connections to the canonical commutation relation [ 𝑓 (π‘₯), 𝑝] = 𝑖 𝑓 β€²(π‘₯). The additional shift by π‘ˆ+ is a relatively small deviation from a finite difference approximation being performed on the Hamiltonian. Proof. We proceed in several steps, first by computing [π‘ˆ+, 𝐢 (𝐻)]. We have [π‘ˆ+, 𝐢 (𝐻)] = = π‘π‘βˆ’1 βˆ‘οΈ 𝑗=0 π‘π‘βˆ’1 βˆ‘οΈ 𝑗=0 𝐻 𝑗 βŠ— [π‘ˆ+, | π‘—βŸ©βŸ¨ 𝑗 |] 𝐻 𝑗 βŠ— (| 𝑗 + 1⟩⟨ 𝑗 | βˆ’ | π‘—βŸ©βŸ¨ 𝑗 βˆ’ 1|. (4.31) By splitting the sum and reindexing (all increments modulo 𝑁𝑐), we can move the difference to the 𝐻 𝑗 , giving [π‘ˆ+, 𝐢 (𝐻)] = βˆ‘οΈ 𝑗 (𝐻 𝑗 βˆ’ 𝐻 𝑗+1) βŠ— | 𝑗 + 1⟩⟨ 𝑗 | = βˆ’π‘ˆ+ βˆ‘οΈ 𝑗 (𝐻 𝑗+1 βˆ’ 𝐻 𝑗 ) βŠ— | π‘—βŸ©βŸ¨ 𝑗 |. (4.32) 83 Next, we have that [π‘ˆβˆ’, 𝐢 (𝐻)] = βˆ’[π‘ˆ+, 𝐢 (𝐻)]†. Thus, [π‘ˆ+ βˆ’ π‘ˆβˆ’, 𝐢 (𝐻)] = βˆ’2 Re (cid:32) π‘ˆ+ βˆ‘οΈ 𝑗 (𝐻 𝑗+1 βˆ’ 𝐻 𝑗 ) βŠ— | π‘—βŸ©βŸ¨ 𝑗 | (4.33) (cid:33) and the full result follows almost immediately from the definition of Ξ” given in equation (4.28). As for the upper bound, we note that βˆ₯ Re( 𝐴) βˆ₯ ≀ βˆ₯ 𝐴βˆ₯ for any finite-dimensional 𝐴, and by unitary invariance of the spectral norm we have βˆ₯ [Ξ”, 𝐢 (𝐻)] βˆ₯ ≀ (cid:13) (cid:13) βˆ‘οΈ (cid:13) (cid:13) (cid:13) 𝑗 𝐻 𝑗+1 βˆ’ 𝐻 𝑗 𝛿𝑑 βŠ— | π‘—βŸ©βŸ¨ 𝑗 | (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) = max 𝑗 (cid:13) (cid:13) (cid:13) (cid:13) 𝐻 𝑗+1 βˆ’ 𝐻 𝑗 𝛿𝑑 (cid:13) (cid:13) (cid:13) (cid:13) . The upper bound then follows from the claim (cid:13) (cid:13) (cid:13) (cid:13) 𝐻 𝑗+1 βˆ’ 𝐻 𝑗 𝛿𝑑 (cid:13) (cid:13) (cid:13) (cid:13) ≀ max π‘‘βˆˆ[𝑑 𝑗 ,𝑑 𝑗+1] βˆ₯ (cid:164)𝐻 (𝑑) βˆ₯ coming from a the fundamental theorem of calculus and the triangle inequality. (4.34) (4.35) β–‘ Having defined the clock space and Hamiltonian, we wish to prepare a suitable initial state. A seemingly adequate and natural choice is to take |πœ“0⟩ βŠ— |0⟩, where |πœ“0⟩ is the initial state of the system of interest and |0⟩ is the clock state at the initial time 𝑑 = 0. However, problems immediately arise which can be traced to the fact that the continuous version of |0⟩ is 𝛿(𝑑), which is not a normalizable state vector. This formal problem finds its way into the discrete setting, in that the finite difference Ξ” does not properly compute a derivative of |0⟩. Thus, Ξ” fails to translate |0⟩ properly into later times, and the time dependent simulation fails. To fix this issue, we take a cue from the continuous setting, where the best we can do is take a wavepacket of small enough width to suit our purposes. For simplicity, this wavepacket may as well be Gaussian, with some width 𝜎 to be chosen with care. Thus, we introduce Gaussian functions πœ™πœ‡ (𝑑; 𝜎) = π‘’βˆ’|π‘‘βˆ’πœ‡|2 𝑐/𝜎2 1 √ N of width 𝜎 ∈ R+ and center πœ‡ ∈ [0, 𝑇). Here |Β·|𝑐 is the shortest distance to 0 modulo 𝑇, |𝑑|𝑐 = min {|𝑑|, |𝑇 βˆ’ 𝑑|} 84 (4.36) (4.37) so that, with 0 and 𝑇 identified, πœ™πœ‡ is smooth everywhere except πœ‡ + 𝑇/2 mod 𝑇. Moreover, N ∈ R+ is chosen such that the discretized vector |πœ™πœ‡βŸ© = βˆ‘οΈ 𝑗 πœ™πœ‡ (𝑑 𝑗 ; 𝜎)| π‘—βŸ© (4.38) is normalized in the Euclidean sense (i.e., a quantum state vector). Technically, N has some dependence on πœ‡, but in our case we will only consider πœ‡ = 𝑑 𝑗 for some 𝑗, in which case N only depends on parameters such as 𝑁𝑐 and 𝜎. Because of this choice, we will more simply write |πœ™ 𝑗 ⟩ ≑ |πœ™π‘‘ 𝑗 ⟩. We are now ready to more clearly elucidate the overall strategy of the clock space construction. Figure 4.1 gives a schematic of the relevant components. We imagine 𝑁 𝑝 chunks of time steps, each containing π‘π‘ž subdivisions. As stated above, each of the 𝑁 𝑝 should be thought of as a single Trotter step in the evolution under 𝐻 (𝑑). The π‘π‘ž substates ensure that 𝛿𝑑 is sufficiently small such that the approximation of Ξ” to a derivative of πœ™ 𝑗 holds. In particular, we will desire 𝜎 ≫ 𝛿𝑑. On the other hand, we want the variation of 𝐻 within the envelope of πœ™ 𝑗 to be small. That is, we want 𝜎 < 𝑇/𝑁 𝑝. Because, presumably, we’ve chosen each Trotter step sufficiently small, this ensures that 𝐻 is approximately constant over the bulk of |πœ™ 𝑗 ⟩. Of course, we will want to ensure all of the above conditions are met using as few resources, such as clock register states, as possible to get an accurate simulation. Let’s now characterize the accuracy of this construction. First, it will be helpful to have a characterization of the size of the normalization N . Lemma 4.3.2. In the notation above, the normalization constant N ∈ R+ for Gaussian states |πœ™ 𝑗 ⟩ peaked at πœ‡ = 𝑑 𝑗 satisfies ∈ 𝑂 (βˆšοΈπ›Ώπ‘‘/𝜎). 1 √ N Proof. By cyclicity, the normalization N is the same for all |πœ™ 𝑗 ⟩, so we consider 𝑗 = 0. Because 85 Figure 4.1 Schematic of the discrete clock Hilbert space. The clock register has an initially prepared Gaussian state which is translated uniformly under the clock Hamiltonian. Its location controls the Hamiltonian applied to the system of interest. The Hamiltonian varies little over each of the 𝑁 𝑝 large steps, and the Gaussian is wide compared to the π‘π‘ž subdivisions within each large step. |πœ™0⟩ is normalized in the Euclidean norm, we have π‘π‘βˆ’1 βˆ‘οΈ N = π‘’βˆ’2|𝛿𝑑2 𝑗 |2 𝑐/𝜎2 𝑗=0 𝑁𝑐/2βˆ’1 βˆ‘οΈ 𝑗=0 = π‘’βˆ’2 𝑗 2𝛿𝑑2/𝜎2 + π‘π‘βˆ’1 βˆ‘οΈ 𝑗=𝑁𝑐/2 π‘’βˆ’2(π‘π‘βˆ’ 𝑗)2𝛿𝑑2/𝜎2 = 1 + 𝑁𝑐/2βˆ’1 βˆ‘οΈ 𝑗=1 π‘’βˆ’2 𝑗 2𝛿𝑑2/𝜎2 + 𝑁𝑐/2 βˆ‘οΈ 𝑗=1 π‘’βˆ’2 𝑗 2𝛿𝑑2/𝜎2 = 𝑁𝑐 2 βˆ’1 βˆ‘οΈ 𝑗=0 π‘’βˆ’2 𝑗 2𝛿𝑑2/𝜎2 + 𝑁𝑐 2βˆ‘οΈ 𝑗=0 π‘’βˆ’2 𝑗 2𝛿𝑑2/𝜎2 βˆ’ 1. (4.39) We may lower bound the sums as Riemann approximations to a Gaussian integral, giving error function erf N β‰₯ βˆšοΈ‚ πœ‹ 8 (cid:18) erf (cid:19) (cid:18)𝑇 + 2𝛿𝑑 √ 2𝜎 + erf (cid:19)(cid:19) (cid:18) 𝑇 √ 2𝜎 βˆ’ 1 > βˆšοΈ‚ πœ‹ 2 𝜎 𝛿𝑑 erf (cid:19) (cid:18) 𝑇 √ 2𝜎 βˆ’ 1 , (4.40) which then implies ≀ √︁ 2/πœ‹(𝛿𝑑/𝜎) 1 N 1 (cid:17) βˆšοΈƒ 2 πœ‹ 𝛿𝑑 𝜎 βˆ’ erf (cid:16) 𝑇 √ 2𝜎 βˆšοΈ‚ 2 πœ‹ = (𝛿𝑑/𝜎) + 𝑂 (cid:18) (𝛿𝑑/𝜎) (cid:18) 𝛿𝑑 𝜎 (cid:19)(cid:19) + π‘’βˆ’ 𝑇2 2𝜎2 The result follows simply from taking a square root. ∈ 𝑂 (𝛿𝑑/𝜎). (4.41) β–‘ 86 With this technical lemma in hand, we turn to showing that Ξ” indeed acts as a generator of translations on the clock space for |πœ™ 𝑗 ⟩, provided 𝜎 is large relative to 𝛿𝑑 and that the Gaussian is not truncated by small 𝑇. Lemma 4.3.3. In the notation introduced in this section, for any π‘š ∈ Z+ we have π‘’π‘–Ξ”π‘šπ›Ώπ‘‘ |πœ™ 𝑗 ⟩ = |πœ™ 𝑗+π‘šβŸ© + 𝑂 (cid:16) π‘š(𝛿𝑑/𝜎)2 + π‘šβˆšοΈπ›Ώπ‘‘/πœŽπ‘’βˆ’(𝑇/2𝜎)2 (cid:17) where the asymptotics 𝑂 are understood to be taken as 𝛿𝑑/𝜎 β†’ 0 and 𝜎/𝑇 β†’ 0. Proof. Performing a 1st order Taylor expansion of the exponential, 𝑒𝑖Δ𝛿𝑑 |πœ™ 𝑗 ⟩ = |πœ™ 𝑗 ⟩ + 𝑖𝛿𝑑Δ|πœ™ 𝑗 ⟩ + 𝑅1(𝛿𝑑)|πœ™ 𝑗 ⟩, where 𝑅1 is the Taylor remainder operator 𝑅1(𝛿𝑑) = 𝛿𝑑 ∫ 𝛿𝑑 0 πœ•2 πœ•πœ2 π‘’π‘–Ξ”πœπ›Ώπ‘‘π‘‘πœ = βˆ’ ∫ 𝛿𝑑 0 π‘’π‘–Ξ”πœπ‘‘πœ(𝛿𝑑Δ2). The Taylor error can be bounded, via the triangle inequality for integrals, as βˆ₯𝑅1(𝛿𝑑)|πœ™ 𝑗 ⟩βˆ₯ ≀ 𝛿𝑑2βˆ₯Ξ”2|πœ™ 𝑗 ⟩βˆ₯. The action of Ξ” on discretized functions |π‘”βŸ© of the clock space is given by (4.42) (4.43) (4.44) Ξ” |π‘”βŸ© = βˆ’π‘– π‘π‘βˆ’1 βˆ‘οΈ 𝑔(𝑑 𝑗 ) (cid:18) | 𝑗 + 1⟩ βˆ’ | 𝑗 βˆ’ 1⟩ 2𝛿𝑑 (cid:19) 𝑗=0 𝑔(𝑑 𝑗+1) βˆ’ 𝑔(𝑑 π‘—βˆ’1) = 𝑖 βˆ‘οΈ 2𝛿𝑑 𝑗 | π‘—βŸ© (4.45) = 𝑖 |π·π›Ώπ‘‘π‘”βŸ© . Here 𝐷𝛿𝑑 𝑓 (π‘₯) := 𝑓 (π‘₯+𝛿𝑑)βˆ’ 𝑓 (π‘₯βˆ’π›Ώπ‘‘) 2𝛿𝑑 is the symmetric finite difference of halfwidth 𝛿𝑑 at point π‘₯. Thus, Ξ”2|πœ™ 𝑗 ⟩ = βˆ’|𝐷2 𝛿𝑑 πœ™ 𝑗 ⟩. We consider the error of this finite difference in terms of an approximation to the derivative for values of 𝑑 within 𝑇/2 βˆ’ 2𝛿𝑑 of 𝑑 𝑗 in circle distance. On this part of the domain, πœ™ 𝑗 (𝑑 Β± 2𝛿𝑑) is smooth, hence |𝐷2 𝛿𝑑 πœ™ 𝑗 ⟩ = |πœ•2 𝑑 πœ™ 𝑗 ⟩ + 𝑂 (𝛿𝑑2πœ™(4) 𝑗 ) (4.46) 87 where the superscript (4) indicates a fourth derivative. Near the edge of the Gaussian, the second- derivative property does not hold; however, these parts of the state vector have amplitude which is on the order 𝑂 (N βˆ’1/2π‘’βˆ’(𝑇/2𝜎)2), which by Lemma 4.3.2 is 𝑂 (βˆšοΈπ›Ώπ‘‘/πœŽπ‘’βˆ’(𝑇/2𝜎)2). This gets multiplied by π›Ώπ‘‘βˆ’2 due to the second finite difference 𝐷𝛿𝑑 being taken. Taking the two sources independently as an upper bound, we have βˆ₯Ξ”2|πœ™ 𝑗 ⟩βˆ₯ ∈ 𝑂 (cid:16) 𝛿𝑑2/𝜎4 + (πœŽπ›Ώπ‘‘3)βˆ’1/2π‘’βˆ’(𝑇/2𝜎)2 (cid:17) (4.47) where πœŽβˆ’4 comes from the four derivatives of the Gaussians. Thus, the total Taylor remainder may be upper bounded using (4.43) as βˆ₯𝑅1(𝛿𝑑)|πœ™ 𝑗 ⟩βˆ₯ ∈ 𝑂 (cid:16) (𝛿𝑑/𝜎)4 + βˆšοΈπ›Ώπ‘‘/πœŽπ‘’βˆ’(𝑇/2𝜎)2 (cid:17) . (4.48) To complete the proof we return to the linear Taylor expansion in (4.42). Using similar reasoning to above, |πœ™ 𝑗 ⟩ + 𝑖𝛿𝑑Δ|πœ™ 𝑗 ⟩ = |πœ™ 𝑗 ⟩ βˆ’ 𝛿𝑑|𝐷𝛿𝑑 πœ™ 𝑗 ⟩ = |πœ™ 𝑗 ⟩ βˆ’ 𝛿𝑑|πœ•π‘‘ πœ™ 𝑗 ⟩ + 𝑂 (cid:16) (𝛿𝑑/𝜎)2 + βˆšοΈπ›Ώπ‘‘/πœŽπ‘’βˆ’(𝑇/2𝜎)2 (cid:17) . (4.49) Finally, what remains is a linear approximation to |πœ™ 𝑗+1⟩, with error also (𝛿𝑑/𝜎)2. Keeping only the leading terms, notice that the Taylor remainder error is subdominant. Altogether, 𝑒𝑖Δ𝛿𝑑 |πœ™ 𝑗 ⟩ = |πœ™ 𝑗+1⟩ + 𝑂 (cid:16) (𝛿𝑑/𝜎)2 + βˆšοΈπ›Ώπ‘‘/πœŽπ‘’βˆ’(𝑇/2𝜎)2 (cid:17) . (4.50) So far, we’ve proved the result for π‘š = 1. The full result follows by noting that π‘’π‘–Ξ”π‘šπ›Ώπ‘‘ = (𝑒𝑖Δ𝛿𝑑)π‘š and taking, as upper bound, π‘š times the error of a single step. β–‘ We note that the error in Ξ” generating translations comes from two sources: the discretization at small scales and the boundary effects at large scales. We might name these, in the language of the lattice field theory, ultraviolet and infrared truncation effects, respectively. Our next intermediate result will be concerned with the time evolution of the system under 𝐢 (𝐻) controlled on the Gaussian state |πœ™ 𝑗 ⟩. We want the result to be, approximately, an evolution under 𝐻 (𝑑 𝑗 ) on the main register of interest. In what follows, it will be convenient to take 𝜏 := 𝑇/𝑁 𝑝 as the time duration of a larger subdivision of steps. 88 Lemma 4.3.4. Let 𝐻 : [0, 𝑇] β†’ Herm(H ) be a bounded differentiable function with bounded derivative. For any πœ‚ ∈ R, we have π‘’βˆ’π‘–πΆ (𝐻)πœ‚ |πœ“βŸ©|πœ™ 𝑗 ⟩ = π‘’βˆ’π‘–π» (𝑑 𝑗 )πœ‚ |πœ“βŸ©|πœ™ 𝑗 ⟩ (cid:18) + 𝑂 πœ‚πœ max π‘‘βˆˆ[0,𝑇] βˆ₯ (cid:164)𝐻 (𝑑) βˆ₯ + (1 + πœ‚ max π‘‘βˆˆ[0,𝑇] βˆ₯𝐻 (𝑑) βˆ₯)π‘’βˆ’πœ2/4𝜎2 (cid:19) where 𝜏 := 𝑇/𝑁 𝑝. Proof. We begin by grouping the terms of 𝐢 (𝐻) into two chunks: one with significant overlap with the Gaussian, the other with small overlap. Specifically, we take 𝐢 (𝐻) = 𝐻av + 𝐻βŠ₯, with 𝐻av : = 𝑗+π‘π‘ž/2βˆ’1 βˆ‘οΈ π‘˜= π‘—βˆ’π‘π‘ž/2 π»π‘˜ βŠ— |π‘˜βŸ©βŸ¨π‘˜ | 𝐻βŠ₯ : = 𝐢 (𝐻) βˆ’ 𝐻av. (4.51) Because 𝐻av and 𝐻βŠ₯ commute, we can Trotterize with no error. π‘’βˆ’π‘–πΆ (𝐻)πœ‚ |πœ“βŸ©|πœ™ 𝑗 ⟩ = π‘’βˆ’π‘–π»βŠ₯πœ‚π‘’βˆ’π‘–π»avπœ‚ |πœ“βŸ©|πœ™ 𝑗 ⟩ (4.52) We will show that the 𝐻av term gives approximately 𝐻 (𝑑 𝑗 ), while 𝐻βŠ₯ acts as approximately the identity (with the right parameter values). First, consider π‘’βˆ’π‘–π»avπœ‚. Define 𝑃 𝑗 = (cid:205)π‘˜ |π‘˜βŸ©βŸ¨π‘˜ | as the projector onto the clock states on which 𝐻av has support (π‘˜ ∈ Z ∩ [ 𝑗 βˆ’ π‘π‘ž/2, 𝑗 + π‘π‘ž/2 βˆ’ 1]). We have βˆ₯π‘’βˆ’π‘–π»avπœ‚ βˆ’ π‘’βˆ’π‘–π» 𝑗 βŠ—π‘ƒ 𝑗 πœ‚ βˆ₯ ≀ πœ‚βˆ₯𝐻av βˆ’ 𝐻 𝑗 βŠ— 𝑃 𝑗 βˆ₯. (4.53) Meanwhile, (cid:13) (cid:13)𝐻av βˆ’ 𝐻 𝑗 βŠ— 𝑃 𝑗 (cid:13) (cid:13) = (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) 𝑗+π‘π‘ž/2βˆ’1 βˆ‘οΈ π‘˜= π‘—βˆ’π‘π‘ž/2 (π»π‘˜ βˆ’ 𝐻 𝑗 ) βŠ— |π‘˜βŸ©βŸ¨π‘˜ | (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) = max π‘˜ βˆ₯π»π‘˜ βˆ’ 𝐻 𝑗 βˆ₯. (4.54) By a simple Taylor bound, βˆ₯π»π‘˜ βˆ’ 𝐻 𝑗 βˆ₯ ≀ (𝜏/2) max𝑑 βˆ₯ (cid:164)𝐻 (𝑑) βˆ₯, were the max is over [π‘‘π‘˜ , 𝑑 𝑗 ] (taking the appropriate ordering of 𝑑 𝑗 , π‘‘π‘˜ if needed). We can therefore say βˆ₯π‘’βˆ’π‘–π»avπœ‚ βˆ’ π‘’βˆ’π‘–π» 𝑗 βŠ—π‘ƒ 𝑗 πœ‚ βˆ₯ ≀ πœ‚πœ max π‘‘βˆˆ[0,𝑇] βˆ₯ (cid:164)𝐻 βˆ₯ (4.55) 89 so that, up to this error, we can replace a simulation by 𝐻av with 𝐻 𝑗 βŠ— 𝑃0. Moving on to this situation, we have π‘’βˆ’π‘–π» 𝑗 βŠ—π‘ƒ0πœ‚ |πœ“βŸ© βŠ— |πœ™ 𝑗 ⟩ = π‘’βˆ’π‘–π» 𝑗 πœ‚ |πœ“βŸ©π‘ƒ 𝑗 |πœ™ 𝑗 ⟩ + |πœ“βŸ©(𝐼 βˆ’ 𝑃 𝑗 )|πœ“ 𝑗 ⟩. (4.56) Thinking of 𝜎 < 𝜏 and taking 𝜏/𝜎 increasing, we have 𝑃0|πœ™ 𝑗 ⟩ = |πœ™ 𝑗 ⟩ + 𝑂 (cid:16) π‘’βˆ’πœ2/4𝜎2 (cid:17) . Thus, π‘’βˆ’π‘–π» 𝑗 βŠ—π‘ƒ0πœ‚ |πœ“βŸ©|πœ™ 𝑗 ⟩ = π‘’βˆ’π‘–π» 𝑗 πœ‚ |πœ“βŸ©|πœ™ 𝑗 ⟩ + 𝑂 (cid:18) πœ‚πœ max π‘‘βˆˆ[0,𝑇] βˆ₯ (cid:164)𝐻 βˆ₯ + π‘’βˆ’πœ2/4𝜎2 (cid:19) . (4.57) For the remainder of the proof, take |πœ“β€²βŸ© = π‘’βˆ’π‘–π» 𝑗 πœ‚ |πœ“βŸ© for notational convenience. We now consider the action of 𝐻βŠ₯ on the remaining state, which we anticipate to be small. First, (cid:13)π‘’βˆ’π‘–π»βŠ₯πœ‚ |πœ“β€²βŸ©|πœ™ 𝑗 ⟩ βˆ’ |πœ“β€²βŸ©|πœ™ 𝑗 ⟩(cid:13) (cid:13) (cid:13) ≀ πœ‚βˆ₯𝐻βŠ₯|πœ“β€²βŸ©|πœ™ 𝑗 ⟩βˆ₯. (4.58) Let J be an index set for all the time steps included in the summation 𝐻av. We have (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) π‘˜βˆ‰J βˆšοΈ„ βˆ₯𝐻βŠ₯|πœ“β€²βŸ©|πœ™ 𝑗 ⟩βˆ₯ = π»π‘˜ |πœ“β€²βŸ©|π‘˜βŸ©βŸ¨π‘˜ |πœ™ 𝑗 ⟩ (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) βˆ‘οΈ ≀ βˆ‘οΈ π‘˜βˆ‰J 1 N π‘’βˆ’2|𝑑 𝑗 βˆ’π‘‘π‘˜ |2 𝑐/𝜎2 βˆ₯π»π‘˜ βˆ₯2. Employing a HΓΆlder inequality on the inner product, followed by Lemma 4.3.2, βˆšοΈ„ βˆ‘οΈ π‘˜βˆ‰J 1 N π‘’βˆ’2|𝑑 𝑗 βˆ’π‘‘π‘˜ |2 𝑐/𝜎2 βˆ₯π»π‘˜ βˆ₯2 ≀ max π‘˜βˆ‰J βˆ₯π»π‘˜ βˆ₯ π‘’βˆ’|𝑑 𝑗 βˆ’π‘‘π‘˜ |2 𝑐/𝜎2 N βˆ‘οΈ π‘˜βˆ‰J βˆ₯𝐻 (𝑑) βˆ₯(𝛿𝑑/𝜎) max 𝑑 ∈ 𝑂 (cid:169) (cid:173) (cid:171) ∞ βˆ‘οΈ π‘˜=π‘π‘ž/2 . π‘’βˆ’π‘˜ 2𝛿𝑑2/𝜎2(cid:170) (cid:174) (cid:172) (4.59) (4.60) Following a similar procedure to before, we convert to an error function erf and take an exponential upper bound. Doing so gives βˆ₯𝐻βŠ₯|πœ“β€²βŸ©|πœ™ 𝑗 ⟩βˆ₯ ∈ 𝑂 max π‘‘βˆˆ[0,𝑇] πœ‚ max𝑑 βˆ₯𝐻 (𝑑) βˆ₯π‘’βˆ’(𝜏/2𝜎)2 (cid:17) Combining the errors together, we take the widest exponential π‘’βˆ’πœ2/4𝜎2 Thus, π‘’βˆ’π‘–π»βŠ₯πœ‚ acts trivially on this state up to 𝑂 (cid:16) . . (4.61) as a simple upper bound (cid:18) βˆ₯𝐻 (𝑑) βˆ₯π‘’βˆ’πœ2/2𝜎2 (cid:19) for all exponentials that appear. Putting all the error sources together gets us the result of the Lemma statement. β–‘ 90 With the previous two lemmas, we have the ingredients needed for a clock space simulation: controlled operations and time shifts. We combine them to show that our clock space indeed encodes time dependent dynamics. Theorem 4.3.5. Let 𝐻 [0, 𝑇] β†’ Herm(H ) be a time dependent Hamiltonian on a finite dimensional vector space H , such that 𝐻 (𝑑) as a function is bounded and differentiable with bounded derivative. Then, the clock Hamiltonian, with Gaussian input |πœ™0⟩, approximately applies the time evolution operator π‘ˆ (𝑇, 0) to an initial state |πœ“0⟩ ∈ H . Precisely, π‘’βˆ’π‘–π»π‘π‘‡ |πœ“0⟩|πœ™0⟩ = (π‘ˆ (𝑇, 0)|πœ“0⟩) |πœ™0⟩ + 𝑂 (cid:18) 𝑇 𝛿𝑑/𝜎2 + βˆšοΈπ‘π‘π‘‡/πœŽπ‘’βˆ’π‘‡ 2/4𝜎2 + max 𝑑 βˆ₯ (cid:164)𝐻 βˆ₯ 𝑇 2 𝑁 𝑝 + π‘’βˆ’πœ2/4𝜎2 (𝑁 𝑝 + max 𝑑 (cid:19) . βˆ₯𝐻 βˆ₯𝑇) Proof. Let 𝜏 := 𝑇/𝑁 𝑝. We begin with a first-order Trotterization of 𝐻𝑐 into 𝑁 𝑝 steps. π‘’βˆ’π‘–π»π‘π‘‡ = (cid:16) π‘’π‘–Ξ”πœπ‘’βˆ’π‘–πΆ (𝐻)𝜏(cid:17) 𝑁 𝑝 (cid:18) + 𝑂 βˆ₯ (cid:164)𝐻 (𝑑) βˆ₯ max π‘‘βˆˆ[0,𝑇] (cid:19) 𝑇 2 𝑁 𝑝 (4.62) With initial state |πœ“0⟩|πœ™0⟩, combining Lemmas 4.3.4 and 4.3.3 gives the following error for a single Trotter step. π‘’π‘–Ξ”πœπ‘’βˆ’π‘–πΆ (𝐻)𝜏 |πœ“0⟩|πœ™0⟩ = π‘’βˆ’π‘–π»0𝜏 |πœ“0⟩|πœ™π‘π‘ž ⟩ (cid:18) + 𝑂 πœπ›Ώπ‘‘/𝜎2 + βˆšοΈƒ π‘π‘žπœ/πœŽπ‘’βˆ’π‘‡ 2/4𝜎2 + 𝜏2 max 𝑑 βˆ₯ (cid:164)𝐻 βˆ₯ + (1 + 𝜏 max 𝑑 βˆ₯𝐻 βˆ₯)π‘’βˆ’πœ2/4𝜎2 (cid:19) (4.63) Thus, after all 𝑁 𝑝 steps, we can multiply the single step error above to get an upper bound of (cid:16) π‘’π‘–Ξ”πœπ‘’βˆ’π‘–πΆ (𝐻)𝜏(cid:17) 𝑁 𝑝 |πœ“0⟩|πœ™0⟩ = π‘’βˆ’π‘–π»π‘π‘ž ( 𝑁𝑝 βˆ’1) 𝜏 . . . π‘’βˆ’π‘–π»π‘π‘ž πœπ‘’βˆ’π‘–π»0𝜏 |πœ“0⟩|πœ™0⟩ + 𝑂 (cid:18) 𝑇 𝛿𝑑/𝜎2 + βˆšοΈπ‘π‘π‘‡/πœŽπ‘’βˆ’π‘‡ 2/4𝜎2 + max 𝑑 βˆ₯ (cid:164)𝐻 βˆ₯ 𝑇 2 𝑁 𝑝 + π‘’βˆ’πœ2/4𝜎2 (𝑁 𝑝 + max 𝑑 (cid:19) . βˆ₯𝐻 βˆ₯𝑇) (4.64) The right side, without the error, is a 1st order Suzuki Trotter splitting, which approximates π‘ˆ (𝑇, 0) to order maxπ‘‘βˆˆ[0,𝑇] βˆ₯𝐻 (𝑑)βˆ₯𝑇 2/𝑁 𝑝. This can be absorbed into the third term of the big-𝑂. This gives the result stated in the Theorem. β–‘ With this result in hand, we now show that the parameters (𝑁 𝑝, π‘π‘ž, 𝜎) of the clock can be chosen such that any desired degree of approximation to π‘ˆ (𝑇, 0) can be achieved. 91 Theorem 4.3.6. In the context of the previous theorem, for any πœ– ∈ R+, there exists clock parameters (𝑁 𝑝, π‘π‘ž, 𝜎) such that with (𝑁 𝑝, π‘π‘ž) scaling as (cid:13)π‘’βˆ’π‘–π»π‘π‘‡ |πœ“0⟩|πœ™0⟩ βˆ’ π‘ˆ (𝑇, 0)|πœ“0⟩|πœ™0⟩(cid:13) (cid:13) (cid:13) < πœ– (cid:18) 𝑁 𝑝 ∈ Θ βˆ₯ (cid:164)𝐻 βˆ₯ max π‘‘βˆˆ[0,𝑇] (cid:19) 𝑇 2 πœ– , π‘π‘ž ∈ Θ (cid:18) max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2 πœ– 2 (cid:19) π‘₯2 , 𝜎 ∈ Θ (cid:18) πœ– max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇π‘₯ (cid:19) . Here, βˆšοΈ„ π‘₯ := log (cid:19) (cid:18) Ξ“ 𝑇 πœ– (cid:26) Ξ“ := max max π‘‘βˆˆ[0,𝑇] βˆ₯ (cid:164)𝐻 βˆ₯𝑇, πœ– max π‘‘βˆˆ[0,𝑇] (cid:27) . βˆ₯𝐻 βˆ₯ In particular, there exists a sequence (𝑁 𝑝 ( 𝑗), π‘π‘ž ( 𝑗), 𝜎( 𝑗)) of clock space parameters, such that Tr𝑐 (π‘’βˆ’π‘–π»π‘π‘‡ |πœ“0⟩|πœ™0⟩) = π‘ˆ (𝑇, 0)|πœ“0⟩ lim π‘—β†’βˆž where Tr𝑐 is a partial trace over the clock register, and Tr𝑐 (|Ψ⟩) ≑ Tr𝑐 (|Ψ⟩⟨Ψ|). Proof. To ensure a total error within πœ– is achievable, it suffices to ensure that each of the five terms constituting the error in Theorem 4.3.5 is within 𝑂 (πœ–) independently. From the onset, we will choose 𝑁 𝑝 ∈ Θ (cid:0)maxπ‘‘βˆˆ[0,𝑇] βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2/πœ– (cid:1) to satisfy the third term. We next move to understand the necessary 𝜎 scaling. We parametrize it as 𝜎 = 𝜏/π‘₯ (4.65) with the hope that π‘₯ can be chosen to increase slowly (i.e., that the Gaussian states have width only slightly smaller than the Trotter step size). For this, we focus on the last two terms, since they have no π‘π‘ž dependence (which will set the smallest scales). We seek max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2 πœ– π‘’βˆ’π‘₯2/4 ∈ 𝑂 (πœ–), βˆ₯𝐻 βˆ₯𝑇 π‘’βˆ’π‘₯2/4 ∈ 𝑂 (πœ–) max 𝑑 (4.66) which can be satisfied provided that π‘₯ is asymptotically lower bounded as (cid:27)(cid:19) (cid:18) π‘₯2 ∈ Ξ© log max (cid:26) max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2 πœ– 2 , max𝑑 βˆ₯𝐻 βˆ₯𝑇 πœ– (4.67) = Ξ© (log(Ξ“ 𝑇/πœ–)) . 92 This sets the scaling for 𝜎. We move next to the first term to fix π‘π‘ž, since the 2nd term is expected to be quite small. We require 𝑇 𝛿𝑑/𝜎2 ∈ 𝑂 (πœ–), which is equivalent to 𝑁 𝑝π‘₯2 π‘π‘ž ∈ 𝑂 (πœ–). (4.68) Therefore, there exists an π‘π‘ž ∈ Θ (cid:0)𝑁 𝑝π‘₯2/πœ– (cid:1), satisfying the bound. All that remains is the second term, whose contribution can be easily shown to be subdominant compared to the other sources. Therefore, the choice of parameter scaling suffice to achieve the desired error πœ–. We have shown that any desired precision πœ– for dynamical simulation can be accommodated by appropriate choice of clock space parameters. Taking a sequence πœ– 𝑗 β†’ 0, we see there exists a sequence of clock space evolutions whose limit, restricted to the main register, is π‘ˆ (𝑇, 0). β–‘ The asymptotic scalings provided in the above theorem will be important in our discussion of qubitization, where we will use them to derive a query complexity. 4.4 Time Dependent Qubitization In the previous section, we developed a clock space construction which encoded a time de- pendent Hamiltonian as a time independent one on an augmented, finite-dimensional space. The removal of time-ordering using a clock register opens the door for quantum algorithms for time independent Hamiltonian simulation to simulate the full clock-system dynamics directly. In par- ticular, qubitization is an asymptotically optimal [88] simulation method that can only be applied to time independent 𝐻. In this section, we propose the simulation of time dependent Hamiltoni- ans through qubitization using finite clock registers. To be concrete, we will work with an input model in which 𝐻 (𝑑) is a linear combination of fixed unitaries with time-varying coefficients. This describes, for example, Pauli matrices on 𝑛 qubits with fluctuating coefficients. 4.4.1 Pseudo-Algorithm We take 𝑛𝑐 qubits to provide a clock register of size 𝑁𝑐 = 2𝑛𝑐 , which are included with the main register of interest. The initial state |πœ“0⟩ |πœ™0⟩ must be prepared on this joint register. We take |πœ“0⟩ of the main register as given, since this is necessarily application dependent. We must, however, 93 prepare a Gaussian state |πœ™0⟩ on the clock register of 𝑛𝑐 qubits with width 𝜎. Unsurprisingly, much effort has been devoted to this task. [80, 110, 109, 71, 57]. For our purposes, we will simply refer to the approach by Kitaev and Webb [79, 10] as efficient enough for our purposes. The Gaussian will nonnegligible support over 𝑂 (π‘π‘ž) clock states, and their algorithm scales polynomially in the number of qubits π‘›π‘ž = log π‘π‘ž over the Gaussian. This cost is negligible compared to the simulation costs that we are about to discuss. Once the initial state is prepared, we employ qubitization to approximate π‘’βˆ’π‘–π»π‘π‘‡ on the full register. Given 𝐻 (𝑑) in LCU form, we need to express 𝐻𝑐 in LCU form as well, which is not immediate. This is done through several applications of the Signature Matrix Decomposition. We also truncate Ξ” at high frequencies to reduce computational cost, with little loss in accuracy. Details of the LCU decomposition are provided in the next subsection. Once 𝐻𝑐 is in LCU form, select SEL and prepare PREP circuits may be constructed to block encode 𝐻𝑐 as 𝐻𝑐/βˆ₯𝑐βˆ₯1 = (⟨0| PREP† βŠ— 𝐼)SEL(PREP |0⟩ βŠ— 𝐼) (4.69) where βˆ₯𝑐βˆ₯1 is the one-norm of the LCU coefficients. Standard qubitization can now be done on this block encoded Hamiltonian [88]. The PREP circuit must create a "quasi-uniform" distribution over some number 𝑁 of states, in the sense that, on the LCU auxiliary register, |PREP⟩ = πΎβˆ’1 βˆ‘οΈ √ 𝑗=1 𝛿 | π‘—βŸ© + 𝑁 βˆ‘οΈ √ 𝑗=𝐾 𝛿′ | π‘—βŸ© (4.70) with 𝛿, 𝛿′, 𝐾 and 𝑁 determined by parameters of simulation. Meanwhile the SEL circuit will need to apply controlled π‘ˆπ‘– operations, where π‘ˆπ‘– is a unitary in the 𝐻 (𝑑) decomposition, and controlled signature matrices. These second operations can be done with classical comparator circuits. Each SEL will also require a Quantum Fourier Transform and its inverse on the clock register. 4.4.2 Block Encoding To make progress, it seems 𝐻 (𝑑) itself should be expressible neatly in terms of unitaries. Thus, we assume 𝐻 (𝑑) is of the form 𝐻 (𝑑) = 𝛼𝑖 (𝑑)π‘ˆπ‘– 𝐿 βˆ‘οΈ 𝑖=1 94 (4.71) where π‘ˆ 𝑗 are Hermitian and unitary (e.g., 𝑛-qubit signed Pauli operators) and 𝛼 𝑗 (𝑑) are nonnegative real-valued functions on [0, 𝑇]. When we discretize, the coefficients 𝛼𝑖 𝑗 ≑ 𝛼𝑖 (𝑑 𝑗 ) will be particularly important. Expanding out 𝐢 (𝐻) from equation (4.27) using (4.71), 𝐢 (𝐻) = π‘π‘βˆ’1 βˆ‘οΈ (cid:32)πΏβˆ’1 βˆ‘οΈ (cid:33) 𝛼𝑖 π‘—π‘ˆπ‘– βŠ— | π‘—βŸ©βŸ¨ 𝑗 | 𝑖=0 π‘ˆπ‘– βŠ— 𝐷𝑖 𝑗=0 πΏβˆ’1 βˆ‘οΈ 𝑖=0 = 𝐷𝑖 := π‘π‘βˆ’1 βˆ‘οΈ 𝑗=0 𝛼𝑖 𝑗 | π‘—βŸ©βŸ¨ 𝑗 | (4.72) (4.73) where is a diagonal operator on the clock register. There is a general technique for LCU constructions of diagonal, or easily diagonalized, operators via a Signature Matrix Decomposition, which we digress to discuss. 4.4.3 Interlude: Signature Matrix Decomposition or "Alternating Sign Trick" In our manipulations of 𝐻𝑐, we have come across the problem of expressing a diagonal, Hermitian matrix 𝐷 in LCU form. We will handle this in the present section. Although I did not invent this technique, I had a sufficiently hard time finding a clear reference to it in the literature, such that I felt an overview would be appropriate and possibly helpful to future researchers. There are many unitary bases that exist, but in our case we have quite stringent requirements. We ultimately want our decomposition to consist of Hermitian operations as well, meaning they should be reflection operators (π‘ˆ2 = 𝐼). Moreover, it is sensible to look for diagonal unitaries, because our operator 𝐷 is diagonal. These requirements alone enforce that our unitaries are, in fact, signature matrices: π‘ˆ = diag(πœ†1, πœ†2, . . . , πœ†π‘›) where πœ† 𝑗 = Β±1. For the moment, let’s imagine the entries of 𝐷 are all positive integers, and we allow ourselves to add unitaries only in integer amounts. Think of each entry as a bucket of size πœ† 𝑗 . We want to add to the bucket, and we can only do so in units of +1 (by unitarity) or βˆ’1. Each time we add a unitary, say, the identity, we are adding a unit to each bucket. Some of the buckets will fill up faster than others because they are smaller, but we are not allowed to stop adding, per se. We can only 95 add or remove one unit, not zero, by unitarity. The next best thing we can do is, while the other buckets are getting filled, add and remove 1 unit in alternating sequence. We need to do this until the largest bucket has been filled, at which point we can stop. Let’s put this all more formally. We want a sequence of unitaries that keep track of whether the entries are above, or below, the number π‘˜ of additions that have already occurred. To this end, define π‘ˆπ‘˜ := 𝑛 βˆ‘οΈ 𝑗=1 (βˆ’1) π‘˜ [π‘˜ >πœ† 𝑗 ] | π‘—βŸ©βŸ¨ 𝑗 | (4.74) where [𝑃] is the boolean function for proposition 𝑃 assigning 1 to true, 0 to false. We see that, for π‘˜ even, π‘ˆπ‘˜ = 𝐼 is the identity operator. while for odd π‘˜ π‘ˆπ‘˜ has eigenvalue βˆ’1 whenever 𝑗 is such that π‘˜ > πœ† 𝑗 . Then, if we take a sum "until the largest bucket" βˆ₯𝐷 βˆ₯ has been filled, we should obtain 𝐷. In fact, βˆ₯𝐷 βˆ₯ βˆ‘οΈ π‘˜=1 π‘ˆπ‘˜ = 𝑛 βˆ‘οΈ 𝑗=1 | π‘—βŸ©βŸ¨ 𝑗 | = 𝑛 βˆ‘οΈ 𝑗=1 | π‘—βŸ©βŸ¨ 𝑗 | βˆ₯𝐷 βˆ₯ βˆ‘οΈ π‘˜=1 (βˆ’1) π‘˜ [π‘˜ >πœ† 𝑗 ] and the inner sum can be written as πœ† 𝑗 βˆ‘οΈ π‘˜=1 1 + βˆ₯𝐷 βˆ₯ βˆ‘οΈ π‘˜=πœ† 𝑗 +1 (βˆ’1) π‘˜ = πœ† 𝑗 + πœ– (4.75) (4.76) where πœ– ∈ {βˆ’1, 0, 1} is an 𝑂 (1)-error. We will see shortly how to boost the precision, but first we note that, generalizing to positive real-valued entries, the same procedure for π‘ˆπ‘˜ (taking ⌈βˆ₯𝐷 βˆ₯βŒ‰ as the upper sum limit) generates βŒŠπœ† 𝑗 βŒ‹ on the entries to accuracy Β±1 at worst. Thus, for real values the error is less than 2, which is still 𝑂 (1). This might not seem like a good approximation, especially when πœ† 𝑗 is small. But we can artificially increase the size of πœ† 𝑗 by performing the same procedure on 𝐷/𝛿 for suitably small 𝛿 > 0, then multiplying by 𝛿. Let 𝐿𝛿 := ⌈βˆ₯𝐷 βˆ₯/π›ΏβŒ‰. Then so 𝐷/𝛿 = 𝐿 𝛿 βˆ‘οΈ π‘˜=1 π‘ˆπ‘˜ + 𝑂 (1) 𝐷 = 𝐿 𝛿 βˆ‘οΈ π‘˜=1 π›Ώπ‘ˆπ‘˜ + 𝑂 (𝛿) 96 (4.77) (4.78) with π‘ˆπ‘˜ the same as (4.74) but with the replacement πœ† 𝑗 β†’ πœ† 𝑗 /𝛿. We’ve succeeded at expressing 𝐷 in LCU form to accuracy 𝑂 (𝛿) using 𝐿𝛿 terms. We still haven’t handled negative eigenvalues. This is accomplished by adding the appropriate sign. Altogether, the 𝐿𝛿 matrices π‘ˆπ‘˜ := 𝑛 βˆ‘οΈ 𝑗=1 sgn(πœ† 𝑗 ) (βˆ’1) π‘˜ [π‘˜ >|πœ† 𝑗 |/𝛿] (4.79) are sufficient to approximate 𝐷 to within 2𝛿 in each entry. Assuming the eigenvalues πœ† 𝑗 are known and classically computable, unitaries such as (4.79) can be implemented on a quantum computer using comparator circuits. Observe that this procedure can also generate an LCU type expansion, more generally, when the Hermitian operator 𝐻 is easily diagonalizable. Moreover, the fact that the coefficients in the LCU are the same allows for simple implementation in a select-and-prepare block encoding. 4.4.4 Block Encoding (cont.) Let Λ𝑖 (𝛿) ≑ ⌈max 𝑗 |𝛼𝑖 𝑗 |/π›ΏβŒ‰. Using a signature matrix decomposition, we can write for 𝛿 > 0, where 𝐷𝑖 = Λ𝑖 (𝛿) βˆ‘οΈ π‘˜=1 π›Ώπ‘†π‘–π‘˜ (𝛿) + 𝑂 (𝛿) π‘†π‘–π‘˜ (𝛿) = π‘π‘βˆ’1 βˆ‘οΈ 𝑗=0 (βˆ’1) π‘˜ [π‘˜ >𝛼𝑖 𝑗 /𝛿] | π‘—βŸ©βŸ¨ 𝑗 | (4.80) (4.81) and [𝑃] is the Boolean function for proposition 𝑃, with [True] = 1 and [False] = 0. Thus, we obtain an LCU decomposition of 𝐢 (𝐻) as 𝐢 (𝐻) = 𝛿 𝐿 βˆ‘οΈ Λ𝑖 (𝛿) βˆ‘οΈ 𝑖=1 π‘˜=1 π‘ˆπ‘– βŠ— π‘†π‘–π‘˜ (𝛿) + 𝑂 (𝐿𝛿). (4.82) The prepare circuit PREP is simple enough because the linear combination is uniform. Therefore, it can be accomplished using a Hadamard gate on each of (cid:32) 𝑛𝐢 (𝐻) ∈ 𝑂 log 𝐿 βˆ‘οΈ max 𝑗 (cid:33) |𝛼𝑖 𝑗 |/𝛿 (4.83) 𝑖=0 auxiliary qubits needed for a binary encoding. The unitaries π‘ˆπ‘– βŠ— π‘†π‘–π‘˜ (𝛿) can be selected using two different SEL circuits: one for the original π‘ˆπ‘– (presumed available to us) and one for the signature 97 matrices π‘†π‘–π‘˜ (𝛿). These unitaries can be constructed using classical comparator circuits provided that each 𝛼𝑖 𝑗 is computable. We turn our attention now to Ξ”. Although already in LCU form, the coefficient has size 2/𝛿𝑑 and is too large to be desirable. As discussed above, the problem stems from unnecessary high- frequency modes, which we wish to truncate. We start by converting Ξ” to Fourier space, i.e., diagonalizing via the Quantum Fourier Transform. The result may be computed by diagonalizing π‘ˆ+, and is found to be π‘π‘βˆ’1 βˆ‘οΈ Ξ” = QFT 𝑁𝑐 𝑇 sin (cid:18) 2πœ‹ (cid:19) 𝑗 𝑁𝑐 | π‘—βŸ©βŸ¨ 𝑗 | QFT† 𝑗=0 𝑁𝑐/2βˆ’1 βˆ‘οΈ 𝑗=βˆ’π‘π‘/2 = QFT 𝑁𝑐 𝑇 sin (cid:18) 2πœ‹ (cid:19) 𝑗 𝑁𝑐 | π‘—βŸ©βŸ¨ 𝑗 | QFT† (4.84) where, in the second line, we define indices βˆ’ 𝑗 = 𝑁𝑐 βˆ’ 𝑗 for 𝑗 > 0 and write the diagonalized Ξ” symmetrically about 𝑗 = 0. The benefit of this parametrization is that small | 𝑗 | correspond to low-frequency modes, as we shall see. Let Δ𝐽 be Ξ” truncated at frequencies above those determined by index 𝐽 ∈ [0, 𝑁𝑐/2] ∩ Z. Δ𝐽 := QFT 𝐽 βˆ‘οΈ 𝑗=βˆ’π½ 𝑁𝑐 𝑇 sin (cid:18) 2πœ‹ (cid:19) 𝑗 𝑁𝑐 | π‘—βŸ©βŸ¨ 𝑗 | QFT† (4.85) The error in a clock space evolution using Δ𝐽 rather than Ξ” is upper bounded by 𝑇 βˆ₯Ξ” |πœ™0βŸ©βˆ’Ξ”π½ |πœ™0⟩ βˆ₯, which can be evaluated and upper bounded as 𝑇 βˆ₯Ξ” |πœ™0⟩ βˆ’ Δ𝐽 |πœ™0⟩ βˆ₯ = (cid:13) (cid:13) (cid:13) (cid:13) βˆ‘οΈ 𝑁𝑐 sin(2πœ‹ 𝑗 𝑁𝑐 )| π‘—βŸ©βŸ¨ 𝑗 |QFT† |πœ™0⟩ (cid:13) (cid:13) (cid:13) (cid:13) | 𝑗 |>𝐽 βˆšοΈ„ βˆ‘οΈ ≀ 𝑁𝑐 |⟨ 𝑗 | QFT† |πœ™0⟩|2. (4.86) | 𝑗 |>𝐽 We thus desire a characterization of QFT† |πœ™0⟩, which we naturally expect to be another Gaussian up to errors arising from the difference between discrete and continuous Fourier Transforms. This analysis was performed in Appendix C of [113], and we adapt that work to our present situation. As the reference shows, the error in each component 𝑗 arises from three sources: 1. Truncation of the time variable to 𝑂 (𝑇), which we denote πœ–trunc. 98 2. Truncation of the frequency variable to 𝑂 (𝑁𝑐/𝑇) ("aliasing"), which we denote πœ–alias. 3. Differences in normalizing in the continuum vs the discrete setting, which we denote πœ–norm. In our notation and setting, Rendon et al. [113] show that these errors satisfy the following asymptotic bounds. πœ–trunc ∈ 𝑂 πœ–alias ∈ 𝑂 πœ–norm ∈ 𝑂 (cid:19) π‘’βˆ’Ξ©(𝑇 2/𝜎2) 𝑐 𝜎2/𝑇 2 (cid:19) π‘’βˆ’Ξ©(𝑁 2 (cid:18)βˆšοΈ‚ 𝜎 𝑇 (cid:18)βˆšοΈ‚ 𝜎 𝑇 π‘’βˆ’Ξ©(𝑁𝑐)(cid:17) (cid:16) (4.87) Let’s take these errors to all be 𝑂 (πœ–QFT), with the required πœ–QFT to be determined. The results from Theorem 16 and Appendix C of [113] imply that (cid:32)βˆšοΈ‚ πœ‹π‘π‘ N QFT† |πœ™0⟩ = 𝑁𝑐/2βˆ’1 βˆ‘οΈ 𝑗=βˆ’π‘π‘/2 𝜎 𝑇 π‘’βˆ’(πœ‹ 𝑗 𝜎/𝑇)2 + 𝑂 (πœ–QFT) (cid:33) | π‘—βŸ© . With this in hand, we return to (4.86). First, |⟨ 𝑗 | QFT† |πœ™0⟩|2 = πœ‹π‘π‘ N 𝜎2 𝑇 2 π‘’βˆ’2(πœ‹ 𝑗 𝜎/𝑇)2 + 𝑂 ( βˆšοΈ‚ 𝜎 𝑇 π‘’βˆ’(πœ‹ 𝑗 𝜎/𝑇)2πœ–QFT) (4.88) (4.89) where we assume the error πœ–QFT is smaller asymptotically than the amplitude itself, to be justified. Taking the sum over high frequencies, βˆšοΈ„ βˆ‘οΈ | 𝑗 |>𝐽 |⟨ 𝑗 | QFT† |πœ™0⟩|2 ∈ 𝑂 (cid:32)βˆšοΈ‚ 𝑁𝑐 N 𝜎 𝑇 π‘’βˆ’Ξ©(𝐽2𝜎2/𝑇 2) + βˆšοΈ‚ 𝜎 𝑇 (cid:33) π‘’βˆ’Ξ©(𝐽2𝜎2/𝑇 2)πœ–QFT βŠ† 𝑂 (cid:18)βˆšοΈ‚ 𝜎 𝑇 π‘’βˆ’Ξ©(𝐽2𝜎2/𝑇 2) (1 + πœ–QFT) (cid:19) . (4.90) We next observe that πœ–QFT ∈ 𝑂 (1) by previous assumptions, and can now be removed. From (4.86), we get the full simulation error by multiplying by 𝑁𝑐 πœ–π½ ∈ 𝑂 (cid:18) 𝑁𝑐 βˆšοΈ‚ 𝜎 𝑇 π‘’βˆ’Ξ©(𝐽2𝜎2/𝑇 2) (cid:19) . In order for πœ–π½ ∈ 𝑂 (πœ–), we want the cutoff 𝐽 to satisfy π‘’βˆ’π½2𝜎2/𝑇 2 ∈ 𝑂 (cid:33) (cid:32)βˆšοΈ‚ 𝑇 𝜎 πœ– 𝑁𝑐 99 (4.91) (4.92) which can be satisfied provided 𝐽 scales as 𝐽 ∈ Θ (cid:18) 𝑇 𝜎 (log 𝜎/𝑇 + log 𝑁𝑐 + log 1/πœ–) (cid:19) βŠ† ˜Θ(𝑇/𝜎). (4.93) Letting ΛœΞ” ≑ Δ𝐽 for this choice of 𝐽, we now switch to considering the simulation of ΛœΞ”. Let 𝛿′ > 0, and let Ξ“(𝛿′) := ⌈(𝑁𝑐/𝑇 𝛿′) sin(2πœ‹π½/𝑁𝑐)βŒ‰. We have where 𝐽 βˆ‘οΈ 𝑗=βˆ’π½ 𝑁𝑐 𝑇 sin (cid:18) 2πœ‹ (cid:19) 𝑗 𝑁𝑐 | π‘—βŸ©βŸ¨ 𝑗 | = 𝛿′ Ξ“(𝛿′) βˆ‘οΈ β„“=1 𝑆(Ξ”) π‘˜ (𝛿′) + 𝑂 (𝛿′) 𝐽 βˆ‘οΈ 𝑆(Ξ”) π‘˜ (𝛿′) := sgn( 𝑗) (βˆ’1) π‘˜ [π‘˜ >(𝑁𝑐/𝑇 𝛿′) sin(2πœ‹ 𝑗/𝑁𝑐)] . (4.94) (4.95) 𝑗=βˆ’π½ Defining the unitary 𝑉ℓ (𝛿′) := QFT 𝑆(Ξ”) β„“ (𝛿′) QFT†, we have obtained an LCU decomposition of Ξ”. The PREP circuit is, as with 𝐢 (𝐻), only a column of Hadamards on 𝑛Δ ∈ 𝑂 (log ((𝑁𝑐/𝑇 𝛿′) sin(2πœ‹π½/𝑁𝑐))) βŠ† Λœπ‘‚ (cid:18) log (cid:19) 1 πœŽπ›Ώβ€² (4.96) auxiliary qubits. Meanwhile the SEL circuit may be constructed as QFT SELβ€² QFT†, where SELβ€² is a select circuit using the 𝑆(Ξ”) signature matrices that can, as before, be implemented with comparator β„“ circuits that compute sine. Combining with (4.82), we obtain an approximate LCU decomposition of the approximate clock Hamiltonian Λœπ»π‘. Λœπ»π‘ = 𝛿 𝐿 βˆ‘οΈ Λ𝑖 (𝛿) βˆ‘οΈ 𝑖=1 π‘˜=1 π‘ˆπ‘– βŠ— π‘†π‘–π‘˜ (𝛿) + 𝛿′ Ξ“(𝛿′) βˆ‘οΈ β„“=1 𝐼 βŠ— 𝑉ℓ (𝛿′) + 𝑂 (πœ–/𝑇 + 𝐿𝛿 + 𝛿′) (4.97) To achieve an πœ–-accurate simulation, we will require 𝛿 ∈ 𝑂 (πœ–/𝐿𝑇) and 𝛿′ ∈ 𝑂 (πœ–/𝑇). The 1-norm 100 βˆ₯𝑐βˆ₯1 of all of the coefficients is given by πΏβˆ’1 βˆ‘οΈ βˆ₯𝑐βˆ₯1 = 𝛿 Λ𝑖 (𝛿) + 𝛿′Γ(𝛿′) 𝑖=0 (cid:32)πΏβˆ’1 βˆ‘οΈ 𝑖=0 ∈ 𝑂 |𝛼𝑖 𝑗 | + max 𝑗 (cid:33) 𝑁𝑐 𝑇 sin(2πœ‹π½/𝑁𝑐) (cid:16) (cid:16) (cid:18) βŠ† 𝑂 βŠ† Λœπ‘‚ βŠ† Λœπ‘‚ βˆ₯𝛼βˆ₯rev βˆ₯𝛼βˆ₯rev (cid:17) ∞,1 + 𝐽/𝑇 ∞,1 + πœŽβˆ’1(cid:17) βˆ₯𝛼βˆ₯rev ∞,1 + max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 πœ– (cid:19) (4.98) where βˆ₯𝛼βˆ₯∞,1 ≑ (cid:205)πΏβˆ’1 𝑖=0 max𝑑 |𝛼𝑖 (𝑑)| and Λœπ‘‚ suppresses multiplicative logarithmic factors. Thus, the number of queries to SEL and PREP circuits in an LCU encoding scales as (cid:18) 𝑄 ∈ Λœπ‘‚ βˆ₯𝛼βˆ₯rev ∞,1 𝑇 + max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2 πœ– + log 1/πœ– log log 1/πœ– (cid:19) . The number of auxiliary qubits needed for the clock register is 𝑛𝑐 = log 𝑁 𝑝 + log π‘π‘ž ∈ 𝑂 (cid:16) log(max 𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2) + log 1/πœ– (cid:17) while the number of auxiliary qubits needed for the LCU block encoding is given by 𝑛LCU = 𝑛𝐢 (𝐻) + 𝑛Δ (cid:32) ∈ 𝑂 log βˆ₯𝛼βˆ₯rev ∞,1 𝛿 + log (cid:33) 1 πœŽπ›Ώβ€² (cid:32) βŠ† 𝑂 log 𝑇 𝐿 βˆ₯𝛼βˆ₯rev ∞,1 πœ– + log (cid:33) max𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2 πœ– 2 (cid:16) βŠ† 𝑂 log 𝐿 + log(βˆ₯𝛼βˆ₯rev ∞,1 𝑇) + log(max 𝑑 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2) + log 1/πœ– (cid:17) (4.99) (4.100) (4.101) for a total number of auxiliary qubits 𝑛 ∈ 𝑂 (𝑛LCU). Improvements over 1st-order Trotter in query complexity, with our bounds, only appear with very small time variations. We may be able to prove this through better characterizations in the error stemming from 𝐻𝑐. We use Trotter bounds in those calculations, but we may need to be smarter to avoid the limits seen here. 101 4.5 Discussion In the circuit-based Hamiltonian simulation community, time dependent Hamiltonians are often treated on separate footing from time independent ones. The main contribution of this project is a new way of thinking about time dependent dynamics that allows us to replace time ordered operator exponentials with ordinary operator exponentials acting on a higher-dimensional finite space. We apply the discretized (𝑑, 𝑑′) trick and show that it encodes the time ordered exponential for sufficiently large clock sizes. The clock space framework can also be used directly in quantum simulation methods to extend the capabilities of certain quantum algorithms. Specifically, it can be used to extend qubitization to time dependent systems. While in many circumstances it will be more convenient to use a truncated Dyson series simulation method in preference to this approach, our work shows how the use of discrete clock spaces used to construct new quantum simulation algorithms that would otherwise be challenging. Besides an LCU encoding, natural block encodings of 𝐻𝑐 may be possible. For example, a very general input model for 𝐻 (𝑑) is to take it as a 𝑑-sparse matrix with query access to the nonzero entries. This seems quite promising an avenue to take, because then 𝐻𝑐 = 𝐢 (𝐻) + Ξ” is 𝑑 + 2 sparse, and there is a natural way to query the entries of 𝐻𝑐. Hence, such a Hamiltonian should immediately simulatable by qubitization (or other quantum walk methods). The trouble is that the largest entry in absolute value βˆ₯𝐻𝑐 βˆ₯max of 𝐻𝑐 comes from Ξ”, which is of size 𝑁𝑐/2𝑇. This is too large to yield an effective simulation algorithm. Of course, there is something odd about the need to care for the operator norm βˆ₯Ξ”βˆ₯, since the typical state being acted on is a Gaussian |πœ™ 𝑗 ⟩. Thinking of Ξ” in frequency space, modes of frequency Ξ©(πœŽβˆ’1) should not be relevant for Gaussian states of width 𝑂 (𝜎) on the clock register. This suggests that a high-frequency truncation of Ξ”, say ΛœΞ” would act approximately the same on the Gaussians while decreasing the norm. However, there is no guarantee that the modified operator, ΛœΞ”, is sparse in the basis of clock times. Perhaps considering a reduced clock Hamiltonian Λœπ»π‘– 𝑗 = βŸ¨πœ™π‘– |𝐻𝑐 |πœ™ 𝑗 ⟩, with all small elements set to zero, would have the sparseness conditions required, along with a subspace norm of βˆ₯Ξ”βˆ₯πœ™ ∈ 𝑂 (πœŽβˆ’1). Investigating such sparse encodings would make an interesting avenue for future work. 102 CHAPTER 5 MULTIPRODUCT FORMULAS FOR TIME DEPENDENT SIMULATION One of the key developments of Chapter 4 is the construction of a discrete clock space which reduces the computation of time dependent Hamiltonians to time independent ones. This reduction provides a useful way to translate techniques and concepts between the time dependent and time independent settings. In this chapter, we investigate one of these connections: a generalization of Multiproduct Formulas (MPFs) for the time dependent setting based on product formula simulations of the clock space. After arguing that such "time dependent MPFs" should form good approximations to the time evolution operator π‘ˆ for sufficiently smooth 𝐻 (𝑑), we propose an algorithm based on these MPFs. We then provide a rigorous characterization of error in these formulas, and from this derive a query complexity in a natural Hamiltonian input model. Numerical demonstrations are used to validate the effectivess of time dependent MPFs at achieving high-accuracy simulations. What we find for the properties of the MPF algorithm (and the qubitization algorithm) is summarized in Table 5.1, where we also display other leading algorithms for time dependent Hamiltonian simulation. Overall, the MPF algorithm has comparable performance to the Dyson method, with strengths and weaknesses on both sides. For example, unlike the Dyson method, the MPF simulation exhibits commutator scaling, meaning that the simulation is perfect for commuting Hamiltonian terms and no time dependence. It also scales as the more favorable 𝐿1 norm of the Hamiltonian rather than the maximum value at a given time, as the Dyson series does. On the other hand, the Dyson series is not concerned with large derivatives, but only the size of 𝐻. Overall, time dependent MPF simulation enlarges the collection of available tools for the future practitioner who is looking for the right algorithm for their problem of interest. This chapter is the subject of ongoing research, particularly with respect to the proof (or disproof) of Conjecture 1. An early preprint has been posted [133] which will be updated as the project reaches completion. 103 Method Trotter [137] Query Complexity Auxiliary Qubits 𝑂 (cid:0)𝐿 (βˆ₯Ξ›βˆ₯1)1+π‘œ(1)/πœ– π‘œ(1)(cid:1) 𝑂 (βˆ₯𝛼βˆ₯2 0 1,1/πœ–) QDrift [16] Dyson [89, 75] 𝑂 (cid:0)βˆ₯𝛼βˆ₯1,βˆžπ‘‡ log(1/πœ–)(cid:1) 𝑇 + log 1/πœ– (cid:1) (cid:101)𝑂 (cid:0)βˆ₯𝛼βˆ₯rev ∞,1 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2/πœ– (cid:1) + max 𝑑 Qubitization* MPF (cid:101)𝑂 (cid:0)𝐿 βˆ₯Ξ›βˆ₯1 log2(1/πœ–)(cid:1) 0 No (cid:101)𝑂 (cid:0) log(βˆ₯ (cid:164)𝛼βˆ₯1,1/πœ–) + log(βˆ₯𝛼βˆ₯1,βˆžπ‘‡/πœ–)(cid:1) No (cid:17) 𝑂 (cid:0) log 𝑇/πœ– (cid:16) 𝐿 βˆ₯𝛼βˆ₯rev ∞,1 βˆ₯ (cid:164)𝐻 βˆ₯𝑇 2/πœ– (cid:16) (cid:17) (cid:1) 𝐿 max 𝑑 + log (cid:101)𝑂 (cid:0)log (cid:0)𝐿βˆ₯Ξ›βˆ₯1βˆ₯ (cid:164)𝛼βˆ₯∞,βˆžπ‘‡ 2/πœ– (cid:1)(cid:1) No Yes CS? Yes Table 5.1 Summary of our results (green) and comparison to leading quantum simulation methods for time dependent Hamiltonians. We assume that 𝐻 = (cid:205)𝐿 and real-valued 𝛼 𝑗 (𝑑). Ξ› is a positive, time dependent function with dimensions of 𝐻, and quantifies the size of 𝛼 𝑗 and its derivatives (see Definition 3). βˆ₯𝛼βˆ₯ 𝑝,π‘ž refers to a nested vector-𝑝 and functional-π‘ž norm for the coefficients 𝛼 = (𝛼 𝑗 ) 𝐿 reverse order. Commutator scaling (CS) here means the simulation error vanishes in the limit where 𝐻 is time independent and [π‘ˆ 𝑗 , π‘ˆπ‘˜ ] = 0 for all 𝑗, π‘˜ ∈ [𝐿]. 𝛼 𝑗 (𝑑)π‘ˆ 𝑗 for Hermitian unitaries π‘ˆ 𝑗 𝑝,π‘ž indicates these are taken in the 𝑗=1, and βˆ₯𝛼βˆ₯rev 𝑗=1 5.1 Introduction and Background Multiproduct formulas (MPFs) are a generalization of the celebrated product formulas and span two of the pillars of quantum simulation: product formula and LCU methods. The aim of the MPF is to approximate the time evolution operator π‘ˆ as a linear combination of lower-order Trotter formulas, in such a way that higher order errors are cancelled [28, 24, 90]. They are, fundamentally, nothing more than a Richardson extrapolation of a product formula P to Trotter step size 𝑠 β†’ 0. This summation is done to address the primary deficiency of product formulas: the cost of constructing a high order product formula is exponentially large. This is not only true of the well-known Suzuki-Trotter formulas, but any similar construction, due to the need to cancel error terms that grow exponentially in the number of products considered. In contrast, since the MPF is a sum of product formula approximations, the number of error terms of a given order does not grow exponentially. This allows us to approximate the quantum dynamics using polynomially many, rather than exponentially many, operator exponentials. The current central result concerning the use of MPFs for time independent simulation is 104 provided by the following theorem of Low, Kliuchnikov, and Wiebe [90]. Theorem 5.1.1 (Time independent MPFs (Theorem 1 of [90])). Let 𝐻 be a bounded, time inde- pendent Hamiltonian, and let π‘ˆ2(𝑑) be the 2nd-order Suzuki-Trotter formula for the time evolution operator π‘ˆ (𝑑) = π‘’βˆ’π‘–π»π‘‘. Let π‘Ž = (π‘Ž1, π‘Ž2, . . . , π‘Žπ‘š) ∈ Rπ‘š and (cid:174)π‘˜ = (π‘˜1, π‘˜2, . . . , π‘˜π‘š) ∈ Zπ‘š choices of π‘Ž and (cid:174)π‘˜ such that the multiproduct formula + . There exist π‘ˆ2,π‘š (𝑑) := π‘š βˆ‘οΈ 𝑗=1 π‘Ž π‘—π‘ˆ π‘˜ 𝑗 2 (𝑑/π‘˜ 𝑗 ) is order 2π‘š and satisfies π‘˜ 𝑗 ∈ 𝑂 (π‘š2), max 𝑗 βˆ₯π‘Žβˆ₯1 ∈ 𝑂 (polylog(π‘š)). As a caution, we remark that, despite notation, the MPF π‘ˆ2,π‘š is not generally unitary for π‘š > 1, though when suitably constructed it will approximate the unitary π‘ˆ (hence be approximately unitary). The proof of Theorem 5.1.1 may be found in [90], but at a high level, the MPF π‘ˆ2,π‘š is a Richardson extrapolation of π‘ˆ2 with respect to the Trotter step size parameter 1/π‘˜. Such an extrapolation is possible for arbitrary π‘š because there exists an error series [18] π‘ˆ π‘˜ 2 (𝑑/π‘˜) βˆ’ π‘ˆ (𝑑) = 𝐸2 𝑗+1 𝑑2 𝑗+1 π‘˜ 2 𝑗 ∞ βˆ‘οΈ 𝑗=1 (5.1) with 𝐸2 𝑗+1 independent of π‘˜ (but not 𝑑 generically). The existence of this series suffices for a 1/π‘˜ β†’ 0 Richardson extrapolation [119]. In particular, cancellation occurs for coefficients π‘Ž 𝑗 satisfying the following Vandermonde linear system. 1 βˆ’2 π‘˜1 ... βˆ’2π‘š+2 π‘˜1 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) Β· Β· Β· Β· Β· Β· . . . Β· Β· Β· 1 βˆ’2 π‘˜π‘š ... βˆ’2π‘š+2 π‘˜π‘š π‘Ž1 π‘Ž2 ... π‘Ž 𝑀 (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) = 1 (cid:170) (cid:174) (cid:174) 0 (cid:174) (cid:174) ... (cid:174) (cid:174) (cid:174) (cid:174) 0 (cid:172) (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (5.2) Though the matrix is ill-conditioned, this is irrelevant to numerical stability, as the inverse Van- dermonde matrix admits an analytic solution that may be reasoned from the theory of polynomial interpolation. What matters for our application is the one-norm βˆ₯π‘Žβˆ₯1 of the coefficients, which 105 serves as our "condition number" because of how it amplifies small errors in the base formula π‘ˆ2 [90]. The content of Theorem 5.1.1 is that Trotter steps (cid:174)π‘˜ may be chosen such that βˆ₯π‘Žβˆ₯1 is not too large. For time-ordered π‘ˆ, the analysis of [18] does not carry over, although reasonable "time dependent" MPFs can be defined heuristically. One of our motivations in constructing a clock space is to be able to eliminate time ordering and, in this setting, show these formulas work. As discussed in [90], specific choices of π‘˜ 𝑗 can be found numerically to minimize βˆ₯π‘Žβˆ₯1, and this may be the best approach in practice. However, for our analytical results it will be most appropriate to utilize the specific π‘˜ 𝑗 chosen in their constructive proof of well-conditioned MPFs. Thus, for all results we will take the powers π‘˜ 𝑗 as follows. π‘˜ 𝑗 = (cid:38) √ 8π‘š πœ‹ (cid:12) (cid:12) (cid:12) (cid:12) sin (cid:18) πœ‹(2 𝑗 βˆ’ 1) 8π‘š (cid:19)(cid:12) (cid:12) (cid:12) (cid:12) βˆ’1(cid:39) , 𝑗 = 1, . . . , π‘š (5.3) We will use these same coefficients even in the time dependent MPFs to be introduced in the subsequent section. For error analysis, it will be useful to have simple, concrete bounds on π‘˜ 𝑗 . We can achieve this by noting that sin(π‘₯) ≀ π‘₯ and sin(π‘₯) β‰₯ 4π‘₯/5 for π‘₯ ∈ [0, 1]. This gives the lower bound and the upper bound π‘˜ 𝑗 β‰₯ (cid:25) (cid:24) 83/2π‘š2 πœ‹2(2 𝑗 βˆ’ 1) (cid:24) β‰₯ 83/2π‘š2 πœ‹2(2π‘š βˆ’ 1) (cid:25) > √ 128π‘š πœ‹2 > π‘š π‘˜ 𝑗 ≀ (cid:38) √ 8π‘š2 5 Γ— 8 4(2 𝑗 βˆ’ 1)πœ‹2 (cid:39) (cid:38) ≀ √ 8π‘š2 (cid:39) 5 Γ— 8 4πœ‹2 < 3π‘š2. Note the consistency of (5.5) with the big-𝑂 scaling of Theorem 5.1.1. 5.2 Definition and Effectiveness (5.4) (5.5) Multiproduct formulas have already been considered extensively in the Hamiltonian simulation community [24, 45, 141], however, they have yet to be seriously considered for use in time dependent Hamiltonian simulations. Because π‘ˆ generally has time-ordering, the techniques used in [18] involving Baker-Campbell-Hausdorff-type expansions do not carry over directly. An approach based instead on the Magnus expansion might be expected to work in its place, but no subset of terms in the expansion represents the exact evolution separated from error terms. Without this 106 generalization, MPFs cannot be applied to interaction picture algorithms as well as simulations of physical systems that have intrinsic time dependence. It is rather easy to propose a reasonable generalization of the MPFs of Theorem 5.1.1 that would be expected to work well in the time dependent case. Simply replace the π‘˜ 𝑗 th power with a sequence of π‘˜ 𝑗 unitaries at each time slice. We will present this definition shortly. First, as a preliminary step, we want to clearly define a notion of approximation order of two-parameter operator functions (such as the propagator π‘ˆ (𝑑, 𝑑0)) that will suit our purposes. Definition 1. For finite-dimensional H , let 𝐿 : [0, 𝑇]2 β†’ 𝐿(H ). We say that 𝐿 𝑝 : [0, 𝑇]2 β†’ 𝐿(H ) is a pth-order approximation to 𝐿 if, for all 𝑑 ∈ [0, 𝑇), βˆ₯𝐿 (𝑑 + 𝜏, 𝑑) βˆ’ 𝐿 𝑝 (𝑑 + 𝜏, 𝑑) βˆ₯ ∈ 𝑂 (𝜏 𝑝+1) where 𝜏 is taken asymptotically to 0. Observe that this aligns with the regular notion of 𝑝th order formulas when considered as a function of a single variable 𝜏 and fixed 𝑑. With 𝑝th order approximants defined, we now propose a generalization of MPFs for two-parameter operators such as the general propagator π‘ˆ. Definition 2 (Time Dependent Multiproduct Formulas). For finite dimensional H and 𝐿 : [0, 𝑇]2 β†’ + , and π‘Ž ∈ Rπ‘š, 𝐿 (H ), let 𝐿 𝑝 : [0, 𝑇]2 β†’ 𝐿(H ) be a 𝑝th-order formula for 𝐿. Given π‘š ∈ Z+, (cid:174)π‘˜ ∈ Zπ‘š define the time dependent Multiproduct Formula πΏπ‘š,𝑝 : [0, 𝑇]2 β†’ 𝐿 (H ) to be πΏπ‘š,𝑝 (𝑑, 𝑑0) := π‘š βˆ‘οΈ 𝑗=1 π‘Ž 𝑗 𝐿 (π‘˜ 𝑗 ) 𝑝 (𝑑, 𝑑0) 𝐿 (π‘˜) 𝑝 (𝑑, 𝑑0) := π‘˜βˆ’1 (cid:214) β„“=0 𝐿 𝑝 (𝑑ℓ+1, 𝑑ℓ) where and 𝑑ℓ = 𝑑0 + (𝑑 βˆ’ 𝑑0)β„“/π‘˜. The choice to take the 𝑑ℓ as equally spaced is not entirely coincidental, for the same reason that, in the time independent setting, we take π‘ˆ π‘˜ 2 (𝑑/π‘˜) instead of, say, π‘˜ (cid:214) 𝑗=1 π‘ˆ2(𝑠 𝑗 𝑑) 107 (5.6) where 𝑠 = (𝑠1, . . . , π‘ π‘˜ ) is a probability vector. Taking a simple power of π‘˜ makes reasoning with the BCH expansion possible. While these definitions are likely applicable in more general contexts (such as classical time dependent symplectic dynamics), our interest in simulation means we will consider 𝐿 = π‘ˆ to be a time evolution operator, satisfying all the corresponding properties. Moreover, we will assume 𝐻 is in the so-called linear combination of Hamiltonians (LCH) form 𝐻 = 𝐿 βˆ‘οΈ 𝑖=1 𝐻𝑖 (𝑑) (5.7) which is suitable for product formula simulations. Later we will make the assumption that the exponentials of each 𝐻 𝑗 (𝑑) can be efficiently computed, a standard assumption. Because we utilize the well-conditioning results of [90], we want the base formula to be 2nd order and symmetric. Thus, we will take 𝐿 𝑝 = π‘ˆ2 to being the 2nd order symmetric midpoint formula π‘ˆ2(𝑑 + 𝜏, 𝑑) := (cid:110) βˆ’π‘–π»π‘– (cid:16) 𝑑 + exp 1 (cid:214) 𝑖=𝐿 (cid:17) (cid:111) 𝜏 𝜏 2 𝐿 (cid:214) 𝑖=1 (cid:110) βˆ’π‘–π»π‘– (cid:16) 𝑑 + exp (cid:17) (cid:111) 𝜏 , 𝜏 2 (5.8) where 𝐻 is the Hamiltonian generating π‘ˆ. That π‘ˆ2 is second-order can be seen from Taylor expanding the Dyson series of π‘ˆ about 𝜏 = 0 (𝐻 must be at least twice differentiable). Moreover, π‘ˆ2 is time-reversal symmetric in the same sense as π‘ˆ: π‘ˆ2(𝑑, 𝑑0) = π‘ˆ2(𝑑0, 𝑑)†. This gives the nice property that the error series for π‘ˆ (𝑑 + 𝜏, 𝑑) βˆ’ π‘ˆ (π‘˜) (𝑑 + 𝜏, 𝑑) has only even terms, such that higher order formulas can be reached with approximately half the number of summands. Thus, from now on we will be interested in the MPF π‘ˆ2,π‘š (𝑑, 0) = π‘š βˆ‘οΈ 𝑗=1 π‘Ž π‘—π‘ˆ (π‘˜ 𝑗 ) 2 (𝑑, 0) (5.9) for the rest of this chapter. We finally turn to the question of whether the time dependent MPFs of Definition 2 may be constructed for improved approximants. At the beginning of this section, we mentioned the difficulty presented by time-ordering in adopting the techniques from [18]. The reader of the previous chapter may recognize that clock spaces may be used to remove time ordering, circumventing the issue. However, when the clock variable 𝑑 is continuous, the shift term βˆ’πΈ in the clock Hamiltonian is 108 an unbounded operator, complicating a BCH-type analysis. We conjecture, and provide a heuristic argument, that time dependent MPFs indeed boost the approximation order for sufficiently smooth Hamiltonians. Conjecture 1. Let 𝐻 = (cid:205)𝐿 𝑖=1 𝐻𝑖 (𝑑), and let π‘ˆ2(𝑑 + 𝜏, 𝑑) = π‘’βˆ’π‘–π»π‘– (𝑑+𝜏/2)𝜏 1 (cid:214) 𝑖=𝐿 𝐿 (cid:214) 𝑖=1 π‘’βˆ’π‘–π»π‘– (𝑑+𝜏/2)𝜏 be the symmetric, 2nd order Trotterized midpoint formula. Suppose each 𝐻𝑖 is 2π‘š + 1 time differentiable. Then the time dependent multiproduct formula π‘ˆ2,π‘š (𝑑 + 𝜏, 𝑑) with base formula π‘ˆ2 approximates π‘ˆ (𝑑 + 𝜏, 𝑑) to order 2π‘š in 𝑑. We now discuss a potential path to proof of this conjecture. Without loss of generality, we take 𝑑 = 0. Let π‘˜ ∈ Z+, and consider a sequence of discrete clock constructions on interval [0, 𝜏], with parameters (𝑁 𝑝 (β„“), π‘π‘ž (β„“), 𝜎(β„“)), such that π‘˜ always divides 𝑁𝑐 = 𝑁 𝑝 π‘π‘ž, and such that the limit reproduces the dynamics of 𝐻 (𝑑) on the main register, as per Theorem 4.3.6. Consider one of the elements of this sequence. Using the form of 𝐻 given in the conjecture statement, we may write 𝐢 (𝐻) = 𝐿 βˆ‘οΈ 𝑖=1 𝐢 (𝐻𝑖). Thus, the clock Hamiltonian 𝐻𝑐 admits the following 2nd order symmetric Trotterization. 𝑉2(𝜏) = π‘’π‘–Ξ”πœ/2 (cid:32) 1 (cid:214) 𝑖=𝐿 π‘’βˆ’π‘–πΆ (𝐻𝑖)𝜏/2 𝐿 (cid:214) 𝑖=1 π‘’βˆ’π‘–πΆ (𝐻𝑖)𝜏/2 (cid:33) π‘’π‘–Ξ”πœ/2 From [18], we have that 𝑉 (𝜏) βˆ’ 𝑉 π‘˜ 2 (𝜏/π‘˜) = π‘šβˆ’1 βˆ‘οΈ 𝑗=1 E2 𝑗+1(𝜏) 𝜏2 𝑗+1 π‘˜ 2 𝑗 + E (𝜏, π‘˜) (5.10) (5.11) (5.12) where E ∈ 𝑂 (𝜏2π‘š+1) is analytic in 𝜏. Thus the standard, well-conditioned multiproduct formula 𝑉2,π‘š of Theorem 5.1.1 with base formula 𝑉2 satisfies 𝑉 (𝜏) βˆ’ 𝑉2,π‘š (𝜏) = π‘š βˆ‘οΈ 𝑗=1 π‘Ž 𝑗 E (𝜏, π‘˜ 𝑗 ). (5.13) 109 We now wish to look at the action on the main register. Applying equation (5.13) to the state |πœ“βŸ© |πœ™0⟩ of the full register, where |πœ“βŸ© is arbitrary, and then taking the trace Tr𝑐 over the clock register, one obtains Tr𝑐 (𝑉 (𝜏) |πœ“βŸ© |πœ™0⟩) βˆ’ Tr𝑐 (𝑉2,π‘š (𝜏) |πœ“βŸ© |πœ™0⟩) = π‘š βˆ‘οΈ 𝑗=1 π‘Ž 𝑗 𝐸 (𝜏, π‘˜ 𝑗 ) (|πœ“βŸ©) (5.14) where 𝐸 (𝜏, π‘˜) is a linear map on the main register defined by 𝐸 (𝜏, π‘˜)(|πœ“βŸ©) := Tr𝑐 (E (𝜏, π‘˜) |πœ“βŸ© |πœ™0⟩). (5.15) The above holds for every clock space in the sequence defined by (𝑁 𝑝 (β„“), π‘π‘ž (β„“), 𝜎(β„“)). Taking the limit as β„“ β†’ ∞ of equation (5.14) we may pass the limits through the finite sums and scalar multiplications lim β„“β†’βˆž Tr𝑐 (𝑉 (𝜏) |πœ“βŸ© |πœ™0⟩) βˆ’ lim β„“β†’βˆž Tr𝑐 (𝑉2,π‘š (𝜏) |πœ“βŸ© |πœ™0⟩) = π‘š βˆ‘οΈ 𝑗=1 π‘Ž 𝑗 lim β„“β†’βˆž 𝐸 (𝜏, π‘˜ 𝑗 ) |πœ“βŸ© (5.16) provided that these limits exist. Indeed, by Theorem 4.3.6, Tr𝑐 (𝑉 (𝜏) |πœ“βŸ© |πœ™0⟩) = π‘ˆ (𝜏, 0) |πœ“βŸ© . As for the MPF, taking π‘˜ steps of the Trotterization, we should find that Tr𝑐 (𝑉2(𝜏/π‘˜) π‘˜ πœ“ |πœ™0βŸ©π‘) = π‘ˆ (π‘˜) 2 (𝜏, 0) |πœ“βŸ© lim β„“β†’βˆž (5.17) (5.18) though this must be shown. This shouldn’t be too hard, as the idea is clear: perform a sequence of clock shifts followed by 2nd order Trotter on the main register. By passing the limit through the multiproduct sum, Tr𝑐 (𝑉2,π‘š (𝜏, 0) |πœ“βŸ© |πœ™0βŸ©π‘) = π‘ˆ2,π‘š (𝜏, 0) |πœ“βŸ© . lim β„“β†’βˆž (5.19) It remains to show that the limit limβ„“ 𝐸 (𝜏, π‘˜) exists, and moreover is in 𝑂 (𝜏2π‘š+1). This is where the main challenge lies. To show that the limit of a sequence with terms of order 𝑂 (𝜏2π‘š+1) is also 𝑂 (𝜏2π‘š+1), we can show that the 2π‘š + 1 derivative is bounded at 𝜏 = 0. Unfortunately, our current clock constructions have the width 𝜎 of the clock state shrinking to infinity, which means 110 the derivatives grow as well. If a different clock construction can be provided where the clock state can have width 𝜎 ∈ 𝑂 (1), a bound can be placed and thus the limit will be 𝑂 (𝜏2π‘š+1). Current ongoing work is being undertaken to fill in the gaps of the previous argument. How- ever, the numerics of Section 5.7 strongly suggest that the time dependent MPFs indeed work as expected. Moreover, the form of the time-dependent MPF of Definition 2 can be obtained by a naive Trotterization of the continuous clock space, which is very suggestive that, beyond formal issues, the approach is reasonable. Thus, we proceed with the assumption Conjecture 1 is true. 5.3 Time Dependent Multiproduct Simulation Having argued that good time dependent MPFs exist, we now propose an algorithm for Hamil- tonian simulation using these formulas. We will provide some accompanying discussion to explain our choices, and at the end we will more directly state the approach. In order to present a concrete computational model for our Hamiltonian, we further specify that our LCH Hamiltonian 𝐻 (𝑑) is of the form 𝐻 (𝑑) = 𝐿 βˆ‘οΈ 𝑖=1 𝛼𝑖 (𝑑)𝐻𝑖 (5.20) where each 𝛼𝑖 (𝑑) ∈ R is assumed 2π‘š+1 differentiable for an π‘š-term MPF. Without loss of generality we take βˆ₯𝐻𝑖 βˆ₯ ≀ 1. From the onset, there are a couple of choices to make. The MPFs, in principle, could approxi- mate the entire interval [0, 𝑇] provided that the Trotter steps π‘˜π‘– are sufficiently large. However, this has several disadvantages. First, there is no flexibility to treat some subintervals of [0, 𝑇] as more difficult than others and allocate resources appropriately. Second, the well-conditioned scheme of [90] would have to be abandoned or modified to accommodate larger (cid:174)π‘˜. Instead, we divide [0, 𝑇] into a mesh of π‘Ÿ subintervals, not necessarily uniform, but rather constructed to account for more difficult parts of the simulation. We provide a greedy algorithm for constructing such a mesh in Section 5.9. The algorithm requires a computable Ξ›2π‘š+1-bound to work (see Definition 3), however, a practitioner might prefer a more heuristic approach to constructing the time mesh. For the moment, we will simply say that, given 𝑑𝑖, the next time point 𝑑𝑖+1 is incremented roughly as 111 1/Ξ›2π‘š+1(𝑑) for 𝑑 in a neighborhood of 𝑑𝑖, where Ξ›2π‘š+1 is a positive real-valued function of 𝐻 and its derivatives that grows for larger or faster fluctuating 𝐻. Once the mesh points 𝑑0, 𝑑1, . . . , π‘‘π‘Ÿ are determined, a time dependent MPF is performed over each subinterval [𝑑𝑖, 𝑑𝑖+1] in sequence. We assume the MPF is implemented using the LCU technique. The base midpoint formula π‘ˆ2 must be implemented by some scheme which depends on the structure of 𝐻 (𝑑), though the approximating unitary π‘Š2 should be at least 2nd-order and preserve the time-reversal symmetry of π‘ˆ2 (and π‘ˆ). It is known that product formulas exhibit commutator scaling, meaning that, in the limit where all 𝐻 𝑗 commute pairwise and all 𝛼 𝑗 are constant functions, the simulation error goes to zero. Hence, the MPF will also inherit this desirable property. It is for precisely this reason that, in Table 5.1, we claim our MPFs exhibit commutator scaling. Let us now supply our procedure for the multiproduct simulation. Given fundamental parame- ters, [0, 𝑇], πœ–, and a description of 𝐻 (𝑑): 1. Compute a Ξ›2π‘š+1 bound (Definition 3) for some π‘š larger than the expected number of MPF terms. This is more a less a bound on the generalized "size" of 𝐻 (𝑑). 2. Construct a time mesh of π‘Ÿ steps using the algorithm of Section 5.9. 3. Perform a sequence of MPFs over each time slice, with 2nd order base formula π‘Š2 approxi- mating the midpoint formula. Not much more about the parameter choices, such as π‘š or π‘Ÿ, can be said without an error analysis. This will be supplied in the following section. We then return to the question of algorithmic cost via a query model. 5.4 Error Analysis In this section, we analyze the errors arising between the exact unitary π‘ˆ and the MPF approx- imation Λœπ‘ˆ given by π‘Ÿ (cid:214) Λœπ‘ˆ (𝑇, 0) = π‘ˆ2,π‘š (𝑑𝑖, π‘‘π‘–βˆ’1). (5.21) 𝑖=1 This analysis will ignore hardware imperfections and decoherence, assume that π‘ˆ2 is implemented perfectly, and assume exact coefficients π‘Ž 𝑗 . In the query complexity analysis of Section 5.6 we will 112 consider additional algorithmic errors arising from a more precise specification of the Hamiltonian input model. We introduce a useful definition to quantify errors succinctly. It is well understood that MPFs, like regular product formulas, have smoothness requirements to ensure convergence. To quantify errors and costs of MPFs, we provide a metric which captures the "size" of 𝐻 and its derivatives at each point in time, in order to characterize the difficulty of simulation. Definition 3. Let 𝐻 (𝑑) = (cid:205)𝐿 𝑖=1 𝛼𝑖 (𝑑)𝐻𝑖 be a time dependent, finite-dimensional Hamiltonian with 𝐻𝑖 Hermitian and 𝛼𝑖 (𝑑) ∈ R having 𝑛 ∈ N βˆͺ {∞} continuous derivatives. For each 𝑖 define a Λ𝑖,𝑛-bound ("Lambda i n bound") as any continuous function Λ𝑖,𝑛 : [0, 𝑇] β†’ R+ satisfying the following bounds with respect to 𝐻 and its derivatives Λ𝑖,𝑛 (𝑑) β‰₯ sup 𝑗 ∈[𝑛] 𝑗+1βˆšοΈƒ βˆ₯𝛼( 𝑗) 𝑖 (𝑑) βˆ₯ βˆ€π‘‘ ∈ [0, 𝑇] where 𝑓 (𝑛) represents an 𝑛th derivative of 𝑓 , and [𝑛] := { 𝑗 ∈ N | 𝑗 ≀ 𝑛}. Assuming such bounds exist for all 𝑖 = 1, . . . , 𝐿, we say that 𝐻 (𝑑) is Λ𝑛-bounded. We further say that 𝐻 (𝑑) is Λ𝑛-boundable if it admits some Λ𝑛-bound. For convenience, we define Λ𝑖 ≑ Λ𝑖,∞. We also define a Λ𝑛 bound as any continuous on [0, 𝑇] satisfying For near-constant 𝛼𝑖 (𝑑), Λ𝑖,𝑛 is simply an upper bound on |𝛼𝑖 |, while for rapid oscillations Λ𝑛 (𝑑) β‰₯ max π‘–βˆˆ[𝐿] Λ𝑖,𝑛 (𝑑). the derivative terms will dominate. Observe that for finite 𝑛, our assumptions imply that Λ𝑖,𝑛 (𝑑) exists (𝐻 is Λ𝑖,𝑛-boundable), since |𝛼( 𝑗) | is continuous on a compact interval and hence a bounded 𝑖 function. Also in the finite case, the supremum may be replaced with a simple max, and Λ𝑖,𝑛 (𝑑) may be taken as equal to the right hand side because it is the maximum of a finite set of continuous functions, which is continuous. For this "minimal choice," Λ𝑖,𝑛 (𝑑) is a nondecreasing sequence in 𝑛. For each 𝑛, there also exists a Λ𝑖,𝑛 that is constant in 𝑑. Allowing Λ𝑖,𝑛 to vary in time, however, takes into consideration the possibility that the expense of simulating 𝐻 will vary with time. We note that Λ𝑖,𝑛-bounds are additive in the sense that, for 𝐻 (𝑑) and 𝐺 (𝑑) admitting Λ𝐻 𝑖,𝑛 and Λ𝐺 𝑖,𝑛-bounds, respectively, Λ𝐻 𝑖,𝑛 + Λ𝐺 𝑖,𝑛 is a Λ𝑖,𝑛-bound on 𝐻 + 𝐺. 113 In contrast to finite 𝑛, the existence of a Λ𝑖,∞-bound is not guaranteed, and amounts to the assumption that the derivatives of 𝐻 grow at most exponentially for asymptotically large 𝑗 and fixed 𝑑. There are smooth, even analytic functions which do not satisfy this, many of which are physically interesting. A simple example is a Gaussian pulse 𝛼(𝑑) = π‘’βˆ’π‘‘2 (5.22) whose derivatives, generating the Hermite polynomials, grow factorially with 𝑛 at 𝑑 = 0. Other in- teresting cases, such as harmonic oscillations or exponential growth and decay, do admit a Ξ›-bound. Despite these restrictions, we adopt this approach for simplicity and in order to facilitate compari- son with prior work on general-order Suzuki-Trotter formulas [137]. Admittedly, a modification of Definition 3 to be an upper bound on 𝑗 βˆ’1 𝑗+1βˆšοΈƒ βˆ₯𝛼( 𝑗) 𝑖 (𝑑) βˆ₯ max 𝑗 (5.23) would expand the class of functions admitting Ξ›-bounds to analytic functions (though not generic smooth functions). We now begin the error analysis of (5.21) in earnest. From a triangle inequality the error can be bounded as the error in each step. βˆ₯π‘ˆ (𝑇, 0) βˆ’ Λœπ‘ˆ (𝑇, 0) βˆ₯ ≀ π‘Ÿ βˆ‘οΈ 𝑖=1 βˆ₯π‘ˆ (𝑑𝑖, π‘‘π‘–βˆ’1) βˆ’ π‘ˆ2,π‘š (𝑑𝑖, π‘‘π‘–βˆ’1) βˆ₯ (5.24) Therefore, to ensure an error at most πœ–, it suffices that each subinterval has error at most πœ–/π‘Ÿ. We thus focus a single subinterval. An upper bound on this error is supplied by the following theorem, which the main technical result of this section. Theorem 5.4.1. Let 𝐻 : [𝑑0, 𝑑1] β†’ Herm(H ) be a time dependent Hamiltonian on finite- dimensional H with 2π‘š + 1 continuous derivatives on [𝑑0, 𝑑1] and Ξ›2π‘š+1-bound. Suppose further that 𝑒𝐿 max 𝜏∈[𝑑0,𝑑1] Ξ›2π‘š+1(𝜏) (𝑑1 βˆ’ 𝑑0) < 1. Then for any π‘š ∈ Z+ there exists (cid:174)π‘˜ ∈ Zπ‘š βˆ₯π‘ˆ (𝑑1, 𝑑0) βˆ’ π‘ˆ2,π‘š (𝑑1, 𝑑0)βˆ₯ < + and π‘Ž ∈ Rπ‘š such that (cid:18) βˆ₯π‘Žβˆ₯1√ πœ‹π‘š 5𝐿 max 𝜏∈[𝑑0,𝑑1] Ξ›2π‘š+1(𝜏) (𝑑1 βˆ’ 𝑑0) (cid:19) 2π‘š+1 114 and βˆ₯π‘Žβˆ₯1 ∈ 𝑂 (log(π‘š)). Observe that convergence of the above error bound to zero as π‘š β†’ ∞ is conditioned on sufficiently small 𝑑1 βˆ’ 𝑑0. This is potentially unsurprising, as the Suzuki-Trotter formulas also do not provide an unconditionally converging sequence of approximations to the time evolution operator. Note as well the parallel roles between π‘š and the Suzuki-Trotter order π‘˜ in reducing the error. In our case, however, we shall see that the simulation cost increases only polynomially in π‘š, whereas for product formulas the cost is necessarily exponential in π‘˜. The term βˆ₯π‘Žβˆ₯1/ √ πœ‹π‘š is π‘œ(1) for large π‘š and can be more or less ignored. Unfortunately, the Ξ›2π‘š+1 scales as the worst 𝛼𝑖 times the number of terms 𝐿, which seems too cynical. However, improving on this may greatly complicate the proof of the error bound. Theorem 5.4.1 will be the important result that informs the algorithmic choices and complexity analysis of subsequent sections. Having characterized the error on a single subinterval of [0, 𝑇], the full error over π‘Ÿ subintervals may be found simply using (5.24). We prove Theorem 5.4.1 using a similar strategy to that used to provide error estimates for the Suzuki-Trotter formulas [14, 137, 26]. As 𝐻 is continuously differentiable at least 2π‘š + 1 times, π‘ˆ2,π‘š is a valid extrapolant under our , and cancels the first π‘š terms in the error series. We can thus express the difference π‘ˆ2,π‘š βˆ’ π‘ˆ using the integral Taylor remainder formulas with π‘ˆ2,π‘š (𝑑, 𝑑0) βˆ’ π‘ˆ (𝑑, 𝑑0) = 𝑅2π‘š βˆ’ R2π‘š R2π‘š := 𝑅2π‘š := 1 2π‘š! 1 2π‘š! ∫ 𝑑 𝑑0 ∫ 𝑑 𝑑0 (𝑑 βˆ’ 𝜏)2π‘šπ‘ˆ (2π‘š+1) (𝜏, 𝑑0)π‘‘πœ (𝑑 βˆ’ 𝜏)2π‘šπ‘ˆ (2π‘š+1) 2,π‘š (𝜏, 𝑑0)π‘‘πœ, where π‘ˆ (𝑛) refers to derivatives in the first argument. By the triangle inequality, βˆ₯π‘ˆ2,π‘š (𝑑, 𝑑0) βˆ’ π‘ˆ (𝑑, 𝑑0) βˆ₯ ≀ βˆ₯R2π‘š βˆ₯ + βˆ₯𝑅2π‘š βˆ₯ 115 (5.25) (5.26) (5.27) (5.28) and we upper bound each remainder in separate lemmas. The easier bound is R2π‘š, so we begin with the corresponding lemma. Lemma 5.4.2. The remainder term R2π‘š in equation (5.27) satisfies βˆ₯R2π‘š βˆ₯ < (cid:18) 1 √ πœ‹π‘š 2 2𝐿 max 𝜏∈[𝑑0,𝑑] Ξ›2π‘š+1(𝜏) (𝑑 βˆ’ 𝑑0) (cid:19) 2π‘š+1 . Proof. Recall that π‘ˆ, as the exact propagator, satisfies the SchrΓΆdinger equation (2.16). Higher derivatives can easily be found through repeated application of the product rule. The result will be a polynomial in the derivatives of 𝐻 times π‘ˆ itself. Under the spectral norm, using the triangle and submultiplicative properties, the ordering of terms doesn’t matter, and therefore equivalent to the expression one gets taking derivatives of a scalar exponential. Noting that βˆ₯π‘ˆ βˆ₯ = 1, the resulting polynomial is the complete exponential Bell polynomial from FaΓ  di Bruno’s formula (see Section 2.8). Letting 𝑛 = 2π‘š + 1, we have βˆ₯πœ•π‘› 𝑑 π‘ˆ (𝑑, 𝑑0)βˆ₯ ≀ π‘Œπ‘› (cid:16) βˆ₯𝐻 (𝑑) βˆ₯, βˆ₯ (cid:164)𝐻 (𝑑) βˆ₯, . . . , βˆ₯𝐻 (π‘›βˆ’1) (𝑑) βˆ₯ (cid:17) . (5.29) From the definition of Λ𝑖,𝑛, we have βˆ₯𝐻 ( 𝑗) (𝑑) βˆ₯ ≀ ≀ 𝐿 βˆ‘οΈ 𝑖=1 βˆ‘οΈ 𝑖 |𝛼( 𝑗) 𝑖 (𝑑)| Λ𝑖,𝑛 (𝑑) 𝑗+1 ≀ (𝐿Λ𝑛 (𝑑)) 𝑗+1 and since the Bell polynomials π‘Œπ‘› are monotonic in each argument, (cid:16) π‘Œπ‘› βˆ₯𝐻 βˆ₯, βˆ₯ (cid:164)𝐻 βˆ₯, . . . , βˆ₯𝐻 (π‘›βˆ’1) βˆ₯ (cid:17) ≀ π‘Œπ‘› (𝐿Λ𝑛 (𝑑), (𝐿Λ𝑛 (𝑑))2, . . . , (𝐿Λ𝑛 (𝑑))𝑛) where 𝑏𝑛 are the Bell numbers (Section 2.8). Thus, = (𝐿Λ𝑛 (𝑑))𝑛𝑏𝑛 βˆ₯πœ•π‘› 𝑑 π‘ˆ (𝑑, 𝑑0) βˆ₯ ≀ (𝐿Λ𝑛 (𝑑))𝑛𝑏𝑛. 116 (5.30) (5.31) (5.32) Finally, returning to the bound on R2π‘š, we have from the integral triangle inequality that βˆ₯R2π‘š βˆ₯ ≀ ≀ 1 (2π‘š)! 1 (2π‘š)! ∫ 𝑑 𝑑0 ∫ 𝑑 𝑑0 (𝑑 βˆ’ 𝜏)2π‘š βˆ₯πœ•2π‘š+1 𝜏 π‘ˆ (𝜏, 𝑑0) βˆ₯π‘‘πœ (𝑑 βˆ’ 𝜏)2π‘š (𝐿Λ2π‘š+1(𝜏))2π‘š+1𝑏2π‘š+1π‘‘πœ (5.33) where we made use of equation (5.32). This, in turn, can be bounded by maximizing Ξ›2π‘š+1 over [𝑑0, 𝑑]. βˆ₯R2π‘š βˆ₯ ≀ 𝑏2π‘š+1 (2π‘š)! Ξ›2π‘š+1(𝜏))2π‘š+1 ∫ 𝑑 𝑑0 π‘‘πœ(𝑑 βˆ’ 𝜏)2π‘š (𝐿 max 𝜏∈[𝑑0,𝑑] (cid:18) ≀ 𝑏2π‘š+1 (2π‘š + 1)! 𝐿 max 𝜏∈[𝑑0,𝑑] Ξ›2π‘š+1(𝜏) (𝑑 βˆ’ 𝑑0) (cid:19) 2π‘š+1 (5.34) Finally, we upper bound the prefactor using a Stirling bound and bounds from [13] on the Bell numbers. For all π‘š ∈ Z+, 𝑏2π‘š+1 (2π‘š + 1)! < (cid:17) 2π‘š+1 (cid:16) 0.792(2π‘š+1) log(2π‘š+2) √︁2πœ‹(2π‘š + 1) (cid:17) 2π‘š+1 (cid:16) 2π‘š+1 𝑒 = 1 √︁2πœ‹(2π‘š + 1) (cid:18) .792𝑒 log(2π‘š + 2) (cid:19) 2π‘š+1 . Plugging this into equation (5.34), βˆ₯R2π‘š βˆ₯ < < 1 √︁2πœ‹(2π‘š + 1) (cid:18) 1 √ πœ‹π‘š 2 2𝐿 max 𝜏∈[𝑑0,𝑑] (cid:18) 0.792𝑒 log(2π‘š + 2) 𝐿 max 𝜏∈[𝑑0,𝑑] Ξ›2π‘š+1(𝜏) (𝑑 βˆ’ 𝑑0) (cid:19) 2π‘š+1 Ξ›2π‘š+1(𝜏) (𝑑 βˆ’ 𝑑0) (cid:19) 2π‘š+1 . The last line is the result of the lemma. We now state the bound on the Taylor 𝑅2π‘š for the time dependent MPF. Lemma 5.4.3. In the notation above, suppose that 𝑒𝐿 max 𝜏∈[𝑑0,𝑑1] Ξ›2π‘š+1(𝜏) (𝑑1 βˆ’ 𝑑0) < 1. Then the remainder term 𝑅2π‘š in equation (5.27) satisfies βˆ₯𝑅2π‘š βˆ₯ < (cid:18) βˆ₯π‘Žβˆ₯1 √ πœ‹π‘š 2 5𝐿 max 𝜏∈[𝑑0,𝑑] Ξ›2π‘š+1(𝜏) (𝑑 βˆ’ 𝑑0) (cid:19) 2π‘š+1 . 117 (5.35) (5.36) β–‘ The proof is more technical than the previous bound, and is given at the end of this section. First, we quickly prove Theorem 5.4.1 assuming the truth of the above Taylor remainder lemmas. Proof of Theorem 5.4.1. First, we note that βˆ₯π‘Žβˆ₯1 β‰₯ 1, since π‘Ž necessarily satisfies (cid:205) 𝑗 π‘Ž 𝑗 = 1 from the Vandermonde constraints (5.2). From equation (5.28), the error βˆ₯π‘ˆ (𝑑, 𝑑0) βˆ’ π‘ˆ2,π‘š (𝑑, 𝑑0)βˆ₯ is bounded by the sum of the remainder upper bounds derived in Lemmas 5.4.3 and 5.4.2. Comparing the two, we see that 𝑅2π‘š dominates R2π‘š for all π‘š β‰₯ 1. We thus take twice the larger as an upper bound βˆ₯π‘ˆ (𝑑, 𝑑0) βˆ’ π‘ˆ2,π‘š (𝑑, 𝑑0)βˆ₯ < 2βˆ₯𝑅2π‘š βˆ₯ (cid:18) < βˆ₯π‘Žβˆ₯1√ πœ‹π‘š 5𝐿 max 𝜏∈[𝑑0,𝑑] This completes the proof. Ξ›2π‘š+1(𝜏) (𝑑 βˆ’ 𝑑0) (cid:19) 2π‘š+1 . (5.37) β–‘ To prove Lemma 5.4.3, we will first need a technical lemma that bounds the size of ordinary exponentials of time dependent matrices. Lemma 5.4.4. Let 𝐴(𝑑) be an anti-Hermitian valued function of 𝑑 ∈ R with 𝑛 bounded derivatives. Then βˆ₯𝑑𝑛 𝑑 𝑒 𝐴(𝑑) βˆ₯ ≀ π‘Œπ‘› (cid:16) βˆ₯𝑑𝑑 𝐴(𝑑) βˆ₯, βˆ₯𝑑2 𝑑 𝐴(𝑑) βˆ₯, . . . , βˆ₯𝑑𝑛 𝑑 𝐴(𝑑) βˆ₯ (cid:17) where π‘Œπ‘› is the complete exponential Bell polynomial. In the scalar case, FaΓ  di Bruno’s bound is an exact expression (see Section 2.8), so the content of our result is that a corresponding bound holds even in the non-scalar case. The exponential disappears because 𝑒 𝐴(𝑑) is unitary. Proof of Lemma 5.4.4. From the Trotter product theorem, we have 𝑑 exp( 𝐴(𝑑)) = πœ•π‘› πœ•π‘› 𝑑 (exp( 𝐴(𝑑)/π‘Ÿ))π‘Ÿ . lim π‘Ÿβ†’βˆž (5.38) Using the fact that the series converges uniformly, we may interchange the order of differentiation and the limit. This leads to βˆ₯πœ•π‘› 𝑑 exp( 𝐴(𝑑))βˆ₯ ≀ lim π‘Ÿβ†’βˆž (cid:18) βˆ‘οΈ 𝑆 𝑛 𝑠1, . . . , π‘ π‘Ÿ (cid:19) π‘Ÿ (cid:214) π‘ž=1 (cid:13)πœ•π‘ π‘ž (cid:13) 𝑑 exp( 𝐴(𝑑)/π‘Ÿ)(cid:13) (cid:13) . (5.39) 118 Here the sum over 𝑆 is constrained such that 𝑠 𝑗 β‰₯ 0 and 𝑠1 + Β· Β· Β· + π‘ π‘Ÿ = 𝑛. Then using Taylor’s theorem we have (cid:13)πœ•π‘ π‘ž (cid:13) 𝑑 exp( 𝐴(𝑑)/π‘Ÿ)(cid:13) (cid:13) ≀ βˆ₯ 𝐴(π‘ π‘ž) (𝑑) βˆ₯ π‘Ÿ + 𝑂 (1/π‘Ÿ 2). (5.40) for π‘ π‘ž > 0, where the 𝑂 (1/π‘Ÿ 2) terms will vanish as π‘Ÿ β†’ ∞. The π‘ π‘ž = 0 case has upper bound 1 by unitarity. Hence, put together, βˆ₯πœ•π‘› 𝑑 exp( 𝐴(𝑑))βˆ₯ ≀ lim π‘Ÿβ†’βˆž (cid:18) βˆ‘οΈ 𝑛 𝑠1, . . . , π‘ π‘Ÿ (cid:19) π‘Ÿ (cid:214) (cid:32) βˆ₯ 𝐴(π‘ π‘ž) (𝑑) βˆ₯(1 βˆ’ π›Ώπ‘ π‘ž,0) π‘Ÿ (cid:33) + π›Ώπ‘ π‘ž,0 . (5.41) π‘ž=1 Now let us define a scalar function π‘Ž(π‘₯) defined for π‘₯ in a neighborhood of 𝑑 such that, for any 𝑆 π‘˜ such that 0 ≀ π‘˜ ≀ 𝑛, π‘Ž (π‘˜) (𝑑) = βˆ₯ 𝐴(π‘˜) (𝑑) βˆ₯(1 βˆ’ π›Ώπ‘˜,0) (5.42) for a particular π‘₯ = 𝑑. Such a function can be seen to exist by considering the 𝑛th degree Taylor polynomial. We may apply the standard FaΓ  di Bruno formula (2.49) to π‘Ž, so that π‘₯ π‘’π‘Ž(π‘₯) πœ•π‘› (cid:12) (cid:12) (cid:12) (cid:12)π‘₯=𝑑 = π‘’π‘Ž(𝑑)π‘Œπ‘› (βˆ₯ 𝐴(1) (𝑑)βˆ₯, . . . , βˆ₯ 𝐴(𝑛) (𝑑) βˆ₯) = π‘Œπ‘› (βˆ₯ 𝐴(1) (𝑑) βˆ₯, . . . , βˆ₯ 𝐴(𝑛) (𝑑) βˆ₯). (5.43) On the other hand we can split π‘Ž(𝑑) into π‘Ÿ steps and compute the 𝑛th derivative, just as for the Trotter product theorem. π‘₯ π‘’π‘Ž(π‘₯) πœ•π‘› (cid:12) (cid:12) (cid:12) (cid:12)π‘₯=𝑑 = lim π‘Ÿβ†’βˆž (cid:18) βˆ‘οΈ 𝑆 𝑛 𝑠1, . . . , π‘ π‘Ÿ (cid:19) π‘Ÿ (cid:214) π‘ž=1 (cid:32) βˆ₯ 𝐴(π‘ π‘ž) (Δ𝑑) βˆ₯(1 βˆ’ π›Ώπ‘ π‘ž,0) π‘Ÿ (cid:33) + π›Ώπ‘ π‘ž,0 By comparing expressions (5.41) and (5.44), we see that βˆ₯πœ•π‘› 𝑑 exp 𝐴(𝑑) βˆ₯ ≀ πœ•π‘› π‘₯ π‘’π‘Ž(π‘₯) (cid:12) (cid:12) (cid:12) (cid:12)π‘₯=𝑑 and applying (5.43), we reach our desired bound FaΓ  di Bruno bound. βˆ₯πœ•π‘› 𝑑 exp( 𝐴(𝑑))βˆ₯ ≀ π‘Œπ‘› (βˆ₯ 𝐴(1) (𝑑) βˆ₯, . . . , βˆ₯ 𝐴(𝑛) (𝑑) βˆ₯) (5.44) (5.45) (5.46) We evaluate the derivatives of 𝐴(𝑑), and express them in terms of the derivatives of the Hamiltonian, 𝐻 ( 𝑗) (for simplicity, we leave off the evaluation point. The derivative is with respect to the Hamiltonian’s single argument). The result is πœ• 𝑗 𝑑 𝐴(𝑑) = βˆ’π‘– π‘˜ (cid:34)(cid:18) π‘ž βˆ’ 1/2 π‘˜ (cid:19) 𝑗 (𝑑 βˆ’ 𝑑0)𝐻 ( 𝑗) + 𝑗 (cid:18) π‘ž βˆ’ 1/2 π‘˜ (cid:19) π‘—βˆ’1 (cid:35) . 𝐻 ( π‘—βˆ’1) (5.47) 119 Employing the Λ𝑛-bound from Definition 3, we have that βˆ₯πœ• 𝑗 𝑑 𝐴(𝑑)βˆ₯ ≀ (cid:34) (cid:18) π‘ž βˆ’ 1/2 π‘˜ 1 π‘˜ (cid:19) 𝑗 (𝑑 βˆ’ 𝑑0)Ξ› 𝑗+1 𝑛,π‘ž + 𝑗 (cid:19) π‘—βˆ’1 (cid:18) π‘ž βˆ’ 1/2 π‘˜ (cid:35) Λ𝑛,π‘žπ‘ž 𝑗 (cid:19) 𝑗 (cid:18) π‘ž βˆ’ 1/2 π‘˜ = 𝑗 𝑛,π‘ž Ξ› (cid:20) 𝑗 π‘ž βˆ’ 1/2 + 1 π‘˜ (𝑑 βˆ’ 𝑑0)Λ𝑛,π‘ž (cid:21) . Here, Λ𝑛,π‘ž := max πœβˆˆπΌπ‘ž Λ𝑛 (𝜏) (5.48) (5.49) and πΌπ‘ž = [𝑑0 + (π‘ž βˆ’ 1)(𝑑 βˆ’ 𝑑0)/π‘˜, 𝑑0 + π‘ž(𝑑 βˆ’ 𝑑0)/π‘˜] is the π‘žth interval in the mesh from 𝑑0 to 𝑑 with π‘˜ even spaces. Since Λ𝑛,π‘ž ≀ max𝜏∈[𝑑0,𝑑] Λ𝑛 (𝜏), from the assumptions of the lemma, Λ𝑛,π‘ž (𝑑 βˆ’ 𝑑0) < 1. Hence, where ΛœΞ›π‘›,π‘ž ≑ Λ𝑛,π‘ž (π‘ž βˆ’ 1/2)/π‘˜. βˆ₯πœ• 𝑗 𝑑 𝐴(𝑑) βˆ₯ ≀ ΛœΞ› 𝑗 𝑛,π‘ž (cid:20) 𝑗 π‘ž βˆ’ 1/2 (cid:21) 1 π‘˜ + (5.50) Plugging this into the formula into (5.46) and using the definition of π‘Œπ‘› given by (2.50), our bound becomes βˆ₯πœ•π‘› 𝑑 π‘ˆ2(𝑑)βˆ₯ ≀ βˆ‘οΈ 𝐢 𝑛! 𝑐1! . . . 𝑐𝑛! 𝑛 (cid:214) (cid:32) ( 𝑗=1 π‘˜ ) ΛœΞ› 𝑗 π‘žβˆ’1/2 + 1 𝑗! (cid:33) 𝑐 𝑗 𝑗 𝑛,π‘ž . Using the sum property of the coefficients 𝑐 𝑗 , we can move the ΛœΞ› 𝑗 𝑛,π‘ž out of the sum. βˆ₯πœ•π‘› 𝑑 π‘ˆ2(𝑑)βˆ₯ ≀ (cid:18) Λ𝑛,π‘ž π‘ž βˆ’ 1/2 π‘˜ (cid:19) 𝑛 βˆ‘οΈ 𝐢 𝑛! 𝑐1! . . . 𝑐𝑛! (cid:32) 𝑛 (cid:214) 𝑗=1 (cid:18) = Λ𝑛,π‘ž (cid:19) 𝑛 π‘ž βˆ’ 1/2 π‘˜ π‘Œπ‘› (cid:17) (cid:16) (cid:174)π‘₯ (𝑛) π‘ž,π‘˜ (cid:33) 𝑐 𝑗 𝑗 π‘žβˆ’1/2 + 1 𝑗! π‘˜ (5.51) (5.52) In the last line, we reapplied the definition of π‘Œπ‘› and of the vectors (cid:174)π‘₯ (𝑛) π‘ž,π‘˜ . This completes our bound for the π‘ˆ2 formula for the π‘žth segment of mesh defined by π‘˜ 𝑗 . β–‘ We conclude this section with a proof of the bound on 𝑅2π‘š. Proof of Lemma 5.4.3. Without loss of generality, we take 𝑑0 = 0. The relevant expressions are π‘ˆ2,π‘š (𝑑, 0) = π‘Ž π‘—π‘ˆ (π‘˜ 𝑗 ) 2 (𝑑, 0) π‘š βˆ‘οΈ 𝑗=1 120 (5.53) and π‘ˆ (π‘˜) 2 (𝑑, 0) := π‘˜ (cid:214) β„“=1 π‘ˆ2(𝑑ℓ, π‘‘β„“βˆ’1) with 𝑑ℓ := 𝑑ℓ/π‘˜. The Taylor remainder in integral form is given by 𝑅2π‘š = = 1 (2π‘š)! 1 (2π‘š)! ∫ 𝑑 0 π‘š βˆ‘οΈ 𝑗=1 (𝑑 βˆ’ 𝜏)2π‘š 𝑑2π‘š+1 π‘‘πœ2π‘š+1 (𝑑 βˆ’ 𝜏)2π‘š 𝑑2π‘š+1 π‘‘πœ2π‘š+1 ∫ 𝑑 π‘Ž 𝑗 0 π‘ˆ2,π‘š (𝜏, 0)π‘‘πœ π‘ˆ (π‘˜ 𝑗 ) 2 (𝜏, 0)π‘‘πœ. With a couple triangle inequalities, this is upper bounded as βˆ₯𝑅2π‘š βˆ₯ ≀ ≀ π‘š βˆ‘οΈ 1 (2π‘š)! βˆ₯π‘Žβˆ₯1 (2π‘š + 1)! 𝑗=1 |π‘Ž 𝑗 | 𝑑2π‘š+1 2π‘š + 1 max 𝜏∈[0,𝑑] βˆ₯𝑑2π‘š+1 𝜏 π‘ˆ (π‘˜ 𝑗 ) 2 (𝜏, 0) βˆ₯ 𝑑2π‘š+1 max 𝑗,𝜏 βˆ₯𝑑2π‘š+1 𝜏 π‘ˆ (π‘˜ 𝑗 ) 2 (𝜏, 0) βˆ₯ (5.54) (5.55) (5.56) where in the last line we employed a HΓΆlder inequality. Our focus is now on bounding the derivative, which we unravel layer by layer using frequent multinomial expansions. First, πœπ‘ˆ (π‘˜) 𝑑𝑛 2 (𝜏, 0) = (cid:18) βˆ‘οΈ 𝑁 𝑛 𝑛1, . . . , π‘›π‘˜ (cid:19) π‘˜ (cid:214) β„“=1 𝑑𝑛ℓ 𝜏 π‘ˆ2(πœβ„“, πœβ„“βˆ’1). (5.57) Next, we write where π‘ˆ2(πœβ„“, πœβ„“βˆ’1) = = 1 (cid:214) 𝑖=𝐿 2𝐿 (cid:214) 𝑖=1 π‘’βˆ’π‘–π»π‘–π›Όπ‘– (πœβ„“ βˆ’1/2)𝜏/π‘˜ 𝐿 (cid:214) 𝑖=1 π‘’βˆ’π‘–π»π‘–π›Όπ‘– (πœβ„“ βˆ’1/2)𝜏/π‘˜ 𝑒 𝐴𝑖,β„“ 𝐴𝑖,β„“ := βˆ’π‘–π»π‘–π›Όπ‘– (πœβ„“βˆ’1/2)𝜏/π‘˜ and 𝑖 is defined by reflection for 𝑖 > 𝐿. Once again performing a multinomial expansion, 𝑑𝑛 πœπ‘ˆ2(πœβ„“, πœβ„“βˆ’1) = (cid:18) βˆ‘οΈ 𝑁 𝑛 𝑛1, . . . , 𝑛2𝐿 (cid:19) 2𝐿 (cid:214) 𝑖=1 𝜏 𝑒 𝐴𝑖,β„“ . 𝑑𝑛𝑖 We now bound the individual ordinary operator exponentials. Invoking Lemma 5.4.4, βˆ₯𝑑𝑛 𝜏 𝑒 𝐴𝑖,β„“ βˆ₯ ≀ π‘Œπ‘› (cid:0)βˆ₯π‘‘πœ 𝐴𝑖,β„“ βˆ₯, . . . , βˆ₯𝑑𝑛 𝜏 𝐴𝑖,β„“ βˆ₯(cid:1) . 121 (5.58) (5.59) (5.60) (5.61) In turn, we have 𝑑𝑛 𝜏 𝐴𝑖,β„“ = βˆ’π‘– = βˆ’π‘– 𝐻𝑖 π‘˜ 𝐻𝑖 π‘˜ 𝑑𝑛 𝜏 (𝛼𝑖 (πœβ„“βˆ’1/2)𝜏) (cid:34) (cid:18) β„“ βˆ’ 1/2 π‘˜ (cid:19) 𝑛 πœπ›Ό(𝑛) 𝑖 (πœβ„“βˆ’1/2) + 𝑛 (cid:19) π‘›βˆ’1 (cid:18) β„“ βˆ’ 1/2 π‘˜ 𝛼(π‘›βˆ’1) 𝑖 (πœβ„“βˆ’1/2) (cid:35) (5.62) where 𝛼(𝑛) (π‘₯) refers to the 𝑛th derivative of 𝛼 with respect to its argument, then evaluated at π‘₯ (i.e., not a 𝜏 derivative). Since βˆ₯𝐻𝑖 βˆ₯ ≀ 1 we have βˆ₯𝑑𝑛 𝜏 𝐴𝑖,β„“ βˆ₯ < 1 π‘˜ (β„“/π‘˜)π‘›βˆ’1 (cid:16) (β„“/π‘˜)𝜏|𝛼(𝑛) 𝑖 (πœβ„“βˆ’1/2)| + 𝑛|𝛼(π‘›βˆ’1) 𝑖 (πœβ„“βˆ’1/2)| (cid:17) . (5.63) From Definition 3, 𝛼( 𝑗) 𝑖 (𝑑) ≀ Λ𝑖,𝑛 (𝑑) 𝑗+1. Dropping the 𝑛 and 𝑑 dependence for the moment, βˆ₯𝑑𝑛 𝜏 𝐴𝑖,β„“ βˆ₯ < (β„“/π‘˜)𝑛 (cid:16) (𝜏/π‘˜)Λ𝑛+1 𝑖 + (𝑛/β„“)Λ𝑛 𝑖 (cid:17) = (Λ𝑖ℓ/π‘˜)𝑛 (Ξ›π‘–πœ/π‘˜ + 𝑛/β„“) . (5.64) We’ve reached the bottom, and now proceed to work our way back up to the Taylor remainder 𝑅2π‘š, starting with (5.61). Using Lemma 5.4.4, (cid:32) 𝑛 (cid:214) (cid:33) 𝑐 𝑗 βˆ₯𝑑 𝑗 𝜏 𝐴𝑖,β„“ βˆ₯ 𝑗! βˆ₯𝑑𝑛 𝜏 𝑒 𝐴𝑖,β„“ βˆ₯ ≀ βˆ‘οΈ 𝐢 𝑛! 𝑐1!𝑐2! . . . 𝑐𝑛! < βˆ‘οΈ 𝐢 𝑛! 𝑐1!𝑐2! . . . 𝑐𝑛! 𝑗=1 𝑛 (cid:214) 𝑗=1 (cid:18) (Λ𝑖ℓ/π‘˜) 𝑗 (Ξ›π‘–πœ/π‘˜ + 𝑗/β„“) 𝑗! (cid:19) 𝑐 𝑗 . (5.65) Using the sum rule for 𝐢 we can pull out a factor of (Λ𝑖ℓ/π‘˜). With the upper bound 𝑗 ≀ 𝑛 and the monotonicity of π‘Œπ‘›, we obtain the bound βˆ₯𝑑𝑛 𝜏 𝑒 𝐴𝑖,β„“ βˆ₯ < (Λ𝑖ℓ/π‘˜)𝑛𝐡𝑛 (Ξ›π‘–πœ/π‘˜ + 𝑛/β„“) where 𝐡𝑛 is the Bell polynomial (see Section 2.8). For simplicity, define π‘₯𝑖,β„“,𝑛 = Ξ›π‘–πœ/π‘˜ + 𝑛/β„“ as the argument to 𝐡𝑛. Employing the bound (2.56), βˆ₯𝑑𝑛 𝜏 𝑒 𝐴𝑖,β„“ βˆ₯ < (Λ𝑖ℓ/π‘˜)𝑛 (cid:18) 𝑛 log(1 + 𝑛/π‘₯𝑖,β„“,𝑛) (cid:19) 𝑛 122 (5.66) (5.67) (5.68) which is valid for all 𝑛 > 0, and for 𝑛 = 0 when defined by the 0+ limit. We can simplify the reciprocal log with the bound 1 log(1 + 𝑛/π‘₯𝑖,β„“,𝑛) < = (cid:18) 1 2 1 2𝑛 + (cid:18) (cid:19) 𝑛 π‘₯𝑖,β„“,𝑛 𝑛 2π‘₯𝑖,β„“,𝑛 𝑛 1 + (cid:19) 𝑛 . This gives us the simplified exponential derivative βˆ₯𝑑𝑛 𝜏 𝑒 𝐴𝑖,β„“ βˆ₯ < (Λ𝑖ℓ/2π‘˜)𝑛 (𝑛 + 2π‘₯𝑖,β„“,𝑛)𝑛. We now move up a level to reconsider (5.60). Employing a triangle inequality, βˆ₯𝑑𝑛 πœπ‘ˆ2(πœβ„“, πœβ„“βˆ’1)βˆ₯ ≀ βˆ‘οΈ 𝑁 < βˆ‘οΈ 𝑁 (cid:18) (cid:18) 𝑛 𝑛1, . . . , 𝑛2𝐿 𝑛 𝑛1, . . . , 𝑛2𝐿 (cid:19) 2𝐿 (cid:214) 𝑖=1 (cid:19) 2𝐿 (cid:214) 𝑖=1 βˆ₯𝑑𝑛𝑖 𝜏 𝑒 𝐴𝑖,β„“ βˆ₯ (Λ𝑖ℓ/2π‘˜)𝑛𝑖 (𝑛𝑖 + 2π‘₯𝑖,β„“,𝑛𝑖 )𝑛𝑖 . (5.69) (5.70) (5.71) Maximize Λ𝑖 over all 𝑖 = 1, . . . , 𝐿 and call it Ξ›. We can factor out the corresponding term, and with some rewriting obtain (Ξ›β„“/2π‘˜)𝑛 βˆ‘οΈ 𝑁 (cid:18) 𝑛 𝑛1, . . . , 𝑛2𝐿 (cid:19) 2𝐿 (cid:214) 𝑖=1 (𝑛𝑖 + 2π‘₯β„“,𝑛𝑖 )𝑛𝑖 (5.72) where we’ve also let π‘₯β„“,𝑛𝑖 be π‘₯𝑖,β„“,𝑛𝑖 with the subscript dropped on Λ𝑖. Focusing on the rightmost product over 𝑖, one can show using a Lagrange multiplier that the maximum is given by 𝑛𝑖 = 𝑛/2𝐿 for all 𝑖 (we maximize over 𝑛𝑖 ∈ R+, which is an upper bound). This is intuitive from symmetry of the product as well. Taking this as an upper bound, we have + 2Ξ›πœ π‘˜ + 𝑛 𝐿ℓ (cid:19) 𝑛 (cid:18) βˆ‘οΈ 𝑁 (cid:19) 𝑛 𝑛1, . . . , 𝑛2𝐿 βˆ₯𝑑𝑛 πœπ‘ˆ2(πœβ„“, πœβ„“βˆ’1)βˆ₯ < (Ξ›β„“/2π‘˜)𝑛 = (Ξ›β„“/2π‘˜)𝑛 (cid:18) 𝑛 2𝐿 (cid:18) 𝑛 + 4Ξ›πœπΏ π‘˜ (cid:19) 𝑛 (cid:19) 𝑛 . + 2𝑛 β„“ 2Ξ›πœπΏβ„“ π‘˜ (cid:18) = (Ξ›/π‘˜)𝑛 𝑛 + 𝑛ℓ/2 + (5.73) where in going to the second line we evaluated the multinomial sum as (2𝐿)𝑛 and simplified. 123 With this in hand, we return to (5.57) and bound it as βˆ₯𝑑𝑛 πœπ‘ˆ (π‘˜) 2 (𝜏, 0)βˆ₯ ≀ (cid:18) βˆ‘οΈ 𝑁 𝑛 𝑛1, . . . , π‘›π‘˜ (cid:19) π‘˜ (cid:214) βˆ₯𝑑𝑛ℓ 𝜏 π‘ˆ2(πœβ„“, πœβ„“βˆ’1) βˆ₯ 𝑛 𝑛1, . . . , π‘›π‘˜ Using the upper bound β„“ ≀ π‘˜ and factoring out the (Ξ›/π‘˜)𝑛ℓ using the sum rule, 𝑛ℓ + 𝑛ℓℓ/2 + < βˆ‘οΈ 𝑁 (Ξ›/π‘˜)𝑛ℓ 2Ξ›πœπΏβ„“ π‘˜ β„“=1 (cid:18) (cid:18) β„“=1 π‘˜ (cid:214) (cid:19) (5.74) (cid:19) 𝑛ℓ . βˆ₯𝑑𝑛 𝑛 𝑛1, . . . , π‘›π‘˜ Similar to, we upper bound the product using 𝑛ℓ = 𝑛/π‘˜ for all β„“, which can be justified through a (𝜏, 0)βˆ₯ < (Ξ›/π‘˜)𝑛 βˆ‘οΈ (𝑛ℓ + 𝑛ℓ π‘˜/2 + 2Ξ›πœπΏ)𝑛ℓ . πœπ‘ˆ (π‘˜) (5.75) β„“=1 𝑁 2 (cid:18) (cid:19) π‘˜ (cid:214) maximization using Lagrange multipliers. The corresponding bound is βˆ₯𝑑𝑛 πœπ‘ˆ (π‘˜) 2 (𝜏, 0)βˆ₯ < (Ξ›/π‘˜)𝑛 (𝑛/π‘˜ + 𝑛/2 + 2Ξ›πœπΏ)𝑛 βˆ‘οΈ 𝑁 (cid:19) (cid:18) 𝑛 𝑛1, . . . , π‘›π‘˜ = (𝑛Λ)𝑛 (cid:18) 1 π‘˜ + 1 2 + 2Ξ›πœπΏ 𝑛 (cid:19) 𝑛 . (5.76) We are finally ready to return to equation (5.56) and bound 𝑅2π‘š. We recall that Ξ› has 𝜏 dependence, and let Ξ›max := max𝜏∈[0,𝑑] Ξ›(𝜏). We also upper bound any appearance of 𝜏 otherwise by 𝑑 because these are always in the numerator. So far, using 𝑛 = 2π‘š + 1, these reductions give βˆ₯𝑅2π‘š βˆ₯ < βˆ₯π‘Žβˆ₯1 (2π‘š + 1)! ((2π‘š + 1)Ξ›max𝑑)2π‘š+1 max 𝑗 (cid:18) 1 π‘˜ 𝑗 + 1 2 + 2Λ𝑑 𝐿 2π‘š + 1 (cid:19) 2π‘š+1 . (5.77) Employing a Stirling bound on the factorial, and factoring out an additional 𝐿 from the rightmost term, βˆ₯𝑅2π‘š βˆ₯ < βˆ₯π‘Žβˆ₯1 √︁2πœ‹(2π‘š + 1) (𝑒𝐿Λmax𝑑)2π‘š+1 max 𝑗 (cid:18) 1 πΏπ‘˜ 𝑗 + 1 2𝐿 + 2Λ𝑑 2π‘š + 1 (cid:19) 2π‘š+1 . (5.78) We now apply the assumption that 𝑒𝐿Λmax𝑑 < 1 to upper bound the max 𝑗 term, along with π‘˜ 𝑗 , 𝐿 β‰₯ 1. Thus, (cid:18) 1 πΏπ‘˜ 𝑗 + 1 2𝐿 + 2Λ𝑑 2π‘š + 1 max 𝑗 (cid:19) 2π‘š+1 (cid:19) 2π‘š+1 < (cid:18) 3 2 + 2 3𝑒 βˆ₯𝑅2π‘š βˆ₯ < < βˆ₯π‘Žβˆ₯1 √ πœ‹π‘š 2 βˆ₯π‘Žβˆ₯1 √ πœ‹π‘š 2 (cid:18)(cid:18) 3𝑒 2 + (cid:19) 2 3 𝐿 max 𝜏∈[0,𝑑] Ξ›2π‘š+1(𝜏)𝑑 (cid:18) 5𝐿 max 𝜏∈[0,𝑑] Ξ›2π‘š+1(𝜏)𝑑 (cid:19) 2π‘š+1 . (cid:19) 2π‘š+1 (5.79) (5.80) In these last lines, we remind ourselves that Ξ› has the subscript 2π‘š + 1 as per Definition 3. β–‘ 124 5.5 Time Step Analysis The next ingredient we need for a complexity analysis is asymptotic bounds on the number of subintervals π‘Ÿ needed in the time mesh. This will be the concern of this section. Unfortunately, in pursuing best-case bounds on π‘Ÿ, we eschew a practical procedure for generating the time points 𝑑𝑖. Section 5.9 provides a concrete procedure which is based on the analysis of this section. For time dependent Hamiltonians, because the cost per unit time can vary with 𝑑 in general, one should adaptively choose the step size depending on the cost. For our purposes, this means choosing a step size inversely proportional to the energy measure Ξ›2π‘š+1(𝑑). We will explore this adaptive time stepping and show 𝐿1-norm scaling with Ξ›2π‘š+1(𝑑) here. To derive bounds on π‘Ÿ, we will need to assume something about size of the derivative (cid:164)Ξ›2π‘š+1 compared to Ξ›2π‘š+1 itself. Given a Λ𝑛-bound, a differentiable (smooth, even) Λ𝑛-bound exists. From now on, we consider Λ𝑛-bounds for which there exists a 𝐾 ∈ R+ be such that | (cid:164)Λ𝑛 (𝑑)| ≀ 𝐾Λ𝑛 (𝑑)2 for all 𝑑 ∈ [0, 𝑇]. Given 𝐻 that is Λ𝑛 boundable, there is always, in fact, a Λ𝑛 bound such that 𝐾 exists and is arbitrarily close to zero. For example, we may take a constant bound Ξ›β€² 𝑛 := max𝑑 Λ𝑛 (𝑑), noting that Λ𝑛 is continuous on a compact interval. Of course, Ξ›β€² does not capture the changing behavior of 𝐻 (𝑑), and is therefore suboptimal. Nevertheless, we’ve demonstrated that our additional assumptions are not much more restrictive than those we’ve already made. Note that (in natural units) 𝐾 is dimensionless. With these preliminaries in place, the following result provides an upper bound on the number of time steps needed for our MPF algorithm. Lemma 5.5.1. Let 𝐻 satisfy the assumptions of Theorem 5.4.1, and let Ξ›2π‘š+1 be a Ξ›2π‘š+1-bound for 𝐻 such that, for some 𝐾 ∈ R+, | (cid:164)Ξ›2π‘š+1(𝑑)| ≀ 𝐾Λ2π‘š+1(𝑑)2 for all 𝑑 ∈ [0, 𝑇]. For every πœ– > 0, there exists a list (𝑑0, 𝑑1, . . . , π‘‘π‘Ÿ) of monotonically increasing times 𝑑 𝑗 ∈ [0, 𝑇], with 𝑑0 = 0 and π‘‘π‘Ÿ = 𝑇, such that βˆ₯π‘ˆ (𝑇, 0) βˆ’ π‘Ÿ (cid:214) 𝑖=1 π‘ˆ2,π‘š (𝑑𝑖, π‘‘π‘–βˆ’1) βˆ₯ ≀ πœ– 125 with the number of time steps π‘Ÿ bounded above as (cid:36)(cid:18) (cid:18) 5 1 + π‘Ÿ ≀ (cid:19) 𝐾 3 2 𝐿βˆ₯Ξ›βˆ₯1 (cid:19) 1+ 1 2π‘š (cid:18) βˆ₯π‘Žβˆ₯1 √ πœ‹π‘š πœ– 2π‘š (cid:37) (cid:19) 1 . Here, βˆ₯Ξ›2π‘š+1βˆ₯1 is the 𝐿1 norm. ∫ 𝑇 βˆ₯Ξ›2π‘š+1βˆ₯1 := Ξ›2π‘š+1(𝑑)𝑑𝑑 0 Proof. As discussed in Section 5.4 in order to satisfy the πœ–-error constraint of Lemma 5.5.1, it suffices that the error on each subinterval is less than πœ–/π‘Ÿ. Using Theorem 5.4.1, the sum is bounded as π‘Ÿ βˆ‘οΈ 𝑖=1 βˆ₯π‘ˆ (𝑑𝑖, π‘‘π‘–βˆ’1) βˆ’ π‘ˆ2,π‘š (𝑑𝑖, π‘‘π‘–βˆ’1)βˆ₯ ≀ βˆ₯π‘Žβˆ₯1√ πœ‹π‘š π‘Ÿ βˆ‘οΈ (cid:18) 𝑖=1 5𝐿 max 𝜏∈[π‘‘π‘–βˆ’1,𝑑𝑖] Ξ›2π‘š+1(𝜏) (𝑑𝑖 βˆ’ π‘‘π‘–βˆ’1) (cid:19) 2π‘š+1 . (5.81) To ensure an overall error πœ–, it therefore suffices to produce a mesh such that for each 𝑖, (cid:18) βˆ₯π‘Žβˆ₯1√ πœ‹π‘š 5𝐿 max 𝜏∈[π‘‘π‘–βˆ’1,𝑑𝑖] Ξ›2π‘š+1(𝜏) (𝑑𝑖 βˆ’ π‘‘π‘–βˆ’1) (cid:19) 2π‘š+1 ≀ πœ–/π‘Ÿ. Rearranging, this corresponds to choosing 𝑑𝑖, given all other parameters, that satisfy 𝐿 max 𝜏∈[π‘‘π‘–βˆ’1,𝑑𝑖] Ξ›2π‘š+1(𝜏) (𝑑𝑖 βˆ’ π‘‘π‘–βˆ’1) ≀ √ (cid:18) πœ– πœ‹π‘š βˆ₯π‘Žβˆ₯1π‘Ÿ 1 5 (cid:19) 1/(2π‘š+1) . (5.82) (5.83) We now digress in order to relate max𝜏 Ξ›2π‘š+1(𝜏) and its average. Here is where we will make use of the 𝐾-bounds on the derivative (cid:164)Ξ›, we closely follow arguments found in [137]. From the inequality in the lemma statement, we have (cid:164)Ξ›2π‘š+1(𝑑) Ξ›2π‘š+1(𝑑)2 1 Ξ›2π‘š+1(𝑑) Suppose the time π‘‘π‘–βˆ’1 has been chosen by the previous iteration (if 𝑖 = 1, 𝑑0 = 0). Let 𝑑 > π‘‘π‘–βˆ’1 and (5.84) ≀ 𝐾. ≀ 𝐾 (cid:12) (cid:12) (cid:12) (cid:12) 𝑑 𝑑𝑑 (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) integrate the above inequality from π‘‘π‘–βˆ’1 to 𝑑. ∫ 𝑑 π‘‘π‘–βˆ’1 ∫ 𝑑 (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) 𝑑 π‘‘πœ 𝑑 π‘‘πœ (cid:12) (cid:12) (cid:12) (cid:12) 1 Ξ›2π‘š+1(𝜏) 1 Ξ›2π‘š+1(𝜏) π‘‘πœ βˆ’ 1 Ξ›2π‘š+1(π‘‘π‘–βˆ’1) 126 π‘‘π‘–βˆ’1 1 Ξ›2π‘š+1(𝑑) (cid:12) (cid:12) (cid:12) (cid:12) π‘‘πœ ≀ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) ≀ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) ≀ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) (5.85) Let us rearrange this in terms of Ξ›2π‘š+1(𝑑) alone. βˆ’πΎ (𝑑 βˆ’ π‘‘π‘–βˆ’1) ≀ 1 Ξ›2π‘š+1(π‘‘π‘–βˆ’1) βˆ’ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) ≀ Ξ›2π‘š+1(π‘‘π‘–βˆ’1) 1 + 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1)Ξ›2π‘š+1(π‘‘π‘–βˆ’1) βˆ’ 1 Ξ›2π‘š+1(𝑑) 1 Ξ›2π‘š+1(𝑑) ≀ Ξ›2π‘š+1(𝑑) ≀ ≀ 1 Ξ›2π‘š+1(π‘‘π‘–βˆ’1) 1 Ξ›2π‘š+1(π‘‘π‘–βˆ’1) Ξ›2π‘š+1(π‘‘π‘–βˆ’1) 1 βˆ’ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1)Ξ›2π‘š+1(π‘‘π‘–βˆ’1) ≀ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) + 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) (5.86) The lowerbound inequality holds for all 𝑑 > π‘‘π‘–βˆ’1, while the upper bound only holds when (𝑑 βˆ’ π‘‘π‘–βˆ’1)Ξ›2π‘š+1(π‘‘π‘–βˆ’1)𝐾 < 1. (5.87) We restrict our attention to 𝑑 for which both bounds hold. Consider, for the moment, only the leftmost inequality. The lower bound on the left is monotonically decreasing with 𝑑. This means that it is also a uniform lower bound on Ξ›2π‘š+1(𝑑′) for any 𝑑′ ∈ [π‘‘π‘–βˆ’1, 𝑑]. Therefore, it is a lower bound for the average Β―Ξ›2π‘š+1(𝑑) on the interval [π‘‘π‘–βˆ’1, 𝑑]. That is, Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1) := 1 𝑑 βˆ’ π‘‘π‘–βˆ’1 ∫ 𝑑 π‘‘π‘–βˆ’1 Ξ›2π‘š+1(𝜏)π‘‘πœ Ξ›2π‘š+1(π‘‘π‘–βˆ’1) 1 + 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1)Ξ›2π‘š+1(π‘‘π‘–βˆ’1) ≀ Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1), or, after isolating for Ξ›2π‘š+1(π‘‘π‘–βˆ’1) Ξ›2π‘š+1(π‘‘π‘–βˆ’1) ≀ Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1) 1 βˆ’ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1) . (5.88) (5.89) (5.90) At this point, let’s now consider the upper bound in equation (5.86). This bound is monotonically increasing in 𝑑, and therefore also upper bounds Ξ›2π‘š+1(𝜏) for any 𝜏 in [π‘‘π‘–βˆ’1, 𝑑]. Therefore, it is also a bound for the maximum. max 𝜏∈[π‘‘π‘–βˆ’1,𝑑] Ξ›2π‘š+1(𝜏) ≀ Ξ›2π‘š+1(π‘‘π‘–βˆ’1) 1 βˆ’ 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1)Ξ›2π‘š+1(π‘‘π‘–βˆ’1) (5.91) Substituting bounds for Ξ›2π‘š+1(π‘‘π‘–βˆ’1) from equation (5.90) gives us a bound on the maximum value in terms of the average. max 𝜏∈[π‘‘π‘–βˆ’1,𝑑] Ξ›2π‘š+1(𝜏) ≀ Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1) 1 βˆ’ 3 2 𝐾 Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1) (𝑑 βˆ’ π‘‘π‘–βˆ’1) (5.92) 127 Solving for the average value of Ξ›2π‘š+1, and multiplying by 𝑑 βˆ’ π‘‘π‘–βˆ’1 on both sides, (𝑑 βˆ’ π‘‘π‘–βˆ’1) Β―Ξ›2π‘š+1(𝑑, π‘‘π‘–βˆ’1) β‰₯ (𝑑 βˆ’ π‘‘π‘–βˆ’1) max𝜏∈[π‘‘π‘–βˆ’1,𝑑] Ξ›2π‘š+1(𝜏) 𝐾 (𝑑 βˆ’ π‘‘π‘–βˆ’1) max𝜏∈[π‘‘π‘–βˆ’1,𝑑] Ξ›2π‘š+1(𝜏) 1 + 3 2 . (5.93) Let us finally choose a 𝑑 = 𝑑𝑖 which will serve as the next time step in the adaptive scheme. We would like come as close as possible to saturating equation (5.83) while staying within the constraint imposed by the maximum bound of equation (5.86). Thus, we choose 𝑑𝑖 such that max 𝜏∈[π‘‘π‘–βˆ’1,𝑑𝑖] Ξ›2π‘š+1(𝜏)(𝑑𝑖 βˆ’ π‘‘π‘–βˆ’1) = min (cid:40) 1 𝐾 , 1 5𝐿 √ (cid:18) πœ– πœ‹π‘š βˆ₯π‘Žβˆ₯1π‘Ÿ (cid:19) 1/(2π‘š+1)(cid:41) . (5.94) Since 𝐾 is a constant, for asymptotic purposes we will assume sufficiently small πœ– such that the right term is smaller. Plugging in to (5.93) yields Β―Ξ›2π‘š+1(𝑑𝑖, π‘‘π‘–βˆ’1)(𝑑𝑖 βˆ’ π‘‘π‘–βˆ’1) β‰₯ 1 5𝐿 (cid:17) 1/(2π‘š+1) (cid:16) πœ– √ πœ‹π‘š βˆ₯π‘Žβˆ₯1π‘Ÿ √ (cid:16) πœ– πœ‹π‘š βˆ₯π‘Žβˆ₯1π‘Ÿ (cid:17) 1/(2π‘š+1) . 𝐾 (5.95) < 1 and by summing over 𝑖 = 1, . . . , π‘Ÿ 1 + 3 2 1 5𝐿 (cid:17) 1/(2π‘š+1) (cid:16) πœ– √ πœ‹π‘š βˆ₯π‘Žβˆ₯1π‘Ÿ We then find, by using the fact that 1 5𝐿 in (5.95), that βˆ₯Ξ›βˆ₯1 β‰₯ π‘Ÿ 2π‘š 2π‘š+1 1 5𝐿 (cid:18) πœ– √ πœ‹π‘š βˆ₯π‘Žβˆ₯1 (cid:19) 1/(2π‘š+1) (cid:18) 1 + (cid:19) βˆ’1 . 𝐾 3 2 (5.96) Finally, rearranging the above, this implies that the number of steps required for the MPF algorithm is upper bounded as (cid:18) (cid:18) 5 1 + π‘Ÿ ≀ (cid:19) 𝐾 3 2 𝐿βˆ₯Ξ›βˆ₯1 (cid:19) 1+ 1 2π‘š (cid:18) βˆ₯π‘Žβˆ₯1 √ πœ‹π‘š πœ– (cid:19) 1 2π‘š . The result then directly follows from the requirement that π‘Ÿ is an integer. (5.97) β–‘ To summarize, we’ve provided an upper bound on the number of steps π‘Ÿ needed given as- sumptions on the derivative of Ξ›2π‘š+1. What is perhaps objectionable is that, in determining our subsequent time stepping, we seemed to need information about the total number of steps π‘Ÿ that we would end up with. While this does not detract from the correctness of our result, it does indicate possible difficulty in constructing a suitable set of 𝑑 𝑗 for which the Lemma holds. One approach is to guess the final number π‘Ÿtry of steps needed, construct the mesh according to the proof, then see if π‘Ÿtry can be made correct. This approach is considered in Section 5.9. 128 5.6 Query Complexity With the results of the previous two sections, we proceed to bound the query complexity needed to perform a time dependent MPF simulation. First, we define a set of oracles that are appropriate for this simulation problem. We reemphasize that the most natural input model in our setting is the linear combinations of Hamiltonians model 𝐻 = 𝐿 βˆ‘οΈ 𝑗=1 𝛼 𝑗 (𝑑)𝐻 𝑗 , (5.98) where 𝛼 𝑗 : [0, 𝑇] β†’ R has 2π‘š + 1 continuous derivatives and 𝐻 𝑗 ∈ Herm(C2𝑛 We discretize [0, 𝑇] into 2𝑛𝑑 uniform grid points π‘‘π‘˜ = π‘˜π‘‡/2𝑛𝑑 for π‘˜ ∈ [0, 2𝑛𝑑 ) ∩ Z, and define ) satisfies βˆ₯𝐻 𝑗 βˆ₯ ≀ 1. 𝛼 𝑗 π‘˜ := 𝛼 𝑗 (π‘‘π‘˜ ). Let 𝛿𝑑 := 𝑇/2𝑛𝑑 . Let π‘ˆπ›Ό and π‘ˆπ» be unitary oracles which provide the input Hamiltonian as follows. π‘ˆπ›Ό | π‘—βŸ© |π‘˜βŸ© |𝜏⟩ |0⟩ := | π‘—βŸ© |π‘˜βŸ© |𝜏⟩ (cid:12) (cid:12)𝛼 𝑗 π‘˜ 𝜏(cid:11) |πœ“βŸ© := | π‘—βŸ© (cid:12) π‘ˆπ» | π‘—βŸ© (cid:12) (cid:12)𝛼 𝑗 π‘˜ 𝜏(cid:11) (cid:12)𝛼 𝑗 π‘˜ 𝜏(cid:11) exp{βˆ’π‘–π» 𝑗 𝛼 𝑗 π‘˜ 𝜏} |πœ“βŸ© (5.99) The oracle π‘ˆπ›Ό encodes a reversible classical computation and may be taken as self-inverse. Here |𝜏⟩ encodes a step of size 𝜏 ∈ R in binary using 𝑛𝑐 qubits. Such step sizes are always nonnegative for the low-order formulas we consider, and therefore we take 𝜏 ∈ [0, 𝑇]. Hence, 𝛿𝑑 = 𝑇/2𝑛𝑐 is the rounding error for the step sizes. We neglect rounding effects due to the values 𝛼 𝑗 π‘˜ 𝜏. Our first result concerns the approximate implementation of π‘ˆ2 using the two oracles. Lemma 5.6.1. Let π‘ˆ2(𝜏 + 𝑑, 𝑑) be the 2nd-order Suzuki-Trotter formula for the midpoint formula of equation (5.8). Then an approximation π‘Š2 can be constructed using at most 6𝐿 βˆ’ 3 queries to π‘ˆπ» and π‘ˆπ›Ό, such that βˆ₯π‘ˆ2(𝑑 + 𝜏, 𝑑) βˆ’ π‘Š2(𝑑 + 𝜏, 𝑑) βˆ₯ ≀ 𝐿 max 𝑗,π‘‘βˆˆ[0,𝑇] | (cid:164)𝛼 𝑗 (𝑑 + 𝜏/2)| 𝑇 2 2𝑛𝑐 . Proof. Define π‘Š2 as π‘ˆ2 but with each 𝛼 𝑗 evaluated at the nearest discrete times in {π‘‘π‘˜ }. Using the techniques of [137], two queries to π‘ˆπ›Ό and one query to π‘ˆπ» are needed to exactly simulate each of the 2𝐿 βˆ’ 1 exponentials present in π‘Š2. Thus 3 Γ— (2𝐿 βˆ’ 1) queries are needed total. To evaluate the 129 discretization error, by Box 4.1 of [101] we have that βˆ₯π‘Š2 βˆ’ π‘ˆ2βˆ₯ ≀ 2 𝐿 βˆ‘οΈ 𝑗=1 βˆ₯π‘’βˆ’π‘–π» 𝑗 𝛼 𝑗 (rnd[𝑑+𝜏/2])𝜏/2 βˆ’ π‘’βˆ’π‘–π» 𝑗 𝛼 𝑗 (𝑑+𝜏/2)𝜏/2βˆ₯ (5.100) which in turn is upper bounded, through an application of the fundamental theorem of calculus, by 2 𝐿 βˆ‘οΈ 𝑗=1 (cid:13) (cid:13)𝐻 𝑗 𝛼 𝑗 (rnd (cid:2)𝑑 + (cid:3)) 𝜏 2 𝜏 2 βˆ’ 𝐻 𝑗 𝛼 𝑗 (𝑑 + 𝜏 2 ) 𝜏 2 (cid:13) (cid:13) (5.101) where rnd rounds to the nearest 𝑛𝑐-bit value. Since βˆ₯𝐻 𝑗 βˆ₯ ≀ 1 this is merely upper bounded as 𝜏 𝐿 βˆ‘οΈ 𝑗=1 (cid:12) (cid:12)𝛼 𝑗 (rnd (cid:2)𝑑 + 𝜏 2 (cid:3)) βˆ’ 𝛼 𝑗 (𝑑 + 𝜏 2 )(cid:12) (cid:12). (5.102) By the fundamental theorem of calculus, with an integral upper bound, each term is upper bounded as 𝛿𝑑 maxπ›Ώπ‘‘βˆˆπ‘‘Β±π›Ώπ‘‘ |πœ•π‘‘π›Ό 𝑗 (𝑑 + 𝜏/2)|. Maximizing over [0, 𝑇] instead, and making other simplifying choices,we get a crude upper bound βˆ₯π‘Š2 βˆ’ π‘ˆ2βˆ₯ ≀ πœπΏπ›Ώπ‘‘ max 𝑗,[0,𝑇] | (cid:164)𝛼 𝑗 (𝑑)| ≀ 𝐿 𝑇 2 2𝑛𝑐 | (cid:164)𝛼 𝑗 (𝑑)|. max 𝑗,[0,𝑇] Rearranging this gives the inequality of the lemma statement. (5.103) β–‘ Having supplied an approximate base formula π‘Š2 with our queries, we next need to implement an approximate MPF π‘Š2,π‘š over a subinterval [𝑑0, 𝑑1]. This is conventionally done through the use of "select" SEL and "prepare" PREP circuits. βˆšοΈ„ π‘š βˆ‘οΈ PREP |0⟩ := |π‘Ž 𝑗 | βˆ₯π‘Žβˆ₯1 | π‘—βŸ© 𝑗=1 SEL | π‘—βŸ© |πœ“βŸ© := sgn(π‘Ž 𝑗 ) | π‘—βŸ© π‘Š (π‘˜ 𝑗 ) (𝑑1, 𝑑0) |πœ“βŸ© (5.104) 2 The circuit PREP can be implemented without any queries to π‘ˆπ›Ό or π‘ˆπ» whereas SEL requires 𝑂 (𝐿βˆ₯ (cid:174)π‘˜ βˆ₯∞) queries. Following the well-conditioned MPF scheme of [90] we have that π‘˜ 𝑗 ≀ 3π‘š2. This implies that a query to SEL requires 𝑂 (πΏπ‘š2) queries to π‘ˆπ» and π‘ˆπ›Ό. We can use the SEL and PREP for a standard LCU block encoding in order to construct a time dependent MPF with base formula π‘Š2. 130 Lemma 5.6.2. Under the assumptions of Theorem 5.4.1 and the query model above, for any [𝑑0, 𝑑1] βŠ† [0, 𝑇] the time dependent MPF π‘Š2,π‘š with base formula π‘Š2 satisfies βˆ₯π‘Š2,π‘š (𝑑1, 𝑑0) βˆ’ π‘ˆ (𝑑1, 𝑑0) βˆ₯ ∈ 𝑂 βˆ₯π‘Žβˆ₯1 (cid:32) (cid:18) max π‘‘βˆˆ[𝑑0,𝑑1] Ξ›2π‘š+1(𝑑)𝑇 (cid:19) 2π‘š+1(cid:33) , provided that (cid:32) 𝑛𝑐 β‰₯ log √ πœ‹π‘š5/2𝐿 max𝑑, 𝑗 |πœ•π‘‘π›Ό 𝑗 (𝑑)|(𝑑1 βˆ’ 𝑑0)2 3 (cid:0)5𝐿 maxπ‘‘βˆˆ[𝑑0,𝑑1] Ξ›2π‘š+1(𝑑) (𝑑1 βˆ’ 𝑑0)(cid:1) 2π‘š+1 (cid:33) , and can be constructed with a number of queries to π‘ˆπ» and π‘ˆπ›Ό scaling as 𝑂 (π‘š2𝐿). Proof. From Lemma 4 of [15], we have (⟨0| βŠ— 𝐼)(PREP†)SEL(PREP) (|0⟩ βŠ— 𝐼) = 1 βˆ₯π‘Žβˆ₯1 π‘š βˆ‘οΈ 𝑗=1 π‘Ž π‘—π‘Š (π‘˜ 𝑗 ) 2 (𝑑1, 𝑑0) Let 𝛿′ > 0 be such that, for all 𝑗 and β„“ ∈ {1, . . . , π‘˜ 𝑗 }, (cid:18) (cid:19) (cid:18) (cid:13) (cid:13) π‘Š2 (cid:13) (cid:13) Δ𝑑 β„“ π‘˜ 𝑗 + 𝑑0, Δ𝑑 + 𝑑0 βˆ’ π‘ˆ2 Δ𝑑 β„“ βˆ’ 1 π‘˜ 𝑗 = π‘Š2,π‘š (𝑑1, 𝑑0)/βˆ₯π‘Žβˆ₯1. β„“ π‘˜ 𝑗 + 𝑑0, Δ𝑑 β„“ βˆ’ 1 π‘˜ 𝑗 + 𝑑0 (cid:19)(cid:13) (cid:13) (cid:13) (cid:13) ≀ 𝛿′ where Δ𝑑 = 𝑑1 βˆ’ 𝑑0. Then, by invoking Box 4.1 from [101], (cid:13) π‘ˆ (π‘˜ 𝑗 ) (cid:13) (cid:13) 2 (𝑑1, 𝑑0) βˆ’ π‘Š (π‘˜ 𝑗 ) 2 (𝑑1, 𝑑0) (cid:13) (cid:13) (cid:13) ≀ π‘˜ 𝑗 𝛿′ which, since π‘˜ 𝑗 ≀ 3π‘š2, implies that (5.105) (5.106) (5.107) βˆ₯𝑉2,π‘š (𝑑1, 𝑑0) βˆ’ π‘Š2,π‘š (𝑑1, 𝑑0) βˆ₯ ≀ 3π‘š2𝛿′βˆ₯π‘Žβˆ₯1. (5.108) We supply 𝛿′ using Lemma 5.6.1, obtaining 3π‘š2𝛿′βˆ₯π‘Žβˆ₯1 ≀ 3π‘š2βˆ₯π‘Žβˆ₯1𝐿 max 𝑗,π‘‘βˆˆ[0,𝑇] | (cid:164)𝛼 𝑗 (𝑑 + 𝜏/2)| 𝑇 2 2𝑛𝑐 , (5.109) giving us a bound on the discretized MPF π‘Š2,π‘š relative to the undiscretized 𝑉2,π‘š. It then follows from the triangle inequality and Theorem 5.4.1 that (cid:13)π‘Š2,π‘š (𝑑1, 𝑑0) βˆ’ π‘ˆ (𝑑1, 𝑑0)(cid:13) (cid:13) (cid:13) ≀ (cid:13) (cid:13) + βˆ₯π‘Š2,π‘š (𝑑1, 𝑑0) βˆ’ π‘ˆ2,π‘š (𝑑1, 𝑑0) βˆ₯ (cid:13)π‘ˆ2,π‘š (𝑑1, 𝑑0) βˆ’ π‘ˆ (𝑑1, 𝑑0)(cid:13) βˆ₯π‘Žβˆ₯1√ πœ‹π‘š 5𝐿 max π‘‘βˆˆ[0,𝑇] Ξ›2π‘š+1(𝑑)𝑇 (cid:18) ≀ (cid:19) 2π‘š+1 + 3π‘š2𝐿βˆ₯π‘Žβˆ₯1 max 𝑗,𝑑 | (cid:164)𝛼 𝑗 (𝑑)| 𝑇 2 2𝑛𝑐 . (5.110) 131 Under the assumption that 𝑛𝑐 β‰₯ log (cid:32) √ πœ‹π‘š5/2𝐿 max 𝑗,𝑑 | (cid:164)𝛼 𝑗 (𝑑)|𝑇 2 3 (cid:0)5𝐿 maxπ‘‘βˆˆ[0,𝑇] Ξ›2π‘š+1(𝜏)𝑇 (cid:1) 2π‘š+1 (cid:33) (5.111) the second term is bounded by the first (5.110), so we have an upper bound π‘Ž 𝑗 (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) 𝑀 βˆ‘οΈ π‘˜ 𝑗 (cid:214) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) Since π‘ˆ (𝑇, 0) is unitary, we know that the MPF implemented by our algorithm is close to a unitary. (cid:0)𝑇 π‘ž/π‘˜ 𝑗 , 𝑇 (π‘ž βˆ’ 1)/π‘˜ 𝑗 (cid:1) βˆ’ π‘ˆ (𝑇, 0) 5𝐿 max π‘‘βˆˆ[0,𝑇] 2βˆ₯π‘Žβˆ₯1√ πœ‹π‘š Ξ›2π‘š+1(𝑑)𝑇 (5.112) (cid:19) 2π‘š+1 π‘Š2 π‘ž=1 𝑗=1 ≀ (cid:18) . This means that we satisfy the preconditions of robust oblivious amplitude amplification given by Lemma 5 of [15]. This result implies that using 𝑂 (βˆ₯π‘Žβˆ₯1) applications of the unitary given by (5.105), we can implement an operator (cid:101)π‘Š (𝑇, 0) such that (for constant π‘š) βˆ₯π‘Š2,π‘š (𝑑1, 𝑑0) βˆ’ π‘ˆ (𝑑1, 𝑑0)βˆ₯ ∈ 𝑂 βˆ₯π‘Žβˆ₯1 (cid:32) (cid:18) max π‘‘βˆˆ[0,𝑇] Ξ›2π‘š+1(𝑑) (𝑑1 βˆ’ 𝑑0) (cid:19) 2π‘š+1(cid:33) . The number of queries scales as 𝑄step ∈ 𝑂 (βˆ₯π‘Žβˆ₯1π‘š2𝐿) βŠ† (cid:101)𝑂 (π‘š2𝐿). (5.113) (5.114) β–‘ With the short-time simulation costs in place we are now ready to state our main theorem, which bounds the number of queries needed to perform the full multiproduct simulation of a time dependent Hamiltonian. Theorem 5.6.3. In the query setting above, and under the assumptions of Theorem 5.4.1, and Lemma 5.5.1 (Ξ›2π‘š+1-bounded 𝐻 with 𝐾 bound on (cid:164)Ξ›2π‘š+1), we have that the number of queries 𝑄tot needed to π‘ˆπ›Ό and π‘ˆπ» to construct an operator π‘Štot(𝑇, 0) simulate a time dependent Hamiltonian of the form (cid:205)𝐿 𝛼 𝑗 (𝑑)𝐻 𝑗 such that βˆ₯(⟨0| βŠ— 𝐼)π‘Štot(𝑇, 0) (|0⟩ βŠ— 𝐼) βˆ’ π‘ˆ (𝑇, 0) βˆ₯ ≀ πœ– satisfies 𝑗=1 (cid:16) 𝑄tot ∈ (cid:101)𝑂 𝐿(1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1 log2(1/πœ–) (cid:17) , and the total number of auxiliary qubits is in (cid:32) (cid:101)𝑂 log (cid:32) 𝐿 (1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1 max 𝑗,𝑑 | (cid:164)𝛼 𝑗 (𝑑)|𝑇 2 πœ– (cid:33)(cid:33) . 132 Proof. From Lemma 5.5.1 we have that the number of segments needed to perform the simulation within error πœ– obeys π‘Ÿ ∈ (cid:101)𝑂 (cid:18) ((1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1)1+1/(2π‘š) πœ– 1/(2π‘š) (cid:19) . Therefore, using Lemma 5.6.2, 𝑄tot ∈ (cid:101)𝑂 (π‘š2πΏπ‘Ÿ) βŠ† (cid:101)𝑂 (cid:18) π‘š2𝐿 ((1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1)1+1/(2π‘š) πœ– 1/(2π‘š) (cid:19) . (5.115) (5.116) An approximation to the optimal value of π‘š ∈ Z+ can be obtained by equating the exponentially shrinking component of the cost to the polynomially increasing value of π‘š. We choose π‘š to satisfy (cid:18) (1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1 πœ– (5.117) π‘š2 = (cid:19) 1/2π‘š . Solving for π‘š yields π‘š = log (cid:0)(1 + 𝐾)βˆ₯Ξ›2π‘š+1βˆ₯1/πœ– (cid:1) 4 LambertW (cid:16) log (cid:0)(1 + 𝐾)βˆ₯Ξ›2π‘š+1βˆ₯1/πœ– (cid:1)/4 (cid:17) ∈ (cid:101)𝑂 (cid:18) log (cid:18) (1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1 πœ– (cid:19)(cid:19) . (5.118) This implies that the query complexity 𝑄tot is in (cid:16) (cid:101)𝑂 𝐿(1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1 log2(1/πœ–) (cid:17) . (5.119) The number of auxiliary qubits needed in the construction is in 𝑂 (log(π‘š)) to implement the MPF and (⌈log πΏβŒ‰ + 𝑛𝑐) to implement the π‘ˆπ›Ό oracle. From the result of Lemma 5.6.2 we see that 𝑛𝑐 dominates this cost. We thus have a number of auxiliary qubits scaling as (cid:32) π‘š2𝐿 max |πœ•π‘‘π›Ό 𝑗 (𝑑)|𝑇 2 (cid:0)maxπ‘‘βˆˆ[0,𝑇] Ξ›2π‘š+1(𝑑)𝑇 (cid:1) 2π‘š+1 (cid:32) π‘š2πΏπ‘Ÿ βˆ₯π‘Žβˆ₯1 max |πœ•π‘‘π›Ό 𝑗 (𝑑)|𝑇 2 πœ– (cid:33)(cid:33) (cid:33)(cid:33) log log 𝑛aux ∈ 𝑂 ∈ 𝑂 (cid:32) (cid:32) (cid:32) ∈ (cid:101)𝑂 log (cid:32) 𝐿 (1 + 𝐾) βˆ₯Ξ›2π‘š+1βˆ₯1 max |πœ•π‘‘π›Ό 𝑗 (𝑑)|𝑇 2 πœ– (cid:33)(cid:33) where used Eq. (5.116) and Eq. (5.118) above. (5.120) β–‘ This shows that the cost of quantum simulation using MPFs broadly conforms to the cost scalings that one would expect of previous methods. In particular, similar to the truncated Dyson series simulation method [89, 75] we obtain that the cost of simulating a time dependent Hamiltonian scales near-linearly with time 𝑇 and poly-logarithmically with 1/πœ–. 133 5.7 Numerical Demonstrations In the above sections, we developed and characterized MPFs for time dependent simulations by showing their existence and proving error bounds. However, these bounds are unlikely to be the final word on the performance of the algorithm. For example, we already mentioned that, for time independent 𝐻, the MPF of Definition 2 is exact in cases where the Hamiltonian consists of only commuting terms. Yet this behavior is not captured in the bound of Theorem 5.4.1 because Ξ›2π‘š is at least as large as βˆ₯𝐻 βˆ₯. This discrepancy is unrelated to the fact that, in practice, the 2nd-order formula π‘ˆ2 can only be computed approximately. To begin bridging the gap between algorithm’s actual performance and our bounds, we inves- tigate time dependent MPFs empirically through two numerical examples. We compute π‘ˆ2,π‘š for these systems on a classical computer (using matrix computations) and compare the result with the exact propagator (computed within machine πœ–). The vector (cid:174)π‘˜ ∈ Zπ‘š + we will use comes from the bottom half of Table I from [90], which minimizes βˆ₯ (cid:174)π‘˜ βˆ₯ for βˆ₯π‘Žβˆ₯1 ≀ 2. In general, deriving an analytical solution for the propagator given a time dependent Hamilto- nian is challenging or impossible. To bypass this problem, we will consider a time independent Hamiltonian which is viewed from a "non-inertial" frame, thereby rendering the dynamics time dependent in the new frame. More specifically, suppose 𝐻 is a time independent Hamiltonian with propagator π‘ˆ (𝑑) = π‘’βˆ’π‘–π»π‘‘ (henceforth the initial time is set to zero). Let |πœ“π‘‘βŸ© be the solution to the SchrΓΆdinger equation π‘–πœ•π‘‘ |πœ“π‘‘βŸ© = 𝐻 |πœ“π‘‘βŸ©. Under a frame transformation 𝑇 (𝑑), which transforms vectors as | Λœπœ“π‘‘βŸ© = 𝑇 (𝑑) |πœ“π‘‘βŸ©, the Hamiltonian and propagator transform as Λœπ‘ˆ (𝑑) = 𝑇 (𝑑)π‘ˆ (𝑑) ˜𝐻 (𝑑) = 𝑖 πœ•π‘‡ (𝑑) πœ•π‘‘ 𝑇 (𝑑)† + 𝑇 (𝑑)𝐻 (𝑑)𝑇 (𝑑)†. (5.121) Thus, in order to benchmark the error of the MPF, we compute Λœπ‘ˆπ‘˜ for Hamiltonian ˜𝐻, then compare with the exact propagator (accurate to machine precision). πœ–π‘ = βˆ₯ Λœπ‘ˆ2,π‘š (𝑑) βˆ’ 𝑇 (𝑑)π‘ˆ (𝑑) βˆ₯ (5.122) 134 5.7.1 Example 1: Electron in Magnetic field, Rotating Frame As a very simple first demonstration, consider a spin-1/2 particle (say, electron) in a homoge- neous external magnetic field 𝐡. Choose a coordinate system such that 𝐡 makes an angle πœƒ with respect to the 𝑧-axis, and lies within the π‘₯𝑧 plane. This system can be described by the Hamiltonian 𝐻 = πœ‡π΅(cos πœƒπ‘/2 + sin πœƒ 𝑋/2) (5.123) where 𝑍 and 𝑋 (and later π‘Œ ) are Pauli operators, and πœ‡ is a coupling parameter that will henceforth be set to one. The propagator π‘ˆ (𝑑) = π‘’βˆ’π‘–π»π‘‘ is easy to compute, and corresponds to precession about the magnetic field axis with frequency 𝐡. To obtain a time dependent problem, let’s shift to a reference frame that rotates with angular frequency πœ” about the 𝑧-axis. The transformation is given by 𝑅𝑧 (πœ”π‘‘), where π‘…π‘Ž is the usual π‘†π‘ˆ (2) rotation operator about axis π‘Ž. The Hamiltonian in the rotating frame is ˜𝐻 (𝑑) = (πœ” + 𝐡 cos πœƒ)𝑍/2 + 𝐡 sin πœƒ (cos πœ”π‘‘ 𝑋/2 + sin πœ”π‘‘π‘Œ /2). (5.124) Because we know that this Hamiltonian is just a transformed time independent system, it is easy to compute the exact propagator Λœπ‘ˆ (𝑑). Λœπ‘ˆ (𝑑) = 𝑅𝑧 (πœ”π‘‘)π‘ˆ (𝑑) (5.125) Though it is not strictly necessary to run the algorithm, let’s compute an appropriate Ξ›(𝑑) upper bound. The spectral norm of ˜𝐻 may be upper bounded as βˆ₯ ˜𝐻 βˆ₯ ≀ |πœ” + 𝐡 cos πœƒ| 2 + |𝐡 sin πœƒ| while the derivatives ˜𝐻 (𝑛) (𝑑) have the bound βˆ₯ ˜𝐻 (𝑛) (𝑑) βˆ₯ ≀ |𝐡 sin πœƒπœ”π‘›| (cid:12) (cid:12) (cid:12) (cid:12) 𝐡 sin πœƒ πœ” (cid:12) (cid:12) (cid:12) (cid:12) βˆ₯𝑑𝑖𝑙𝑑𝑒𝐻 (𝑛) (𝑑) βˆ₯ ≀ πœ” 1/𝑛+1 𝑛+1βˆšοΈƒ (5.126) (5.127) . For πœ” not too much larger than 𝐡, we see then that Ξ›(𝑑) = πœ” is an appropriate choice. 135 Figure 5.1 (left) Multiproduct errors plotted against simulation time, for several low-order MPFs, on a log-log plot. Notice the power law scaling for small values of 𝑑. The parameters used here are 𝐡 = 1, πœ” = 4, πœƒ = πœ‹/6. For larger π‘š, one quickly runs into machine precision becoming the dominant error source. (right) The running power 𝑝(𝑑, 𝑑′) defined in equation (5.128), with 𝑑′ = .3. Note the plateau corresponds with the anticipated value of 2π‘š + 1. The first thing to check will be that the error has the appropriate power law scaling. Namely, for 𝑀-term formulas, the error πœ–π‘ for small 𝑑 should scale as 𝑂 (𝑑2π‘š+1) or better. We can check this by computing the "running power" 𝑝(𝑑, 𝑑′). 𝑝(𝑑, 𝑑′) := log πœ–π‘‘/πœ–π‘‘β€² log 𝑑/𝑑′ (5.128) For different but small values of 𝑑, 𝑑′, the value of 𝑝 should approach the expected order of the error: 2π‘š + 1. Indeed, this is precisely the behavior observed in Figure 5.1. For sufficiently small simulation times, a power-law dependence on the simulation error is observed, and the corresponding power is as anticipated. Additionally, we see that the error decreases by orders of magnitude with each additional term once the power-law regime is reached. Choosing π‘š > 4 in this example quickly leads to machine precision being the dominant error source. Next, we vary the MPF order π‘š for fixed simulation time 𝑑. Since Ξ› = πœ”, our bounds predict an exponential decay in the error, but only provided 𝑑 < 1/πœ”. Otherwise, the bounds grow exponentially and say nothing useful about performance. In Figure 5.7.1, we fix 𝑑 at several different times and plot the error dependence on the multiproduct order π‘š. Past a certain threshold value for π‘š (which increases with 𝑑) an exponential decay in error is observed, possibly superexponential. It is promising that, even for 𝑑 = 10, the exponential decay is eventually achieved at π‘š ≳ 6. 136 Figure 5.2 Multiproduct error shows an (super)exponential decrease in error for sufficiently large order π‘š. The threshold for this regime is seen to increase as the simulation time increases. This behavior surpasses the expectation of our proven bounds, since there are no guarantees if the time step is too large. Note that, in practice, one should typically split a longer simulation time into smaller steps. The plateau for 𝑑 = 1, π‘š > 8 is a result of machine precision limitations. Parameter values: 𝐡 = 1, πœ” = 4, πœƒ = πœ‹/6. This suggests our error bounds may be too conservative, and in particular MPFs could absolutely converge to π‘ˆ as π‘š β†’ ∞ in certain circumstances. This would be a notable improvement to product formulas alone, which tend to lead to errors that diverge as π‘š β†’ ∞ if the time step 𝑑 remains fixed [14, 137, 26]. In contrast, Theorem 5.4.1 shows that if the time step is sufficiently small, then the MPF converges to the exact result. However, such convergence is not anticipated from the bounds for a large value such as 𝑑 = 10. Indeed, there are good reasons to believe the absolute convergence property holds more gener- ically than this example. No matter how large the order π‘š, we are still using a low order formula (such as the midpoint formula π‘ˆ2) as a base. Moreover, recall that the MPF is essentially a sum of product formulas with different numbers of time steps (for the same time interval). As the order π‘š increases, higher weight is given to terms in the multiproduct sum with finer meshes. Corre- spondingly, terms which have larger time steps, and therefore may not converge properly, become suppressed at large π‘š. Such behavior is not reflected in our derived error bounds, so there is likely room for improvement. 137 Practitioners in quantum simulation will likely want to know how MPFs fare against the more- familiar and simpler Trotter techniques. To facilitate this, numerical studies across a broad range of physically interesting systems would be desirable. Such a comprehensive analysis must be left to future work; here we will be satisfied with comparing MPFs with Trotterization for our spin-1/2 example. Our Trotterization is just an MPF with π‘š = 1, corresponding to a midpoint- formula approximation. To facilitate as fair a comparison as possible, we will keep the number of midpoint-formula queries between the two methods the same. That is, we will enforce the requirement π‘Ÿtrot = π‘Ÿmpf max 𝑗 |π‘˜ 𝑗 | (5.129) where π‘Ÿtrot and π‘Ÿmpf are the number of time steps for Trotter and MPF, respectively. Note that the number of midpoint queries per time step for Trotter and MPFs are 1 and max 𝑗 |π‘˜ 𝑗 | respectively. Figure 5.7.1 shows the results of these head-to-head comparisons for the several values of the magnetic field 𝐡 and rotation frequency πœ”. The number of MPF steps π‘Ÿmpf is fixed at 10, a reasonable value since it makes ΛΔ𝑑 ∼ 1 on each subinterval. As the MPF order increases, so does the number of Trotter steps π‘Ÿtrot by the condition (5.129). These results show that, for π‘š not too large, MPFs outperform Trotterization, at a value of the error πœ– which is large enough to be of practical significance for scientific or industrial applications. Admittedly, the spin-1/2 system considered above is rather simplistic. However, we anticipate most of the inferences drawn above to hold even as we increase the dimensionality of the Hilbert space. For example, though the complexity of simulating π‘ˆ2 generally increases as dim(𝐻) grows, it does so both for MPFs and Trotterization. Nevertheless, benchmarking of MPFs on more complex systems would be a welcomed proof (or disproof) of concept. 5.7.2 Example 2: Spin Chain in Interaction Picture As a first step towards more complicated many-body quantum systems, we investigate the use of MPFs for a particular one-dimensional chain of spins with nearest-neighbor interactions. As before, we will take advantage of a change of reference frame, allowing us to compare the multiproduct simulations with a machine-precise simulation in an equivalent, time independent 138 Figure 5.3 Simulation error (spectral norm) of MPFs and midpoint-formula Trotterization, for the spin-1/2 system, with number of midpoint-formula queries kept fixed between the two. Each plot corresponds to different values for the parameters 𝐡 and πœ”, always with πœƒ = πœ‹/6. The number of MPF steps π‘Ÿmpf is fixed at 10. The crossover point tends to occur for error πœ– > 10βˆ’3, which is large enough for practical significance. Such error tolerances can be orders of magnitude larger than those required in many quantum simulation proposals [112, 84]. frame. In pursuit of a good case study, we seek a (time independent) Hamiltonian 𝐻 = 𝐻0 + 𝐻1 which produces nontrivial time-dependence in the so-called "interaction picture." We also ask that it satisfies a simple conservation law. A special instance of the 1D 𝑋 𝑋 model will suffice to meet these conditions. Consider a circular chain of 𝑁 qubits with nearest-neighbor hopping interactions, with Hamiltonian 𝐻 = 𝐻0 + 𝐻1 of the form 𝐻0 = 𝐻1 = 𝑁 βˆ‘οΈ π‘˜=1 𝑁 βˆ‘οΈ π‘˜=1 πœ”π‘˜ 2 π½π‘˜ 2 π‘π‘˜ (π‘‹π‘˜ π‘‹π‘˜+1 + π‘Œπ‘˜π‘Œπ‘˜+1) . (5.130) Here, πœ”π‘˜ , π½π‘˜ are real, site-dependent parameters, and any index increments are done modulo 𝑁. For any value of the parameters, the Hamiltonian conserves the total magnetization πœ‡ := (cid:205)π‘˜ π‘π‘˜ . [πœ‡, 𝐻] = 0 (5.131) Conceptually will think of 𝐻0 as a "base" Hamiltonian, with perturbation 𝐻1 generating inter- actions, though we make no assumptions as to the smallness of 𝐻1. We will switch to an interaction picture which is comoving with the simple dynamics of 𝐻0. In this frame, the Hamiltonian ˜𝐻 (𝑑) is 139 given by where ˜𝐻 (𝑑) = 𝑒𝑖𝐻0𝑑 𝐻1π‘’βˆ’π‘–π»0𝑑 𝑁 βˆ‘οΈ π½π‘˜ 2 = π‘˜=1 (π‘‹π‘˜ (𝑑) π‘‹π‘˜+1(𝑑) + π‘Œπ‘˜ (𝑑)π‘Œπ‘˜+1(𝑑)) π‘‹π‘˜ (𝑑) := 𝑒𝑖𝐻0𝑑 π‘‹π‘˜ π‘’βˆ’π‘–π»0𝑑 = cos(πœ”π‘˜π‘‘) π‘‹π‘˜ βˆ’ sin(πœ”π‘˜π‘‘)π‘Œπ‘˜ π‘Œπ‘˜ (𝑑) := 𝑒𝑖𝐻0π‘‘π‘Œπ‘˜ π‘’βˆ’π‘–π»0𝑑 = cos(πœ”π‘˜π‘‘)π‘Œπ‘˜ + sin(πœ”π‘˜π‘‘) π‘‹π‘˜ (5.132) (5.133) correspond to rotating the pauli vectors about the 𝑧-axis with frequency πœ”π‘˜ . We can express equation (5.132) in terms of the time independent π‘‹π‘˜ and π‘Œπ‘˜ of the original frame, ˜𝐻 (𝑑) = 𝑁 βˆ‘οΈ π‘˜=1 π½π‘˜ 2 (cid:8) cos(Ξ”πœ”π‘˜π‘‘)(π‘‹π‘˜ π‘‹π‘˜+1 + π‘Œπ‘˜π‘Œπ‘˜+1) + sin(Ξ”πœ”π‘˜π‘‘) (π‘‹π‘˜π‘Œπ‘˜+1 βˆ’ π‘Œπ‘˜ π‘‹π‘˜+1)(cid:9), (5.134) where Ξ”πœ”π‘˜ = πœ”π‘˜+1 βˆ’ πœ”π‘˜ . We see that having different qubit frequencies πœ”π‘˜ on neighboring sites should give rise to a nontrivial time-dependence in ˜𝐻. Another indication is gleaned from the commutator of 𝐻0 and 𝐻1. [𝐻0, 𝐻1] = βˆ’π‘– βˆ‘οΈ π‘˜ π½π‘˜ 2 (π‘‹π‘˜π‘Œπ‘˜+1 βˆ’ π‘Œπ‘˜ π‘‹π‘˜+1) (Ξ”πœ”π‘˜ ) (5.135) The time dependence in 𝐻𝐼 will be nontrivial when the commutator does not vanish, as occurs when Ξ”πœ”π‘˜ β‰  0. A simple choice is to set π½π‘˜ = 𝐽, πœ”π‘˜ = (βˆ’1) π‘˜ πœ”. (5.136) That is, the qubit frequency alternates sign at each site, and the coupling is translation invariant. For simplicity, we consider only even numbers of qubits to avoid frequency-matching at π‘˜ = 𝑁. Plugging (5.136) into the expression for ˜𝐻 in (5.134), where ˜𝐻 (𝑑) = 𝐽 2 (cid:0) cos(2πœ”π‘‘)𝐺1 + sin(2πœ”π‘‘)𝐺2 (cid:1) 𝐺1 = 𝐺2 = 𝑁 βˆ‘οΈ π‘˜=1 𝑁 βˆ‘οΈ π‘˜=1 π‘‹π‘˜ π‘‹π‘˜+1 + π‘Œπ‘˜π‘Œπ‘˜+1 (βˆ’1) π‘˜ (π‘‹π‘˜π‘Œπ‘˜+1 βˆ’ π‘Œπ‘˜ π‘‹π‘˜+1). 140 (5.137) (5.138) As a final check, one can see that 𝐺1 and 𝐺2 do not commute with each other. Yet they both commute with πœ‡. Thus, ˜𝐻 (𝑑) given in (5.137) is our model system to investigate. Assuming ˜𝐻 commutes with an observable πœ‡, to what degree does the MPF π‘ˆ2,π‘š conserve πœ‡? Since π‘ˆ2,π‘š is an algebraic combination of exponentials of ˜𝐻, π‘ˆ2,π‘š also commutes with πœ‡. If π‘ˆ2,π‘š were truly unitary, then the operator πœ‡ would evolve in the Heisenberg picture as πœ‡2,π‘š (𝑑) := π‘ˆβ€  2,π‘š (𝑑)πœ‡π‘ˆ2,π‘š (𝑑) = πœ‡ as it would under the exact propagator π‘ˆ. However, π‘ˆ2,π‘š is not necessarily unitary. π‘ˆβ€  2,π‘š (𝑑)π‘ˆ2,π‘š (𝑑) β‰  𝐼 This implies that conservation laws are only approximately conserved. (5.139) (5.140) πœ‡2,π‘š (𝑑) βˆ’ πœ‡ = Because π‘ˆ2,π‘š (𝑑) βˆ’ π‘ˆ (𝑑) ∈ 𝑂 (𝑑2π‘š+1), so is (cid:17) (cid:16) π‘ˆβ€  2,π‘š (𝑑)π‘ˆ2,π‘š (𝑑) βˆ’ 𝐼 (cid:17) (cid:16) π‘ˆβ€  2,π‘š (𝑑)π‘ˆ2,π‘š (𝑑) βˆ’ 𝐼 . πœ‡ β‰  0 (5.141) Figure 5.7.2 plots the deviations in the conserved πœ‡, βˆ₯πœ‡βˆ’ πœ‡2,π‘š (𝑑) βˆ₯, with respect to the simulation time. As the simulation time tends to zero, we see the expected power-law scaling, as evidence by the linear relationship on a log-log plot. For larger π‘š, the slope and hence power 𝑝 increases, corresponding to improved performance. We can extract the power as the slope of the line, and this is plotted in the right frame. Notice there are sudden dips in the error at specific simulation times, which tend to occur before reaching the power law scaling regime. This could be due to cancellation between two terms in an error series of comparable magnitude. Similar phenomenon occurs in several other contexts, such as the error from adiabatic evolution [135]. Conclusive identification of these phenomenon will require further study. Naively, we would expect 𝑝 = 2π‘š + 1, but here we actually get slightly better: 𝑝 = 2π‘š + 2. In fact, this scaling can be justified. The following argument, a variant of which can be found in [28], shows that the integrator is nearly unitary. Theorem 5.7.1. The deviation of π‘ˆ2,π‘š from being unitary obeys βˆ₯π‘ˆβ€  2,π‘š (𝑑)π‘ˆ2,π‘š (𝑑) βˆ’ 𝐼 βˆ₯ ∈ 𝑂 (𝑑2π‘š+2). 141 Figure 5.4 (left) Deviations from the conservation of magnetization πœ‡ under time-evolution by MPFs. Note that the order π‘š = 1 is simply a product formula evolution, which conserves πœ‡ exactly. For small simulation times, the expected power-law scaling is observed, with larger powers as π‘š increases. (right) The running power 𝑝(𝑑, 𝑑′) as defined in (5.128), with 𝑑′ = .3. Note the plateau at 2π‘š + 2, which indicates slightly better convergence than naively expected (𝑝 = 2π‘š + 1). This phenomenon generalizes to other systems and is formalized by Theorem 5.7.1. Parameter values: 𝑁 = 4, 𝐽 = 1, πœ” = 4. Proof. We suppress all function evaluations at 𝑑 when convenient. Let 𝐸 := π‘ˆ2,π‘š βˆ’ π‘ˆ, so that π‘ˆ2,π‘š = π‘ˆ + 𝐸. Then, using the unitarity of π‘ˆ and the fact that 𝐸 ∈ 𝑂 (𝑑2π‘š+1), where 2,π‘šπ‘ˆ2,π‘š = 𝐼 + 𝑁 + 𝑂 (𝑑4π‘š+2) π‘ˆβ€  𝑁 := π‘ˆβ€ πΈ + 𝐸 β€ π‘ˆ. (5.142) (5.143) Since 𝑁 ∈ 𝑂 (𝑑2π‘š+1), all of its derivatives up to degree 2π‘š vanish when evaluated at 𝑑 = 0. Hence, it suffices to show that 𝑁 (2π‘š+1) (0) = 0. (5.144) We can expand this derivative in terms of 𝐸 and π‘ˆ using the binomial theorem. When we evaluate at 𝑑 = 0, those terms with derivative less than degree 2π‘š + 1 in 𝐸 vanish. We are left with 𝑁 (2π‘š+1) (0) = 𝐸 †(2π‘š+1) (0)π‘ˆ (0) + π‘ˆβ€ (0)𝐸 (2π‘š+1) (0). (5.145) We have π‘ˆ (0) = π‘ˆβ€ (0) = 𝐼. Moreover, by the time-symmetric property of π‘ˆ and π‘ˆ2,π‘š, 𝐸 (𝑑) is also symmetric. Therefore 𝐸 †(2π‘š+1) (0) = 𝐸 (2π‘š+1) (βˆ’π‘‘) (cid:12) (cid:12) (cid:12)𝑑=0 = βˆ’πΈ (2π‘š+1) (0). (5.146) 142 Hence, the two terms in (5.145) cancel, yielding 𝑁 (2π‘š+1) (0) = 0. This completes the proof. β–‘ In summary, though MPFs do not inherently preserve commutations laws, the error is due to nonunitarity in π‘ˆ2,π‘š. This can be bounded and reduced in a systematic way, either by decreasing the time step or increasing the MPF order. 5.8 Discussion In this chapter, we presented an algorithm for time dependent Hamiltonian simulation that uses multiproduct formulas to boost the accuracy compared to product formula simulation. Our algorithm inherits the commutator scaling of product formulas, giving a benefit over comparable methods such as the Dyson series approach. We provide a rigorous characterization of the sim- ulation error as well as query computational complexity. Numerical demonstrations validate the effectiveness of time dependent MPFs in achieving high-accuracy simulations. Several avenues for future research are immediately apparent from this chapter. First, a proof of Conjecture 1 is highly desirable. Currently, we are investigating modifications of the clock space construction that keep the clock state width 𝜎 fixed, allowing for completion of the argument. Numerical demonstrations beyond the simple examples here would be desirable for showing scaling to larger systems, and of course, eventually the method should be tested on actual quantum hardware. To summarize, time dependent MPF simulation is a new algorithm which complements many existing approaches and likely will perform well on systems with a large degree of locality. 5.9 Algorithm for Time Mesh For completeness, I include the greedy algorithm for generating the time mesh used in the MPF simulation. I thank Alessandro Roggero for devising the approach presented below. The mesh construction of Section 5.5, although theoretically sound, is not directly imple- mentable since it requires knowing the total number of steps while constructing each new point based on local data. To avoid this issue, as well as the restriction | (cid:164)Ξ›(𝜏)| ≀ 𝐾Λ2(𝜏) we seek a simple-to-use greedy algorithm. One possibility is to use a direct approach which first selects a candidate number of steps π‘Ÿtry. Starting from π‘Ÿtry = 1, we then build recursively a sequence of times using the condition (see 143 Eq. (5.83) in the main text) max π‘‘βˆˆ[π‘‘π‘–βˆ’1,𝑑𝑖] Ξ›2π‘š (𝑑) (𝑑𝑖 βˆ’ π‘‘π‘–βˆ’1) ≀ 1 41 (cid:18) πœ– 0.32βˆ₯π‘Žβˆ₯1π‘Ÿ (cid:19) 1/(2π‘š+1) , (5.147) with π‘Ÿ = π‘Ÿtry. Starting from 𝑑0 = 0 and looking for the largest 𝑑𝑖 that satisfies the condition, we finally check whether the generated number of intervals is greater than π‘Ÿtry in which case we increase π‘Ÿtry by one and repeat. When the algorithm stops at the optimal value π‘Ÿopt, we have performed a total of π‘Ÿopt(π‘Ÿopt + 1)/2 non-linear optimization steps, each one requiring multiple evaluations of the left hand side of Eq. (5.147). This can be very demanding when the left hand side of Eq. (5.147) is expensive to evaluate and the optimal number of intervals is around half the upperbound (cid:18) π‘Ÿmax = 41(𝑑 βˆ’ 𝑑0) max 𝜏∈[𝑑0,𝑑] Ξ›2π‘š (𝜏) (cid:19) 2π‘š+1 2π‘š (cid:18) 0.32βˆ₯π‘Žβˆ₯1 (cid:19) 1 2π‘š πœ– (5.148) obtained considering identical intervals and bounding Ξ›2π‘š (𝑑) with its maximum value over the whole simulation interval [0, 𝑇]. In this case, finding an approximation to the optimal decomposi- tion requires 𝑂 (π‘Ÿ 2 max) optimization steps, each one requiring multiple evaluations of the lefty hand side of Eq. (5.147). We now describe an alternative approach which determines π‘Ÿopt within a factor of 2 and uses only π‘Ÿmax evaluations of maxπ‘‘βˆˆ[π‘‘π‘–βˆ’1,𝑑𝑖] Ξ›2π‘š (𝑑) and additional 𝑂 (log(π‘Ÿmax)π‘Ÿmax) simple arithmetic operations. This procedure can be used to find a viable, and approximately optimal, decomposition of the time interval or as a good starting point to find the optimal one using a procedure as the one described above. The idea is to start by decomposing the interval [0, 𝑇] into π‘Ÿmax segments with equal length and storing the maximum of Ξ›2π‘š (𝑑) in each segment in an array 𝐴 of size π‘Ÿmax. We then introduce an additional array of the same size πΏπ‘š = (cid:20) max π‘˜ β‰€π‘š (cid:21) π‘š π΄π‘˜ 𝑇 π‘Ÿmax , together with an additional set of vectors of the same size (cid:20) 𝑅(𝑛) π‘š = max 𝑛β‰₯π‘˜ >π‘š (cid:21) π΄π‘˜ (𝑛 βˆ’ π‘š) 𝑇 π‘Ÿmax , (5.149) (5.150) with 𝑛 an additional index between 1 and π‘Ÿπ‘šπ‘Žπ‘₯. The first vector stores the left hand side of Eq. (5.147) for the interval up to the π‘š-th time while the second vector stores the same information 144 for the interval starting at the π‘š-th time and ending at the 𝑛-th one. The algorithm proceeds by splitting the time interval recursively into two parts so that the left hand side of Eq. (5.147) takes (approximately) the same value on both halves (ie. we are splitting the error equally on both sides). At every iteration the number of intervals doubles and the right hand side of Eq. (5.147) shrinks accordingly. We stop the procedure once Eq. (5.147) is satisfied on one interval (since we are guaranteed it will in all others). The procedure will stop at some π‘ŸπΎ at which point we know the optimal value π‘Ÿopt is in [βŒˆπ‘ŸπΎ/2βŒ‰, π‘ŸπΎ]. The algorithm can then be described as follows 1. Compute πΏπ‘š for all π‘š = 1, ..., π‘Ÿmax 2. Set 𝑛 = π‘Ÿmax and π‘Ÿ = 2 3. Compute the elements of 𝑅(𝑛) π‘š for all π‘š = 1, ..., 𝑛 βˆ’ 1 4. Initialize an auxiliary array π·π‘š as π·π‘š = πΏπ‘š βˆ’ 𝑅(𝑛) π‘š 5. Find the least index π‘˜ for which 𝐷 π‘˜ > 0 6. If 𝐿 π‘˜ is less than the right hand side of Eq. (5.147) with the current value of π‘Ÿ, set π‘ŸπΎ = π‘Ÿ and exit 7. If 2π‘Ÿ β‰₯ π‘Ÿmax set π‘ŸπΎ = π‘Ÿmax and exit 8. set π‘Ÿ = 2π‘Ÿ, 𝑛 = π‘˜ and repeat from step 3 Step 1 requires π‘Ÿmax operations while Steps 3 and 4 cost 𝑛 operations each. Since the number of iterations is bounded by log2(π‘Ÿmax), their combined cost is bounded by 2 log2(π‘Ÿmax)π‘Ÿmax. If we use binary search, Step 5 costs log2(𝑛) operations so its total cost is at most log2(π‘Ÿmax)2 operations. From this analysis we see that Steps 3 and 4 are the most expensive ones and they dominate the cost of the scheme. On exit we have π‘ŸπΎ β‰ˆ π‘Ÿopt together with the first interval [𝑑0, 𝑑1]. The rest of the intervals can then be found keeping π‘Ÿ = π‘ŸπΎ fixed with additional 𝑂 (π‘Ÿmax) operations. 145 CHAPTER 6 A SIMULATION MIXED BAG Each of the prior chapters of this dissertation, starting with Chapter 3, focuses on a (mostly) self-contained theoretical project. However, I also contributed to several other projects during my PhD, especially during earlier years. These projects are more concerned with applications on noisy, small-scale devices than with methods suited for relatively noiseless processors. While my contributions were often of a mathematical and theoretical flavor, the ultimate goal was to apply our algorithmic gadgets to existing quantum devices, emphasizing demonstration over rigorous proof. Each section of the present chapter corresponds to one of these projects, presented in the order they were completed, with corresponding publication referenced within the first few paragraphs. I will begin with a topic that technically falls outside of quantum computing proper, in which we propose investigating discrete scale invariance and anomalous symmetry breaking with trapped- ion quantum simulators. Moving back to quantum computing, I will present a heuristic quantum algorithm for preparing low-lying energy states known as the Projected Cooling algorithm. Finally, I will discuss the aptly but oddly named Rodeo Algorithm, which is a randomized iterative phase estimation algorithm for determining eigenvalues and preparing eigenstates of an observable, provided that observable can be efficiently time evolved. 6.1 Discrete Scale Invariance on Trapped-Ion Systems Closely related to the idea of simulation by quantum computation is the emulation of a desired Hamiltonian on a controllable and accessible quantum system, namely, analog Hamiltonian simu- lation. See Subsection 2.7.4 for a short background and references to more detailed introductions. In one of my first projects as a PhD student, I proposed, along with collaborators [82], a scheme for simulating scale invariant Hamiltonians on trapped-ion quantum simulators. As of writing, trapped ions are one of the most mature platforms for exquisite manipulation of quantum systems. We begin this topic by introducing some of the deep physics ideas motivating our study, such as anomalous symmetry breaking. Then we will discuss our proposal for investigating these concepts on certain Hamiltonians accessible to trapped-ion simulators. We will conclude with some of my 146 findings on the self-similar dynamics which these systems exhibit in for certain nearly-unbound states. 6.1.1 Scale Invariance and Quantum Anomalies Understanding how a system changes with scale is a fundamental question in physics. For example: How do aspects of a microscopic system affect behavior at larger scales? Which of these aspects are relevant, and which get washed out by averaging or some other mechanism? In modern parlance, such questions are formalized in the theory of renormalization [91], which touches on essentially all aspects of modern physics, from the Standard Model to phase transitions. Closely related is the concept of emergent phenomena, which deals with how complex many-body systems achieve their properties, not through microscopic properties themselves, but through the complex interactions of these properties. It is remarkable that the new descriptions produced by these microscopic interactions are even intelligible. Renormalization provides a partial answer, showing how the effect of changing scales can often be incorporated, to good approximation, by changes to couplings in the physical theory. Another important concept in physics is symmetry. Symmetries provide such a powerful framework for understanding a physical system that little progress in physics could be made without them. Noether’s theorem relates certain continuous symmetries to conservation laws, and in practice symmetries allow for the reduction of computation and analysis needed to solve a problem. While symmetries are by nature elegant and simplifying, the breaking of symmetry has proven an essential idea to modern physics. In this project, we explored how quantum anomalies–the breaking of symmetry in a classical Hamiltonian by quantization–can be studied using a class of Hamiltonians that are implementable on existing trapped-ion simulators. Symmetry is merely a change that, in fact, leads to no change at all. When a scale transformation is performed, yet the system looks the same at this new scale, we have scale invariance. This is an important idea in field theory and renormalization (fixed points and conformal field theory). Even more "mundane" standard quantum mechanical systems can exhibit scale invariance. Take, 147 for example, the 1/π‘Ÿ 2 potential 𝐻 = 𝛼 𝑝2 2 + 𝛽 π‘Ÿ 2 (6.1) for 𝛼 ∈ R+ and 𝛽 ∈ R. Performing a canonical scale transformation π‘Ÿ ↦→ πœ†π‘Ÿ and 𝑝 ↦→ πœ†βˆ’1 𝑝 for any πœ† ∈ R+, we find that 𝐻 (πœ†π‘Ÿ, 𝑝/πœ†) = πœ†βˆ’2𝐻 (π‘Ÿ, 𝑝). Classically, this leads to a trivial change in the phase space dynamics whereby the trajectories are the same, but time rescales as πœ†2𝑑. Quantum mechanically, the time evolution operator is rescaled by the same factor. Remarkably, even this simple example leads to a quantum anomaly for sufficiently attractive potentials [20, 62]. The full scale invariance is broken to a discrete one, giving rise to a tower of bound states approaching 𝐸 β†’ 0 related by a geometric sequence. Thus, this simple Hamiltonian, which may arise from the interaction of an electromagnetic dipole and point charge, already exhibits a surprising richness. What’s more, the potential turns out to be intimately related to the Efimov effect [41, 100], a curious phenomenon in which bosons, with short range interactions, cannot form a two-body bound state, but three bosons together have an infinite number of bound states. The binding of three particles, but not two, recalls the intriguing Borromean rings [34], which are linked such that cutting any single ring unlinks all the rings. Since the original discovery, Efimov physics has been linked to a broad class of phenomena, and theoretical interest has been recently growing [100]. 6.1.2 Trapped-Ion Systems One of the most developed hardware platforms for quantum information processing is the ion trap [19], first proposed as a platform for universal quantum computation shortly after Shor’s groundbreaking factoring algorithm [30]. Atomic ions, typically Ytterbium or one of several alkaline earth metals, are trapped using lasers in a 1D chain. External control and readout of the system is mediated by laser pulses, and interactions between the ions are mediated, interestingly, through coupling with phononic modes [101] in the 1D chain. Trapped ions possess a number of internal states that can be used as the informational degree of freedom (e.g., qubit) including hyperfine, Zeeman, optimal, and fine transitions. The impressive degree of control, all-to-all connectivity, high fidelity of operations, and long coherence times make trapped ions a promising platform for quantum hardware experiments and computations. 148 Although the technology is being pursued for general computation, these systems can also be operated as analog simulators. Our study is motivated by a particular long range interaction that can be engineered between the ions. Using two hyperfine "clock" states, the following effective spin Hamiltonian can be approximately generated [104]. βˆ‘οΈ βˆ‘οΈ 𝐻 = 𝑖, 𝑗 π‘˜=π‘₯,𝑦,𝑧 𝑖 𝑗 𝜎 (𝑖) 𝐽 π‘˜ π‘˜ 𝜎 ( 𝑗) π‘˜ + βˆ‘οΈ βˆ‘οΈ 𝑖 π‘˜=π‘₯,𝑦,𝑧 π΅π‘–π‘˜ 𝜎 (𝑖) π‘˜ (6.2) Here 𝜎 (𝑖) π‘˜ is the π‘˜th Pauli matrix on site for ion 𝑖. The parameters 𝐽 are related to the spatial coordinates of the ions and can be tuned separately for each π‘˜ using laser pulses. At long ranges, the couplings fall off as 𝐽 π‘˜ 𝑖 𝑗 β‰ˆ 𝐽 π‘˜ 0 |𝑖 βˆ’ 𝑗 |𝛼 (6.3) for chosen 𝛼 ∈ (0, 3) [104, 72] and couplings 𝐽 π‘˜ 0 . It is the long-range character (6.3) that we find particularly interesting. By factorizing (6.2) into and π‘₯𝑦 piece and a 𝑧 piece, and choosing appropriate coupling parameters, we can created effective kinetic-potential Hamiltonians with scale invariant properties. First, making the choice 𝑖 𝑗 = 𝐽 𝑦 𝐽π‘₯ 𝑖 𝑗 ≑ 𝐽𝑖 𝑗 , 𝐽 𝑧 𝑖 𝑗 = 𝑉𝑖 𝑗 , π΅π‘–π‘˜ = π›Ώπ‘˜ π‘§π‘ˆπ‘– (6.4) we find that the Hamiltonian factorizes as 𝐻 = 𝑇 + 𝑉 + π‘ˆ, where 𝑇 is a kinetic term, 𝑉 is a potential interaction term, and π‘ˆ is a on-site potential. With this parametrization, the full Hamiltonian becomes 𝐻 = βˆ‘οΈ 𝑖≠ 𝑗 𝐽𝑖 𝑗 (𝑋𝑖 𝑋 𝑗 + π‘Œπ‘–π‘Œπ‘— ) + βˆ‘οΈ 𝑖≠ 𝑗 𝑉𝑖 𝑗 𝑍𝑖 𝑍 𝑗 + βˆ‘οΈ π‘ˆπ‘– 𝑍𝑖. 𝑖 (6.5) The "particles" of the system are given by the spin state: we say that there is a particle at site 𝑗 if a 𝑍 𝑗 measurement yields +1, i.e., we equate particle occupancy with the binary value of the 𝑍 𝑗 observable. Notice that (cid:205)𝑖 𝑍𝑖 is conserved by 𝐻, hence particle number is conserved. We can interpret these particles as hard-core bosons, with raising and lowering operators given by 𝜎±, and strong repellant interactions at short distances preventing multiple occupancy of a site. Hard-core bosons might serve as a reasonable model for a collection of nuclei with even nucleon number, since nuclei are repulsive at low energies. 149 We want to consider the Hamiltonian (6.5) as a discretization of some continuous-variable system with scale invariance. As such, we will need to consider low energies where wavelengths are large compared to the lattice. Considering only the kinetic term 𝑇 alone, with asymptotic form (6.3) of the hopping coefficient, and neglecting boundary effects by assuming a large chain, the eigenenergies are given by 𝐸 ( 𝑝) = 4𝐽0 ∞ βˆ‘οΈ π‘š=1 1 π‘šπ›Ό cos( π‘π‘š). Performing a low momentum expansion, we find that for 𝛼 < 3 𝐸 ( 𝑝) = 2𝐽0 sin(π›Όπœ‹/2)Ξ“(1 βˆ’ 𝛼)| 𝑝|π›Όβˆ’1 + 𝑂 ( 𝑝2). (6.6) (6.7) Here is the first part of our scale invariant Hamiltonian. Now we consider the potential part. We fix one of our bosons ("spin up") on one of the sites, say site 0, with a very deep on-site potential π‘ˆ0. We then add a second boson and tune the potential interaction 𝑉𝑖 𝑗 such that, at large distances, 𝑉𝑖 𝑗 ∼ 𝑉0 |𝑖 βˆ’ 𝑗 |π›Όβˆ’1 . Then, we see that the total Hamiltonian 𝐻 = 2𝐽0 sin(π›Όπœ‹/2)Ξ“(1 βˆ’ 𝛼)| 𝑝|π›Όβˆ’1 + 𝑉0 |π‘Ÿ |π›Όβˆ’1 + 𝑂 ( 𝑝2) (6.8) (6.9) is scale invariant at low momenta. With choice of parameters 𝐽0 < 0, 𝑉0 < 0, this Hamiltonian has an attractive potential. In our paper’s Supplemental Material [82], we provide the limit cycle boundary 𝐸 = 0 state as well as the discrete scaling factor relating the bound state energies near this threshold. 6.1.3 Self-Similar Dynamics In the previous sections, we have discussed discrete scale invariance in the bound state spectra of certain Hamiltonians, and how these could be constructed on trapped-ion systems. Discrete scale invariance implies self-similarity as one changes their "field of view" by certain discrete amounts. Besides providing equilibrium information via the Gibbs state, the bound state spectrum impacts the closed-system dynamics. Can the self-similarity of the bound state spectrum manifest in the 150 dynamical evolution of a chosen initial state? We will show that the answer is yes, though the initial state must be chosen carefully and may be difficult to prepare. The idea is to prepare a weakly bound state that overlaps many of the eigenstates just below the 𝐸 = 0 threshold. Take as initial state |πœ“(0)⟩ = ∞ βˆ‘οΈ 𝑛=0 𝑐𝑛 |π‘›βŸ© (6.10) where 𝑛 indexes the bound states of energy 𝐸𝑛 < 0, with some chosen 𝐸0 as the lowest energy state. These energies are related in geometric sequence as 𝐸𝑛 = 𝐸0/πœ†π‘› for some πœ† > 1. The time evolved state is given by |πœ“(𝑑)⟩ = ∞ βˆ‘οΈ 𝑛=0 π‘π‘›π‘’βˆ’π‘–πΈπ‘›π‘‘ |π‘›βŸ© = ∞ βˆ‘οΈ 𝑛=0 π‘π‘›π‘’βˆ’π‘–πΈ0𝑑/πœ†π‘› |π‘›βŸ© . (6.11) Suppose now we rescale 𝑑 by πœ†. Because πΈπ‘›βˆ’1 = πΈπ‘›πœ†, this is simply (cid:205)𝑛 π‘π‘›π‘’βˆ’π‘–πΈπ‘›βˆ’1𝑑 |π‘›βŸ©. Without relating the coefficients 𝑐𝑛, there isn’t much more that can be said. Imposing the condition 𝑐𝑛 = π›Ύπ‘π‘›βˆ’1, however, we find |πœ“(πœ†π‘‘)⟩ = 𝛾 ∞ βˆ‘οΈ 𝑛=1 π‘π‘›βˆ’1π‘’βˆ’π‘–πΈπ‘›βˆ’1𝑑 |π‘›βŸ© + 𝑐0π‘’βˆ’π‘–πΈ0πœ†π‘‘ |0⟩ = π›Ύπ‘ˆ+ |πœ“(𝑑)⟩ + 𝑐0π‘’βˆ’π‘–πΈ0πœ†π‘‘ |0⟩ (6.12) initial state |πœ“(0)⟩ on the subspace span{|π‘›βŸ©}∞ where π‘ˆ+ |π‘›βŸ© := |𝑛 + 1⟩. Besides the piece proportional to |0⟩, we have a representation of the 𝑛=1. The scaling factor 𝛾 ∈ C must satisfy |𝛾| < 1 for normalization purposes, and in fact |𝑐0|2 + |𝛾|2 = 1. Already, we see some manifestation of self-similarity. Some feasible measurement must be performed to extract the information about the state (6.12). Let’s consider the overlap observable π‘Š (𝑑) := βŸ¨πœ“(0)|πœ“(𝑑)⟩, which might be obtained with two copies of the initial state, one of which time-evolved, then performing some version of the SWAP test. We find π‘Š (𝑑) = (1 βˆ’ |𝛾|2) |𝛾|2π‘›π‘’βˆ’π‘–πΈ0𝑑/πœ†π‘› . ∞ βˆ‘οΈ 𝑛=0 (6.13) 151 Up to normalization and nomenclature, the real part of this is nothing more than a Weierstrass function, defined in the original paper as ∞ βˆ‘οΈ 𝑛=0 π‘Žπ‘› cos(π‘π‘›πœ‹π‘₯). In particular, we make the identifications π‘Ž = |𝛾|2, 𝑏 = πœ†βˆ’1, π‘₯ = 𝐸0𝑑 πœ‹ . (6.14) (6.15) For certain values of the parameters π‘Ž, 𝑏 the function (6.14) is a fractal, being continuous everywhere but differentiable nowhere. This is not the case we are currently in; our evolution is mathematically smooth. As 𝑛 increases in the series, the frequencies 𝑏𝑛 decrease exponentially, leading to an utterly smooth behavior. This is not surprising, since all of the bound eigenstates states present in our initial state have frequencies within [𝐸0, 0]. Despite this, we can recover an approximate self similarity at long time scales. Consider 𝛾 = 1 βˆ’ 𝛿, with 𝛿 > 0 taken very small. Then, for values 𝑁 ∈ Z+ such that 𝑛𝛿 β‰ͺ 1 for all 𝑛 < 𝑁, the first 𝑁 levels have approximately equal weight in the superposition. Shifting 𝑑 ↦→ πœ†π‘‘ then leaves the function approximately unchanged, up to a small high-frequency component. For these 𝑁 levels, then, we get 𝑁 βˆ‘οΈ 𝑛=0 cos(πœ†βˆ’π‘›πΈ0𝑑) (6.16) exhibiting self similar behavior. Moreover, if we rescale to 𝑑 ↦→ πœ†π‘›π‘‘ and reindex 𝑛 ↦→ 𝑁 βˆ’ 𝑛, we get something which looks like a truncation of a Weierstrass function 𝑁 βˆ‘οΈ 𝑛=0 cos(πœ†π‘›πΈ0𝑑). (6.17) In short, by zooming out to larger time scales, the short time scales appear fractal. The Weierstrass function is famous as being an example of a continuous function that is differentiable nowhere. It is also a fractal, with fractal dimension 𝐷 = 2 + log π‘Ž log 𝑏 (cid:18) 1 βˆ’ = 2 (cid:19) . log|𝛾| log πœ† (6.18) 152 In our case 𝐷 = 2, so our curve is space filling. Different choices of coefficient 𝑐 𝑗 , in which lower frequencies have higher amplitude, could produce different values of π‘Ž and therefore different fractal dimensions. 6.1.4 Discussion In this Section, we show how trapped-ion quantum simulators can be used to investigate Hamiltonians with anomalous symmetry breaking due to quantization, with applications to Efimov physics. We characterized the nature of the discrete scale invariance of the bound state spectrum for a family of scale-invariant Hamiltonians. Finally, we indicate how self-similar dynamics, reminiscent of the Weierstrass function, can be obtained through particular state preparation and measurement. Unfortunately, technological challenges remain to implementing the low energy Hamilto- nian (6.9), particularly due to the need for long wavelengths relative to the ion spacing. As of writing, trapped ion quantum computers are of size at most 60 [50, 140], which appears small enough to introduce unwanted boundary effects. While there are no fundamental limitations to the size of the ion trap, implementing interactions between the ions becomes increasingly challenging and expensive as the size scales up [98]. We hope these challenges may be resolved by future improvements in technology or clever implementation of our approach. 6.2 Projected Cooling Algorithm Having considered analog quantum simulators, we return back to digital quantum computing, but still with an analog flavor. Absent of error correction and logical qubits, current quantum computers mainly reside in the regime of Noisy Intermediate Scale Quantum (NISQ) devices. Such devices implement imperfect operations and are subject to decoherence, severely restricting maximum computation time. For algorithm developers looking to find near-term application of quantum computers, it is important that the proposed algorithms are relatively noise insensitive. In this section, we will discuss the Projected Cooling Algorithm [83], which prepares ground states, or generally low-lying states, of a kinetic-potential Hamiltonian with localized interactions and translation-invariant kinetic terms. Such systems are common in nuclear physics, as nuclear 153 interactions are spatially local and often only a few deeply bound states exist. I will discuss the main idea of the algorithm, followed by an application to the Dirac delta potential. 6.2.1 Background Preparing ground states of a Hamiltonian is valuable for both scientific computing and mathe- matical optimization. This is known to be a hard problem in general [78] and even determining basic properties, such as a the existence of a spectral gap, is undecidable [35]. However, the importance of determining ground state properties is so great that generic hardness is not a deterrent to trying. Moreover, generic hardness of the ground state problem does not necessarily imply hardness among the instances of "physical interest." Several heuristic or partial algorithms exist for preparing ground states, or low-lying eigenstates. One of these is the adiabatic algorithm, in which a ground state of a simple Hamiltonian 𝐻0 is prepared, then evolved according to a slowly varying time dependent 𝐻 (𝑑) which takes 𝐻0 to the Hamiltonian 𝐻 of interest. The premise of the algorithm, as the name suggests, rests on adiabaticity: provided that the evolution is "sufficiently slow," the state |πœ“(𝑑)⟩ will remain in the ground state of 𝐻 (𝑑) for any time 𝑑. The slowness required is related to the inverse gap between the ground state and the next excited state at any given time. Unfortunately, it can be challenging to ensure that the gap does not decrease exponentially with system size, despite clever choice of initial Hamiltonian and trajectory 𝐻 (𝑑) in model space. As another example, we briefly mention Variational Quantum Eigensolver (VQE) [22], which is an optimization approach to finding the ground state whereby circuit parameters are optimized to reduce the energy of the state. Unfortunately, variational methods are known to suffer from barren plateaus [95], which provide a poor optimization landscape for gradient-based methods to succeed. A distinct class of approaches, in which we include projected cooling, may be characterized as measurement-based. Simple projective measurements, as the name suggests, project onto a state, or subspace, determined by the measurement outcome. By performing a measurement compatible with the energy basis, it is possible to post-select on measuring a result compatible with the ground state. The main limitation arises when the probability of successful measurement is vanishingly 154 small. Indeed, for a randomly prepared initial state, the expected overlap with the ground state should decrease exponentially with system size. A natural objection to this grim outlook is the existence of clever ansatzes, such as Hartree-Fock states, that have been developed in the domain sciences long before quantum computing, which can be expected to have much higher overlap than a randomly chosen state. The canonical and highly general measurement-based scheme for state preparation is the suite of phase estimation algorithms, to be discussed in greater detail in the next Section. At a high level, phase estimation is nothing more than a projective measurement in the eigenbasis of a chosen unitary π‘ˆ. When π‘ˆ is a time-evolution operator for time independent 𝐻, this corresponds to a measurement of energy states. Given initial state |πœ“0⟩ with fidelity 𝑝gs with respect to the ground space of 𝐻, a phase estimation protocol will produce the (approximate) ground state with (approximate) probability 𝑝gs. Thus, repeating 𝑂 (1/𝑝gs) times is sufficient to produce the ground state. The generality of phase estimation suggests a tradeoff in the form of computational difficulty. Phase estimation algorithms require at least one auxiliary qubit to use as a control register for a controlled time evolution. Moreover these operations do not necessarily respect hardware connec- tivity, as the control register must talk to all qubits in the main system. For present hardware, these demands can be prohibitive. 6.2.2 Projected Cooling To avoid the demands imposed by phase estimation, focusing on a more specific class of Hamiltonians is desirable. We take inspiration from nuclear physics. Nucleons exhibit strong, short-range interactions, and thus their bound states are also spatially localized. Many nuclei have only a few deeply bound states, and possibly many more shallow bound states. For example, the simple deuteron is know to have only one bound state. When nucleons are freed from the short- range potential, they enter scattering, or continuum states that evolve according to a translationally- invariant kinetic Hamiltonian. These scattering states tend to disperse away from the interaction region, while the bound states naturally remain localized in the interaction region. 155 The above discussion suggests a simple criterion for distinguishing bound and unbound states for such systems: long-term localization. Evolved under the nuclear Hamiltonian, bound states will stay localized in the interaction region, while unbound states will disperse away. A measurement of the nuclear configuration (i.e., many-particle position measurement) should distinguish the two cases. To retain coherence, a binary measurement should be made which only records whether particles are found outside the range of interaction. Assuming none are, we can expect the system to have lower energy content, since the particle distance from the interaction region is correlated with energy. Such reasoning is the basis of the Projected Cooling Algorithm. We begin by preparing an initial state localized with respect to the interaction. We then perform a time-evolution according to the natural Hamiltonian of the system for some time 𝑇. Finally, a binary measurement is performed which asks whether or not particles are found outside the region in which the potential 𝑉 has greatest support. It is reasonable to expect the initial state to have higher overlap with the ground state, barring special symmetry, than a generic state far from the interaction region. This suggests simple yet effective ansatzes are available for these localized systems. The simulation time 𝑇 required depends, in principle, on the nature of the potential. Resonances will tend to weaken the effectiveness of the algorithm by keeping continuum states around longer. At the same time, 𝑇 cannot be chosen longer than the velocity of the dispersive component of the wavefunction, or else there will be reflections due to the finite box size. The final measurement 𝑀 is in the configuration basis, which can be made simple by choosing it as our computational basis. Altogether, no auxiliary registers and additional controlled operations are necessary. The time evolution itself can be expected, generally, to be the most difficult subroutine of the algorithm. 6.2.3 Application to Approximate 1D Dirac Delta Potential Well We demonstrate the Projected Cooling procedure with a 1D particle in an attractive, localized, deep potential square well. The Hamiltonian is 𝐻 = 𝑝2 2π‘š + 𝑉 156 (6.19) where 𝑉 (π‘₯) = βˆ’π‘‰0 |π‘₯| < 𝐿 0 |π‘₯| β‰₯ 0 (6.20)   ο£³ for 𝑉0, 𝐿 > 0. We take a discretization of this into 2𝑁 sites labeled 𝑛 = βˆ’π‘, . . . , 𝑁. We aren’t concerned so much with the interior details of the potential as much as its localization, for this simple example. As such, we will assume only the 𝑛 = 0 point is located within the potential; that is, the lattice spacing π‘Ž is larger than 𝐿. The discretized potential then becomes simply ˆ𝑉 = βˆ’π‘‰0|0⟩⟨0|. For the kinetic term, we take a symmetric finite difference approximation. That is, ˆ𝐾 = 1 2π‘šπ‘Ž2 (2𝐼 βˆ’ π‘ˆ+ βˆ’ π‘ˆβˆ’) (6.21) (6.22) where π‘ˆ+ = π‘ˆβ€  βˆ’ is the right-shifting unitary operation π‘ˆ+ |π‘›βŸ© = π‘ˆ+ |𝑛 + 1⟩. The discretized Hamiltonian ˆ𝐻 = ˆ𝐾 + ˆ𝑉 has a bound state spectrum that can be analyzed using an ansatz borrowed from the continuum Dirac delta potential. Neglecting finite boundary, we take an ansatz bound state of the form |πœ…βŸ© = ∞ βˆ‘οΈ 𝑛=βˆ’βˆž π‘’βˆ’πœ…π‘Ž|𝑛| |π‘›βŸ© . (6.23) Applying ˆ𝐻 to |πœ…βŸ©, we see that |πœ…βŸ© is an eigenstate with energy 𝐸 provided the following two conditions are satisfied. πΈπ‘šπ‘Ž2 = 1 βˆ’ cosh(πœ…π‘Ž) 1 βˆ’ π‘’βˆ’πœ…π‘Ž = π‘šπ‘Ž2(𝑉0 + 𝐸) (6.24) This transcendental equation admits a unique solution for πœ… > 0 and 𝐸 < 0. Thus, provided our cutoff 𝑁 is sufficiently large, we might expect to find such a state following Projected Cooling. For our numerical example, we work in lattice units where π‘Ž = 1, and we will also take π‘š = 𝑉0 = 1. The full discretized Hamiltonian is then ˆ𝐻 = (𝐼 βˆ’ 1 2 π‘ˆ+ βˆ’ 1 2 π‘ˆβˆ’) βˆ’ |0⟩⟨0|. (6.25) 157 We now need to discuss a mapping of this Hamiltonian onto a set of qubits. One natural choice is to encode position into the computational basis. This has the advantage of requiring 𝑂 (log 𝑁) qubits representing the system, which is valuable because Projected Cooling requires a large enough region for the unbound states to disperse into. In contrast, we could directly represent each lattice site with a qubit. This leads to a less-favorable 𝑂 (𝑁) scaling, but the advantage is that multiple particles could be allowed, with hard-core repulsion preventing multiple occupancy. Moreover, the operations π‘ˆ+, π‘ˆβˆ’ are much simpler to implement in this "unary" encoding. The case of unary encoding is discussed in the paper (Model 1A) [83]. Here, we supplement this work with a discussion of the binary encoding approach. The number of qubits needed to represent the system scales as 𝑛 ∈ 𝑂 (log 𝑁). Preparing Gaussian wavepackets on a quantum register is a well-studied problem [110, 80, 79], and we assume this can be done efficiently. For simulating ˆ𝐻, several methods could be employed. Trotterizing along the two terms ˆ𝐾 and ˆ𝑉, one could simulate ˆ𝑉 using a πΆπ‘›βˆ’1(𝑅𝑧) gate, and ˆ𝐾 by diagonalizing via the Quantum Fourier Transform. Alternatively, we observe that ˆ𝐻 can be expressed as a linear combination of unitaries (LCU), hence is amenable to simulation by qubitization. The number of required queries to the block-encoding "select" SEL and "prepare" PREP circuits scales as 𝑂 (𝑇 + log 1/πœ–) for simulation time 𝑇 and accuracy πœ–. The PREP circuit is only on two qubits, because once the identity terms are removed, there are only 3 unitary pieces. The SEL cost is dominated by the controlled incrementer, which requires 𝑂 (𝑛2) = 𝑂 (log2 𝑁) CNOT gates [85]. The potential part of SEL requires the reflection operator 𝐼 βˆ’ 2 (cid:12) (cid:12), controlled on the "prepare register." This requires only 𝑂 (𝑛) (cid:12)0βŠ—π‘›(cid:11) (cid:10)0βŠ—π‘›(cid:12) gates. The number of auxiliary qubits for the LCU is 2, and does not change with system size. Once the time evolution is performed, a measurement must be performed to determine if the particle is found significantly outside the region of interaction. A simple computational basis (i.e., position) measurement won’t do, since this will destroy the state of interest. instead, a binary measurement must be done which asks whether the position of the particle is outside of a range 𝑅 determined by the locality of the state. This can be done with a comparator circuit [115] which requires 𝑂 (𝑛) gates and a couple of auxiliary qubits to be measured. 158 6.2.4 Discussion Here we considered the Projected Cooling algorithm for preparing ground states using the dispersion of unbound states away from the interaction region. While we motivated our approach using nuclear systems, the method can be applied to any system with attractive, localized potential interactions and a translation-invariant kinetic energy. Following our initial work, the method was used to investigate the transverse Ising model [58], with success for models which exhibited dispersion rather than localization. Like other measurement-based state preparation algorithms, success is contingent upon having sufficient overlap with the bound state of interest. For localized interactions, an effective ansatz may correspond to, for example, a Gaussian packet of width on the order of the interaction length. Determining the required 𝑇, πœ– and 𝑁 analytically to ensure good fidelity with the ground state is beyond our scope. In our paper, we mainly employ classical numerical simulations that suggest good convergence to the desired state, and refer the reader to these results [83]. However, a more careful theoretical treatment is left to be desired. This might be done using a scattering theory treatment. This would likely elucidate the role of resonances, which should propagate slowly and thus hinder the method’s effectiveness. 6.3 Rodeo Algorithm This final section of the chapter concerns a new addition to the suite of iterative Quantum Phase Estimation (IQPE) protocols known as the Rodeo Algorithm. The original algorithm was introduced in [29], where it was tested on the Heisenberg model using classical simulations. Subsequently, an actual quantum computation on IBM quantum computer Casablanca was performed for a simple single-qubit Hamiltonian [106]. In more recent work [11], the Rodeo Algorithm was tested on two- qubit Hamiltonians and used to benchmark a protocol for efficiently compiling complex sequences of controlled operations. Rather than cover all of these works in detail, here I will describe the principles of the algorithm, its relation to iterative phase estimation, and my contributions to the theoretical characterization of the method. 159 6.3.1 Phase Kickback and Phase Estimation Needless to say, unitary operations play a vital role in quantum mechanics, particularly quan- tum computing. A fundamental computational task one might be interested in is estimating the eigenvalues of a unitary π‘ˆ. Given a quantum state |πœ“βŸ© on the Hilbert space of π‘ˆ, one way to learn the eigenvalues of π‘ˆ is through a projective measurement in the eigenbasis. This has the added benefit of approximately preparing a corresponding eigenstate. Phase estimation algorithms accomplish precisely this goal. For readers familiar with basic op- tics, a satisfying analogy exists between phase estimation and Mach-Zehnder interferometers [67]. In fact, the analogy is so close that it is more accurate to say they share the same working principle: measuring a phase shift via interference. Phase estimation algorithms have been around for as long as quantum computing has garnered significant attention. Shor’s famous algorithm for factoring [118], and the more general problem of finding discrete logarithms, rests on phase estimation techniques, as does the HHL algorithm for solving linear systems [64]. Standard QPE, based on the Quantum Fourier Transform, is described in detail in Nielsen and Chuang’s well-known text [101]. Kitaev supplied perhaps the first iterative QPE protocol [77], and several improvements to the method have been made since [125, 67, 102]. Adaptive protocols allow for improved phase measurement schemes based on prior measurements [136]. Often iterative QPE refers simply to a "iteratization" of the standard QFT by pushing all controls past the measurements [59]. 6.3.2 Basic Circuit Figure 6.1 exhibits the fundamental iteration of the Rodeo Algorithm, which we term a "cycle." The upper wire represents a single auxiliary qubit, on which single qubit gates such as the Hadamard 𝐻 and the parametrized phase gate 𝑆(𝛼) := (cid:169) (cid:173) (cid:173) (cid:171) act. The lower register is the main register of interest, where π‘ˆ (𝑑) = π‘’βˆ’π‘–π‘‚π‘‘ is the time evolution 0 𝑒𝑖𝛼 (6.26) (cid:170) (cid:174) (cid:174) (cid:172) 1 0 operator for some observable 𝑂 of interest. The parameter 𝐸 is called the target energy, and set 160 |0⟩ 𝐻 𝑆(𝐸𝑑) 𝐻 |πœ“0⟩ π‘ˆ (𝑑) Figure 6.1 Elementary iteration ("cycle") of the Rodeo Algorithm. Times 𝑑 are randomly sampled from a normal distribution of center 0 and width Ξ“βˆ’1. Performing 𝑀 cycles acts as a band-pass filter, only allowing eigenvalues within a range centered around 𝐸 with width Ξ“. Eigenvalues outside this interval are exponentially suppressed in the number of cycles 𝑀. "Success" is conditioned on all 𝑀 measurement outcomes being 0. by the user. Meanwhile, 𝑑 is a random variable sampled from some distribution 𝜌 centered about zero. We take 𝑑 to be normally distributed with variance 1/Ξ“2. 𝜌(𝑑) = Ξ“ √ 2πœ‹ π‘’βˆ’(Γ𝑑)2/2 (6.27) Other reasonable choices exist, but the normal distribution is simple to analyze and performs well enough. Some choices, such as the uniform distribution over [βˆ’Ξ“βˆ’1, Ξ“βˆ’1], are less favorable as they have poorer filtering properties resulting from the distribution having long tails in Fourier (energy) space. Roughly speaking, the parameters 𝐸, Ξ“ define an interval [𝐸 βˆ’ Ξ“, 𝐸 + Ξ“] for which the Rodeo measurement protocol asks the question: Is there an eigenvalue of 𝑂 located in [𝐸 βˆ’ Ξ“, 𝐸 + Ξ“]? Like all quantum measurements, this will depend on the state |πœ“0⟩ prepared, and a successful detection will occur with frequency given by the Born rule. As Ξ“ shrinks, longer time evolutions π‘ˆ (𝑑) will occur, and this is expensive. Taking the cost to increase linearly in 𝑑, which saturates lower bounds of the no-fast-forwarding theorem [14], the cost scales inversely with the accuracy. This is in accord with the Heisenberg limit for quantum parameter estimation. The role of the phase gate 𝑆(𝐸𝑑) is to shift the spectrum of 𝑂 by βˆ’πΈ. This is because the phase acts equivalently to a controlled phase multiplication on the system. 𝑆(𝛼) = 𝑒𝑖𝛼 Applying this identity to the circuit of Figure 6.1 and combining the control gates, 161 𝐻 𝑆(𝐸𝑑) 𝐻 𝐻 𝐻 π‘ˆ (𝑑) = π‘ˆπΈ (𝑑) where π‘ˆπΈ (𝑑) = π‘’π‘–πΈπ‘‘π‘ˆ (𝑑) = π‘’βˆ’π‘–(π‘‚βˆ’πΈ 𝐼)𝑑 . (6.28) For eigenvalues πœ† of 𝑂 within 𝑂 (Ξ“) of 𝐸, π‘’βˆ’π‘–(πœ†βˆ’πΈ)𝑑 is relatively unaffected by variations in 𝑑 of order 𝑂 (Ξ“βˆ’1). Meanwhile, other eigenvalues are shifted dramatically as 𝑑 varies randomly. This we analogize as the "bucking" in the Rodeo Algorithm, where far-away eigenvalues are likely to be kicked off. This will be elucidated more concretely in subsequent analysis. As far as the author is aware, our algorithm is the first of the phase estimation family to employ random parameters and shift the Hamiltonian in this fashion. Even if our initial state |πœ“0⟩ has small overlap with the eigenstates we are interested in, successive iterations of this circuit, with 𝐸 and Ξ“ trained on the energy range of interest, allow us to amplify these states and determine whether our operator 𝑂 has some eigenvalue in the range set by these parameters. We do require |πœ“0⟩ to have some overlap with these eigenstates. However, the threshold for detection can be made increasingly small with repeated cycles. 6.3.3 A Single Buck of the Bull Let |πœ“0⟩ be the initial state of the main register, as in Figure 6.1. We decompose |πœ“0⟩ into its spectral components in the following way. π›ΌβˆˆπœŽ(𝑂) Here, 𝜎(𝑂) is the spectrum of 𝑂, i.e., the set of (real) eigenvalues, 𝑐𝛼 ∈ C is the component of |πœ“0⟩ = βˆ‘οΈ 𝑐𝛼 |π›ΌβŸ© (6.29) |πœ“0⟩ along the eigenspace of 𝛼, and |π›ΌβŸ© is the projection onto this subspace. |π›ΌβŸ© := 𝑃𝛼 |πœ“βŸ© βˆšοΈβŸ¨πœ“| 𝑃𝛼 |πœ“βŸ© , 𝑐𝛼 := βŸ¨π›Ό| 𝑃𝛼 |πœ“βŸ© = βˆšοΈβŸ¨πœ“| 𝑃𝛼 |πœ“βŸ© (6.30) We will suppress the notation 𝜎(𝑂) in the sum (6.29) from now on. The output of the rodeo cycle, before measurement, is given by the well-known result for the Hadamard test. |0⟩ |πœ“0⟩ β†’ |0⟩ (cid:16) 𝐼 + π‘ˆπΈ (𝑑) 2 (cid:17) |πœ“0⟩ + |1⟩ (cid:16) 𝐼 βˆ’ π‘ˆπΈ (𝑑) 2 (cid:17) |πœ“0⟩ (6.31) 162 Expressing this in terms of the eigenbasis |π›ΌβŸ© gives |0⟩ |πœ“0⟩ β†’ |0⟩ βŠ— + |1⟩ βŠ— βˆ‘οΈ 𝛼 βˆ‘οΈ 𝛼 π‘π›Όπ‘’βˆ’π‘–Ξ”π›Όπ‘‘/2 cos( π‘π›Όπ‘–π‘’βˆ’π‘–Ξ”π›Όπ‘‘/2 sin( Δ𝛼𝑑 2 Δ𝛼𝑑 2 ) |π›ΌβŸ© ) |π›ΌβŸ© (6.32) where we have defined Δ𝛼 = 𝛼 βˆ’ 𝐸. If the measurement succeeds (0), we continue on with the iteration of the circuit and preserve the state of the main register, while for failure we either halt or discard the result at the end of the computation. Assuming success, the new state vector |πœ“β€²βŸ© is given by |πœ“β€²βŸ© = βˆ‘οΈ 𝛼 𝑐′ 𝛼 |π›ΌβŸ© 𝛼 = π‘π›Όπ‘’βˆ’π‘–Ξ”π›Όπ‘‘/2 cos 𝑐′ (cid:19) (cid:18) Δ𝛼𝑑 2 up to normalization. The probability 𝑝′ 𝛼 = |𝑐′ 𝛼|2 is slightly more illuminating. 𝑝′ 𝛼 = 𝑝𝛼 cos2 (cid:19) (cid:18) Δ𝛼𝑑 2 (6.33) (6.34) Observe that a relative enhancement of the probability amplitudes 𝑝𝛼 occurs for cos2(Δ𝛼𝑑/2) β‰ˆ 1. This certain occurs for Δ𝛼𝑑 β‰ˆ 0, i.e., for 𝛼 near the target energy, but also for Δ𝛼𝑑 = 2πœ‹π‘˜ with π‘˜ ∈ Z. These peak locations for π‘˜ β‰  0 fluctuate with 𝑑. With many cycles, it is unlikely that any eigenvalue far from 𝐸 will stay on a maximum across multiple trials, as we will see shortly. The probability of success 𝑃0 and failure 𝑃1 is computed from the squared norm of each term in equation (6.32). 𝑃0 = 𝑃1 = βˆ‘οΈ 𝛼 βˆ‘οΈ 𝛼 𝑝𝛼 cos2 𝑝𝛼 sin2 (cid:19) (cid:19) (cid:18) Δ𝛼𝑑 2 (cid:18) Δ𝛼𝑑 2 = 1 βˆ’ 𝑃0 (6.35) 6.3.4 Multiple Bucks: the Full Rodeo The extension of the previous analysis from a single run to 𝑀 runs through the basic circuit of Figure 6.1 is relatively straightforward. Let (𝑑𝑖) 𝑀 𝑖=1 be the time samples for each cycle. Then, 163 with repeated application of equation (6.34), the probability amplitude 𝑝 (𝑀) 𝛼 of 𝛼 after 𝑀 runs, conditioned on the circuit succeeding, is given by 𝑝 (𝑀) 𝛼 = 𝑝𝛼 𝑀 (cid:214) 𝑖=1 cos2 (cid:19) . (cid:18) Δ𝛼𝑑𝑖 2 The success probability 𝑃(𝑀) 0 may be factorized as 𝑃(𝑀) 0 = 𝑀 (cid:214) π‘˜=1 π‘π‘˜ (6.36) (6.37) where π‘π‘˜ is the probability of measuring zeros on the π‘˜th measurement conditioned on measuring zeros in every prior measurement. Let |Ξ¨π‘˜βˆ’1⟩ = |0⟩ βŠ— (cid:205)𝛼 𝑐(π‘˜βˆ’1) after π‘˜ βˆ’ 1 successful measurements and directly before the π‘˜th measurement. From Born’s rule, |π›ΌβŸ© be the state of the entire register 𝛼 π‘π‘˜ = βˆ₯ ⟨0|Ξ¨π‘˜βˆ’1⟩ βˆ₯2 βŸ¨Ξ¨π‘˜βˆ’1|Ξ¨π‘˜βˆ’1⟩ , where ⟨0| ≑ ⟨0| βŠ— 𝐼. Adapting equation (6.32) to the present situation, π‘π‘˜ = (cid:205)𝛼 𝑝 (π‘˜βˆ’1) 𝛼 cos2 (Ξ”π›Όπ‘‘π‘˜ /2) (cid:205)𝛽 𝑝 (π‘˜βˆ’1) 𝛽 = (cid:205)𝛼 𝑝 (π‘˜) 𝛼 (cid:205)𝛽 𝑝 (π‘˜βˆ’1) 𝛽 . (6.38) (6.39) Returning to expression (6.37), we see that each π‘π‘˜ in the product telescopes, giving a simple formula. 𝑃(𝑀) 0 βˆ‘οΈ 𝑝 (𝑀) 𝛼 = = βˆ‘οΈ 𝑝𝛼 𝛼 𝑀 (cid:214) π‘˜=1 cos2 (cid:19) (cid:18) Ξ”π›Όπ‘‘π‘˜ 2 (6.40) That is, the success probability is simply the sum of the unnormalized probability amplitudes. In hindsight, we could have anticipated that, for each 𝛼, the measurement probabilities over each iteration behave independently, as exhibited in equation (6.40). Performing the analysis for an eigenvector input state, one finds that the state is unaffected by the measurement outcomes, thus the measurements are independent. The full result follows by linearity. 6.3.5 Statistics Now that we’ve determined the behavior for a particular choice of times (𝑑𝑖) 𝑀 𝑖=1, we must recall that these times were randomly chosen according to the normal distribution (6.27). We hope, 164 though have yet to fully justify, that the measurement statistics correlate strongly, and predictably, with the presence of an eigenvalue within 𝐸 Β± Ξ“ that significantly overlaps the initial state. To do this, it makes sense to compute some basic statistics about the measurement results, particularly the expected behavior over the distribution of 𝑑𝑖. Thus, we compute the expectation values of 𝑝 (𝑀) and P (𝑀) 0 (𝑑). Using equation (6.36), (𝑑) 𝛼 βŸ¨π‘ (𝑀) 𝛼 ⟩ = ∫ R𝑀 𝜌(𝑑) 𝑝 (𝑀) 𝛼 𝑑𝑑 𝑀 = = Ξ“ (2πœ‹) 𝑀/2 𝑝𝛼Γ (2πœ‹) 𝑀/2 ∫ R𝑀 (cid:18)∫ R π‘’βˆ’(Γ𝑑)2/2 𝑝𝛼 𝑀 (cid:214) cos2( π‘’βˆ’(Γ𝑑)2/2 cos2 𝑗=1 (cid:18) Δ𝛼𝑑 2 (cid:19) 𝑀 (cid:19) 𝑑𝑑 . Δ𝛼𝑑 𝑗 2 )𝑑𝑑 𝑀 (6.41) In the last step, we used the fact that the 𝑀-dimensional integral factorizes. This expression is easy to evaluate. βŸ¨π‘ (𝑀) 𝛼 ⟩ = 𝑝𝛼 (cid:33) 𝑀 (cid:32) 1 + π‘’βˆ’Ξ”π›Ό2/2Ξ“2 2 (6.42) We see that, for 𝛼 βˆ’ 𝐸 on the order of Ξ“ or greater, the probability decays in the number of cycles √ as 2βˆ’π‘€, whereas for success probability βŸ¨π‘ƒ(𝑀) 0 𝑀Δ𝛼/Ξ“ β‰ͺ 1 the amplitudes are approximately preserved. The expected ⟩ can be easily obtained from equations (6.40) and (6.42), using the linearity of the expectation value. βŸ¨π‘ƒ(𝑀) 0 ⟩ = βˆ‘οΈ 𝑝𝛼 𝛼 (cid:33) 𝑀 (cid:32) 1 + π‘’βˆ’Ξ”π›Ό2/2Ξ“2 2 (6.43) Observe how this serves as an indicator function for the existence of eigenvalues. If all 𝛼 are outside of 𝑂 (Ξ“) from 𝐸, the amplitudes decay exponentially in 𝑀. For any 𝛼 within this range, however, there will be a success probability which goes roughly as the initial overlap with those states. As expected from measurement-based procedures, we cannot overcome the inherent 𝑂 (1/overlap2) cost scaling. A quadratic improvement to a Heisenberg limit may be feasible with amplitude amplification, but only with additional quantum overhead and operations. Our indicator βŸ¨π‘ƒ(𝑀) 0 ⟩ might not be of practical use if the behavior of a typical run deviates 165 wildly from the average. We thus investigate the variance. From equation (6.40), Var(𝑃(𝑀) 0 (cid:68) (cid:0)𝑃(𝑀) 0 ) = = (cid:10) βˆ‘οΈ (cid:1) 2(cid:69) βˆ’ (cid:69)2 (cid:68) 𝑃(𝑀) 0 𝛼 𝑝 (𝑀) 𝑝 (𝑀) 𝛽 (cid:11) βˆ’ (cid:0) βˆ‘οΈ βŸ¨π‘ (𝑀) 𝛼 ⟩(cid:1) 2 𝛼𝛽 Cov (cid:16) 𝑝 (𝑀) 𝛼 , 𝑝 (𝑀) 𝛽 = βˆ‘οΈ 𝛼𝛽 𝛼 , (cid:17) (6.44) where Cov (𝑋, π‘Œ ) := βŸ¨π‘‹π‘Œ ⟩ βˆ’ βŸ¨π‘‹βŸ©βŸ¨π‘Œ ⟩ (6.45) is the covariance. To clean up the math below, define the dimensionless parameters π‘Ž := Δ𝛼/Ξ“ and 𝑏 := Δ𝛽/Ξ“. Then, βŸ¨π‘ (𝑀) 𝛼 𝑝 (𝑀) 𝛽 ⟩ = 𝑝𝛼 𝑝 𝛽 βŸ¨π‘ (𝑀) 𝛼 βŸ©βŸ¨π‘ (𝑀) 𝛽 ⟩ = 𝑝𝛼 𝑝 𝛽 (cid:32) (cid:32) 2 + π‘’βˆ’(π‘Ž+𝑏)2/2 + π‘’βˆ’(π‘Žβˆ’π‘)2/2 + 2π‘’βˆ’π‘Ž2/2 + 2π‘’βˆ’π‘2/2 8 (cid:33) 𝑀 1 + π‘’βˆ’(π‘Ž2+𝑏2)/2 + π‘’βˆ’π‘Ž2/2 + π‘’βˆ’π‘2/2 4 (cid:33) 𝑀 so that where Cov (cid:16) 𝑝 (𝑀) 𝛼 , 𝑝 (𝑀) 𝛽 (cid:17) = 𝑝𝛼𝐢 (𝑀) 𝛼𝛽 𝑝 𝛽 (cid:32) (cid:32) 𝐢 (𝑀) 𝛼𝛽 = βˆ’ 1 + π‘’βˆ’π‘Ž2/2 + π‘’βˆ’π‘2/2 + π‘’βˆ’(π‘Ž2+𝑏2)/2 cosh(π‘Žπ‘) 4 (cid:33) 𝑀 1 + π‘’βˆ’π‘Ž2/2 + π‘’βˆ’π‘2/2 + π‘’βˆ’(π‘Ž2+𝑏2)/2 4 (cid:33) 𝑀 (6.46) (6.47) (6.48) is positive definite. Hence, the variance Var(P (𝑀) 0 ) is a contraction of the matrix 𝐢 (𝑀) with the vector of initial probability amplitudes 𝑝 for each eigenvalue. It is fruitful to consider 𝐢 (𝑀) 𝛼𝛽 as a function of two real parameters π‘Ž, 𝑏 ∈ R for each 𝑀 ∈ Z+. First, we observe that 𝐢 is an even function in both π‘Ž and 𝑏, so that only the positive quadrant need be considered. We also observe that 𝐢 is symmetric under π‘Ž ↔ 𝑏. It is more or less clear that 𝐢 should approach 0 for π‘Ž, 𝑏 large and for π‘Ž, 𝑏 near zero, but showing this analytically is rather awkward (though straightforward in principle). Figure 6.2 provides plots of 𝐢 (𝑀) 𝛼𝛽 for various values 166 Figure 6.2 (Top) Density plot of covariance function 𝐢 (𝑀) 𝑏 = Δ𝛽/Ξ“ for 𝑀 = 4, 10, and 20 going left to right. We observe the function diminishing with 𝑀, with maximum along the π‘Ž = 𝑏 line which moves slightly inward with 𝑀. (Bottom) Location and values of maximum 𝐢 (𝑀) for 𝑀 from 4 to 200. Line of best fit indicates inverse square root power 𝛼𝛽 law for location, whereas value 𝑐max appears to follow inverse power law with 𝑀. 𝛼𝛽 with respect to π‘Ž = Δ𝛼/Ξ“ and of 𝑀. We observe the correlations are peaked for π‘Ž = 𝑏 = π‘Žpeak, where π‘Žpeak appears to follow an inverse square root power law. Even for 𝑀 = 4, the correlations are never larger than 0.06 and only decrease with 𝑀. The peak location corresponds to eigenvalues being on the order of Θ(Ξ“) away from the target energy. Larger values of 𝐢 (𝑀) 𝛼𝛽 contribute to a larger variance via (6.44). We interpret this as follows: the Rodeo Algorithm struggles to properly classify eigenvalues close to, but not entirely within, the rough interval [𝐸 βˆ’ Ξ“, 𝐸 + Ξ“]. When eigenvalues are clearly within Ξ“, the success probability is 1 for those eigenstates, while for far eigenvalues the success probability is a coin toss. This "resonant peak", despite being a nuisance, can be handled by varying the target energy, and in any case the resonance decreases in amplitude and width with 𝑀. We will make use of some properties of 𝐢 (𝑀) 𝛼𝛽 in our subsequent analysis of the Rodeo Algorithm’s performance. 167 6.3.6 Performance on Eigenvalue Detection Having characterized the statistical properties of our randomized circuit, we now turn to the question of performance. To make progress analytically, we make some simplifying assumptions. Suppose we choose some target energy 𝐸 and width Ξ“, and wish to determine whether there is a nearby eigenvalue of π›Όβˆ—, by which we mean an eigenvalue such that |π›Όβˆ— βˆ’ 𝐸 |/Ξ“ < 𝑑 for 𝑑 on the order of 1. We are promised that, if such an eigenstate exists, the input state |πœ“βŸ© has overlap π‘π›Όβˆ— = |π‘π›Όβˆ— |2 β‰₯ 𝛿, where 𝛿 > 0 is some given threshold. Any other populated eigenstates are assumed to have eigenvalues at least 𝑔Γ away from 𝐸 for some 𝑔 > 𝑑. We analyze the question of eigenvalue existence in the language of hypothesis testing, with null hypothesis 𝐻null of no eigenvalue present. The alternative hypothesis 𝐻alt is the presence of π›Όβˆ— within Γ𝑑 of 𝐸 that has overlap at least 𝛿 with the initial state. Practically speaking, without the promises given above, our algorithm will simply fail to detect eigenstates whose overlap is too low (without the 𝛿 promise), or fail to resolve multiple eigenvalues within the detection range (without the 𝑔 promise). Under 𝐻0, the expected success probability 𝑃null of the 𝑀-cycle Rodeo Algorithm is upper bounded as 𝑃null ≀ (cid:32) 1 + π‘’βˆ’π‘”2/2 2 (cid:33) 𝑀 . (6.49) On the other hand, for alternative hypothesis 𝐻1, the expected success probability 𝑃alt is lower bounded as 𝑃alt β‰₯ 𝛿 > 𝛿 (cid:32) (cid:32) 1 + π‘’βˆ’π‘‘2/2 2 1 + π‘’βˆ’π‘‘2/2 2 (cid:33) 𝑀 + (1 βˆ’ 𝛿) 1 2𝑀 (cid:33) 𝑀 . (6.50) To distinguish the two cases, we estimate the success probability through a normalized count 𝑁 samples of the 𝑀 cycle Rodeo Algorithm. A detection corresponds to determining 𝑃alt > 𝑃null with confidence determined by uncertainty in the method. We denote the estimated probability by ¯𝑃. Ignoring other reasonable sources of error, such as imperfect gates, decoherence, and imperfect implementation of π‘ˆ (𝑑), the error in the estimate is 𝑂 (𝜎sample), where 𝜎sample is the standard 168 deviation in the estimate of expected success probability. To analyze this uncertainty, we use a heuristic error propagation approach. Assuming an exact, nonrandom success probability 𝑃0, our uncertainty comes from the binomial variance 𝜎binom = βˆšοΈ‚ 𝑃0(1 βˆ’ 𝑃0) 𝑁 < βˆšοΈ‚ 𝑃0 𝑁 . (6.51) To characterize deviations from the binomial distribution due to random fluctuations 𝜎fluc of 𝑃0 caused by the times 𝑑 𝑗 , we use an error propagation formula. 𝜎fluc β‰ˆ (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) πœ• πœ•π‘ƒ0 βˆšοΈ‚ 𝑃0(1 βˆ’ 𝑃0) 𝑁 πœŽπ‘ƒ0 (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) < 1 2 √ 1 𝑃0𝑁 πœŽπ‘ƒ0 (6.52) Let’s now consider the two cases separately, in the case of 𝐻null, there is no eigenvalue within 𝑔Γ and the upper bound 𝜎fluc < 1 √ 𝑁 is valid. In this case, πœŽπ‘ƒ0 will shrink exponentially, and for some 𝑔 ∈ 𝑂 (1) this will be enough to bound 𝜎fluc by 𝑂 (βˆšοΈπ‘ƒ0/𝑁). Consider now the case 𝐻alt. From now on, we make the assumption 2𝑀/2πœŽπ‘ƒ0 (6.53) 𝑑2𝑀 β‰ͺ 1 (6.54) so that 𝑃alt > 𝛿 + 𝑂 (𝛿𝑑2𝑀). In this case, we require πœŽπ‘ƒ0 of 𝐢 (𝑀) guarantee an 𝑂 (βˆšοΈπ‘ƒ0/𝑁) scaling as desired. 𝛼𝛽 , particularly the 𝛼 = 𝛽 = 𝑑Γ + 𝐸 term, reveals that 𝐢 (𝑀) < βˆšοΈπ›Ώ + 𝑂 (𝛿𝑑2𝑀). A careful analysis 𝛼𝛽 ∈ 𝑂 (𝑑4𝑀). This is sufficient to Overall, we’ve found, through semi-heuristic derivation, that the error in our Rodeo estimation protocol goes as 𝑂 (1/ √ 𝑁). To ensure the two hypotheses can be distinguished by the Rodeo Algorithm, we thus require (cid:32) 𝛿 1 + π‘’βˆ’π‘‘2/2 2 (cid:33) 𝑀 (cid:32) βˆ’ 1 + π‘’βˆ’π‘”2/2 2 (cid:33) 𝑀 ∈ Ξ©(βˆšοΈπ›Ώ/𝑁). Taking as the worst case scenario 𝑔 = 1 in the above, we have the requirement (cid:33) 𝑀 (cid:32) 1 + π‘’βˆ’π‘‘2/2 2 ∈ Ξ© (cid:18) 1 √ 𝛿𝑁 + 1 𝛿 π‘’βˆ’Ξ˜(𝑀) (cid:19) . 169 (6.55) (6.56) We utilize our requirement that 𝑑2𝑀 β‰ͺ 1, so that the right-hand side is Ξ©(1). This is implicitly an upper bound on 𝑀, meaning that not all 𝑑 and 𝛿 will allow for solutions using our approach. However, we will see presently that the size of 𝑀 need not be too large. Looking at the 2nd term on the right-hand size of (6.56), we find that 𝑀 ∈ Ξ©(log 1/𝛿) (6.57) suffices to ensure 𝑂 (1) error. Finally, we choose 𝑁 ∈ Ξ©(1/𝛿). We see that there are choices of parameters for which the algorithm can succeed in the setting we’ve constructed. The only real requirement is that log 1/𝛿 β‰ͺ 1/𝑑2 (6.58) which is not a great restriction in practice, provided the overlap is not exceedingly small. This restriction comes about, more or less, from the fact that, as 𝑀 increases, the effective width of the search window shrinks as 𝑂 (1/ √ 𝑀). A rough cost estimate 𝐢ost can be assigned as 𝐢ost := 𝑁 𝑀 Ξ“ ∈ 𝑂 (cid:18) (1/𝛿) log 1/𝛿 Ξ“ (cid:19) (6.59) which attributes a cost of Ξ“βˆ’1 to the time evolution, assuming it dominates the full rodeo cycle. The cost 𝑐ost = 𝐢ost/𝑁 per cycle is also the maximum circuit depth, and has the same Ξ“ scaling but favorable 𝛿 dependence. Interpreting Ξ“ as our accuracy parameter, we see that the Rodeo Algorithm achieves Heisenberg scaling. In practice, given a range [𝐸min, 𝐸max] of possible eigenvalues for the operator 𝑂 of interest, Ξ“ can be varied using a procedure akin to binary search. This has little effect on the total cost, which will still be dominated by the smallest Ξ“ used assuming it is reduced exponentially each refinement. 6.3.7 Noise Noise and decoherence are significant sources of error for near-term quantum computers without error correction. Significant advances in error mitigation techniques have increased what can be achieved from limited, noisy hardware [76], though there may be fundamental limitations to these 170 approaches for large systems [108]. In search of applications for near-term quantum devices, there is a desire for methods which are inherently robust to noise. We investigated the noise robustness of the Rodeo Algorithm under the simplest possible model of decoherence: depolarizing noise. At each gate, we assume there is some probability 𝑝depol of the current state being replaced with the maximally mixed state, thus ruining the computation. Compu- tations were performed using Qiskit Runtime with the "ibmq_qasm_simulator." We considered the simple 3-qubit Hamiltonian 𝑍1 + 𝑍2 + 𝑍3, whose eigenvalues are easily seen to be the odd integers from βˆ’3 to 3. We took as initial state (𝐻 |0⟩)βŠ—3 where 𝐻 is the Hadamard. We varied 𝑝depol from 0 to 0.05 and investigated the effect on performance. We take Ξ“ = 1 and 𝑀 = 6, and perform a scan from [βˆ’5, 5] in increments of 0.1. Figure 6.3 shows the results of the numerical simulation. We observe that, for increased noise, the peaks decrease in height but remain in the same location. By 𝑝depol = 0.05, the peaks become hard to distinguish from one another, but not hard to distinguish from background. Importantly, the location of the peaks is relatively unaffected, which is the most important part. We expect that, for many reasonable models of noise, not just symmetric depolar- ization, the peaks will not shift greatly as a result. Imperfect gates will change the effective operator Λœπ‘‚, however, thus leading to changes in eigenvalues. Understanding how decoherence affects the effective Hamiltonian being evolved would greatly advance our understanding of the impact, not only on the Rodeo Algorithm, but other phase-estimation protocols. 6.3.8 Discussion Here we presented the Rodeo Algorithm, a new addition to the collection of phase estimation protocols which is simple and allows for targeted search of eigenvalues and preparation of eigen- states. To complement previous work on the method [29, 106, 11], here we focused on theoretical aspects not previously covered. We provide a quasi-rigorous cost analysis of the method under a binary hypothesis model, and conclude with some numerics showing behavior of the method under symmetric depolarizing noise. Following the initial publication of the method, subsequent research removed the need for randomness in the algorithm while providing a more rigorous characterization of performance [96]. 171 Figure 6.3 Simulation of Rodeo Algorithm for three non-interacting qubits for various amounts of depolarizing noise. Eigenvalues are located at βˆ’3, βˆ’1, 1, and 3, and the initial state has overlap 1/8, 3/8, 3/8, 1/8 with each eigenspace. We see that the algorithm reproduces the correct qualitative behavior in spite of noise reductions in signal. The algorithm benefits from the symmetry of depolarization, whereas other noise models may affect the phase rotation being applied. These authors were interested particularly in state preparation, and their findings of performance are consistent with our randomized approach. For current noisy devices, the Rodeo Algorithm can serve as a simple and practical alternative for standard QPE for eigenvalue estimation and state preparation. Thus, it may eventually serve as a useful subroutine for algorithms aimed at achieving quantum advantage, that is, practical advantage of quantum computers over classical computers for interesting tasks. For example, recently an end- to-end simulation algorithm for nuclear effective field theories was proposed, which used standard Quantum Phase Estimation (QPE) in analysis for estimating total resource costs [134]. Replacing this final measurement step with the Rodeo Algorithm in actual applications should increase the feasibility of the full algorithm for near-term hardware. 172 CHAPTER 7 CONCLUSION AND OUTLOOK Quantum computing today shares similarities with thermodynamics of the 19th century. From its birth, thermodynamics was intimately linked to technology. Practitioners sought to create more efficient engines, refrigerators, and other cyclical systems to perform work using the resource of heat gradients. These pragmatic goals led to foundational scientific progress, such as discovering laws about optimal efficiencies and allowed physical transformations. Gradually, the powerful and subtle notion of entropy developed, so fruitful as to remain at the forefront of modern physics and information theory.1 Quantum computers, unlike engines and refrigerators, have yet to yield any practical benefit. Interest in the field is driven by the expectation that they eventually will, and that effective, error- corrected quantum computers can be made that will simulate quantum dynamics, solve optimization problems, and perform hitherto unrecognized yet valuable computational tasks. In the background, deeper questions are very much present. How different are the quantum and classical worlds from a computational viewpoint? What makes quantum probabilities special compared to standard probability? Broadly, what is quantum mechanics even about? Because the building blocks of quantum information, qubits, are so simple, they provide an excellent playground to tackle such questions with clarity. It is imaginable that large-scale manipulation of entanglement will aid our "gut" understanding of the theory and elucidate connections between the microscopic world and our macroscopic reality. For what purposes will we manipulate this enormous entanglement? Besides Shor’s ground- breaking factoring algorithm, Grover’s unstructured search algorithm, and a handful of others, not many compelling use cases are known. The truth is, if a scalable, fault-tolerant quantum computer existed today, we wouldn’t know many uses for it. The idea of quantum-based quantum simulation has been around about as long as quantum computing itself, and remains one of the few concrete and useful tasks we know of. Thankfully, its importance to scientific computing is enough to command 1While the "entropy" from physics and information theory are technically distinct, Landauer’s principle [81] and other ideas suggest a deep connection that, to my knowledge, is still not fully understood. 173 attention, and moreover, Hamiltonian simulation finds application in at least a handful general tasks such as solving linear systems of equations. Even if not part of an algorithm, Hamiltonian simulation concepts can assist in the design of quantum algorithms, especially given the absence of many other conceptual guideposts. This dissertation offers additional tools for Hamiltonian simulation and, importantly, rigorous characterizations of each of their performance. We demonstrate increased effectiveness of prod- uct formulas for accurately estimating observable dynamics using polynomial interpolation. We provide several new approaches to time dependent Hamiltonian simulation, as well as a compu- tational reduction of the time dependent to time independent dynamics. Finally, we survey novel tools for near-term quantum simulation, including trapped-ion simulation of anomalous symmetry breaking, state preparation via Projected Cooling, and resource-effective eigenvalue estimation and eigenvector preparation using the Rodeo Algorithm. Despite enormous progress, fresh ideas are needed in quantum algorithms research. We offer tools which may eventually be implemented on the quantum computers of the future, and hope that that future is not so far away. 174 BIBLIOGRAPHY [1] [2] [3] [4] [5] [6] [7] [8] Scott Aaronson. Quantum computing and hidden variables. Physical Review A, 71(3): 032325, 2005. Scott Aaronson. How big are quantum states?, page 200–216. Cambridge University Press, 2013. Andrew Adamatzky. A brief history of liquid computers. Philosophical Transactions of the Royal Society B, 374(1774):20180372, 2019. Dorit Aharonov, Wim Van Dam, Julia Kempe, Zeph Landau, Seth Lloyd, and Oded Regev. Adiabatic quantum computation is equivalent to standard quantum computation. SIAM Review, 50(4):755–787, 2008. Thomas D Ahle. Sharp and simple bounds for the raw moments of the binomial and poisson distributions. Statistics & Probability Letters, 182:109306, 2022. Rizwanul Alam, George Siopsis, Rebekah Herrman, James Ostrowski, Phillip C Lotshaw, and Travis S Humble. Solving maxcut with quantum imaginary time evolution. Quantum Information Processing, 22(7):281, 2023. Alain Aspect, Jean Dalibard, and GΓ©rard Roger. Experimental test of bell’s inequalities using time-varying analyzers. Physical Review Letters, 49(25):1804, 1982. Brian M Austin, Dmitry Yu Zubarev, and William A Lester Jr. Quantum monte carlo and related approaches. Chemical Reviews, 112(1):263–288, 2012. [9] Mohsen Bagherimehrab, Yuval R Sanders, Dominic W Berry, Gavin K Brennen, and Barry C Sanders. Nearly optimal quantum algorithm for generating the ground state of a free quantum field theory. PRX Quantum, 3(2):020364, 2022. [10] Christian W. Bauer, Plato Deliyannis, Marat Freytsis, and Benjamin Nachman. Practical considerations for the preparation of multivariate gaussian states on quantum computers, 2021. [11] Max Bee-Lindgren, Zhengrong Qian, Matthew DeCross, Natalie C. Brown, Christopher N. Gilbreth, Jacob Watkins, Xilin Zhang, and Dean Lee. Controlled gate networks applied to eigenvalue estimation, 2024. [12] John S Bell. On the einstein podolsky rosen paradox. Physics Physique Fizika, 1(3):195, 1964. [13] Daniel Berend and Tamir Tassa. Improved bounds on bell numbers and on moments of sums of random variables. Probability and Mathematical Statistics, 30(2):185–205, 2010. 175 [14] Dominic W Berry, Graeme Ahokas, Richard Cleve, and Barry C Sanders. Efficient quantum algorithms for simulating sparse hamiltonians. Communications in Mathematical Physics, 270:359–371, 2007. [15] Dominic W Berry, Andrew M Childs, and Robin Kothari. Hamiltonian simulation with nearly optimal dependence on all parameters. In 2015 IEEE 56th annual symposium on foundations of computer science, pages 792–809. IEEE, 2015. [16] Dominic W Berry, Andrew M Childs, Yuan Su, Xin Wang, and Nathan Wiebe. Time- dependent hamiltonian simulation with l1-norm scaling. Quantum, 4:254, 2020. [17] Alexandre Blais, Arne L Grimsmo, Steven M Girvin, and Andreas Wallraff. Circuit quantum electrodynamics. Reviews of Modern Physics, 93(2):025005, 2021. [18] S Blanes, F Casas, and J Ros. Extrapolation of symplectic integrators. Celestial Mechanics and Dynamical Astronomy, 75:149–161, 1999. [19] Colin D Bruzewicz, John Chiaverini, Robert McConnell, and Jeremy M Sage. Trapped-ion quantum computing: Progress and challenges. Applied Physics Reviews, 6(2), 2019. [20] Horacio E Camblong, Luis N Epele, Huner Fanchiotti, and Carlos A Garcia Canal. Quantum anomaly in molecular physics. Physical Review Letters, 87(22):220402, 2001. [21] Giulio Casati and Luca Molinari. β€œquantum chaos” with time-periodic hamiltonians. Progress of Theoretical Physics Supplement, 98:287–322, 1989. [22] Marco Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, et al. Varia- tional quantum algorithms. Nature Reviews Physics, 3(9):625–644, 2021. [23] Chi-Fang Chen and Fernando G. S. L. BrandΓ£o. Average-case speedup for product for- doi: 10.1007/ mulas. Communications in Mathematical Physics, 405(2):32, 2024. s00220-023-04912-5. [24] Andrew M Childs and Nathan Wiebe. Hamiltonian simulation using linear combinations of unitary operations. Quantum Information & Computation, 12(11-12):901–924, 2012. [25] Andrew M Childs, Dmitri Maslov, Yunseong Nam, Neil J Ross, and Yuan Su. Toward the first quantum simulation with quantum speedup. Proceedings of the National Academy of Sciences, 115(38):9456–9461, 2018. [26] Andrew M Childs, Aaron Ostrander, and Yuan Su. Faster quantum simulation by random- ization. Quantum, 3:182, 2019. [27] Andrew M Childs, Yuan Su, Minh C Tran, Nathan Wiebe, and Shuchen Zhu. Theory of 176 trotter error with commutator scaling. Physical Review X, 11(1):011020, 2021. [28] Siu A Chin. Multi-product splitting and runge-kutta-nystrΓΆm integrators. Celestial Mechan- ics and Dynamical Astronomy, 106(4):391–406, 2010. [29] Kenneth Choi, Dean Lee, Joey Bonitati, Zhengrong Qian, and Jacob Watkins. Rodeo algorithm for quantum computing. Physical Review Letters, 127(4):040505, 2021. [30] Juan I Cirac and Peter Zoller. Quantum computations with cold trapped ions. Physical Review Letters, 74(20):4091, 1995. [31] Louis Comtet. Advanced Combinatorics: The Art of Finite and Infinite Expansions. Springer Science & Business Media, 2012. [32] B. Jack Copeland. The Church-Turing Thesis. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2020 edition, 2020. [33] Christopher J Cramer. Essentials of Computational Chemistry: Theories and Models. John Wiley & Sons, 2013. [34] Peter Cromwell, Elisabetta Beltrami, and Marta Rampichini. The borromean rings. Mathe- matical Intelligencer, 20(1):53–62, 1998. [35] Toby S Cubitt, David Perez-Garcia, and Michael M Wolf. Undecidability of the spectral gap. Nature, 528(7581):207–211, 2015. [36] AndrΓ© Pierro de Camargo. On the numerical stability of newton’s formula for lagrange interpolation. Journal of Computational and Applied Mathematics, 365:112369, 2020. ISSN 0377-0427. doi: https://doi.org/10.1016/j.cam.2019.112369. [37] David Deutsch. Quantum theory, the church–turing principle and the universal quantum computer. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences, 400(1818):97–117, 1985. [38] Bryce Seligman Dewitt and Neill Graham. The Many-Worlds Interpretation of Quantum Mechanics, volume 63. Princeton University Press, 2015. [39] Paul Adrien Maurice Dirac. Quantum mechanics of many-electron systems. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 123(792):714–733, 1929. [40] Yulong Dong, Lin Lin, and Yu Tong. Ground-state preparation and energy estimation on early fault-tolerant quantum computers via quantum eigenvalue transformation of unitary matrices. PRX Quantum, 3(4):040305, 2022. 177 [41] Vitaly Efimov. Energy levels arising from resonant two-body forces in a three-body system. Physics Letters B, 33(8):563–564, 1970. [42] Albert Einstein, Boris Podolsky, and Nathan Rosen. Can quantum-mechanical description of physical reality be considered complete? Physical Review, 47(10):777, 1935. [43] Jens Eisert, Marcus Cramer, and Martin B Plenio. Colloquium: Area laws for the entangle- ment entropy. Reviews of Modern Physics, 82(1):277, 2010. [44] Suguru Endo, Qi Zhao, Ying Li, Simon Benjamin, and Xiao Yuan. Mitigating algorithmic errors in a hamiltonian simulation. Physical Review A, 99:012334, 1 2019. doi: 10.1103/ PhysRevA.99.012334. [45] Paul K Faehrmann, Mark Steudtner, Richard Kueng, Maria Kieferova, and Jens Eisert. Randomizing multi-product formulas for hamiltonian simulation. Quantum, 6:806, 2022. [46] Richard P Feynman. Simulating physics with computers. In Feynman and Computation, pages 133–153. CRC Press, 2018. [47] Fulvio Flamini, Nicolo Spagnolo, and Fabio Sciarrino. Photonic quantum information processing: a review. Reports on Progress in Physics, 82(1):016001, 2018. [48] Edward Fredkin and Tommaso Toffoli. Conservative logic. International Journal of Theo- retical Physics, 21(3):219–253, 1982. doi: 10.1007/BF01857727. [49] Stuart J Freedman and John F Clauser. Experimental test of local hidden-variable theories. Physical Review Letters, 28(14):938, 1972. [50] Nicolai Friis, Oliver Marty, Christine Maier, Cornelius Hempel, Milan HolzΓ€pfel, Petar Jurcevic, Martin B Plenio, Marcus Huber, Christian Roos, Rainer Blatt, et al. Observation of entangled states of a fully controlled 20-qubit system. Physical Review X, 8(2):021012, 2018. [51] W Gautschi. How (un)stable are vandermonde systems? asymptotic and computational analysis. In Lecture Notes in Pure and Applied Mathematics, pages 193–210. Marcel Dekker, Inc, 1990. [52] Iulia M Georgescu, Sahel Ashhab, and Franco Nori. Quantum simulation. Reviews of Modern Physics, 86(1):153, 2014. [53] Elizabeth Gibney. The quantum gold rush. Nature, 574(7776):22–24, 2019. [54] AndrΓ‘s GilyΓ©n, Yuan Su, Guang Hao Low, and Nathan Wiebe. Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, 178 page 193–204, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450367059. [55] Sheldon Goldstein. Bohmian Mechanics. In Edward N. Zalta, editor, The Stanford Ency- clopedia of Philosophy. Metaphysics Research Lab, Stanford University, fall 2021 edition, 2021. [56] Dmitry Grinko, Julien Gacon, Christa Zoufal, and Stefan Woerner. Iterative quantum ampli- tude estimation. npj Quantum Information, 7:52, 2021. doi: 10.1038/s41534-021-00379-1. [57] Lov Grover and Terry Rudolph. Creating superpositions that correspond to efficiently integrable probability distributions, 2002. [58] Erik J Gustafson. Projective cooling for the transverse ising model. Physical Review D, 101 (7):071504, 2020. [59] Dipanjali Halder, V Srinivasa Prasannaa, Valay Agarawal, and Rahul Maitra. Iterative quantum phase estimation with variationally prepared reference state. International Journal of Quantum Chemistry, 123(3):e27021, 2023. [60] Jack K. Hale and HΓΌseyin KoΓ§ak. Scalar Nonautonomous Equations, pages 107–132. ISBN 978-1-4612-4426-4. doi: 10.1007/ Springer New York, New York, NY, 1991. 978-1-4612-4426-4\_4. [61] Brian Hall. Quantum Theory for Mathematicians, volume 267 of Graduate Texts in Mathe- matics. Springer-Verlag New York, 2013. [62] H-W Hammer and Brian G Swingle. On the limit cycle for the 1/r2 potential in momentum space. Annals of Physics, 321(2):306–317, 2006. [63] Dylan Harley, Ishaun Datta, Frederik Ravn Klausen, Andreas Bluhm, Daniel Stilck FranΓ§a, Albert Werner, and Matthias Christandl. Going beyond gadgets: The importance of scala- bility for analogue quantum simulators. arXiv:2306.13739, 2023. [64] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear systems of equations. Physical Review Letters, 103(15):150502, 2009. [65] LoΓ―c Henriet, Lucas Beguin, Adrien Signoles, Thierry Lahaye, Antoine Browaeys, Georges- Olivier Reymond, and Christophe Jurczak. Quantum computing with neutral atoms. Quan- tum, 4:327, 2020. [66] Markus Heyl, Philipp Hauke, and Peter Zoller. Quantum localization bounds trotter errors in digital quantum simulation. Science Advances, 5(4):eaau8342, 2019. doi: 10.1126/sciadv. aau8342. 179 [67] Brendon L Higgins, Dominic W Berry, Stephen D Bartlett, Howard M Wiseman, and Geoff J Pryde. Entanglement-free heisenberg-limited phase estimation. Nature, 450(7168):393–396, 2007. [68] Nicholas J Higham. The numerical stability of barycentric lagrange interpolation. IMA Journal of Numerical Analysis, 24(4):547–556, 2004. [69] Mark Howard, Joel Wallman, Victor Veitch, and Joseph Emerson. Contextuality supplies the ’magic’for quantum computation. Nature, 510(7505):351–355, 2014. [70] Jacky Huyghebaert and Hans De Raedt. Product formula methods for time-dependent schrodinger problems. Journal of Physics A: Mathematical and General, 23(24):5777, 1990. [71] Jason Iaconis, Sonika Johri, and Elton Yechao Zhu. Quantum state preparation of normal distributions using matrix product states. npj Quantum Information, 10(1):15, 2024. [72] R Islam, Crystal Senko, Wes C Campbell, S Korenblit, J Smith, A Lee, EE Edwards, C- CJ Wang, JK Freericks, and C Monroe. Emergence and frustration of magnetism with variable-range interactions in a quantum simulator. Science, 340(6132):583–587, 2013. [73] Dollard John Day and Friedman Charles N. Product Integration with Application to Differ- ential Equations. Number v. 10. Section, Analysis in Encyclopedia of Mathematics and Its Applications. Cambridge University Press, 1984. ISBN 9780521302302. [74] John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ron- neberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Ε½Γ­dek, Anna Potapenko, et al. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021. [75] MΓ‘ria KieferovΓ‘, Artur Scherer, and Dominic W Berry. Simulating the dynamics of time- dependent hamiltonians with a truncated dyson series. Physical Review A, 99(4):042314, 2019. [76] Youngseok Kim, Andrew Eddins, Sajant Anand, Ken Xuan Wei, Ewout Van Den Berg, Sami Rosenblatt, Hasan Nayfeh, Yantao Wu, Michael Zaletel, Kristan Temme, et al. Evidence for the utility of quantum computing before fault tolerance. Nature, 618(7965):500–505, 2023. [77] A Yu Kitaev. Quantum measurements and the abelian stabilizer problem. arXiv preprint quant-ph/9511026, 1995. [78] A. Yu. Kitaev, A. H. Shen, and M. N. Vyalyi. Classical and Quantum Computation. American Mathematical Society, USA, 2002. ISBN 0821832298. [79] Alexei Kitaev and William A. Webb. Wavefunction preparation and resampling using a 180 quantum computer, 2009. [80] Natalie Klco and Martin J. Savage. Minimally entangled state preparation of localized wave functions on quantum computers. Physical Review A, 102:012612, 2020. doi: 10.1103/ PhysRevA.102.012612. [81] Rolf Landauer. Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5(3):183–191, 1961. [82] Dean Lee, Jacob Watkins, Dillon Frame, Gabriel Given, Rongzheng He, Ning Li, Bing-Nan Lu, and Avik Sarkar. Time fractals and discrete scale invariance with trapped ions. Physical Review A, 100(1):011403, 2019. [83] Dean Lee, Joey Bonitati, Gabriel Given, Caleb Hicks, Ning Li, Bing-Nan Lu, Abudit Rai, Avik Sarkar, and Jacob Watkins. Projected cooling algorithm for quantum computation. Physics Letters B, 807:135536, 2020. [84] Joonho Lee, Dominic W Berry, Craig Gidney, William J Huggins, Jarrod R McClean, Nathan Wiebe, and Ryan Babbush. Even more efficient quantum computations of chemistry through tensor hypercontraction. PRX Quantum, 2(3):030305, 2021. [85] Xiaoyu Li, Guowu Yang, Carlos Manuel Torres Jr, Desheng Zheng, and Kang L Wang. A class of efficient quantum incrementer gates for quantum circuit synthesis. International Journal of Modern Physics B, 28(01):1350191, 2014. [86] HQ Lin. Exact diagonalization of quantum-spin models. Physical Review B, 42(10):6561, 1990. [87] Seth Lloyd. Universal quantum simulators. Science, 273(5278):1073–1078, 1996. [88] Guang Hao Low and Isaac L Chuang. Hamiltonian simulation by qubitization. Quantum, 3: 163, 2019. [89] Guang Hao Low and Nathan Wiebe. Hamiltonian simulation in the interaction picture. arXiv:1805.00675, 2018. [90] Guang Hao Low, V. Kliuchnikov, and N. Wiebe. Well-conditioned multiproduct hamiltonian simulation. arXiv: Quantum Physics, 2019. [91] Shang-keng Ma. Introduction to the renormalization group. Reviews of Modern Physics, 45 (4):589, 1973. [92] Pesi R Masani. Norbert Wiener 1894–1964, volume 5. BirkhΓ€user, 2012. [93] John C Mason and David C Handscomb. Chebyshev Polynomials. CRC press, 2002. 181 [94] Sam McArdle, Suguru Endo, AlΓ‘n Aspuru-Guzik, Simon C Benjamin, and Xiao Yuan. Quantum computational chemistry. Reviews of Modern Physics, 92(1):015003, 2020. [95] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature Communications, 9 (1):4812, 2018. [96] Richard Meister and Simon C Benjamin. Resource-frugal hamiltonian eigenstate preparation via repeated quantum phase estimation measurements. arXiv:2212.00846, 2022. [97] Barry Simon Michael Reed. Methods of Modern Mathematical Physics, volume 1. Academic Press, 1980. [98] Christopher Monroe and Jungsang Kim. Scaling the ion trap quantum processor. Science, 339(6124):1164–1169, 2013. [99] Chandra Sekhar Mukherjee, Subhamoy Maitra, Vineet Gaurav, and Dibyendu Roy. Preparing dicke states on a quantum computer. IEEE Transactions on Quantum Engineering, 1:1–17, 2020. [100] Pascal Naidon and Shimpei Endo. Efimov physics: a review. Reports on Progress in Physics, 80(5):056001, 2017. [101] Michael A Nielsen and Isaac Chuang. Quantum computation and quantum information, 2002. [102] Caleb J O’Loan. Iterative phase estimation. Journal of Physics A: Mathematical and Theoretical, 43(1):015301, 2009. [103] Uri Peskin and Nimrod Moiseyev. The solution of the time-dependent schrΓΆdinger equation by the (t, t’) method: Theory, computational algorithm and applications. The Journal of Chemical Physics, 99(6):4590–4596, 1993. [104] Diego Porras and J Ignacio Cirac. Effective quantum spin systems with trapped ions. Physical Review Letters, 92(20):207901, 2004. [105] David Poulin, Angie Qarry, Rolando Somma, and Frank Verstraete. Quantum simulation of time-dependent hamiltonians and the convenient illusion of hilbert space. Physical Review Letters, 106(17):170501, 2011. [106] Zhengrong Qian, Jacob Watkins, Gabriel Given, Joey Bonitati, Kenneth Choi, and Dean Lee. Demonstration of the rodeo algorithm on a quantum computer. arXiv:2110.07747, 2021. [107] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri. Numerical Mathematics, volume 37. Springer Science & Business Media, 2010. 182 [108] Yihui Quek, Daniel Stilck FranΓ§a, Sumeet Khatri, Johannes Jakob Meyer, and Jens Eisert. Exponentially tighter bounds on limitations of quantum error mitigation. arXiv:2210.11505, 2022. [109] Arthur G. Rattew and BΓ‘lint Koczor. Preparing arbitrary continuous functions in quantum registers with logarithmic complexity, 2022. [110] Arthur G Rattew, Yue Sun, Pierre Minssen, and Marco Pistoia. The efficient preparation of normal distributions in quantum registers. Quantum, 5:609, 2021. [111] Robert Raussendorf, Daniel E Browne, and Hans J Briegel. Measurement-based quantum computation on cluster states. Physical Review A, 68(2):022312, 2003. [112] Markus Reiher, Nathan Wiebe, Krysta M Svore, Dave Wecker, and Matthias Troyer. Eluci- dating reaction mechanisms on quantum computers. Proceedings of the National Academy of Sciences, 114(29):7555–7560, 2017. [113] Gumaro Rendon, Jacob Watkins, and Nathan Wiebe. Improved accuracy for trotter simula- tions using chebyshev interpolation. Quantum, 8:1266, 2024. [114] T.J. Rivlin. Chebyshev Polynomials. Dover Books on Mathematics. Dover Publications, 2020. ISBN 9780486842332. URL https://books.google.com/books?id=3s0mygEACAAJ. [115] Ankur Sarker, M Shamiul Amin, Avishek Bose, and Nafisah Islam. An optimized design of binary comparator circuit in quantum computing. In 2014 International Conference on Informatics, Electronics & Vision (ICIEV), pages 1–5. IEEE, 2014. [116] Nicolas PD Sawaya, Tim Menke, Thi Ha Kyaw, Sonika Johri, AlΓ‘n Aspuru-Guzik, and Gian Giacomo Guerreschi. Resource-efficient digital quantum simulation of d-level systems for photonic, vibrational, and spin-s hamiltonians. npj Quantum Information, 6(1):49, 2020. [117] M.L. Shetterly and L. Freeman. Hidden Figures. HarperCollins, 2018. ISBN 9780062881885. [118] Peter W Shor. Algorithms for quantum computation: discrete logarithms and factoring. In Proceedings 35th Annual Symposium on Foundations of Computer Science, pages 124–134. IEEE, 1994. [119] Avram Sidi. The Richardson Extrapolation Process, page 21–41. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, 2003. doi: 10. 1017/CBO9780511546815.003. [120] Daniel R Simon. On the power of quantum computation. SIAM Journal on Computing, 26 (5):1474–1483, 1997. 183 [121] Ionel Stetcu, Alessandro Baroni, and Joseph Carlson. Projection algorithm for state prepa- ration on quantum computers. Physical Review C, 108(3):L031306, 2023. [122] Masuo Suzuki. General theory of fractal path integrals with applications to many-body theories and statistical physics. Journal of Mathematical Physics, 32(2):400–407, 1991. [123] Masuo Suzuki. General theory of higher-order decomposition of exponential operators and symplectic integrators. Physics Letters A, 165(5-6):387–395, 1992. [124] Masuo Suzuki. Methodology of analytic and computational studies on quantum systems. Journal of Statistical Physics, 110(3):945–956, 2003. [125] Krysta M. Svore, Matthew B. Hastings, and Michael H. Freedman. Faster phase estimation. Quantum Information & Computation, 14(3-4):306–328, 2014. doi: 10.26421/QIC14.3-4-7. [126] Lloyd N. Trefethen. Appendix A. Six Myths of Polynomial Interpolation and Quadrature, pages 263–271. SIAM, 2019. doi: 10.1137/1.9781611975949.appa. URL https://epubs. siam.org/doi/abs/10.1137/1.9781611975949.appa. [127] Jean-Philippe Uzan. The fundamental constants and their variation: observational and theoretical status. Reviews of Modern Physics, 75(2):403, 2003. [128] Dyon van Vreumingen and Kareljan Schoutens. Adiabatic ground-state preparation of fermionic many-body systems from a two-body perspective. Physical Review A, 108(6): 062603, 2023. [129] Farrokh Vatan and Colin Williams. Optimal quantum circuits for general two-qubit gates. Physical Review A, 69:032315, 2004. doi: 10.1103/PhysRevA.69.032315. [130] Almudena Carrera Vazquez, Daniel J Egger, David Ochsner, and Stefan Woerner. Well- conditioned multi-product formulas for hardware-friendly hamiltonian simulation. Quantum, 7:1067, 2023. [131] Libor Veis and JiΕ™Γ­ Pittner. Adiabatic state preparation study of methylene. The Journal of Chemical Physics, 140(21), 2014. [132] Marco Vianello and Federico Piazzon. Stability inequalities for lebesgue constants via markov-like inequalities. Dolomites Research Notes on Approximation, 11(DRNA Volume 11.1):1–9, 2018. [133] Jacob Watkins, Nathan Wiebe, Alessandro Roggero, and Dean Lee. Time-dependent hamil- tonian simulation using discrete clock constructions. arXiv:2203.11353, 2022. [134] James D Watson, Jacob Bringewatt, Alexander F Shaw, Andrew M Childs, Alexey V Gor- shkov, and Zohreh Davoudi. Quantum algorithms for simulating nuclear effective field 184 theories. arXiv:2312.05344, 2023. [135] Nathan Wiebe and Nathan S Babcock. Improved error-scaling for adiabatic quantum evolu- tions. New Journal of Physics, 14(1):013024, 2012. [136] Nathan Wiebe and Chris Granade. Efficient bayesian phase estimation. Physical Review Letters, 117(1):010503, 2016. [137] Nathan Wiebe, Dominic Berry, Peter HΓΈyer, and Barry C Sanders. Higher order decomposi- tions of ordered operator exponentials. Journal of Physics A: Mathematical and Theoretical, 43(6):065203, 2010. doi: 10.1088/1751-8113/43/6/065203. [138] Nathan Wiebe, Dominic W Berry, Peter HΓΈyer, and Barry C Sanders. Simulating quantum dynamics on a quantum computer. Journal of Physics A: Mathematical and Theoretical, 44 (44):445308, 2011. [139] Changhao Yi and Elizabeth Crosson. Spectral analysis of product formulas for quantum ISSN 2056-6387. doi: 10.1038/ simulation. npj Quantum Information, 8(1):37, 2022. s41534-022-00548-w. [140] Jiehang Zhang, Guido Pagano, Paul W Hess, Antonis Kyprianidis, Patrick Becker, Harvey Kaplan, Alexey V Gorshkov, Z-X Gong, and Christopher Monroe. Observation of a many- body dynamical phase transition with a 53-qubit quantum simulator. Nature, 551(7682): 601–604, 2017. [141] Sergiy Zhuk, Niall Robertson, and Sergey Bravyi. Trotter error bounds and dynamic multi- product formulas for hamiltonian simulation, 2023. [142] Christa Zoufal, Ryan V Mishmash, Nitin Sharma, Niraj Kumar, Aashish Sheshadri, Amol Deshmukh, Noelle Ibrahim, Julien Gacon, and Stefan Woerner. Variational quantum al- gorithm for unconstrained black box binary optimization: Application to feature selection. Quantum, 7:909, 2023. [143] Wojciech Hubert Zurek. Decoherence, einselection, and the quantum origins of the classical. Reviews of Modern Physics, 75(3):715, 2003. [144] Karol Ε»yczkowski, PaweΕ‚ Horodecki, Anna Sanpera, and Maciej Lewenstein. Volume of the set of separable states. Physical Review A, 58(2):883, 1998. 185