THREE VARIATIONS ON JOHNSON-LINDENSTRAUSS MAPS FOR SUBMANIFOLDS OF EUCLIDEAN SPACE VIA REACH By Arman Tavakoli A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Mathematics – Doctor of Philosophy 2021 ABSTRACT THREE VARIATIONS ON JOHNSON-LINDENSTRAUSS MAPS FOR SUBMANIFOLDS OF EUCLIDEAN SPACE VIA REACH By Arman Tavakoli In this thesis we investigate 3 variations of the classical Johnson-Lindenstrauss (JL) maps. In one direction we build on the earlier work of Wakin and Eftekhari (2015), by considering generalizations to manifolds with boundary. In a second direction we extend the work of Noga Alon (2003) for lower bounds for the final embedding dimension in JL maps. In the third direction, we consider matrices with fast matrix-vector multiply and improve the run-time in the earlier work of Oymak, Recht and Soltanolkotabi (2018), and Ailon and Liberty (2009). This thesis is organized into 6 chapters. The three variations are discussed in chapters 4, 5 and 6. The variation for manifolds with boundary is presented in chapter 4. The lower bound problem is discussed in chapter 5, and chapter 6 is regarding the run-time improvements. The first chapter is an introduction to Johnson-Lindenstrauss maps. The second chapter is about a regularity parameter called reach and geometrical estimates for manifolds. The third chapter is regarding two geometry questions about reach that arise from the discussions in chapter 2. ACKNOWLEDGEMENTS I would like to thank my advisers Profs. Mark Iwen and Ben Schmidt for their kind support and generous spirit. They listened to me express my own ideas, and helped me make progress with my research. They shared their experience and knowledge with me and I am grateful for that. I thank Profs. Rongrong Wang and Willie Wong for being on my dissertation committee and providing feedback about my work. I am indebted to many people at MSU for providing a supportive community that improved my experience. Sami Mehri generously hosted me in my first visit at MSU; Emmet Harrington and Gorapada Bera were my amazing office mates who shared many mathematical discussions with me. Keshav Sutrave was there for me on numerous occasions. I am thankful to Tingting Jian, Lijie Han, Xiangyu (Orange) Wang, Xiaoying Yuan for many cakes, pastor John Bell, the Greek squad (Michalis Paparizos, Georgios Psaromiligkos, Christos Grigoriadis, Ioannis Zachos, Dimitris Vardakis), Zhe Zhang, Dan Normand, Timothy Szocinski, Minh Hoang Le, Yesfun Yeh, Anand Chandrasekhar, Henry Manoj D’Souza, Nopphachai Sirinakbumrung, Liwei Deng, Farzan Masrour, Hossein Mazaheri, Vojtěch Kejzlar, and Nayeli Rodriguez Briones for hosting me on multiple occasions. My family always supported me and helped me recharge during the holidays. To my parents Fara and Mohammad Hossein, my sisters Aylin and Iren, and little (at the time of writing) Ethan, I love you very much. During the first year of the Covid-19 pandemic, I could stay home and work remotely. Yet many front line workers specially in health care services, provided service to public everyday while risking their lives. This was at a time where limited information was available about Covid-19 and there were no vaccines. I am grateful to the numerous people in the front lines who made life safer for others. iii TABLE OF CONTENTS LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Johnson-Lindenstrauss Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Fast Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 CHAPTER 2 GEOMETRIC PROPERTIES OF REACH AND COVERING ESTIMATES 11 2.1 Reach and its Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Covering Estimate from Above via Gunther’s Theorem . . . . . . . . . . . . . . . 17 2.3 Covering Estimate from Above Without Radius Restriction via Tubes . . . . . . . . 22 2.4 Covering Estimate from Below via Bishop’s Theorem . . . . . . . . . . . . . . . . 26 2.5 Covering Estimate from Above for the Unit Secants of a Submanifold . . . . . . . . 28 2.6 Gaussian Width Estimate for Unit Secants From Above . . . . . . . . . . . . . . . 33 CHAPTER 3 MORE ESTIMATES WITH REACH . . . . . . . . . . . . . . . . . . . . . 35 3.1 Closed Submanifolds with Reach 1 and Minimum Volume . . . . . . . . . . . . . 35 3.1.1 Codimension 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2 Torus in R3 with Reach 1 and Minimum Area . . . . . . . . . . . . . . . . . . . . 37 3.3 Doing Better than Reach in Covering Estimates . . . . . . . . . . . . . . . . . . . 40 CHAPTER 4 SUFFICIENT CONDITIONS FOR JL MAPS OF MANIFOLDS . . . . . . 49 4.1 With sub-Gaussian Matrices (Variation 1) . . . . . . . . . . . . . . . . . . . . . . 49 4.2 With SORS Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 CHAPTER 5 NECESSARY CONDITIONS FOR JL MAPS OF MANIFOLDS (VARI- ATION 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 CHAPTER 6 A JL ALGORITHM WITH 𝑁 log(log(𝑁)) RUN-TIME (VARIATION 3) . . 69 6.1 Case of Finite Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.1.1 JL Maps for Finite Points via sub-Gaussian Matrices . . . . . . . . . . . . 71 6.1.2 JL Maps for Finite Points via SORS Matrices . . . . . . . . . . . . . . . . 72 6.1.3 Combining the Two Matrix Types . . . . . . . . . . . . . . . . . . . . . . 73 6.2 Upgrade for Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.2.1 SORS Matrices for Subspaces . . . . . . . . . . . . . . . . . . . . . . . . 75 6.2.2 Sub-Gaussian Matrices for Subspaces . . . . . . . . . . . . . . . . . . . . 75 6.2.3 Mixing the Matrix Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 iv LIST OF FIGURES Figure 1.1: Two parallel hyper-planes containing a set determine its width in that direction. From [35]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Figure 2.1: Two points with the geodesic and secant line between them. We bound the angle between 𝑑 and 𝑉0 using 𝜏. . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Figure 3.1: Reach is a global parameter. A bump on the surface of a manifold can change its reach, and therefore reach is sensitive to local perturbations. . . . . . . . . . 40 Figure 3.2: Figure-a shows a packing with varying density. Figure-b shows the Veronoi cells associated with a packing. Figure-c shows a Veronoi cell contained by a doubled packing ball. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Figure 6.1: A vector is split into pieces, each part is processed with a fast JL map, outputs are combined and fed into a Gaussian JL map. The Gaussian maps have optimal dimensional reduction while fast JL maps have better run-time. In this mixed approach, we combine the benefits of both methods. . . . . . . . . . 70 v CHAPTER 1 INTRODUCTION In this thesis we study different variations of Johnson-Lindenstrauss (JL) maps. A great source for learning about such maps is the textbook of Roman Vershynin, High Dimensional Probability, [35, Chapter 5]. The variations we consider are based on the theme of improving existing bounds for JL maps, but they come in different flavors such as sufficient conditions, necessary conditions or runtime time complexity bounds. In this chapter we give an introduction to JL maps and discuss some of the tools that are used in their construction. 1.1 Johnson-Lindenstrauss Maps JL maps belong to the general field of dimension reduction (also known as dimensionality reduc- tion). Informally, given 𝑇 as a subset of R𝑁 , a JL map associated to 𝑇 is a map that reduces 𝑁 while preserving the “geometry of 𝑇”, i.e. 𝐿 : R𝑁 → R𝑚 with 𝑚 < 𝑁 such that 𝑇 and 𝐿 (𝑇) have the “same geometry”. There are many possible interpretations for a rigorous replacement for preserving the geometry. Perhaps the most strict requirement would be to ask for preservation of all pairwise Euclidean distances; that would mean for 𝑥, 𝑦 ∈ 𝑇, ||𝐿 (𝑥 − 𝑦)|| 2 = ||𝑥 − 𝑦|| 2 (1.1) where ||.|| 2 is the Euclidean norm. For an example one can consider 𝐿 to be the projection onto the span of 𝑇. If 𝑇 is a finite set, then its span is at most a |𝑇 |-dimensional subspace. If |𝑇 | < 𝑁, 𝐿 would satisfy the above requirements and it would be the simplest example of a map that “preserves the geometry” of 𝑇 while reducing its ambient dimension. In this example the final embedding dimension is at most the number of points. With condition (1.1), this number is also necessary; consider 𝑛 points with all the pairwise distances equal to 1. Such an arrangement can be embedded into R𝑁 for 𝑁 large enough, but it can not be found when the dimension is strictly less than 𝑛−1. For 1 𝑛 = 3, 4 this is the familiar fact that an equilateral triangle doesn’t fit in R1 or a regular tetrahedron doesn’t fit in R2 . Relaxing (1.1) allows us to reduce the ambient dimension more. One could be less strict and allow the map to distort the pairwise distances, but in a controlled manner. In this way, given 0 < 𝜖 < 1 and 𝑥, 𝑦 ∈ 𝑇 we require (1 − 𝜖)||𝑥 − 𝑦|| ≤ ||𝐿(𝑥 − 𝑦)|| ≤ (1 + 𝜖)||𝑥 − 𝑦||. (1.2) Inequality (1.2) is known as the JL condition. In this sense, if we consider 𝑇 as a metric space with the chordal metric, the JL map is a metric space embedding. With condition (1.2), we have a celebrated theorem known as the Johnson-Lindenstrauss lemma. Theorem 1.1.1. [35, Theorem 5.3.1] Let 𝑇 be a set of 𝑛 points in R𝑁 and 𝜖 > 0. Let 𝑃 be an orthogonal projection onto a random 𝑚-dimensional subspace in R𝑁 selected uniformly in the Grassmanian 𝐺 𝑁,𝑚 . Then there are universal constants 𝑐 1 , 𝑐 2 such that if log(𝑛) 𝑚 ≥ 𝑐1 . (1.3) 𝜖2 then with probability at least 1 − 2 exp(−𝑐 2 𝜖 2 𝑚), the scaled projection √︂ 𝑁 𝑄= 𝑃 𝑚 satisfies (1 − 𝜖)||𝑥 − 𝑦|| ≤ ||𝑄(𝑥 − 𝑦)|| ≤ (1 + 𝜖)||𝑥 − 𝑦||. (1.4) The main tool for proving this theorem is the concentration of measure. In this proof technique one estimates the expected length of a vector under all projections. One calculates the averaged length of that vector when it is projected onto all possible directions. Then by the concentration of measure, most projections do almost the same as the average, with a small probability for deviation.  Having obtained a failure probability for a single vector, we apply it 𝑛2 times to get an estimate for all pairwise secants. We use the union bound to combine the the failure probabilities over the different vectors and we get the claimed result. 2 The bound (1.3) is logarithmic in the number of points. This is a improvement compared to the discussion for (1.1) where we saw a linear bound. The price to pay for this logarithmic improvement is to tolerate the 𝜖12 factor coming from the distortion 𝜖. The break even point would √︃ 𝑐 1 log(𝑛) be 𝜖 = 𝑛 . As long as we allow 𝜖 to be larger that this threshold, the JL lemma offers a great improvement. The bound (1.3) is also independent of 𝑁. This property makes the JL maps robust for applica- tions in dimension reduction since it allows them to handle models that encode their data in spaces with large 𝑁. A remarkable property of the above theorem is that it is possible to design the JL map before knowing the data (the 𝑛 points). This property is known as being oblivious, and is essentially the result of introducing probability and allowing for a controllable chance of failure. Since the expres- sions for the probability only depend on the number of points, and not where they are, one can pick a map from all the possible projections before we know the data, and with high probability (1.2) would still be satisfied. The obliviousness makes this technique very useful and robust in appli- cations since the dimension reduction technique becomes decoupled from the intricacies of the data. The next step is to construct a JL map for infinite points. The theorem mentioned above depends explicitly on the number of points. However it is possible to give a different variation, known as the matrix deviation inequality [35, theorem 9.1.1], where the embedding dimension depends on how the vectors are spread out in different directions, through a quantity called Gaussian width (we will describe it below), or a variation of it called Gaussian complexity. The deviation inequality also differs from the JL lemma discussed above because it works with Gaussian random matrices instead of random projections. Informally given 𝑇 ⊂ R𝑁 , the matrix deviation inequality says that 3 with high probability || 𝐴𝑥|| = E|| 𝐴𝑥|| + O (𝛾(𝑇)), for all 𝑥 ∈ 𝑇 (1.5) where 𝐴 is a Gaussian random matrix, E is the expectation (average) operator and 𝛾(𝑇) is the Gaussian complexity of 𝑇. Conceptually (1.5) tells us that the length of a vector in a set 𝑇 after its is mapped by a random a matrix A is, with a high chance, closer than O (𝛾(𝑇)) to the average length. Compared to the earlier discussion about pairwise distances, this perspective is different because we only consider individual vectors. To transition from one to the other, we form a set of all pairwise secants and work with that set of vectors. A quantitative version of the matrix deviation inequality, with tail bounds, is as follows. Theorem 1.1.2. [35, theorem 9.1.1 and 9.1.8] Let 𝐴 be a 𝑚 × 𝑁 matrix whose rows 𝐴𝑖 are independent isotropic and sub-Gaussian random vectors in R𝑁 . Let 𝐾 = max𝑖 || 𝐴𝑖 || 𝜓2 where 𝜓2 is the sub-Gaussian norm. Let 𝑇 ⊂ R𝑁 with Gaussian width 𝜔(𝑇). Then there is a universal constant 𝐶 such that for 𝑢 ≥ 0, the probability of event √ sup || 𝐴𝑥|| − 𝑚||𝑥|| ≤ 𝐶𝐾 2 [𝜔(𝑇) + 𝑢 rad(𝑇)] (1.6) 𝑥∈𝑇 is at least 1 − 2 exp(−𝑢 2 ). Here rad(𝑇) is the radius of the smallest ball that can contain 𝑇. Again in this theorem we transition from earlier pairwise distances 𝑥 − 𝑦 in theorem 1.1.1 to individual vectors with 𝑥 only. 𝑥−𝑦 We note that all vectors of the form ||𝑥−𝑦|| are put into a set 𝑇 and theorem 1.1.2 is applied to this set. The proof of this theorem is based on the generic chaining technique [35, chapter 9]. The √ term 𝑚||𝑥|| acts as the average E|| 𝐴𝑥||. The right hand sides is the deviation from the mean. To achieve a higher success probability, we must allow for a larger deviation. √ If we have a choice over 𝑚, by dividing both sides of (1.6) by 𝑚 and choosing 𝑚 large enough we can bound the deviation to an error 𝜖 of our choosing. In this sense, a large enough 𝑚 allows us to construct an oblivious 𝐽 𝐿 map for any set in terms of its Gaussian width. 4 Now we discuss the Gaussian width. We present two ways to think about it. One way is using probability and Gaussian random variables. The second way is geometrical, and doesn’t use probability. Having the different view points allows us to switch from one to the other depending on the application. The probabilistic definition is as follows. Definition 1.1.3. [35, Definiiton 7.5.1] Let 𝑔 be a standard Gaussian random vector, 𝑔 ∼ 𝑁 (0, 𝐼 𝑁 ). The Gaussian width of a subset 𝑇 ⊂ R𝑁 is defined as 𝜔(𝑇) = E sup ⟨𝑔, 𝑥⟩. (1.7) 𝑥∈𝑇 With an absolute value E sup𝑥∈𝑇 |⟨𝑔, 𝑥⟩|, one gets the Gaussian complexity of T, denoted by 𝛾(𝑇). In general 𝜔(𝑇) ≤ 𝛾(𝑇) but if 𝑇 is origin-symmetric then 𝜔(𝑇) = 𝛾(𝑇). To get an interpretation of how Gaussian width measures the width of a set, first we need the concept of spherical width. Then we show how it relates to Gaussian width. Definition 1.1.4. [35, Definition 7.5.5] The spherical width of a subset 𝑇 ⊂ R𝑁 is defined as follows where 𝜃 is a direction uniformly distributed in S𝑁−1 , 𝜃 ∼ Unif(𝑆 𝑁−1 ). 𝜔 𝑠 (𝑇) = E sup ⟨𝜃, 𝑥⟩ (1.8) 𝑥∈𝑇 Figure 1.1 shows how width of a set in a particular direction is measured. When this directional width is averaged over all possible directions we get the spherical width. Figure 1.1: Two parallel hyper-planes containing a set determine its width in that direction. From [35]. For a Gaussian random vector, one can consider its direction 𝜃 and length 𝑟 separately. Direction 𝜃 is uniformly distributed and decouples from 𝑟. Next we recall that the expected length of Gaussian 5 √ √ random variable differs from 𝑁 by a universal constant, |E||𝑔|| − 𝑁 | ≤ 𝐶. This lets us relate the Gaussian width to spherical width as follows. √ √ ( 𝑁 − 𝐶)𝜔 𝑠 (𝑇) ≤ 𝜔(𝑇) ≤ ( 𝑁 + 𝐶)𝜔 𝑠 (𝑇) (1.9) √ In this sense Gaussian width is the spherical width of a set scaled by 𝑁, and spherical width has a clear geometrical interpretation. We give 4 examples of Gaussian widths that are often encountered in high dimensional proba- bility, see [35, 7.5.8-7.5.10]. They are the 𝐿 1 ball 𝐵1𝑁 , the 𝐿 2 ball 𝐵2𝑁 , the 𝐿 ∞ ball 𝐵∞ 𝑁 , and a set 𝑇 with 𝑛 points. We use 𝐶 for any universal constant. √︁ √︁ 𝐶1 log(𝑁) ≤ 𝜔(𝐵1𝑛 ) ≤ 𝐶2 log(𝑁) (1.10) √ |𝜔(𝐵2𝑁 ) − 𝑁 | ≤ 𝐶 (1.11) √︂ 𝑛 2 𝜔(𝐵∞ ) = 𝑁 (1.12) 𝜋 √︁ 𝜔(𝑇) ≤ 𝐶 log(𝑛) diam(𝑇) (1.13) √ Dividing the first three above expressions by 𝑁 gives us spherical width estimates for the corresponding 𝐿 𝑝 balls. One sees that the width of 𝐿 2 ball is a constant as expected, while the 1 √︃ log(𝑁) √ ∞ ball grows as 𝑁. width of the 𝐿 ball decreases to zero as 𝑁 while the width of the 𝐿 With Gaussian width, theorem 1.1.2 lets us get JL maps for infinite sets. However one still needs techniques for estimating the Gaussian width. In this thesis, we restrict our discussion to compact smooth manifolds. To get a JL map for a compact manifold we need estimates for the Gaussian width of its set of unit secants. The regularity provided by being a manifold allows us to find explicit estimates and thus close the argument for JL maps of this class of objects. More about this estimate is in section 2.6. The key inequality that allows us to relate geometrical properties of a manifold to its Gaussian width is called Dudley’s inequality [18, page 226]. For a universal constant 𝑐, one can relate 6 covering number of set 𝑇 to its Gaussian width as follows. ∫ ∞ √︁ 𝜔(𝑇) ≤ 𝑐 log(𝐶 (𝑇, 𝜖))𝑑𝜖 . (1.14) 0 Motivated by this inequality we calculate various covering numbers in chapter 2. 1.2 Fast Matrices Having discussed random projections and Gaussian matrices for dimension reduction, the next step would be to speed up the computation. For a 𝑚 × 𝑁 matrix, the general matrix-vector multiplication takes O (𝑚𝑁) operations. Since Gaussian matrices are densely populated with random variables this is difficult to improve. If instead we specialize to discrete Fourier matrices, one can use the Fast Fourier Transform (FFT) algorithm. FFT allows for a O (𝑁 log(𝑁)) matrix-vector multiplication, for square matrices. Therefore the Discrete Fourier Transform (DFT) matrices are ideal candidates for fast JL maps. Other examples in this class are also the Welsh-Hadamard matrices. However for JL maps we need short and wide matrices with 𝑚 ≤ 𝑁. This will come from sub-sampling the Fourier matrix. Also since Fourier matrices have a fast matrix-vector multiplication, one can do the full multiplication and throw away the unwanted rows. Constructing JL maps from Fourier matrices requires a process that we briefly describe below. Such an approach has been used before in earlier works. One example is the work in [29] where the authors give a JL construction via Fourier matrices for an arbitrary object in terms of its Gaussian width. We use an indirect approach from the field of the Compressive Sensing (CS). This topic was first discussed by Terence Tao and Emmanuel Candes in their seminal papers [11], [12] and [13]. ([11] is the highest cited paper of Tao according to Google scholar at the time of writing). CS is a large field; for our purposes we need three tools from it. First is the concept of Restricted Isometry Property (RIP), second is an RIP estimate for discrete Fourier transform matrices (one can think of RIP estimates as the fundamental theorems of CS), and third is a theorem called Krahmer-Ward that transforms an RIP estimate to a JL theorem. Below we describe each step separately. If 𝐹 is a DFT matrix, then the construction scheme is as follows. 7 √︂ ©±1 ª 𝑁 ­ ® (Sample m rows from 𝐹𝑁×𝑁 ) ­ . . ® ­ . ® 𝑚 ­ ® | {z } ­ ® Restricted Isometry Property ±1 « ¬ | {z } Random Diagonal Matrix The random diagonal matrix is used in the Krahmer-Ward theorem. We also note that there are multiple conventions for normalizing a DFT matrix. Here we adopt the convention that the matrix is unitary. If 𝜔 is the nth root of unity, 𝜔 𝑁 = 1, then the matrix 𝐹 would be ©1 1 1 ... 1 ª ­ ® 2 1 ­­1 ­ 𝜔 𝜔 ... 𝜔 𝑁−1 ® ® 𝐹 = √ ­. ® ® (1.15) 𝑁 ­ .. ® ­ ® ­ 2 ® 1 𝜔 𝑁−1 𝜔2(𝑁−1) ... 𝜔 (𝑁−1) . « ¬ The first step is the Restricted Isometry Property. RIP is about how much a matrix distorts sparse vectors. It is similar to the JL condition but it is restricted to sparse vectors. This property is reported through pairs of numbers. In each pair, there is a sparsity level, 𝑠, and a deviation value 𝛿. In short sometimes it is written as 𝛿 𝑠 . As a reminder, sparsity of a vector is the number of non-zero components of that vectors, denoted by ||𝑥|| 0 where the zero comes from the lim 𝑝→0 ||𝑥|| 𝑝 . So if a matrix 𝐴 possess the 𝛿 𝑠 RIP then for all vectors 𝑥 that are 𝑠 or less sparse we simultaneously have (1 − 𝛿 𝑠 )||𝑥|| 2 ≤ || 𝐴𝑥|| 2 ≤ (1 + 𝛿 𝑠 )||𝑥|| 2 . (1.16) Some authors put squares in the RIP condition as in (1 − 𝛿 𝑠 )||𝑥|| 2 ≤ || 𝐴𝑥|| 2 ≤ (1 + 𝛿 𝑠 )||𝑥|| 2 . These definitions are equivalent up to scaling the 𝛿 𝑠 by an absolute constant. The second step is an RIP estimate for the discrete Fourier transform matrices. The estimate we provide below is for the general family of orthonormal matrices with 𝑚 rows sampled uniformly in an i.i.d fashion. 8 Lemma 1.2.1. [29, Lemma 4.2], [18, theorem 12.32] Let 𝐹 ∈ R𝑁×𝑁 be an orthonormal matrix obeying 𝐾 𝐹 ∗ 𝐹 = 𝐼, max𝑖, 𝑗 |𝐹𝑖 𝑗 | ≤ √ . (1.17) 𝑁 Define the random subsampled matrix 𝐻 ∈ R𝑚×𝑁 with i.i.d rows chosen uniformly at random from √︃ the rows of 𝐹. Then the RIP(𝑠, 𝛿) holds for 𝑚 𝑁 𝐻 with probability at least 1 − 𝑒 −𝜂 for all 𝛿 > 0 as long as 𝑠 log4 (𝑁) 𝑚 ≥ 𝐶𝐾 2 (1 + 𝜂) . (1.18) 𝛿2 Some remarkable characteristics for the bound in (1.18) are that it is linear in 𝑠 and logarithmic (to 4th power) in 𝑁. It has an explicit dependence on the ambient dimension 𝑁; since 𝑁 could be prohibitively large, improving the depence on 𝑁 in this bound has been an active area of research in CS. For a recent improvement see [10]. The third step is a theorem that allows us to get a JL theorem for finite points from an RIP estimate. This theorem is known as the Krahmer-Ward theorem; it takes a matrix with a RIP estimate, 𝑛 points as data and applies a sequence of random reflections through the coordinate axes. The result is a probabilistic theorem for a JL map with a controlled failure probability. Theorem 1.2.2. (Krahmer-Ward) [18, theorem 9.36] Let 𝑥𝑖 be 𝑛 points in R𝑛 . Let 𝜌, 𝜖 ∈ (0, 1). Let 𝐴 ∈ R𝑚×𝑁 be a matrix where its restricted isometry constant for sparsity 2𝑠, i.e. 𝛿2𝑠 , satisfies 𝛿2𝑠 ≤ 𝜖 4 for 𝑠 ≥ 16 log( 4𝑛 𝜌 ). Let 𝐷 be a diagonal matrix with Rademacher random variables, i.e. with uniform ±1 random variables, on the diagonal. Then with probability exceeding 1 − 𝜌 the following holds simultaneously for all 𝑥𝑖 . (1 − 𝜖)||𝑥𝑖 || 2 ≤ || 𝐴𝐷𝑥𝑖 || 2 ≤ (1 + 𝜖)||𝑥𝑖 || 2 . (1.19) Combining the above 3 steps we get a JL map for finite points using DFT matrices. The benefit of this approach is the computational speed. The cost to pay is a higher final dimension compared 9 to Gaussian or random projection matrices. In chapter 6, we show how one can combine the two approaches and get the best of both worlds. 10 CHAPTER 2 GEOMETRIC PROPERTIES OF REACH AND COVERING ESTIMATES In this chapter we discuss geometrical estimates based on a regularity parameter known as reach. The reader interested in only the JL applications can skip to further chapters and refer back as necessary. We provide 4 main theorems. These theorems provide us with the necessary geometrical tools to construct our desired JL maps for manifolds. In theorem 2.1.7 we give a comprehensive list of properties of reach. In theorem 2.2.6 we give an upper bound for covering numbers of a compact manifold with boundary. We do so by first covering the boundary as an independent manifold. This covers a collar of the boundary, after which we cover the interior. The method of proof is based on Gunther’s volume comparison theorem and is restricted by the injectivity radius of the exponential map in the interior of 𝑀 away from the boundary. In theorem 2.4.1 we provide a lower bound for the covering numbers of a compact manifold. In theorem 2.5.2 we give a covering number bound from above for the unit secants of a submanifold. 2.1 Reach and its Properties Here we review the concept of reach for a submanifold of the Euclidean space and in theorem 2.1.7 provide a comprehensive list of its properties. Theorem 2.1.7 provides the main properties of reach in a convenient and ready-to-use manner for application to geometric problems, and we hope it is helpful for other researchers. We also discuss the case when the submanifold possibly possesses a boundary, as is expected to be the case for manifold models in applications. Reach is an extrinsic parameter that is defined based on how far one can move away from an embedded submanifold while maintaining a unique closest point property. This parameter also controls the extrinsic acceleration of unit speed geodesics. It has been used extensively as a regu- larity parameter since 1959 when it was defined by Federer in [16]. A historical viewpoint of its 11 development can be found in [34]. Some earlier results regarding reach can be found in [1], [8] and [15]. In [1], the authors offer a probabilistic method for estimating the reach of a manifold. In [8] the authors consider the intersection of a ball of small radius with a set of positive reach, and for manifolds they estimate the angle between tangent spaces at different points in terms of reach. In [15, theorem 2], the authors give a JL theorem for a closed manifold via reach. Here, we begin by recalling the definition of reach and then we summarize its properties. Definition 2.1.1. (Reach [16, definition 4.1]) For a closed subset of Euclidean space 𝐴 ⊂ R𝑛 , the reach 𝜏 is defined as 𝜏( 𝐴) = sup{𝑡 ≥ 0 ∀𝑥 ∈ R𝑛 such that 𝑑 (𝑥, 𝐴) < 𝑡, 𝑥 has a unique closest point in 𝐴}. The above definition is for closed subsets of R𝑛 . When restricted to submanifolds, reach depends in part on the second fundamental form of the embedding. However control of the second fundamental form does not fully control reach as two sheets of the submanifold may come close to each other. One can obtain an equivalent characterization of reach in terms of the injectivity radius of the normal exponential map, defined below following [22, Section 8.1]. First we need the concept of cut points for the normal exponential map. Definition 2.1.2. (Cut point) Let 𝜉 be a line segment in R𝑛 meeting 𝑀 orthogonally at 𝑚 ∈ 𝑀. We say 𝑥 ∈ 𝜉 is cut point (cut-focal point) along 𝜉 provided distance from 𝑥 to 𝑀 is no longer minimized along 𝜉 past 𝑥. Definition 2.1.3. (Cut distance) Define the function 𝑒 𝑐 : {( 𝑝, 𝑢) | 𝑝 ∈ 𝑀, 𝑢 ∈ 𝑁 𝑝 𝑀, ∥𝑢∥ = 1} → R such that 𝑒 𝑐 ( 𝑝, 𝑢) = sup{𝑡 > 0 | distance( 𝑝 + 𝑡𝑢, 𝑀) = 𝑡}. In words, 𝑒 𝑐 ( 𝑝, 𝑢) is the distance along direction 𝑢 starting orthogonal at 𝑝, past which the line segment along 𝑢 stops being distance minimizing to 𝑀. 12 Definition 2.1.4. ( [22, Section 8.1]) The injectivity radius of the normal exponential map 𝑟 inj (exp𝑁 ) (also known as minimal focal-cut distance) is inf{𝑒 𝑐 ( 𝑝, 𝑢) | 𝑝 ∈ 𝑀, 𝑢 ∈ 𝑁 𝑝 𝑀, ∥𝑢∥ = 1} When boundary is present, one must specify the criteria for normal vectors at the boundary. They are defined as the ones that make an obtuse angle with inward pointing tangent vectors, following [16, definition 4.4]. These vectors are normal to the boundary, when boundary is viewed as an independent manifold, but only make obtuse angles with the manifold itself. Let 𝑝 ∈ 𝜕 (𝑀) ⊂ R𝑛 and 𝑣 ∈ 𝑇𝑝 𝑀 that is inward pointing. Then 𝑢 ∈ R𝑛 is normal to 𝑀 at 𝑝 if 𝑢 • 𝑣 ≤ 0 for all such 𝑣. We also need the concept of focal points; this helps us describe the behavior where the distance to the submanifold is equal to reach. Definition 2.1.5. (Focal point, [22, section 8.1]) Let 𝑀 be a submanifold of R𝑛 . A focal point of 𝑀 is a point 𝑥 ∈ R𝑛 such that the exponential map exp𝑁 of the normal bundle of 𝑀, is singular somewhere on exp−1 𝑁 (𝑥). Definition 2.1.6. [22, section 8.1] Let 𝑒 𝑓 ( 𝑝, 𝑢) be the distance from 𝑝 ∈ 𝑀 to its first focal point along the geodesic 𝑡 → exp𝑁 ( 𝑝, 𝑡𝑢). In other words, let 𝑒 𝑓 := {( 𝑝, 𝑢) | 𝑝 ∈ 𝑀, 𝑢 ∈ 𝑀 𝑝⊥ , ||𝑢|| = 1} → R be the function defined by 𝑒 𝑓 ( 𝑝, 𝑢) = inf{𝑡 > 0 | kernel(((exp𝑁 )∗ ) ( 𝑝,𝑡𝑢) ) ≠ {0}}. We are now ready to state the key properties of reach. In this approach we are in part following the work of [1, proposition A.1]. We provide a survey of results that are scattered in the literature and provide proofs where they can not be easily located elsewhere. Theorem 2.1.7 (Properties of Reach). Let 𝑀 be a compact, smooth submanifold of R𝑛 possibly with boundary. Let 𝜏 be the reach of 𝑀. Let 𝑝, 𝑞 ∈ 𝑀 and 𝑥 ∈ R𝑁 . Then the following properties hold. 1. All geodesics in 𝑀 are 𝐶 1 . In the interior of 𝑀, they are smooth. At all points unit speed geodesics have one-sided second derivatives. When switching from interior to boundary, geodesics may bifurcate. 13 2. The one sided second derivatives for unit speed geodesics are bounded above in norm by 1𝜏 . In particular in the interior of 𝑀 the norm of the second fundamental form of 𝑀 is bounded by 1𝜏 . −2 1 3. In the interior of 𝑀, any sectional curvature 𝑘 satisfies 𝜏2 ≤𝑘≤ 𝜏2 . 4. (Federer Tubular Neighborhood Theorem) If the line segment 𝑝𝑥 is normal to 𝑀 at 𝑝, and ∥ 𝑝 − 𝑥∥ < 𝜏 then 𝑝 is the closest point to 𝑥 in 𝑀. 5. If line segment 𝑝𝑞 is normal to 𝑀 at 𝑝 then ∥ 𝑝 − 𝑞∥ ≥ 2𝜏. 6. At distance exactly 𝜏 away from 𝑀, either there is a focal point or there are two points 𝑝 1 , 𝑝 2 ∈ 𝑀 and 𝑦 ∈ R𝑛 such that ∥ 𝑝 1 − 𝑦∥ = ∥ 𝑝 2 − 𝑦∥ = 𝜏 and the line segments 𝑝 1 𝑦 and 𝑝 2 𝑦 meet at 𝑦 at an angle of 𝜋; or both. 7. 𝜏 = 𝑟 inj (exp𝑁 ). 8. 𝜏 > 0. 9. For 𝑝 ∈ 𝑀, the injectivity radius at 𝑝 is at least min{𝑑 𝑀 ( 𝑝, 𝜕 𝑀), 𝜋𝜏} where 𝑑 𝑀 ( 𝑝, 𝜕 𝑀) is the geodesic distance of 𝑝 to the boundary of 𝑀. If there is no boundary, injectivity radius is at least 𝜋𝜏. For the next two properties, let 𝑑 and 𝑙 be the Euclidean and geodesic distances between 𝑝 and 𝑞 respectively. 𝑙2 10. 𝑙 − 2𝜏 ≤ 𝑑. 2𝑑 2 11. When restricted to 𝑑 ≤ 2𝜏 , we further have 𝑙 ≤ 𝑑 + 𝜏 . For the last two properties, let 𝛾(𝑡) be a unit speed geodesic connecting 𝑝 to 𝑞. Let 𝑣 ∈ 𝑇𝑝 𝑀 ¤ be 𝛾(0). Let 𝜙 be the angle between 𝑣 and the secant line connecting 𝑝 to 𝑞. Let 𝑤 ∈ 𝑇𝑝 𝑀 be any unit vector and 𝑤 ∗ ∈ 𝑇𝑞 𝑀 its parallel transport via the connection on 𝑀 along 𝛾. Let 𝜃 be the angle between 𝑤 and 𝑤 ∗ after they are parallel transported to the origin in R𝑛 . 14 12. 𝜃 ≤ 𝜏𝑙 . 𝑑 2𝑑 2 13. When restricted to 𝑑 ≤ 2𝜏 , sin(𝜙) ≤ 2𝜏 (1 + 𝜏 ) . Proof. 1. See [3, theorem 1]. 2. See [8, lemma 4] or [27, proposition 6.1]. In the case where a unit speed geodesic 𝛾(𝑡) is part of the boundary, the only modification is that 𝛾(𝑡) ¥ is normal to the boundary and the same argument applies. 3. This follows from item 2 above and Gauss’s equation [28, page 100, theorem 5]. 4. See [16, theorem 4.8(12)] or [8, theorem 2]. 5. By Federer’s tubular neighborhood theorem [8, section 3.1], for points on line segment 𝑝𝑞 with distance to 𝑝 less than 𝜏, 𝑝 is the closest point to them. Therefore 𝑞 is at least 𝜏 away from the point that is 𝜏 away from 𝑝. This gives the claimed ∥ 𝑝 − 𝑞∥ ≥ 2𝜏. 6. See [33]. 7. Let 𝑟 = 𝑟 inj (exp𝑁 ). First we show 𝜏 ≤ 𝑟. For the sake of contradiction assume 𝑟 < 𝜏. Then by definition of 𝑟, there is a normal line segment to 𝑀 that stops being distance minimizing pass distance 𝑟 away from 𝑀. But this contradicts Federer’s tubular neighborhood theorem, as such normal line segments must remain distance minimizing up to distance 𝜏. Next we show 𝜏 < 𝑟 is impossible and that is enough for obtaining 𝜏 = 𝑟. For contradiction, assume 𝜏 < 𝑟. By item 6, at distance exactly 𝜏 away from 𝑀 there must be either a focal point or two equal length line segments normal to 𝑀 and intersecting each other at angle of 𝜋. If there is a focal point, it is known that past a focal point, geodesics stop being distance minimizing, see [31, lemma 2.11]. Hence past a distance 𝜏, there is a normal line segment to 𝑀 that stops being distance minimizing; but this contradicts the definition of 𝑟 because it guarantees they remain distance minimizing up to distance 𝑟 away from 𝑀. In the case of two equal line segments, call them 𝑝 1 𝑦 and 𝑝 2 𝑦 with 𝑝 1 , 𝑝 2 ∈ 𝑀. Since they meet at an angle of 𝜋, 𝑝 1 , 𝑦 and 𝑝 2 are collinear; and since ∥ 𝑝 1 − 𝑦∥ = ∥ 𝑝 2 − 𝑦∥, 𝑝 1 𝑦 stops being 15 distance minimizing past 𝑦. Again this contradicts the guarantee of being distance minimizing up to distance 𝑟, showing 𝜏 < 𝑟 is impossible. 8. See [7, page 151, theorem 4]. 9. By [4, theorem 3], in a manifold with boundary and reach 𝜏, two geodesics with the same starting point must travel a distance of 𝜋𝜏 before they meet. Separately all sectional curvatures are 1 bounded above by 𝜏2 . Then by Klingenberg’s theorem [6, theorem 89], we get that the injectivity radius is at least 𝜋𝜏. Since we take a minimum with the distance to the boundary, we are guaranteed not to intersect it. 10. See [27, lemma 6.3]. 11. For partial and complementary proofs see [27, lemma 6.3] and [15, lemma 7]. Here we 𝑙2 streamline the above arguments. Reversing 𝑙 − 2𝜏 ≤ 𝑑, one of √︂ √︂ 2𝑑 2𝑑 𝑙 ≤ 𝜏−𝜏 1− or 𝑙 ≥ 𝜏 + 𝜏 1 − 𝜏 𝜏 𝜏 must hold. If the strict inequality 𝑑 < 2 holds then 𝑙 = 𝜏 is impossible. We use this fact to √︃ show when 𝑑 ≤ 2𝜏 we must have 𝑙 ≤ 𝜏. From 𝑙 ≤ 𝜏, only 𝑙 ≤ 𝜏 − 𝜏 1 − 2𝑑 𝜏 remains, and using √ 2 2𝑑 2 1 − 1 − 𝑥 ≤ 𝑥+𝑥 2 we get the claimed 𝑙 ≤ 𝑑 + 𝜏 . 𝜏 For the sake of contradiction assume 𝑑 ≤ 2 and 𝜏 < 𝑙. Then 𝑆1 = {𝑎 ∈ 𝑀 | 𝑑 𝑀 ( 𝑝, 𝑎) = 𝜏} is not empty. Let 𝑞 ∗ be a minimizer for the Euclidean distance to 𝑝 in 𝑆1 : 𝑞 ∗ = arg inf ∥ 𝑝 − 𝑎∥. 𝑎∈𝑆1 Then 𝑞 ∗ is also a minimizer for Euclidean distance to 𝑝 over 𝑆2 = {𝑎 ∈ 𝑀 | 𝑑 𝑀 ( 𝑝, 𝑎) ≥ 𝜏}. Because if there is a minimizer in the interior of 𝑆2 such as 𝑎, then line segment 𝑝𝑎 must be normal 𝜏 to 𝑀 at 𝑎. Then ∥ 𝑝 − 𝑎∥ ≥ 2𝜏. But since 𝑎 is a minimizer, it holds that ∥ 𝑝 − 𝑎∥ ≤ ∥ 𝑝 − 𝑞∥ ≤ 2 and that contradicts ∥ 𝑝 − 𝑎∥ ≥ 2𝜏. Similarly 𝑞 can’t be a minimizer because it is in the interior of 𝑆2 . So there is a strict inequality ∥ 𝑝 − 𝑞 ∗ ∥ < ∥ 𝑝 − 𝑞∥. But for ∥ 𝑝 − 𝑞 ∗ ∥ < 𝜏 2, it is impossible to have 𝑑 𝑀 ( 𝑝, 𝑞 ∗ ) = 𝜏 and this 16 contradicts how 𝑞 ∗ was constructed. 12. See [8, lemma 6]. ℎ 13. See figure 2.1. We have sin(𝜙) = 𝑑 where ℎ is the distance of 𝑞 to the line through 𝑝 𝜏 with direction 𝑣. Since we are restricted to 𝑑 ≤ 2, using item 11, we have that 𝑙 ≤ 𝜏. Now we bound ℎ. Let 𝜃 (𝑠) be the angle between 𝑣 and the tangent vector to the geodesic at distance 𝑠 𝑠 along the geodesic, when viewed as vectors in R𝑁 . From item 12 we get that 𝜃 (𝑠) ≤ 𝜏. The incremental height gain in the ℎ direction is at most sin(𝜃 (𝑠))𝑑𝑠. Therefore we obtain    2 𝑙2 𝑑2 ∫𝑙 ∫𝑙 1 + 2𝑑  ℎ ≤ 0 sin(𝜃 (𝑠))𝑑𝑠 ≤ 0 sin 𝜏𝑠 𝑑𝑠 = 𝜏 − 𝜏 cos 𝜏𝑙 ≤ 2𝜏 ≤ 2𝜏 𝜏 . Dividing by 𝑑 gives the claimed result. ■ Figure 2.1: Two points with the geodesic and secant line between them. We bound the angle between 𝑑 and 𝑉0 using 𝜏. 2.2 Covering Estimate from Above via Gunther’s Theorem We first present a covering bound for closed manifolds. Since the boundary of a manifold is itself a closed manifold, this argument is useful in the case the manifold has boundary. First we need a lemma relating covering and packing numbers. For definitions of packing and covering see [35, lemma 4.2.8]. Lemma 2.2.1. Let 𝑋 be a metric space with 𝐴 a subset of it. Then for any 0 < 𝜖, the 𝜖-covering number of 𝐴 is not greater than the 2𝜖 -packing number of 𝐴. 𝜖 𝐶 (𝜖) ≤ 𝑃( ) (2.1) 2 17 Proof. Start with a packing with balls of radius 2𝜖 . Then enlargen the balls to radius 𝜖. We claim this is a covering. If not there is a point 𝑦 ∈ 𝐴 such that it is more than 𝜖 away from all the 𝜖 centers of balls. Then a ball of radius 2 centered at 𝑦 would be disjoint with the starting packing balls and that contradicts the assumption that any packing is maximal. Theorem 2.2.2. (Covering a Closed Manifold) Let 𝑀 ↩→ R𝑛 be a closed (compact, no boundary) submanifold of R𝑛 of dimension 𝑑 ≥ 0. Let 𝜏 > 0 be the reach of 𝑀, and 𝑉 denote the volume of 𝑀, as induced by the embedding. When 𝑑 = 0, we use the counting measure for volume. Then if 𝑑 = 0, 𝑉 balls are sufficient to cover 𝑀. If 𝑑 > 0, let 𝜔 𝑑 denote the volume of the 𝑑-dimensional Euclidean ball of radius 1. Then for √ 𝑉 0 < 𝜖 < 2 6𝜏, one can cover 𝑀 with at most 𝜖 2 𝑑−1 𝜖 𝑑 𝑛-dimensional Euclidean balls of 𝜔 𝑑 (1− ) (2) 24𝜏 2 radius 𝜖 with centers on 𝑀. Proof. The case of 𝑑 = 0 corresponds to isolated points and 1 ball per point is sufficient. So assume 𝑑 > 0. The covering number with balls of radius 𝜖 is bounded above by the packing number with radius 2𝜖 , [35, lemma 4.2.8]. We claim each such packing ball contains a geodesic ball of 𝜖 𝜖 2 𝑑−1 𝜖 𝑑 radius 2 and the volume of the geodesic ball is at least 𝜔 𝑑 (1 − 24𝜏 2 ) ( 2) . Since the packing balls are disjoint, the claim implies the theorem immediately. For 𝑥 ∈ 𝑀, 𝐵R𝑛 (𝑥, 2𝜖 ) contains the geodesic ball 𝐵 𝑀 (𝑥, 2𝜖 ) because the geodesic distance between two points of 𝑀 is not smaller than the Euclidean distance between them. We will work with this geodesic ball. To bound the volume of the geodesic ball from below, we use G¥unther’s volume comparison theorem, see [19, page 169, theorem 3.101, part ii]. To use it, we need to bound the sectional curvatures of 𝑀 from above, and ensure that the geodesic balls don’t touch the cut locus of their center. From properties of reach 𝜏, theorem 2.1.7, all sectional curvatures of 𝑀 are bounded above 1 by 𝜏2 and the injectivity radius of 𝑀 is at least 𝜋𝜏. To avoid intersecting the cut locus, we need 𝜖 √ 2 < 𝜋𝜏; this is satisfied by the assumption that 𝜖 < 2 6𝜏 (one could allow for a larger range of 𝜖, see remark 2.2.3). By G¥unther’s theorem, the volume of the geodesic ball is at least the volume 18 𝜖 of a geodesic ball of radius 2 in the simply connected model space of constant sectional curvature 1 𝜏2 , which is a sphere with radius 𝜏. Let 𝑉 1 ( 2𝜖 ) denote this volume. It is explicitly given by the 𝜏2 formula (see [25]): ∫ 𝜖 𝜖 2𝜋 𝑑/2 2 𝑉1 ( )= 𝑆 1 (𝑥) 𝑑−1 𝑑𝑥 (2.2) 𝜏2 2 𝑑 Γ( 2 ) 0 𝜏2 √ where for 𝑘 > 0, 𝑆 𝑘 (𝑥) = √1 sin( 𝑘𝑥). It follows that 𝑘 ∫ 𝜖 ∫ 𝜖 2𝜋 𝑑/2 2 𝑑−1 2𝜋 𝑑/2 2 𝑥 𝑆 1 (𝑥) 𝑑𝑥 = (𝜏 sin( )) 𝑑−1 𝑑𝑥. (2.3) Γ( 𝑑2 ) 0 𝜏2 Γ( 𝑑2 ) 0 𝜏 sin(𝑥) √ Since 𝑥 is decreasing on [0, 𝜋], and 𝜖 < 2 6𝜏 < 2𝜋𝜏, 𝜖 𝜖 2𝜋 𝑑/2 sin( 2𝜏 ) 𝑑−1 1 𝜖 𝑑 𝑉1 ( ) ≥ ( 𝜖 ) ( ) . (2.4) 𝜏2 2 Γ( 𝑑2 ) 𝑑 2 2𝜏 𝑥2 sin(𝑥) √ 𝜋 𝑑/2 Using 0 < 1 − 6 < 𝑥 for 0 < 𝑥 < 6, 𝜔 𝑑 = Γ( 𝑑2 +1) and Γ( 𝑑2 + 1) = Γ( 𝑑2 ) 𝑑2 we get 𝜖 𝜖 2 𝑑−1 𝜖 𝑑 𝑉 1 ( ) ≥ 𝜔 𝑑 (1 − ) ( ) . (2.5) 𝜏2 2 24𝜏 2 2 ■ √ Remark 2.2.3. In the proof of the theorem 2.2.2, we restricted the range of 𝜖 as 0 < 𝜖 < 2 6𝜏. 2 The reason is that 1 − 𝑥6 , which is the second order Taylor expansion of sin(𝑥)/𝑥, remains positive in this range. One could work with the sinc(𝑥) function to have the larger range 0 < 𝜖 < 2𝜋𝜏, or one could work with the 6th order Taylor expansion of sin(𝑥)/𝑥, which remains positive slightly beyond 3 and we have 0 ≤ 𝑇6 (𝑥) ≤ sinc(𝑥) on [0, 3]. However for the purposes of JL applications such improvements only offer marginal benefits and hence for simplicity we state our bounds for the second order expansion. Next as an example we apply our theorem to S𝑑 . A standard estimate for covering S𝑑 with balls √ 𝑑 centered on the sphere is ( 3𝜖 ) 𝑑 , see [35, corollary 4.2.13]. Here we get (3.4 𝑑) 2.1 𝜖 𝑑 . In the special 𝑑 1.5 log(𝑑) case of S𝑑 , there are more economical bounds on the order of O ( 𝜖𝑑 ) ([9, theorem 6.8.1]), but they apply only to S𝑑 , and not arbitrary manifolds. 19 √ 𝑑 Corollary 2.2.4. For 0 < 𝜖 < 1 the standard S𝑑 ↩→ R𝑛 can be covered with at most (3.4 𝑑) 2.1 𝜖𝑑 Euclidean 𝑛-dimensional balls of radius 𝜖. 𝑑+1 𝑉𝑀 2𝜋 2 Proof. By theorem 2.2.6, we need at most 2 . We have 𝜏 = 1, 𝑉𝑀 = Γ( 𝑑+1 , 𝜔 𝑑 (1− 𝜖 2 ) 𝑑−1 ( 𝜖2 ) 𝑑 2 ) 24𝜏 𝑀 𝜋2 𝑑 √ Γ( 𝑑 +1) √ 𝜔𝑑 = Γ( 𝑑2 +1) , and 𝑉𝜔𝑀𝑑 = 2 𝜋 Γ( 2𝑑+1 ) < 2 𝜋𝑑. In the last equation we used the Gautschi’s inequality 2 √ 24 𝑑−1 2𝑑 √ [20]. This leads to an upper bound of 2 𝜋𝑑 ( 23 ) 𝜖 𝑑 ≤ 3.4 𝑑 ( 2.1 𝜖 ) . 𝑑 ■ Next we focus on a covering estimate for a compact manifold with boundary. First we present a lemma showing that the number of connected components of the boundary must be finite. Lemma 2.2.5. Let 𝑀 be a compact connected embedded submanifold of R𝑛 with boundary 𝜕 𝑀. Then 𝜕 𝑀 has a finite number of connected components. Proof. Assume not. Let 𝑝𝑖 ∈ 𝑀 be a sequence of points where each 𝑝𝑖 belongs to a different component of 𝜕 𝑀. Since 𝑀 is compact, 𝑝𝑖 converge to a point 𝑝 ∈ 𝑀, after passing to a subsequence. Either 𝑝 is in interior of 𝑀 or it belongs to the boundary of 𝑀. If 𝑝 is in the interior there is an open neighborhood of 𝑝 in R𝑛 intersecting 𝑀 only in interior points. Since 𝑝𝑖 are boundary points, this contradicts their convergence to 𝑝. If 𝑝 is in the 𝜕 𝑀, then it is in one of the connected components such as 𝑗. Again there is an open neighborhood of 𝑝 in R𝑛 intersecting 𝑀 only in points in component 𝑗 of boundary or interior points; this contradicts 𝑝𝑖 converging to 𝑝. ■ Using theorem 2.2.2 and lemma 2.2.5 we present a covering estimate for a manifold with boundary. Theorem 2.2.6. (Covering a Manifold with Boundary) Let 𝑀 ↩→ R𝑛 be a compact 𝑑-dimensional, 𝑑 ≥ 1, submanifold of R𝑛 with boundary. Let 𝜏𝑀 > 0 and 𝑉𝑀 be the reach and volume of 𝑀. Let 𝜕 𝑀 denote the boundary of 𝑀 as an independent submanifold of R𝑛 with 𝑉𝜕 𝑀 as the volume of 𝜕 𝑀. Let 𝜏𝜕𝑖 𝑀 > 0 be the reach of the 𝑖-th component of boundary of 𝑀. Let 𝜏𝜕 𝑀 = inf 𝑖 {𝜏𝜕𝑖 𝑀 }. Fix √ √ 0 < 𝜖 ≤ min{4 6𝜏𝜕 𝑀 , 2 6𝜏𝑀 }. Then for 𝑑 = 1, 𝑀 can be covered with 𝑉𝜖𝑀 + 𝑉𝜕 𝑀 n-dimensional 20 Euclidean balls of radius 𝜖, where 𝑉𝜕 𝑀 corresponds to the counting measure. For 𝑑 ≥ 2, 𝑀 can 𝑉𝑀 𝑉𝜕𝑀 be covered with 𝜖2 + 𝜖2 many balls. 𝜔 𝑑 (1− ) 𝑑−1 ( 𝜖2 ) 𝑑 𝜔 𝑑−1 (1− ) 𝑑−2 ( 𝜖4 ) 𝑑−1 24𝜏 2 96𝜏 2 𝑀 𝜕𝑀 Proof. Consider two subsets of M, 𝑆1 = {𝑥 ∈ 𝑀 |𝑑 𝑀 (𝑥, 𝜕 𝑀) < 2𝜖 } and 𝑆2 = 𝑀\𝑆1 . We will cover each region separately. To cover 𝑆1 we cover 𝜕 𝑀 with balls of radius 2𝜖 , and then enlarge them to balls of radius 𝜖. We claim such balls cover 𝑆1 . For each point 𝑥 ∈ 𝑆1 , there is a point 𝑦 ∈ 𝜕 𝑀 such that 𝑑 𝑀 (𝑥, 𝑦) < 2𝜖 . 𝜖 Also 𝑑R𝑛 (𝑥, 𝑦) ≤ 𝑑 𝑀 (𝑥, 𝑦) < 2𝜖 . Since we have a 2 covering of 𝜕 𝑀, center of one of the balls, 𝑜𝑖 , satisfies 𝑑R𝑛 (𝑜𝑖 , 𝑦) < 2𝜖 . Therefore 𝑑R𝑛 (𝑜𝑖 , 𝑥) ≤ 𝑑R𝑛 (𝑜𝑖 , 𝑦) + 𝑑R𝑛 (𝑦, 𝑥) < 𝜖 as claimed. When 𝑑 = 1, the boundary is a collections of points, so 𝑉𝜕 𝑀 is enough to cover the region 𝑆1 . For region 𝑆2 , we pack it with balls of radius 2𝜖 . Each Euclidean ball will overlap with the geodesic ball of the same radius; for 𝑑 = 1 that is a curve of length 𝜖. Since volume of 𝑆2 is at most 𝑉𝑀 𝑉 and the packing balls are disjoint we can pack at most 𝜖 balls. This gives our covering claim of 𝑉𝑀 𝜖 + 𝑉𝜕 𝑀 . 𝑉𝜕𝑀 𝜖 Now assume 𝑑 ≥ 2. We claim 𝜖2 balls of radius 2 are enough to cover 𝜔 𝑑−1 (1− ) 𝑑−2 ( 𝜖4 ) 𝑑−1 96𝜏 2 𝜕𝑀 𝜕 𝑀. To show this, we use theorem 2.2.2 which requires 𝜕 𝑀 to be compact and without boundary. Boundary of 𝑀 is compact since it is a closed subset of 𝑀; furthermore boundary of boundary of √ a manifold is empty. Therefore by theorem 2.2.2 for 0 < 2𝜖 ≤ 2 6𝜏𝜕 𝑀 , 𝜕 𝑀 can be covered with 𝑉𝜕𝑀 𝜖2 n-dimensional balls of radius 2𝜖 . 𝜔 𝑑−1 (1− 2 ) 𝑑−2 ( 𝜖4 ) 𝑑−1 96𝜏 𝜕𝑀 𝜖 To cover region 𝑆2 , we instead pack it with balls of radius 2 and centers in 𝑆2 to get an upper bound. Since the centers are in 𝑆2 , each ball will contain the geodesic ball of the same radius, 𝐵 𝑀 (𝑜𝑖 , 2𝜖 ). The open geodesic ball 𝐵 𝑀 (𝑜𝑖 , 2𝜖 ) can not touch the boundary 𝜕 𝑀 since its 𝜖 center is at least 2 geodesic distance away. The cut-locus of 𝑜𝑖 is either at least 𝜋𝜏𝑀 away in the geodesic distance or at least the distance of 𝑜𝑖 to the boundary in geodesic distance; and for √ 0 < 2𝜖 ≤ 6𝜏𝑀 < 𝜋𝜏𝑀 the geodesic ball 𝐵 𝑀 (𝑜𝑖 , 2𝜖 ) does not touch the cut locus of 𝑜𝑖 . By the volume √ comparison in the (2.5), we have 0 < 𝜖 ≤ 2 6𝜏𝑀 the volume of the the geodesic ball 𝐵 𝑀 (𝑜𝑖 , 2𝜖 ) is 𝜖 2 𝑑−1 𝜖 𝑑 at least 𝜔 𝑑 (1 − 24𝜏 2 ) ( 2) . Since in a packing such balls are disjoint, the total volume of such 21 𝑉𝑀 balls can not exceed 𝑉𝑀 and their number is at most 𝜖2 as claimed. ■ 𝜔 𝑑 (1− ) 𝑑−1 ( 𝜖2 ) 𝑑 24𝜏 2 𝑀 We apply our estimate to the standard d-dimensional closed ball 𝐵 𝑑 as a manifold with boundary. Corollary 2.2.7. Let 𝐵 𝑑 ↩→ R𝑛 for 𝑛 ≥ 2 be the standard 𝑑 dimensional closed ball of radius 1. Then to cover it with 𝑛-dimensional balls of radius 𝜖 for 0 < 𝜖 < 1, ( 2𝜖 ) 𝑑 + 2𝜋( 4.05 𝜖 ) 𝑑−1 balls are sufficient. 𝑉𝑀 𝑉𝜕𝑀 Proof. From theorem 2.2.6, we need at most 𝜖2 + 𝜖2 balls. 𝜔 𝑑 (1− ) 𝑑−1 ( 𝜖2 ) 𝑑 𝜔 𝑑−1 (1− ) 𝑑−2 ( 𝜖4 ) 𝑑−1 24𝜏 2 96𝜏 2 𝑀 𝜕𝑀 The parameters are 𝑀 𝑉𝐵 𝑑 𝜏𝐵 𝑑 𝜕𝐵 𝑑 𝑉𝑆 𝑑−1 𝜏𝑆 𝑑−1 𝜔𝑑 𝜔 𝑑−1 𝑑 𝑑+1 𝜋2 𝜋 2 𝐵𝑑 Γ( 𝑑2 +1) ∞ 𝑆 𝑑−1 2 Γ( 𝑑+1 1 𝑉𝐵 𝑑 𝑉𝐵 𝑑−1 2 ) Using 0 < 𝜖 < 1 we get, ( 2𝜖 ) 𝑑 + 2𝜋 95 96 4 𝑑−1 96 ( 95 𝜖 ) < ( 2𝜖 ) 𝑑 + 2𝜋( 4.05 𝜖 ) 𝑑−1 . ■ 2.3 Covering Estimate from Above Without Radius Restriction via Tubes In this section we show it is possible to remove the restriction imposed on the covering numbers by the injectivity radius in theorem 2.2.2. This is at the cost of an explicit dependence on the ambient dimension. This dependence could be prohibitive when one is in spaces with extremely high dimensions. For a covering radius larger than the diameter of a manifold, one expects the covering number to be 1. Therefore one encounters 3 regimes for the covering radius: less than the injectivity radius, larger than the injectivity radius but up to the diameter, and larger than the diameter. Interpolating between these regimes is an interesting direction for further research. After we derive our formula in this section, we will discuss this issue again in remark 2.3.3. Proposition 2.3.1. Let 𝑀 ↩→ R𝑛 be a closed (compact, no boundary) submanifold of R𝑛 of dimension 𝑑. Let 𝜏 > 0 be the reach of 𝑀, and 𝑉 denote the volume of 𝑀. Let 𝜔 𝑑 denote the 22 volume of the 𝑑-dimensional Euclidean ball of radius 1. Then for any 0 < 𝜖, one can cover 𝑀 with 𝜖 𝑑 𝑉𝜔𝑛−𝑑 (1 + 2𝜏 ) 𝜖 𝑑 𝑛-dimensional Euclidean balls of radius 𝜖 with centers on 𝑀. 𝜔𝑛 ( 2 ) Remark. Compared to theorem 2.2.2, there is no upper bound on 𝜖. 𝜖 Proof. Instead of covering 𝑀, we pack it with 𝑛-Euclidean ball of radius 2 with centers on 𝑀. All 𝜖 such balls are within the tube of radius 2 around 𝑀. Since the balls are disjoint the number of balls is bounded above by 𝜖 Volume of Tube of radius 2 around 𝑀 (2.6) 𝜔𝑛 ( 2𝜖 ) 𝑛 To establish the proposition, we need to show that 𝜖 𝜖 𝜖 Volume of Tube of radius around 𝑀 ≤ 𝑉𝑀 𝜔𝑛−𝑑 ( ) 𝑛−𝑑 (1 + ) 𝑑 (2.7) 2 2 2𝜏 𝜖 𝜖 ≤ 𝑉𝑀 Vol(𝐵𝑛−𝑑 ( ))(1 + ) 𝑑 . (2.8) 2 2𝜏 For 𝜖 < 𝜏, the exact volume of this tube was found by Herman Weyl in [36] (see the textbook of Gray [22] for a more detailed explanation). Lotz [26, theorem 3.1] gives an overestimate for the volume of the tube for all values of 𝜖 as follows. Let 𝑆 be the shape operator of 𝑀 defined via a local orthonormal frame 𝐸 1 , .., 𝐸 𝑑 on 𝑀 and 𝑣 a unit normal vector to 𝑀 𝑆(𝑣)𝑖 𝑗 := 𝑆 𝑣 (𝐸𝑖 , 𝐸 𝑗 ) = ⟨∇𝐸𝑖 𝐸 𝑗 , 𝑣⟩ (2.9) Let 𝜓𝑖 (𝑣) be defined using the characteristic polynomials of the shape operator 𝑆 as 𝑑 ∑︁ det(Id − 𝑡𝑆(𝑣)) = 𝑡 𝑖 𝜓𝑖 (𝑣) (2.10) 𝑖=0 Let 𝑆(𝑁 𝑀) be the unit sphere bundle of the normal bundle of 𝑀 and define the volume form 𝜔 𝑆(𝑁 𝑀) via the Sasaki metric [32]. Define 𝐾𝑖 as ∫ 𝐾𝑖 := |𝜓𝑖 (𝑣)|𝜔 𝑆(𝑁 𝑀) (2.11) 𝑆(𝑁 𝑀) 23 Then Lotz’s overestimate is give by 𝑑 𝜖 𝜖 ∑︁ 𝐾𝑖 𝜖 (Volume of Tube of radius around 𝑀) ≤ ( ) 𝑛−𝑑 ( )𝑖 (2.12) 2 2 𝑖=0 𝑛−𝑑 +𝑖 2 1 We have that 𝜏 is an upper bound for all eigenvalues of the shape operator. Since 𝜓𝑖 (𝑣) are the coefficients of the characteristic polynomial of S(v), we have   𝑑 1 𝑖 |𝜓𝑖 (𝑣)| ≤ ( ) (2.13) 𝑖 𝜏 ∫ 𝐾𝑖 := |𝜓𝑖 (𝑣)|𝜔 𝑆(𝑁 𝑀) (2.14) 𝑆(𝑁 𝑀)   𝑑 1 𝑖 ≤ vol(𝑆(𝑁 𝑀)) ( ). (2.15) 𝑖 𝜏 where vol(𝑆(𝑁 𝑀)) = vol(𝑀)vol(𝑆 𝑛−𝑑−1 ). Therefore using (2.12) we get 𝑑 𝑑 1 𝑖 𝜖 𝑛−𝑑−1 𝜖 𝑛−𝑑 𝑖 (𝜏) 𝜖 ∑︁ (Volume of Tube of radius around 𝑀) ≤ vol(𝑀)vol(𝑆 )( ) ( ) 𝑖 (2.16) 2 2 𝑖=0 𝑛−𝑑 +𝑖 2 Using vol(𝑆 𝑛−𝑑−1 ) = (𝑛 − 𝑑)vol(𝐵𝑛−𝑑 ) 𝑑   𝜖 𝑛−𝑑 𝜖 ∑︁ 𝑛−𝑑 𝑑 𝜖 𝑖 (Volume of Tube of radius around 𝑀) ≤ 𝑉𝑀 Vol(𝐵 ( )) ( ) (2.17) 2 2 𝑖=0 𝑛 − 𝑑 + 𝑖 𝑖 2𝜏 𝜖 𝜖 ≤ 𝑉𝑀 Vol(𝐵𝑛−𝑑 ( ))(1 + ) 𝑑 (2.18) 2 2𝜏 as claimed. ■ Now we present a similar argument for when a boundary is present. Proposition 2.3.2. (Via Tube Formula) Let 𝑀 ↩→ R𝑛 be a compact 𝑑-dimensional submanifold of R𝑛 with boundary. Let 𝜏𝑀 > 0 be the reach of 𝑀\𝜕 𝑀, and 𝑉𝑀 denote the volume of 𝑀. Let 𝜕 𝑀 denote the boundary of 𝑀 as an independent submanifold of R𝑛 . Let 𝜏𝜕 𝑀 > 0 be the injectivity radius of the normal exponential map of 𝜕 𝑀 and let 𝑉𝜕 𝑀 be the volume of 𝜕 𝑀. Let 𝜔𝑛 be volume of the unit Euclidean 𝑛-dimensional ball. Then for all 0 < 𝜖, 𝑀 can be covered with 𝑉𝑀 𝜔𝜔𝑛−𝑑 𝑛 ( 𝜏1𝑀 + 2𝜖 ) 𝑑 + 𝑉𝜕 𝑀 𝜔𝑛−𝑑+1 1 2 𝑑−1 𝜔 𝑛 ( 𝜏𝜕𝑀 + 𝜖 ) n-dimensional Euclidean balls of radius 𝜖. Proof. Instead of covering 𝑀 with Euclidean balls of radius 𝜖, we pack it with balls of radius 𝜖 𝜖 2. We restrict the centers to be on 𝑀. Any such ball is contained in the set of points of distance 2 24 from 𝑀. We will bound this volume from above. Let 𝐴1 be the image of normal exponential map of 𝑀\𝜕 𝑀 with radius 2𝜖 . Let 𝐴2 be the image of normal exponential map of 𝜕 𝑀 with radius 2𝜖 . 𝜖 𝑑 Volume of 𝐴 is bounded above by 𝑉𝑀 Vol(𝐵𝑛−𝑑 ( 2𝜖 ))(1 + 𝜏𝑀 ) . Volume of 𝐵 is bounded above by 𝜖 𝑑−1 𝑉𝜕 𝑀 Vol(𝐵𝑛−(𝑑−1) ( 2𝜖 ))(1 + 𝜏𝜕𝑀 ) . Therefore the number of disjoint packing balls can be bounded by dividing by the volume of a single ball. 𝑉𝑀 Vol(𝐵𝑛−𝑑 ( 2𝜖 ))(1 + 𝜖 𝑑 𝜏𝑀 ) + 𝑉𝜕 𝑀 Vol(𝐵𝑛−(𝑑−1) ( 2𝜖 ))(1 + 𝜖 𝑑−1 𝜏𝜕𝑀 ) num of balls ≤ (2.19) 𝜔𝑛 ( 2𝜖 ) 𝑛 𝜔𝑛−𝑑 𝜖 𝑑 1 𝜔𝑛−𝑑+1 𝜖 𝑑−1 1 ≤ 𝑉𝑀 (1 + ) 𝜖 𝑑 + 𝑉𝜕 𝑀 (1 + ) (2.20) 𝜔𝑛 2𝜏𝑀 ( 2 ) 𝜔𝑛 2𝜏𝜕 𝑀 ( 2𝜖 ) 𝑑−1 𝜔𝑛−𝑑 1 2 𝜔𝑛−𝑑+1 1 2 = 𝑉𝑀 ( + ) 𝑑 + 𝑉𝜕 𝑀 ( + ) 𝑑−1 (2.21) 𝜔𝑛 𝜏𝑀 𝜖 𝜔𝑛 𝜏𝜕 𝑀 𝜖 Remark 2.3.3. The bound in proposition 2.3.2, which is based on overestimating the volume of a tube around the manifold, is valid for all radii of covering balls. This is in contrast to the approach using Gunther’s theorem where one is restricted by the injectivity radius of the manifold. However the tube bounds depend on the ambient dimension 𝑛 while Gunther’s bound only depends on the dimension of the manifold. This is because for small enough radii, one is close to the surface of the manifold. However for large radii, one must account for the dimension of the ambient space. This dependence on the ambient dimension is prohibitive specially when one is in spaces with extremely large dimension. We show the dependence on the ambient dimension with a calculation. Using the formula for 𝑑 𝜋2 𝜔 𝑛−𝑑 1 Γ( 𝑛2 +1) the volume of a Euclidean ball in dimension 𝑑, 𝜔 𝑑 = Γ( 𝑑2 +1) . One gets = 𝑑 Γ( 𝑛−𝑑 . In 2 +1) 𝜔𝑛 𝜋2 particular if both 𝑛, 𝑑 are even and 𝑑 < 𝑛2 , 𝜔𝑛−𝑑 1 𝑛−𝑑 𝑛 1 𝑛−𝑑 𝑑 𝑛 𝑑 = 𝑑( + 1)...( ) ≥ 𝑑 ( + 1) 2 ≥ ( ) 2 . 𝜔𝑛 𝜋2 2 2 𝜋2 2 4𝜋 𝜖 𝑑 𝑉𝜔𝑛−𝑑 (1 + 2𝜏 ) One can consider the limit 𝜖 → ∞ in the formula from proposition 2.3.1. The 𝜔𝑛 ( 2𝜖 ) 𝑑 𝜔𝑛−𝑑 𝑉 limit scales as ( ) . For a compact manifold, the covering number is 1 once 𝜖 > diameter(𝑀). 𝜔𝑛 𝜏 𝑑 This difference is in part because in the overestimate we ignore the overlaps as the tube gets large. Incorporating the overlaps in the covering bound is an interesting direction for future research. 25 2.4 Covering Estimate from Below via Bishop’s Theorem In this section we give lower bounds for covering numbers of a manifold. These bounds are used in chapter 5 for discussing necessary conditions for JL maps. Theorem 2.4.1. Let 𝑀 be a d-dimensional smooth submanifold of R𝑛 possibly with boundary, with 𝜏 2𝜖 𝜋 𝑑/2 volume 𝑉 and reach 𝜏. Let 0 < 𝜖 ≤ 2, and 𝑟 = 𝜖 (1 + 𝜏 ). Let 𝜔 𝑑 = Γ( 𝑑2 +1) be the volume of the standard unit 𝑑-ball. Then the covering number of 𝑀 with balls centered on 𝑀 and radius 𝜖 satisfies 𝑉 √ ≤ 𝐶 (𝑀, 𝜖) . 2𝑟 𝑑−1 𝑑 𝜔 𝑑 (1 + 3𝜏 ) 𝑟 √ 2𝑟 𝑑−1 𝑑 Proof. We claim a single ball can cover at most a subset of 𝑀 with volume 𝜔 𝑑 (1 + 3𝜏 ) 𝑟 . This claim immediately implies the theorem. 𝜏 For two points 𝑝, 𝑞 ∈ 𝑀 with Euclidean distance 𝛿, geodesic distance 𝑙, and 𝛿 ≤ 2, one has 2𝛿 𝑙 ≤ 𝛿(1 + 𝜏 ). Hence the intersection of 𝑀 and a Euclidean ball of radius 𝜖 centered on 𝑀 is contained in a geodesic balls of radius 𝑟 = 𝜖 (1+ 2𝜖𝜏 ) with the same center provided 𝜖 ≤ 2𝜏 . Therefore √ 2𝑟 𝑑−1 𝑑 it is sufficient to bound the volume of such a geodesic ball from above with 𝜔 𝑑 (1 + 3𝜏 ) 𝑟 . We use Bishop’s volume comparison theorem [See [19], page 169, theorem 3.101, part i]. By properties of reach, theorem 2.1.7, the injectivity radius of each point of 𝑀 is at least 𝜋𝜏, and −2 all sectional curves of 𝑀 are bounded below by 𝜏2 . Hence by Bishop’s theorem the volume of a geodesic ball of radius 𝑟 in 𝑀 is bounded above by the volume of the geodesic ball of radius 𝑟 in −2 the simply connected space with constant sectional curvature 𝜏2 . Denote this volume by 𝑉 −2 (𝑟); it 𝜏2 is given by the formula ∫ 𝑟 2𝜋 𝑑/2 𝑉 −2 (𝑟) = 𝑆 −2 (𝑥) 𝑑−1 𝑑𝑥 𝜏2 Γ( 𝑑2 ) 0 𝜏2 26 √ sinh(𝑥) where 𝑆 𝑘 (𝑥) = √1 sinh( −𝑘𝑥). Since is increasing on 0 < 𝑥, −𝑘 𝑥 ∫ 𝑟 2𝜋 𝑑/2 𝑉 −2 (𝑟) = 𝑆 −2 (𝑥) 𝑑−1 𝑑𝑥 𝜏2 Γ( 𝑑2 ) 0 𝜏2 √ 2𝑥 sinh( 𝜏 ) 𝑑−1 𝑑−1 ∫ 𝑟 2𝜋 𝑑/2 ≤ ( √ ) 𝑥 𝑑𝑥 Γ( 𝑑2 ) 0 2𝑥 𝜏 √ 2𝑟 2𝜋 𝑑/2 sinh( 𝜏 ) 𝑑−1 𝑟 𝑑 ≤ ( √ ) . Γ( 𝑑2 ) 2𝑟 𝑑 𝜏 √ √ Here we use an approximation for sinh(𝑥) 𝜏 𝑥 . Since 0 < 𝜖 ≤ 2 , then 𝑟 = 𝜖 (1 + 2𝜖 𝜏 ) ≤ 𝜏, and 2𝑟 𝜏 ≤ 2. √ Using sinh(𝑥) 𝑥 ≤ 1 + 3𝑥 for 0 < 𝑥 ≤ 2, √ 2𝜋 𝑑/2 2𝑟 𝑑−1 𝑟 𝑑 ≤ (1 + ) Γ( 𝑑2 ) 3𝜏 𝑑 √ 2𝑟 𝑑−1 𝑑 = 𝜔 𝑑 (1 + ) 𝑟 3𝜏 ■ Corollary 2.4.2. Let 𝑀 be a d-dimensional smooth submanifold of R𝑛 possibly with boundary, 𝜋 𝑑/2 with volume 𝑉 and reach 𝜏. Let 0 < 𝜖 ≤ 2𝜏 . Let 𝜔 𝑑 = Γ( 𝑑2 +1) be the volume of the standard 𝑑-ball. Then the covering number of 𝑀 with balls centered on 𝑀 and radius 𝜖 satisfies 𝑉 ≤ 𝐶 (𝑀, 𝜖) . 𝜔 𝑑 (3𝜖) 𝑑 Proof. We use the bound √ 𝑉 2𝑟 𝑑−1 𝑑 ≤ 𝐶 (𝑀, 𝜖). From 0 < 𝜖 ≤ 2𝜏 , we simplify 𝑟 = 𝜖 (1 + 2𝜖𝜏 ) 𝜔 𝑑 (1+ 3𝜏 ) 𝑟 √ 2 to 𝑟 ≤ 2𝜖 and 𝑟 ≤ 𝜏. We get the factor of 3 from (1 + 3 )2 < 3. ■ 27 2.5 Covering Estimate from Above for the Unit Secants of a Submanifold In this section we calculate a covering number for the unit secants generated by a compact subman- ifold of the Euclidean space with boundary in terms of the covering numbers for the manifold. The secants of a manifold have been studied previously in [24, section 3], [30, page 1323], [37, section 1] and [38, section 3]. In our approach here, we use the covering numbers to estimate the Gaussian width of the set of unit secants. We give a list of secants that form an 𝜖-net for the set of unit secants. The distance between two unit secants is calculated by their distance on the unit sphere. The secants of 𝑀 are separated into long and short secants; the long secants are compared to the secants generated by a net on the manifold and the short secants are compared to tangent vectors anchored at the net points. The technique of separating the cords into long and short have been previously used in [14, page 5], [15, lemma 9], and [23, section 3]. For long secants, we have the following lemma. Lemma 2.5.1. [14, lemma 4.1] Let 𝑝, 𝑝 ∗ , 𝑞 and 𝑞 ∗ be 4 points in R𝑛 . Let 0 < 𝑙 = || 𝑝 − 𝑞|| and 4𝑑 || 𝑝 − 𝑝 ∗ ||, ||𝑞 − 𝑞 ∗ || < 𝑑. Let 0 < 𝜖 < 1 and assume 𝑙 ≤ 𝜖. Then 𝑝−𝑞 𝑝∗ − 𝑞∗ || − ∗ || ≤ 𝜖 . (2.22) || 𝑝 − 𝑞|| || 𝑝 − 𝑞 ∗ || We now present the covering argument for the unit secants. In the proof, we develop a lemma for short secants similar to long ones. Theorem 2.5.2. (Covering the unit secants) Let 𝑀 ↩→ R𝑛 be a compact 𝑑-dimensional submanifold of R𝑛 with boundary 𝜕 𝑀. Let 𝜏𝑀 be the reach of 𝑀. Let 𝜏𝑖 be the reach of the 𝑖-th connected component of 𝜕 𝑀 as a submanifold of R𝑛 . Let 𝜏 = inf 𝑀,𝑖 {𝜏𝑀 , 𝜏𝑖 }. Let 𝑉𝑀 be the volume of 𝑀 and 𝑝−𝑞 𝑉𝜕 𝑀 be the volume of 𝜕 𝑀. Let 𝑈 (𝑀 − 𝑀) = { || 𝑝−𝑞|| | 𝑝 ≠ 𝑞, 𝑝, 𝑞 ∈ 𝑀 } be the set of unit secants of 𝑀, and let 𝑈 (𝑀 − 𝑀) be its closure. Fix 0 < 𝜖 < 1. Define 𝛼 = 𝑉𝜔𝑀𝑑 ( 41 𝑉𝜕𝑀 81 𝑑−1 𝜏 ) + 𝜔 𝑑−1 ( 𝜏 ) 𝑑 . Then  2  𝑈 (𝑀 − 𝑀) can be covered with 𝛼2 + 𝛼(2𝑑3𝑑−1 ) 𝜖 14𝑑 n-dimensional Euclidean balls of radius 𝜖. 28 Proof. We give a list of unit vectors in R𝑛 , count its elements, and prove that every element of 𝑈 (𝑀 − 𝑀) is within 𝜖 distance of one of them. In this approach we follow [15]. The list is as follows. 2 1. Consider a ( 𝜏𝜖 20 )-net of points in 𝑀. Include all the unit secant lines generated by pair of these points in the list. If 𝜏 = ∞ then 1 point is sufficient for the net. (That is because infinite reach corresponds to affine spaces and all secants can be compared to tangent vectors at one point). 2. For each point in the above net, consider a ( 3𝜖 )-net for its unit tangent sphere. Include all such unit vectors in the list. If an anchor point is in the boundary, there is a half unit sphere for inward pointing vectors and we consider the unit 𝑆 𝑑−1 as its extension. We must show that the list above is an 𝜖-net for 𝑈 (𝑀 − 𝑀). To do so divide the secant lines 𝜏𝜖 into a long and short set based on their length. The cut-off between long and short secants is 5. If 𝜏 = ∞ then all secants count as short. Long Secants: For a long secant between 𝑝, 𝑞 ∈ 𝑀, let 𝑝 ∗ and 𝑞 ∗ be the closest point to them 2 in the net on 𝑀. Since 2( 𝜏𝜖 20 ) < 𝜏𝜖 5 ≤ || 𝑝 − 𝑞||, we are guaranteed that 𝑝 ∗ and 𝑞 ∗ are distinct points. By lemma 2.5.1 the distance between the unit secants of 𝑝 − 𝑞 and 𝑝 ∗ − 𝑞 ∗ is bounded by 2 2 4( 𝜏20𝜖 ) ( 𝜏 5𝜖 ) || 𝑝−𝑞|| ≤ ( 𝜏5𝜖 ) ≤ 𝜖. Therefore all long secants are covered by the secants in item 1 above. Short Secants: For a short secant, we first exchange it for the starting tangent direction for the distance minimizing geodesic connecting its base points. Second, we exchange that tangent vector for its parallel transport along the distance minimizing geodesic to the tangent space of the closest 2 point to the base point in the ( 𝜏𝜖 20 )-net of 𝑀. Third, we exchange the parallel transported vector for one of the vectors in the item 2 above. In this process, we will incur 3 errors which we denote by 𝜖1 , 𝜖2 and 𝜖3 . We must show that 𝜖1 + 𝜖2 + 𝜖3 ≤ 𝜖. Exchanging a short secant for a tangent vector. Here we bound 𝜖1 by 3𝜖 . Using theorem 𝑑 2𝑑 2 2.1.7, the angle between the initial tangent vector and the secant satisfies sin(𝜙) ≤ 2𝜏 (1 + 𝜏 ) . 29 49𝜖 Because the cutoff for short secants is 𝜏𝜖 5, substituting for 𝑑, sin(𝜙) ≤ 250 . Using 2 sin( 𝜙2 ) as the distance on a unit sphere for 2 vectors with angle 𝜙, 𝜙 3 147𝜖 𝜖 𝜖1 = 2 sin( ) < sin(𝜙) ≤ < . 2 2 500 3 Parallel transport to an anchor point. Here we bound 𝜖2 by 3𝜖 . Here we parallel transport an initial geodesic tangent vector from the base point of the secant to an anchor point. Using theorem 2 2.1.7, 𝜃 ≤ 𝜏𝑙 . The Euclidean distance to the closest net point is at most 𝑑 ≤ ( 𝜏𝜖 20 ). We translate to 2𝑑 2 the geodesic distance using theorem 2.1.7, 𝑙 ≤ 𝑑 + 𝜏 . It follows that using 𝜖 ≤ 1 𝜃 𝑙 𝑑 2𝑑 𝜖2 𝜖2 𝜖 𝜖2 = 2 sin( ) ≤ 𝜃 ≤ ≤ (1 + ) 2 ≤ (1 + ) 2 < . 2 𝜏 𝜏 𝜏 20 10 3 𝜏𝜖 2 Here we had room to spare for obtaining the bound 3𝜖 . However the cut-off of 𝜏𝜖 3 and 20 are needed for the exchanging a short secant with a tangent vector and for handling the long secants. Exchange with a preselected tangent vector: At the anchor point, we exchange the parallel transported tangent vector with a vector in a net for the unit sphere at the anchor point. By choice a ( 3𝜖 )-net is considered at the anchor point hence, 𝜖3 ≤ 3𝜖 . Therefore together 𝜖1 + 𝜖2 + 𝜖3 ≤ 𝜖. It remains to count the number of unit vectors in the above list. For item 1, by theorem 2.2.6, 2 2 one can create a ( 𝜏𝜖 𝜏𝜖 20 )-net for 𝑀 with 𝐶 ( 20 ) when 𝐶 (𝜖) is defined as 𝑉𝑀 𝑉𝜕 𝑀 𝐶 (𝜖) = + . 𝜖2 𝜖2 𝜔 𝑑 (1 − 2 24𝜏𝑀 ) 𝑑−1 ( 2𝜖 ) 𝑑 𝜔 𝑑−1 (1 − 2 ) 96𝜏𝜕𝑀 𝑑−2 ( 𝜖 ) 𝑑−1 4 2 The number of unit secants generated by such points is bounded above by 12 𝐶 ( 𝜏𝜖 2 20 ) . For directions in the list 2, the anchor points could be in the interior or in the boundary of 𝑀. We consider the tangent 𝑆 𝑑−1 for the unit vectors for all of them; if an anchor point is in the boundary, there is a half unit sphere for inward pointing vectors and we consider the unit 𝑆 𝑑−1 as its extension. By [35, corollary 4.2.13], each 𝑆 𝑑−1 can be covered by ( 3𝜖 ) 𝑑 unit vectors. Since there is a unit sphere for 30 2 3 𝑑 each point in the net from item 1, there are 𝐶 ( 𝜏𝜖 20 )( 𝜖 ) . Adding the two estimates together there 2 2 are at most 21 𝐶 ( 𝜏𝜖 2 𝜏𝜖 3 𝑑 20 ) + 𝐶 ( 20 )( 𝜖 ) unit vectors. This number is rather inconvenient and we will 2 simplify it. One has 𝐶 ( 𝜏𝜖20 ) 𝜏𝜖 2 𝑉𝑀 𝑉𝜕 𝑀 𝐶( )= 𝜖2 2 𝜏𝜖2 + 2 2 . 20 ( 𝜏20 ) ( 20 ) 𝜖 2 ( 𝜏20 ) 𝑑−2 ( 𝜏20𝜖 ) 𝑑−1 𝜔 𝑑 (1 − 24𝜏𝑀 2 ) 𝑑−1 ( 2 )𝑑 𝜔 𝑑−1 (1 − 96𝜏𝜕𝑀 2 ) ( 4 ) We note 0 < 𝜖 ≤ 1, 𝜏 = min{𝜏𝑀 , 𝜏𝜕 𝑀 }, and (2 × 20) 𝑑 4 ≤ 40.005𝑑 < 41𝑑 (1 − (24)𝜖(202 ) ) 𝑑−1 (4 × 20) 𝑑 ≤ 80.003𝑑 < 81𝑑 . 𝜖4 (1 − (96) (202 ) ) 𝑑−1 𝑉𝑀 41 𝑑 𝑉𝜕𝑀 81 𝑑−1 Define 𝛼 = 𝜔𝑑 ( 𝜏 ) + 𝜔 𝑑−1 ( 𝜏 ) . Then 𝜏𝜖 2 𝛼 𝐶( ) ≤ 2𝑑 20 𝜖 Therefore we get the overall upper bound as 𝛼2 1  2  𝛼 3 𝑑 𝛼 𝑑 1 4𝑑 + ( 2𝑑 )( ) ≤ + 𝛼3 . 2 𝜖 𝜖 𝜖 2 𝜖 4𝑑 ■ For an example we apply our covering estimate to the standard 𝑆 𝑑 as a corollary. We note that the set of the unit secants of 𝑆 𝑑 is equal to 𝑆 𝑑 itself, hence there is redundancy as many pairs of points achieve the same unit secant. It is an interesting geometry question to find submanifolds that their secants avoid being parallel. In this direction there is the work on totally skew embeddings [21]. Such submanifolds would be great candidates for benchmarking JL maps as their unit secants is expected to be large. √ 2𝑑 Corollary 2.5.3. For covering the unit secants of 𝑆 𝑑 ↩→ R𝑛 , 14𝑑 𝜖 4𝑑 𝑑41 balls of radius 𝜖 are sufficient. 31   𝛼2 1 𝑉𝑀 41 𝑑 Proof. We calculate using theorem 2.5.2. We need 2 + 𝛼3𝑑 𝜖 4𝑑 balls for 𝛼 = 𝜔𝑑 ( 𝜏 ) + 𝑑+1 𝑉𝜕𝑀 81 𝑑−1 𝜋 2 𝜔 𝑑−1 ( 𝜏 ) . We drop the boundary term since 𝑆 𝑑 has no boundary. We have 𝜏 = 1, 𝑉𝑆 𝑑 = 2 Γ( 𝑑+1 2 ) 𝑑 𝑉 𝑑 √ Γ( 𝑑 +1) √ √ and 𝑉𝐵 𝑑 = 𝜔 𝑑 = Γ(𝜋𝑑 2+1) . Then 𝜔𝑆𝑑 = 2 𝜋 Γ( 2𝑑+1 ) ≤ 2 𝜋𝑑. Therefore 𝛼 ≤ 2 𝜋𝑑41𝑑 , and 2 2 𝛼2 √ 𝑑 ≤ 14𝑑 𝑑412𝑑 . 2 + 𝛼3 ■ 32 2.6 Gaussian Width Estimate for Unit Secants From Above Definition 2.6.1. Let 𝑔 be a standard Gaussian random variable in R𝑛 . Define the Gaussian Width of 𝑆 ⊂ R𝑛 , 𝜔(𝑆), as follows. 𝜔(𝑆) = E sup ⟨𝑔, 𝑥⟩ (2.23) 𝑥∈𝑆 Theorem 2.6.2. (Via Gunther) Let 𝑀 ↩→ R𝑛 be a compact 𝑑-dimensional submanifold of R𝑛 with boundary 𝜕 𝑀. Let 𝜏𝑀 be the reach of 𝑀. Let 𝜏𝑖 be the reach of the 𝑖-th connected component of 𝜕 𝑀 as a submanifold of R𝑛 . Let 𝜏 = inf 𝑀,𝑖 {𝜏𝑀 , 𝜏𝑖 }. Let 𝑉𝑀 be the volume of 𝑀 and 𝑉𝜕 𝑀 be the 𝑝−𝑞 volume of 𝜕 𝑀. Let 𝑈 (𝑀 − 𝑀) = { || 𝑝−𝑞|| | 𝑝 ≠ 𝑞, 𝑝, 𝑞 ∈ 𝑀 } be the set of unit secants of 𝑀, and let 𝑈 (𝑀 − 𝑀) be its closure. Let Gaussian Width of 𝑈 (𝑀 − 𝑀) be 𝜔(𝑈 (𝑀 − 𝑀)). Define 𝑉𝑀 41 𝑑 𝑉𝜕 𝑀 81 𝑑−1 𝛼= ( ) + ( ) (2.24) 𝜔𝑑 𝜏 𝜔 𝑑−1 𝜏 𝛼2 𝑐 = ( + 3𝑑 𝛼) (2.25) 2 Then √ √︁ 𝜔(𝑈 (𝑀 − 𝑀)) ≤ 4 2 log(𝑐) + 4𝑑 (2.26) Proof. We use the covering number bounds in theorem 2.5.2 and Dudley’s inequality [18, page 226]. Let 𝐶 (𝜖) denote the covering number of 𝑈 (𝑀 − 𝑀) with n-dimensional Euclidean balls of 2 radius 𝜖. We have that 𝐶 (𝜖) ≤ ( 𝛼2 + 2𝑑3𝑑 𝛼) 𝜖 14𝑑 for 0 < 𝜖 ≤ 1 and 𝐶 (𝜖) = 1 for 𝜖 > 1 because 𝑈 (𝑀 − 𝑀) is a subset of the 𝑆 𝑛−1 . Dudley’s inequality gives us √ ∫ ∞ √︁ 𝜔(𝑈 (𝑀 − 𝑀)) ≤ 4 2 log(𝐶 (𝜖))𝑑𝜖 . (2.27) 0 Therefore √︂ √ ∫ 1 𝛼2 1 𝜔(𝑈 (𝑀 − 𝑀)) ≤ 4 2 log(( + 2𝑑3𝑑 𝛼) 4𝑑 )𝑑𝜖 . (2.28) 0 2 𝜖 33 2 Let 𝑐 = ( 𝛼2 + 2𝑑3𝑑 𝛼), then by Cauchy-Schwartz √︂ √ ∫ 1 1 𝜔(𝑈 (𝑀 − 𝑀)) ≤ 4 2 log(𝑐) + 4𝑑 log( ) (2.29) 0 𝜖 √︄ √ √︁ ∫ 1 4𝑑 1 ≤ 4 2 log(𝑐) 1+ log( ) (2.30) 0 log(𝑐) 𝜖 √︄ √ √︁ ∫ 1 4𝑑 1 ≤ 4 2 log(𝑐) 1+ log( ). (2.31) 0 log(𝑐) 𝜖 With the identity ∫ 1 1 1 1 + 𝑘 log( )𝑑𝑥 = 1 − 𝑘 (𝑥 log(𝑥) − 𝑥) 0 = 1 + 𝑘. (2.32) 0 𝑥 We continue √︄ √ √︁ 4𝑑 𝜔(𝑈 (𝑀 − 𝑀)) ≤ 4 2 log(𝑐) 1 + (2.33) log(𝑐) √ √︁ ≤ 4 2 log(𝑐) + 4𝑑 (2.34) as required. ■ 34 CHAPTER 3 MORE ESTIMATES WITH REACH 3.1 Closed Submanifolds with Reach 1 and Minimum Volume 𝑉 In the earlier covering estimates such as theorem 2.6.2 we encounter the ratio 𝜏𝑑 for 𝑑 dimensional manifolds. This ratio is scaling invariant and hence it is natural to ask if there is a minimum value for it over closed manifolds. We will also encounter this ratio again in theorem 5.0.5 for a necessary condition for JL maps. Therefore here we investigate if it has a lower bound. In particular if we consider the case when 𝑑 = 1, and the manifold is topologically equivalent to S1 , and 𝜏 = 1, then the minimum of the ratio is just the minimum length of a closed curve with reach one. Here we can show that for a closed 𝑑 dimensional manifold the minimum value of this ratio is given by the volume of the standard S𝑑 . In section 3.2 we also consider the minimum value of this ratio for a torus embedded in 3 dimensions, which has a topological constraint. Proposition 3.1.1. Let 𝑀 ⊂ R𝑁 be a closed smooth d dimensional manifold of reach 1. Then the volume of 𝑀 is at least Vol(S𝑑 ) which is the volume of the standard unit S𝑑 . Proof. We present the proof in 4 steps. 1. The injectivity radius of a closed manifold of reach 𝜏 is at least 𝜋𝜏. So for 𝜏 = 1, the injectivy radius is at least 𝜋. See theorem 2.1.7 1 2. Sectional curvatures of 𝑀 are all abounded above by 𝜏2 . For 𝜏 = 1, they are all bounded by 1. 3. Consider the geodesic ball of radius 𝜋 around an arbitrary point. Because the injectivity radius is at least 𝜋, then we can apply Gunther’s volume comparison theorem. Then the volume of the geodesic ball is at least the volume of the geodesic ball of radius 𝜋 in the space of uniform 35 curvature one. The general formula is ∫ 𝑟 2𝜋 𝑑/2 𝑉 1 (𝑟) = 𝑆 1 (𝑥) 𝑑−1 𝑑𝑥 𝜏2 Γ( 𝑑2 ) 0 𝜏2 √ where 𝑆 𝑘 (𝑥) = √1 sin( 𝑘𝑥) for 𝑘 > 0. So we can rewrite 𝑘 ∫ 𝑟 ∫ 𝑟 2𝜋 𝑑/2 𝑑−1 2𝜋 𝑑/2   𝑥  𝑑−1 𝑆 1 (𝑥) 𝑑𝑥 = 𝜏 sin 𝑑𝑥. Γ( 𝑑2 ) 0 𝜏2 Γ( 𝑑2 ) 0 𝜏 Here we want 𝑟 = 𝜋 and 𝜏 = 1 so we get ∫ 𝜋 2𝜋 𝑑/2 𝑉= (sin (𝑥)) 𝑑−1 𝑑𝑥. Γ( 𝑑2 ) 0 This is the formula for the volume of the 𝑑 sphere, as 𝑑 2𝜋 (𝑑+1)/2 Vol(S ) = . (3.1) Γ( 𝑑+12 ) 4. The volume of 𝑀 is at least the volume of the geodesic ball we found in the previous item. ■ By scaling we get the following more general bound. Corollary 3.1.2. Let 𝑀 ⊂ R𝑁 be a closed smooth d dimensional manifold of reach 𝜏. Then the we have 𝑉 ≥ Vol(S𝑑 ). (3.2) 𝜏𝑑 1 Proof. We scale by a factor of 𝜏 so that 𝜏 becomes 1. However under the scaling the ratio 𝑉 𝑉new 𝜏𝑑 doesn’t change. After the scaling we have 1 ≥ Vol(S𝑑 ). Therefore the same is true before scaling. ■ 3.1.1 Codimension 1 In this subsection we give an alternative proof with the isoperimetric inequality for the lower bound 𝑉 on the ratio 𝜏𝑑 in the special case where 𝑑 = 𝑛 − 1 (codimension 1) and the manifold is closed. In the case of equality in our bound, we get a rigidity statement coming from the isoperimetric inequality. The difference with the previous discuss is in having a different method of proof and a rigidity result in the case of equality. 36 Theorem 3.1.3. Let 𝑀 ⊂ R𝑛 be a codimension 1 closed submanifold of R𝑛 . Assume the reach of 𝑀 is 1. Then the 𝑛 − 1-dimensional volume of 𝑀 is at least vol(S𝑛−1 ) where S𝑛−1 is the unit 𝑛 − 1 dimensional sphere. In the case of volume equality, 𝑀 is a rigid transformation of S𝑛−1 . Proof. Since 𝑀 is codimension 1 and is closed, its complement will have two connected components. One of them is the inward component. Call it 𝐴. We give a lower bound for the volume of 𝐴. Since the reach of 𝑀 is 1, the principle curvatures of 𝑀 in all normal directions are bounded above by 1, [27, Proposition 6.1]. Consider the parallel surface of distance 0 < 𝑡 < 1 in the inwards normal direction. Let the shape operator of 𝑀 be 𝑆, then the change of volume formula is given by det(𝐼 − 𝑡𝑆) and det(𝐼 − 𝑡𝑆) ≥ (1 − 𝑡) 𝑛−1 . (3.3) Therefore if 𝑉 is the (n-1)-dimensional volume of 𝑀, the n-dimensional volume of 𝐴 is at least ∫ 1 𝑉 𝑉 (1 − 𝑡) 𝑛−1 𝑑𝑡 = (3.4) 𝑡=0 𝑛 Now we apply the isoperimetric inequality to 𝐴 with 𝜕 𝐴 = 𝑀. Let 𝜔𝑛 be the volume of the unit 𝑛-ball. 1 𝑛−1 𝑛(𝜔𝑛 ) 𝑛 vol𝑛 ( 𝐴) 𝑛 ≤ vol𝑛−1 (𝑀) (3.5) 1 𝑉 𝑛−1 𝑛(𝜔𝑛 ) 𝑛 ( ) 𝑛 ≤ 𝑉 (3.6) 𝑛 vol(S𝑛−1 ) = 𝑛𝜔𝑛 ≤ 𝑉 . (3.7) 𝑑 In the last line we used the fact that vol(S𝑛−1 ) = 𝑑𝑟 | 𝑟=1 vol(𝑟 𝐵𝑛 ). The equality case of the isoperimetric inequality gives that 𝑀 is a rigid motion of S𝑛−1 . ■ 3.2 Torus in R3 with Reach 1 and Minimum Area 𝑉 Here we consider the problem of minimizing the ratio 𝜏𝑑 when 𝑑 = 2, 𝜏 = 1 for a smooth torus in R3 . In this special case there is a topological constraint of having genus 1 and therefore we expect 37 the minimum to be more than the value 4𝜋 ≈ 13 given in section 3.1. Our candidate torus for minimizing the ratio is the torus of revolution with larger radius 2 and smaller radius 1. The reach of this torus is 1 and its surface area is 8𝜋 2 = (2𝜋(2))(2𝜋(1)) ≈ 79. Here we offer two proofs for smaller lower bounds. The first proof is based on Loewner’s √ systolic inequality [6, page 325] and gives a lower bound of 2 3𝜋 2 ≈ 34. The second proof is based on the isoperimetric inequality and gives a lower bound of 27/3 32/3 𝜋 5/3 ≈ 70. Theorem 3.2.1. (Via systolic inequality) Let 𝑇 ⊂ R3 be a smooth embedded torus with reach 1. √ Then the surface area of 𝑇 is at least 2 3𝜋 2 . Proof. Consider the systole of 𝑇 (shortest non-contractible loop). It is necessarily a periodic geodesic [6, page 326]. Since the reach of 𝑇 is 1, the external acceleration of the systole as a unit speed geodesic is at most 1 in norm. Any closed curve with acceleration bounded by 1 has length at least 2𝜋 [17]. Therefore our theorem follows from Loewner’s systolic inequality [6, page 325] √ 2 3 sys ≤𝐴 (3.8) √2 2 3 (2𝜋) ≤𝐴 (3.9) 2 √ 2 3𝜋 2 ≤ 𝐴. (3.10) ■ Theorem 3.2.2. (Via isoperimetric inequality) Let 𝑇 ⊂ R3 be a smooth embedded torus with reach 1. Then the surface area of 𝑇 is at least 27/3 32/3 𝜋 5/3 ≈ 70. Proof. Since torus is an orientable 2 dimensional surface there is a well-defined outward and inward normal direction. Consider the parallel surface 𝑇𝑜𝑢𝑡 defined by moving in the outward direction by 1 − 𝜖. 𝑇𝑜𝑢𝑡 is another smooth torus because reach of 𝑇 is 1. We get our bound by applying the isoperimetric inequality to this surface. 38 First we show the volume contained inside 𝑇𝑜𝑢𝑡 (i.e. the volume of the bounded component of the complement of 𝑇𝑜𝑢𝑡 ) is at least 16𝜋 2 . Second we show that the surface area of 𝑇𝑜𝑢𝑡 is at most twice the surface area of 𝑇. Then we apply the isoperimetric inequality. First we discuss the issue of volume of 𝑇𝑜𝑢𝑡 . Let 𝜖 ∈ (0, 1). Consider the surface Σ𝜖 , that is obtained from 𝑇 by moving in the inward normal direction by a distance of 1 − 𝜖. Then Σ𝜖 is a smooth surface because the reach of 𝑇 is 1. Consider the solid torus given by the filling the bounded component of the complement of Σ𝜖 . This solid torus is a 3-manifold with boundary and its systole is a periodic geodesic [6, page 326]. Call this curve 𝑙 𝜖 . We claim the reach of 𝑙 𝜖 is at least 2 − 𝜖. Consider 𝑙 𝜖 (𝑡) as a unit speed geodesic; for a 𝑡0 if 𝑙 𝜖 (𝑡 0 ) is in the interior of solid bounded by Σ𝜖 , then the acceleration of 𝑙 𝜖 (𝑡0 ) is zero because R3 is flat. If 𝑙 𝜖 (𝑡 0 ) is on Σ𝜖 , the boundary of the solid torus, then the acceleration of 𝑙 𝜖 (𝑡0 ) is outward pointing, and since reach of 𝑇 is 1 and we moved another 1 − 𝜖 inside, we get that reach of 𝑙 𝜖 is at least 2 − 𝜖. By [17, page 50], the length of 𝑙 𝜖 is at least 2𝜋(2 − 𝜖). The tube of radius 2 − 2𝜖 around 𝑙 𝜖 is contained in the inside of 𝑇𝑜𝑢𝑡 . The volume of this tube by [22, page 7] is at least 2(2 − 𝜖)(2 − 2𝜖) 2 𝜋 2 . Therefore the volume inside of 𝑇𝑜𝑢𝑡 is at least 2(2 − 2𝜖) 3 𝜋 2 . Now we discuss the surface area of 𝑇𝑜𝑢𝑡 . Let 𝐻 and 𝐾 denote the mean and Gaussian curvature for 𝑇 respectively. By the parallel surface area formula [22, page 8], we have ∫ ∫ 2 Area(𝑇𝑜𝑢𝑡 ) = Area(T) + (1 − 𝜖) 2𝐻 + (1 − 𝜖) 𝐾 (3.11) 𝑇 𝑇 ∫ By the Gauss–Bonnet theorem 𝑇 𝐾 = 0. ∫ We claim 𝑇 2𝐻 ≤ Area(𝑇). This is because the surface area of the parallel surface in the inward direction, obtained by moving from 𝑇 in the inward direction by distance 𝑡, is given by ∫ Area(T) − 𝑡 𝑇 2𝐻. Since the reach of 𝑇 is 1, this area must remain positive up to 𝑡 = 1 which ∫ implies 𝑇 2𝐻 ≤ Area(𝑇). Returning to (3.11), we get that Area(𝑇𝑜𝑢𝑡 ) ≤ (2 − 𝜖)Area(𝑇) (3.12) 39 Now we apply the isoperimetric inequality to 𝑇𝑜𝑢𝑡 . 4 3Vol(𝑇𝑜𝑢𝑡 ) 2/3 ( 𝜋) 1/3 ≤ Area(𝑇𝑜𝑢𝑡 ) (3.13) 3 4 3(2(2 − 2𝜖) 3 𝜋 2 ) 2/3 ( 𝜋) 1/3 ≤ (2 − 𝜖)Area(𝑇) (3.14) 3 Letting 𝜖 → 0 and simplifying we get 70 ≈ 27/3 32/3 𝜋 5/3 ≤ Area(𝑇). (3.15) ■ 3.3 Doing Better than Reach in Covering Estimates Our earlier covering estimate for a compact manifold was based on reach. We used Gunther’s theorem for volume comparison and the main ingredient in that theorem is an upper bound on 1 sectional curvatures. For a compact manifold of reach 𝜏, 𝜏2 is a global bound on all sectional curvatures. Since this bound is based on reach, it is sensitive to perturbations. A small bump on the manifold could change the value of 𝜏 substantially; see figure 3.1. However for a covering argument, one should be able to find a global estimate which is based on averaging a local estimate. Figure 3.1: Reach is a global parameter. A bump on the surface of a manifold can change its reach, and therefore reach is sensitive to local perturbations. Consider the function 𝑘 : 𝑀 → R 𝑘 ( 𝑝) = max sectional curvature of 𝑀 at 𝑝 (3.16) Since sectional curvatures at a point are defined in the space of all planes in 𝑇𝑝 𝑀, Grassmanian ⨏ 𝐺𝑟 (2, 𝑑), and this space is compact then the max exists. Let 𝑘, ˜ 𝑘 , 𝑘 be the max, min and the ∼ 40 ⨏ ⨏ ⨏ average of 𝑘 over 𝑀. In this notation takes precedence, hence “ 𝑘 + 𝑘” ˜ means “( 𝑘) + 𝑘”. ˜ Again since 𝑀 is compact, all three exist. With respect to the covering estimates, the approach with reach is based on 𝑘˜ ≤ 1 𝜏2 . Here we can ⨏ show ⨏ that in the asymptotic regime as the radius of 2𝑑 𝑘˜ 𝑘−( 𝑘)𝑘 −(2𝑑 −1) 𝑘˜ 𝑘 the covering goes to zero, one can use ⨏∼ ∼ . When 𝑘 is not a constant function, ˜ 𝑘+(2 𝑑 −1) 𝑘−2𝑑 𝑘 ⨏ ∼ this quantity is strictly bounded by 𝑘 and 𝑘˜ and therefore it is an improvement over 𝑘˜ and 𝜏12 . We show this with a calculation below: ⨏ 0 < ( 𝑘˜ − 𝑘)( 𝑘˜ − 𝑘 ) (3.17) ⨏ ⨏ ⨏ ∼ 2𝑑 𝑘˜ 𝑘 − 𝑘 𝑘 − (2𝑑 − 1)𝑘 𝑘˜ < (2𝑑 − 1) 𝑘˜ 𝑘 + 𝑘˜ 2 − 2𝑑 𝑘˜ 𝑘 (3.18) ∼ ∼ ∼ ⨏ ⨏ 2 𝑘˜ 𝑘 − 𝑘 𝑘 − (2 − 1) 𝑘˜ 𝑘 𝑑 𝑑 ∼ ∼ ⨏ < 𝑘, ˜ (3.19) 𝑘˜ + (2 − 1) 𝑘 − 2 𝑘 𝑑 𝑑 ∼ and ⨏ ⨏ ( ˜ 𝑘 − 𝑘)( 𝑘 − 𝑘) < 0 (3.20) ∼ ⨏2 ⨏ ⨏ ⨏ ⨏ (2𝑑 − 1) 𝑘 + 𝑘˜ 𝑘 − 2𝑑 𝑘 𝑘 < 2𝑑 𝑘˜ 𝑘−𝑘 𝑘 − (2𝑑 − 1)𝑘 𝑘˜ (3.21) ∼ ∼ ∼ ⨏ ⨏ ⨏ 2𝑑 𝑘˜ 𝑘−𝑘 𝑘 − (2𝑑 − 1) 𝑘˜ 𝑘 ∼ ∼ 𝑘< ⨏ , (3.22) 𝑘˜ + (2𝑑 − 1) 𝑘 − 2𝑑 𝑘 ∼ and therefore we can conclude ⨏ ⨏ ⨏ 2𝑑 𝑘˜ 𝑘−𝑘 𝑘 − (2𝑑 − 1) 𝑘˜ 𝑘 ∼ ∼ 𝑘< ⨏ ˜ < 𝑘. (3.23) ˜𝑘 + (2𝑑 − 1) 𝑘 − 2𝑑 𝑘 ∼ The averaged quantity comes from treating each packing ball separately. As usual we use the packing-covering relationship to switch to a packing estimate. For each packing ball, we showed it isolates a geodesic ball of some radius with a minimum volume. This minimum volume gives the desired bound. Consider the supremum of all sectional curvatures only in that geodesic ball, then the Gunther’s theorem still holds with this sectional curvature. We can average the values of max sectional curvatures with one value per packing ball to get an averaged value. Now we show this calculation. First we need a volume estimate lemma. 41 Lemma 3.3.1. Let 𝑀 be a simply connected d-dimensional manifold of constant sectional curvature 𝑘 ∈ R. Consider a geodesic ball of radius 𝑟 and volume 𝑉𝑘 (𝑟). Then for 𝑟 small enough we have 2𝜋 𝑑/2 𝑟 𝑑 𝑘 (𝑑 − 1) 𝑑+2 ( − 𝑟 ) ≤ 𝑉𝑘 (𝑟). (3.24) Γ( 𝑑2 ) 𝑑 6(𝑑 + 2) Proof. The volume of a ball of radius 𝑟 in a space form is ∫ 𝑟 2𝜋 𝑑/2 𝑉𝑘 (𝑟) = 𝑆 𝑘 (𝑥) 𝑑−1 𝑑𝑥 Γ( 𝑑2 ) 0 where for 𝑘 ∈ R  √ √1 sin( 𝑘𝑟)  𝑘>0     𝑘     𝑆 𝑘 (𝑟) = 𝑟 𝑘=0 (3.25)   √   1   √−𝑘 sinh( −𝑘𝑟) 𝑘<0    For 𝑘 > 0 1 √ 𝑘𝑥 3 𝑘 2 𝑥 5 √ sin( 𝑘𝑥) = 𝑥 − + + ... (3.26) 𝑘 6 5! and 𝑘 < 0 1 √ (−𝑘)𝑥 3 𝑘 2 𝑥 5 √ sinh( −𝑘𝑥) = 𝑥 + + + ... (3.27) −𝑘 6 5! Therefore 𝑘𝑥 3 ∫ 𝑟 ∫ 𝑟 𝑑−1 𝑆 𝑘 (𝑥) 𝑑𝑥 = (𝑥 − + ...) 𝑑−1 𝑑𝑥 (3.28) 0 0 6 (𝑑 − 1) 𝑑−2 −𝑘𝑥 3 ∫ 𝑟   𝑑−1 = 𝑥 + 𝑥 ( ) + ...𝑑𝑥 (3.29) 0 1 6 𝑟 𝑑 𝑘 (𝑑 − 1) 𝑑+2 = − 𝑟 + ... (3.30) 𝑑 6(𝑑 + 2) The next term in the expansion is always positive so for a small enough 𝑟 we have ∫ 𝑟 𝑟 𝑑 𝑘 (𝑑 − 1) 𝑑+2 − 𝑟 ≤ 𝑆 𝑘 (𝑥) 𝑑−1 𝑑𝑥 (3.31) 𝑑 6(𝑑 + 2) 0 ■ Next we give a lemma about averaging. 42 Lemma 3.3.2. Let 𝑀 be a closed orientable smooth Riemannian manifold of dimension d with volume 𝑉. Let 𝑘 : 𝑀 → R be the map that gives the maximum sectional curvature at each point. Let 𝐵𝑖 (𝑥𝑖 , 𝑟), 1 ≤ 𝑖 ≤ 𝑁 be disjoint closed geodesic balls of radius 𝑟 and centers 𝑥𝑖 . Let 𝑘 𝑖 be the Í supremum of all sectional curvatures in 𝐵𝑖 . Let 𝑘 ∗ = 𝑁 𝑘𝑖 be the average of 𝑘 𝑖 . Then for 𝑟 small enough we have " # 2𝜋 𝑑/2 𝑟 𝑑 𝑘 ∗ (𝑑 − 1) 𝑑+2 𝑁 ≤ 𝑉/ ( − 𝑟 ) (3.32) Γ( 𝑑2 ) 𝑑 6(𝑑 + 2) Í Proof. Let 𝑉𝑖 (𝑟) be the volume of ball 𝑖. Then 𝑖 𝑉𝑖 (𝑟) ≤ 𝑉. By the Gunther’s theorem we have 2𝜋 𝑑/2 𝑟 𝑑 𝑘 𝑖 (𝑑 − 1) 𝑑+2 ( − 𝑟 ) ≤ 𝑉𝑖 (𝑟) (3.33) Γ( 𝑑2 ) 𝑑 6(𝑑 + 2) Summing over the balls we get 𝑁 ∑︁ 2𝜋 𝑑/2 𝑟 𝑑 𝑘 𝑖 (𝑑 − 1) 𝑑+2 ( − 𝑟 ) ≤𝑉 (3.34) 𝑖=1 Γ( 2 ) 𝑑 𝑑 6(𝑑 + 2) " # 2𝜋 𝑑/2 𝑟 𝑑 ∗ 𝑘 (𝑑 − 1) 𝑑+2 𝑁 ( − 𝑟 ) ≤𝑉 (3.35) 𝑑 Γ( 2 ) 𝑑 6(𝑑 + 2) which is our claim. ■ So based on the above lemma if we find an estimate for 𝑘 ∗ = 𝑘𝑖 Í 𝑖 𝑁 we can use sharper bound in the packing estimates compared to the ones given by reach. In the asymptotic regime, as the radius ∫ of packing balls goes to zero, a natural candidates for 𝑘 ∗ is 𝑉1 𝑀 𝑘. But because of varying packing density this is not necessary the case. 1 ∫ 𝑘 deviates from 𝑘 ∗ = Í 𝑘𝑖 Because of varying density as in figure 3.2, it is possible that 𝑉 𝑀 𝑁. However, if 𝐵𝑖 (𝑥𝑖 , 𝑟) is a packing ball with center 𝑥𝑖 and 𝑣 𝑖 is the Voronoi cell corresponding to the 𝐵𝑖 , then 𝐵𝑖 (𝑥𝑖 , 𝑟) ⊂ 𝑣 𝑖 ⊂ 𝐵𝑖 (𝑥𝑖 , 2𝑟) (3.36) This property restricts the deviation of our average ⨏ ⨏ from the continuous average over the manifold. 2𝑑 𝑘˜ 𝑘−( 𝑘)𝑘 −(2𝑑 −1) 𝑘˜ 𝑘 We use this relation to obtain our bound ⨏∼ ∼ . ˜ 𝑘+(2 𝑑 −1) 𝑘−2𝑑 𝑘 ∼ 43 (a) A packing with varying density (b) Veronoi cells associated to a packing (c) Voronoi cell contained in a doubled ball Figure 3.2: Figure-a shows a packing with varying density. Figure-b shows the Veronoi cells associated with a packing. Figure-c shows a Veronoi cell contained by a doubled packing ball. Theorem 3.3.3. Let 𝑀 be a closed smooth Riemannian manifold with volume 𝑉. Let 𝐵𝑖𝑗 (𝑥 𝑖𝑗 , 𝑟), 1 ≤ 𝑗 𝑗 ≤ 𝑁𝑖 be a packing of 𝑀 with geodesic balls of radius 𝑟𝑖 and centers 𝑥𝑖 . Let 𝑝𝑖𝑗 ∈ 𝐵𝑖 be points that are arbitrarily selected from each ball. Let 𝜎𝑖 be a probability measure obtained by placing a mass of 1/𝑁𝑖 weight at 𝑝𝑖𝑗 . Assume 𝑟𝑖 → 0 and 𝜎𝑖 converges weakly to 𝜎. Let 𝑘 : 𝑀 → [𝑘 , 𝑘] ˜ be the map that gives the maximum sectional curvature at each point. ∼ ⨏ Assume 𝑘 is a Lipschitz function. Let 𝑘, ˜ 𝑘 and 𝑘 be the max, min and the average of 𝑘 over 𝑀. ∼ Let 𝑘 𝑖𝑗 := 𝑘 ( 𝑝𝑖𝑗 ). Let 𝜇 and 𝜇𝜎 be the push forward of measure induced by the volume form and 𝜎 on [𝑘 , 𝑘] ˜ via ∼ 𝑘. Assume 𝜇 and 𝜇𝜎 are absolutely continuous with respect to the Lebesgue measure. ⨏ If 𝑘 is not a constant function, then 𝑘 < 𝑘 < 𝑘˜ and ∼ ⨏ ⨏ 2𝑑 𝑘˜ 𝑘−( 𝑘)𝑘 − (2𝑑 − 1) 𝑘˜ 𝑘 Í 𝑁𝑖 𝑖 𝑗=1 𝑘 𝑗 ∼ ∼ lim sup ≤ ⨏ . (3.37) 𝑟 𝑖 →0 𝑁𝑖 𝑘˜ + (2𝑑 − 1) 𝑘 − 2𝑑 𝑘 ∼ 44 Proof. We first fix 𝑖 and work with a single packing. Later we consider 𝑖 → ∞. We partition 𝑀 into Voronoi cells based on their distance to the center of the balls 𝑥 𝑖𝑗 . Define 𝑣 𝑖𝑗 = {𝑝 ∈ 𝑀 |𝑑𝑔 ( 𝑝, 𝑥 𝑖𝑗 ) < 𝑑𝑔 ( 𝑝, 𝑥 𝑖𝑘 ) 𝑗 ≠ 𝑘 } (3.38) where 𝑑𝑔 is the geodesic distance. We assume the measure of the points that are equ-distant to two of 𝑥 𝑖𝑗 is zero and we have 𝑁𝑖 ∑︁ 𝑉= |𝑣 𝑖𝑗 | (3.39) 𝑗=1 Since balls 𝐵𝑖𝑗 are a packing of 𝑀 we have 𝐵𝑖𝑗 (𝑥 𝑖𝑗 , 𝑟𝑖 ) ⊂ 𝑣 𝑖𝑗 ⊂ 𝐵𝑖𝑗 (𝑥 𝑖𝑗 , 2𝑟𝑖 ) (3.40) ˜ we count the number 𝑘 𝑖 ∈ [𝑎, 𝑏] from above and below using the 𝑣 𝑖 and 𝜇. For 𝑎, 𝑏 ∈ [𝑘 , 𝑘], ∼ 𝑗 𝑗 Let 𝑐 be the Lipschitz constant of 𝑘. For an upper bound, if 𝑘 𝑖𝑗 ∈ [𝑎, 𝑏], then ∀𝑥 ∈ 𝑣 𝑖𝑗 , 𝑎 − 4𝑟𝑖 𝑐 ≤ 𝑘 (𝑥) ≤ 𝑏 + 4𝑟𝑖 𝑐. Then ∪ 𝑣 𝑖𝑗 ⊂ {𝑝|𝑎 − 4𝑟𝑖 𝑐 ≤ 𝑘 ( 𝑝) ≤ 𝑏 + 4𝑟𝑖 𝑐} (3.41) 𝑖:𝑘 𝑖𝑗 ∈[𝑎,𝑏] Let 𝑉𝛼 (𝑟) be the volume of geodesic ball of radius 𝑟 in simply connected space of constant sectional curvature 𝛼. By Gunther’s volume comparison theorem and relation ((3.40)), for 𝑟 small enough we have 𝑉𝑏+4𝑟𝑐 (𝑟𝑖 ) ≤ |𝑣 𝑖𝑗 | (3.42) Since the Voronoi cells are disjoint, from equation ((3.41)) 𝜇(𝑎 − 4𝑟𝑖 𝑐, 𝑏 + 4𝑟𝑖 𝑐) (#𝑘 𝑖𝑗 ∈ [𝑎, 𝑏]) ≤ (3.43) 𝑉𝑏+4𝑟𝑖 𝑐 (𝑟) where 𝜇(𝑎 − 4𝑟𝑖 𝑐, 𝑏 + 4𝑟𝑖 𝑐) = |{𝑝 : 𝑎 − 4𝑟𝑖 𝑐 ≤ 𝑘 ( 𝑝) ≤ 𝑏 + 4𝑟𝑖 𝑐}|. For a lower bound, consider all Voronoi cells that intersect the set {𝑎 + 4𝑟𝑖 𝑐 ≤ 𝑘 ( 𝑝) ≤ 𝑏 − 4𝑟𝑖 𝑐}. 𝑏−𝑎 If 𝑟𝑖 < 8𝑐 this set is non-empty. For such Voronoi cells we have ∀𝑥 ∈ 𝑣 𝑖𝑗 , 𝑎 ≤ 𝑘 (𝑥) ≤ 𝑏 since 𝑣 𝑖𝑗 is contained in a ball of radius 2𝑟𝑖 and the 𝑘 𝑖𝑗 from 𝑣 𝑖𝑗 satisfies 𝑘 𝑖𝑗 ∈ [𝑎, 𝑏]. Thus {𝑝|𝑎 + 4𝑟𝑖 𝑐 ≤ 𝑘 ( 𝑝) ≤ 𝑏 − 4𝑟𝑖 𝑐} ⊂ ∪ 𝑣 𝑖𝑗 (3.44) 𝑗:𝑘 𝑖𝑗 ∈[𝑎,𝑏] 45 Again let 𝑉𝛼 (𝑟) be the volume of geodesic ball of radius 𝑟 in simply connected space of constant sectional curvature 𝛼. Using the Bishop-Gromov’s volume comparison theorem |𝑣 𝑖𝑗 | ≤ 𝑉𝑎 (2𝑟𝑖 ) (3.45) Since the Voronoi cells are disjoint, from equation ((3.44)) 𝜇(𝑎 + 4𝑟𝑖 𝑐, 𝑏 − 4𝑟𝑖 𝑐) ≤ (#𝑘 𝑖𝑗 ∈ [𝑎, 𝑏]) (3.46) 𝑉𝑎 (2𝑟𝑖 ) Combining ((3.43)) and ((3.46)), we have 𝜇(𝑎 + 4𝑟𝑖 𝑐, 𝑏 − 4𝑟𝑖 𝑐) 𝜇(𝑎 − 4𝑟𝑖 𝑐, 𝑏 + 4𝑟𝑖 𝑐) ≤ (#𝑘 𝑖𝑗 ∈ [𝑎, 𝑏]) ≤ (3.47) 𝑉𝑎 (2𝑟𝑖 ) 𝑉𝑏+4𝑟𝑖 𝑐 (𝑟𝑖 ) To get the proportion of 𝑘 𝑖 ∈ [𝑎, 𝑏] we must divide by 𝑁𝑖 . We divide the ((3.47)) for [𝑎, 𝑏] and [𝑑, 𝑒] ⊂ [𝑘 , 𝑘]˜ to cancel the 𝑁𝑖 and get ∼ 𝑉𝑒+4𝑟𝑖 𝑐 (𝑟𝑖 ) 𝜇(𝑎 + 4𝑟𝑖 𝑐, 𝑏 − 4𝑟𝑖 𝑐) (#𝑘 𝑖𝑗 ∈ [𝑎, 𝑏]) 𝑉𝑑 (2𝑟𝑖 ) 𝜇(𝑎 − 4𝑟𝑖 𝑐, 𝑏 + 4𝑟𝑖 𝑐) ≤ ≤ (3.48) 𝑉𝑎 (2𝑟𝑖 ) 𝜇(𝑑 − 4𝑟𝑖 𝑐, 𝑒 + 4𝑟𝑖 𝑐) (#𝑘 𝑗 ∈ [𝑑, 𝑒]) 𝑉𝑏+4𝑟𝑖 𝑐 (𝑟𝑖 ) 𝜇(𝑑 + 4𝑟𝑖 𝑐, 𝑒 − 4𝑟𝑖 𝑐) 𝑖 Note that since curvature terms are second order in the volume expansion 𝑉𝛼 (2𝑟) lim = 2𝑑 (3.49) 𝑟→0 𝑉𝛽 (𝑟) Letting 𝑟𝑖 → 0 and using ((3.49)) in ((3.48)) we get 1 𝜇(𝑎, 𝑏) (#𝑘 𝑖𝑗 ∈ [𝑎, 𝑏]) 𝑑 𝜇(𝑎, 𝑏) ≤ ≤ 2 (3.50) 2𝑑 𝜇(𝑑, 𝑒) (#𝑘 𝑗 ∈ [𝑑, 𝑒]) 𝑖 𝜇(𝑑, 𝑒) 𝜇′ (𝑘) ˜ Letting Let 𝑓 (𝑎) = 𝜎′ (𝑎) and 𝑔(𝑘) = 𝑉 . Note that 𝑔(𝑘) is a probability density on [𝑘 , 𝑘]. ∼ 𝑏 → 𝑎 and 𝑒 → 𝑑 and using convergence of 𝜎𝑖 → 𝜎, absolute continuity of 𝜎 and 𝜇, 1 𝑔(𝑎) 𝑓 (𝑎) 𝑔(𝑎) ≤ ≤ 2𝑑 (3.51) 2 𝑔(𝑑) 𝑑 𝑓 (𝑑) 𝑔(𝑑) By assumption both 𝑓 (𝑘) and 𝑔(𝑘) are positive and continuous on [𝑘 , 𝑘]. ˜ Therefore there are ∼ constants 𝑐 1 < 1 < 𝑐 2 such that 𝑐 1 𝑔(𝑘) ≤ 𝑓 (𝑘) ≤ 𝑐 2 𝑔(𝑘) (3.52) 46 Suppose 𝑓 (𝑘 1 ) = 𝑐 1 𝑔(𝑘 1 ) and 𝑓 (𝑘 2 ) = 𝑐 2 𝑔(𝑘 2 ) for some 𝑘 1 , 𝑘 2 . Using (3.51) with 𝑎 = 𝑘 2 and 𝑑 = 𝑘 1 we get 1 𝑔(𝑘 2 ) 𝑐 2 𝑔(𝑘 2 ) 𝑑 𝑔(𝑘 2 ) ≤ ≤ 2 (3.53) 2𝑑 𝑔(𝑘 1 ) 𝑐 1 𝑔(𝑘 1 ) 𝑔(𝑘 1 ) which gives 𝑐1 ≤ 𝑐 2 ≤ 2𝑑 𝑐 1 (3.54) 2𝑑 Since 𝑐 1 < 1 < 𝑐 2 𝑐2 𝑔(𝑘) ≤ 𝑓 (𝑘) ≤ 𝑐 2 𝑔(𝑘) (3.55) 2𝑑 Now consider that both 𝑓 (𝑘) and 𝑔(𝑘) are probability densities on [𝑘 , 𝑘]. ˜ Given the linear ∼ ∫ ∫ constraints on 𝑓 (𝑘) given by condition (3.55), and 𝑓 (𝑘) = 𝑔(𝑘) = 1, from the simplex ∫ algorithm, we can conclude that 𝑘 𝑘 𝑓 (𝑘) is maximized when the inequalities in (3.55) are saturated except possibility at one point such as 𝑘 0 to transition between the two sides of (3.55). Therefore consider   𝑐  2𝑑2 𝑔(𝑘) if 𝑘 ≤ 𝑘 0    𝑓 (𝑘) = (3.56)    𝑐 2 𝑔(𝑘)  if 𝑘 0 < 𝑘  ∫ 𝑘0 ∫ 𝑘˜ ∫ 𝑘˜ Let 𝑘 𝑔(𝑘) = 𝑎 and 𝑘0 𝑔(𝑘) = 𝑏 with 𝑎, 𝑏 > 0, 𝑎 + 𝑏 = 1. Then from 𝑘 𝑓 (𝑘) = 1 we get ∼ ∫ ∼ 𝑐2 2𝑑 2 𝑑 𝑎 + 𝑐 2 𝑏 = 1, and 𝑐 2 = 𝑑 𝑎+2 𝑏 . Now we bound 𝑘 𝑘 𝑓 (𝑘) from above in two ways. ∫ ∫ 𝑘0 ∫ 𝑘˜ 𝑐2 𝑘 𝑓 (𝑘) = 𝑑 𝑘𝑔(𝑘) + 𝑐 2 𝑘𝑔(𝑘) (3.57) 𝑘 2 𝑘 𝑘0 ∼ ∫ 𝑘˜ ∫ 𝑘˜ 𝑐2 1 = 𝑑 𝑘𝑔(𝑘) + 𝑐 2 (1 − 𝑑 ) 𝑘𝑔(𝑘) (3.58) 2 𝑘 2 𝑘0 ∼ 𝑐2 ⨏ ≤ 𝑑 𝑘 + 𝑐 2 (1 − 21𝑑 )𝑏 𝑘˜ (3.59) 2 𝑐2 ⨏ = 𝑑 𝑘 + (1 − 2𝑐𝑑2 ) 𝑘˜ A (3.60) 2 47 1 𝑐2 where we use (1 − )𝑐 𝑏 2𝑑 2 = (1 − 2𝑑 ). In another way, ∫ ∫ 𝑘0 ∫ 𝑘˜ 𝑐2 𝑘 𝑓 (𝑘) = 𝑑 𝑘𝑔(𝑘) + 𝑐 2 𝑘𝑔(𝑘) (3.61) 𝑘 2 𝑘 𝑘0 ∼ ∫ 𝑘0 ∫ 𝑘˜ 1 = −𝑐 2 (1 − 𝑑 ) 𝑘𝑔(𝑘) + 𝑐 2 𝑘𝑔(𝑘) (3.62) 2 𝑘 𝑘 ∼ ∼ 1 ⨏ ≤ −𝑐 2 (1 − )𝑎𝑘 + 𝑐 2 𝑘 (3.63) 2𝑑 ∼ ⨏ = −(𝑐 2 − 1)𝑘 + 𝑐 2 𝑘 B (3.64) ∼ ∫ where we used 𝑐 2 (1 − 21𝑑 )𝑎 = 𝑐 2 − 1. Hence 𝑘 𝑘 𝑓 (𝑘) ≤ min{𝐴, 𝐵}. If we allow 𝑐 2 to be a variable then 𝐴 and 𝐵 are lines with positive and negative slopes respectively. Hence their intersection is ∫ an upper bound for 𝑘 𝑘 𝑓 (𝑘). 𝑐 ⨏ 𝑐 ˜ ⨏ 𝑘 + (1 − 2𝑑 )𝑘 = −(𝑐 − 1)𝑘 + 𝑐 𝑘 (3.65) 2𝑑 ∼ 𝑘˜ − 𝑘 𝑐= 1 ⨏ ∼ 1 ˜ (3.66) (1 − 2𝑑 ) 𝑘 + 2𝑑 𝑘 −𝑘 ∼ Substituting back into A or B, we get ⨏ ⨏ ∫ 2𝑑 𝑘˜ 𝑘−( 𝑘)𝑘 − (2𝑑 − 1) 𝑘˜ 𝑘 ∼ ∼ 𝑘 𝑓 (𝑘) ≤ ⨏ (3.67) 𝑘 𝑘˜ + (2𝑑 − 1) 𝑘 − 2𝑑 𝑘 ∼ ■ Corollary 3.3.4. Let 𝑇 ⊂ R3 be a smooth torus. Let 𝐵𝑖 be packing balls of radius 𝑟 and let 𝑝𝑖 be arbitrary points selected from 𝐵𝑖 , and 𝑘 ( 𝑝𝑖 ) be the sectional curvature at 𝑝𝑖 . Then Í 𝑘 ( 𝑝𝑖 ) −3 𝑘˜ 𝑘 𝑖 ∼ lim sup ≤ (3.68) 𝑟→0 #{𝑘 𝑖 } ˜ 𝑘 − 4𝑘 ∼ ⨏ Proof. We use the previous theorem with 𝑑 = 2. By Gauss-Bonnet theorem 𝑘 = 2 − 2𝑔 = 0, and the claim follows. Note that in a torus in R3 , the max and min of sectional curvatures are respectively positive and negative. Hence our bound is a positive quantity. ■ 48 CHAPTER 4 SUFFICIENT CONDITIONS FOR JL MAPS OF MANIFOLDS 4.1 With sub-Gaussian Matrices (Variation 1) In this section we improve on the work in [15, theorem 2]. In that work, a JL map is constructed for a compact manifold via Gaussian matrices. The method is based on first finding an overestimate for covering for the unit secants of a compact manifold based on its reach and volume, and second using a chaining argument to get a JL map. The chaining method is based on using a sequence of progressively denser 𝜖-nets and representing each point in the manifold as a branch in the tree of 𝜖-nets. [35, Chapter 8] provides a detailed explanation of this technique. Here we improve on the previous work in two ways. First we allow the manifold to have boundary. This generalization allows the JL map to apply to data models where the data terminates at a boundary before it can wrap around and form a close manifold. Second we use the covering numbers to calculate the Gaussian width of the unit secants of the manifold via Dudley’s inequality, and then we apply the matrix deviation inequality to get JL maps for sub-Gaussian matrices. This calculation streamlines and simplifies the generic chaining calculation in the earlier work. Our main theorem in this section is theorem 4.1.7. First we state the result in [15, theorem 2]. Theorem 4.1.1. [15, theorem 2] Let 𝑀 be a compact 𝑑-dimensional Riemannian submanifold of R𝑁 with reach 𝜏 1 and volume 𝑉𝑀 . Conveniently assume that 𝑉𝑀 21 𝑑 ≥ ( √ )𝑑 (4.1) 𝜏 2 𝑑 1 For 0 < 𝜖 ≤ 3 and 0 < 𝜌 < 1. Let Φ be a random 𝑚 × 𝑁 matrix populated with i.i.d zero mean 1 In [15] the concept of condition number is used which is just the reciprocal of reach. 49 Gaussian random variables with variance of 1/𝑚 with √ −2 𝑑 2 8 𝑚 ≥ 18𝜖 max(24𝑑 + 2𝑑 log( 2 ) + log(2𝑉𝑀 ), log( )). (4.2) 𝜏𝜖 𝜌 Then with probability at least 1 − 𝜌, for every pair of points 𝑥 1 , 𝑥2 ∈ 𝑀 (1 − 𝜖)||𝑥 1 − 𝑥 2 || ≤ ||Φ𝑥 1 − Φ𝑥2 || ≤ (1 + 𝜖)||𝑥1 − 𝑥 2 ||. (4.3) Next we state the definitions for sub-Gaussian random variables, vectors and matrices, and then we state the matrix deviation inequality. Definition 4.1.2. [35, section 2.5](sub-Gaussian Random Variable) A random variable 𝑋 is sub- Gaussian if it satisfies E exp(𝑋 2 /𝑎 2 ) ≤ 2 (4.4) for some positive constant 𝑎. The sub-Gaussian norm of 𝑋 denoted by ||𝑋 || 𝜓2 is defined as ||𝑋 || 𝜓2 = inf{𝑡 > 0 : E exp(𝑋 2 /𝑡 2 ) ≤ 2} (4.5) Alternatively, a random variable is called sub-Gaussian if it satisfies P{|𝑋 | ≥ 𝑡} ≤ 2 exp(−𝑐𝑡 2 /||𝑋 || 2𝜓2 ) (4.6) for all 𝑡 ≥ 0 where 𝑐 > 0 is an absolute constant. Definition 4.1.3. [35, definition 3.4.1](sub-Gaussian Random Vector) A random vector 𝑋 ∈ R𝑛 is called sub-Gaussian if the one-dimensional marginals ⟨𝑋, 𝑥⟩ are sub-Gaussian random variables for all 𝑥 ∈ R𝑛 . The sub-Gaussian norm of 𝑋 is defined as ||𝑋 || 𝜓2 = sup ||⟨𝑋, 𝑥⟩|| 𝜓2 . (4.7) 𝑥∈𝑆 𝑛−1 Definition 4.1.4. [35, definition 3.4.1](sub-Gaussian Random Matrix) A random 𝑚 × 𝑛 matrix 𝐴 such that its rows 𝐴𝑖 are independent, isotropic and sub-Gaussian random vectors in R𝑛 is called a sub-Gaussian random matrix. 50 Theorem 4.1.5. [35, theorem 9.1.1 and 9.1.8] Let 𝐴 be 𝑚 × 𝑛 sub-Gaussian random matrix. Let 𝐴𝑖 be the rows of 𝐴 with 𝐾 = max || 𝐴𝑖 || 𝜓2 . Let 𝑇 ⊂ R𝑛 with Gaussian width 𝜔(𝑇). Then there exists an absolute constant 𝐶 such that for any 𝑢 ≥ 0 the following statement holds with probability at least 1 − 2 exp(−𝑢 2 ). √ sup || 𝐴𝑥|| 2 − 𝑚||𝑥|| 2 ≤ 𝐶𝐾 2 [𝜔(𝑇) + 𝑢 · rad(𝑇)]. 𝑥∈𝑇 ■ We rewrite the matrix deviation inequality as lower bound for the number of rows of a sub- Gaussian matrix. Theorem 4.1.6. Let 𝑇 ⊂ S𝑛−1 ⊂ R𝑛 with Gaussian width 𝜔(𝑇). Let 0 < 𝜖, 𝜌 < 1. Let 𝐴 be a 𝑚 × 𝑛 sub-Gaussian random matrix with 𝐾 = max || 𝐴𝑖 || 𝜓2 . Then there exists a universal constant 𝐶 such that if  √︃ 2 2 𝐶 2 𝐾 4 𝜔(𝑇) + log( 1−𝜌 ) ≤𝑚 (4.8) 𝜖2 then with probably at least 𝜌 the following expressions 1 (1 − 𝜖)||𝑥|| ≤ || √ 𝐴𝑥|| ≤ (1 + 𝜖)||𝑥|| 𝑚 holds simultaneously for all 𝑥 ∈ 𝑇. Proof. We use theorem 4.1.5. Since 𝑇 ⊂ S𝑛−1 , rad(𝑇) = 1. Inverting the probability expression √︃ 2 𝜌 = 1 − 2 exp(−𝑢 2 ), we obtain 𝑢 = log( 1−𝜌 ). To control the error, we need 𝐶𝐾 2 √ [𝜔(𝑇) + 𝑢] ≤ 𝜖 𝑚  √︃ 2 2 4 2 𝐶 𝐾 𝜔(𝑇) + log( 1−𝜌 ) ≤ 𝑚. 𝜖2 This completes the proof. ■ Now we put our matrix deviation inequality and the Gaussian width upper bound calculation together to a JL map for a compact manifold with boundary. 51 Theorem 4.1.7. (JL Map of a Compact Submanifold of R𝑛 with Boundary via sub-Gaussian Random Matrices) Let 𝑀 ↩→ R𝑛 be a compact 𝑑-dimensional submanifold of R𝑛 with boundary 𝜕 𝑀. Let 𝜏𝑀 be the reach of 𝑀. Let 𝜏𝑖 be the reach of the 𝑖-th connected component of 𝜕 𝑀 as a submanifold of R𝑛 . Let 𝜏 = inf 𝑀,𝑖 {𝜏𝑀 , 𝜏𝑖 }. Let 𝑉𝑀 be the volume of 𝑀 and 𝑉𝜕 𝑀 be the volume of 𝜕 𝑀. Define 𝑉𝑀 41 𝑑 𝑉𝜕 𝑀 81 𝑑−1 𝛼= ( ) + ( ) (4.9) 𝜔𝑑 𝜏 𝜔 𝑑−1 𝜏 𝛼2 𝑐 = ( + 𝑑3𝑑 𝛼) (4.10) 2 √ √︁ 𝜔∗ = 4 2 log(𝑐) + 4𝑑. (4.11) Let 𝐴 be a 𝑚 × 𝑛 sub-Gaussian random matrix such that 𝐾 = max𝑖 || 𝐴𝑖 || 𝜓2 . Let 0 < 𝜖, 𝜌 < 1. There exists a universal constant 𝐶 such that if 𝑚 satisfies  √︃ 2 2 𝐶 2 𝐾 4 𝜔∗ + log( 1−𝜌 ) ≤𝑚 (4.12) 𝜖2 then with probably at least 𝜌, the following bounds hold simultaneously for all 𝑥1 , 𝑥2 ∈ 𝑀, (1 − 𝜖)||𝑥 1 − 𝑥 2 || ≤ || 𝐴(𝑥 1 − 𝑥2 )|| ≤ (1 + 𝜖)||𝑥1 − 𝑥 2 ||. (4.13) Proof. We use theorem 4.1.6. We know that 𝜔∗ is an upper bound for the Gaussian width of 𝑈 (𝑀). Since 𝑈 (𝑀) is a subset of the unit sphere, we can apply theorem 4.1.6 and this immediately gives the result. ■ Remark 4.1.8. We analyze the dependence of 𝑚 on 𝑑 while keeping the other variables fixed. If  √︃ 2 2 𝐶 2 𝐾 4 𝜔(𝑇)+ log( 1−𝜌 ) one puts 𝑚 = 𝜖2 as the least sufficient value of 𝑚, then 𝑚 depends on 𝑑 with 1 Γ( 𝑑2 +1) order of 𝑂 (𝑑 log(𝑑)). We have 𝜔𝑑 = 𝑑 = 𝑂 (𝑑 𝑑 ), 𝛼 = 𝑉𝜔𝑀𝑑 ( 41 𝑑 𝑉𝜕𝑀 81 𝑑−1 𝜏 ) + 𝜔 𝑑−1 ( 𝜏 ) = 𝑂 (𝑑 𝑑 ), 𝑐 = 𝜋2 2 √ √︁ √︁ √︁ ( 𝛼2 + 2𝑑3𝑑−1 𝛼) = 𝑂 (𝑑 2𝑑 ), and 𝜔(𝑀) = 4 2 log(𝑐) + 4𝑑 = log(𝑂 (𝑑 2𝑑 ) + 4𝑑 = 𝑂 ( 𝑑 log(𝑑)).  √︃ 2 2 𝐶 2 𝐾 4 𝜔(𝑇)+ log( 1−𝜌 ) Since 𝑚 = 𝜖2 , then 𝑚 = 𝑂 (𝑑 log(𝑑)). Remark 4.1.9. Comparing our result to the work in [15, theorem 2], we have removed the mild 𝑉 21 geometric condition on reach 𝜏𝑑 ≥ ( √ )𝑑 and can accommodate the presence of a boundary. 2 𝑑 52 However the dependence of the final dimension, 𝑚, on the dimension of the manifold, 𝑑, remains 𝑂 (𝑑 log(𝑑)). 4.2 With SORS Matrices In this section we improve the work in [29], where a sufficient condition is given for finding JL maps via SORS matrices in terms of Gaussian width of the target set. The method is base on the Multi-Resolution Restricted Isometry Property (MRIP). Separately in the work of [10], an improvement in RIP estimates for SORS matrices is presented. Here we combine the improved RIP estimate with the MRIP framework and obtain an improved JL map for compact manifolds based on the Gaussian width of their unit secants. Theorem 4.2.16 is our main theorem in this section. We start with the definition of SORS matrices and some properties. Definition 4.2.1. (SORS Matrix) Let 𝑀 denote a R𝑛×𝑛 orthogonal matrix (or C𝑛×𝑛 where some authors use unitary matrices) obeying 𝐾 𝑀∗𝑀 = 𝐼 and max |𝑀𝑖, 𝑗 | ≤ √ (4.14) 𝑖, 𝑗 𝑛 Let 𝐻 ∈ R𝑚×𝑛 be a random matrix created from selecting rows of 𝑀 in an i.i.d fashion. Let 𝐷 ∈ R𝑛×𝑛 be a random diagonal matrix with diagonal entries of ±1 of equal probability. Then √ 𝐴 = 𝑛𝐻𝐷 is a Subsampled Orthogonal with Random Sign (SORS) matrix with constant 𝐾. Note that 𝐾 ≥ 1 since 1 = 𝑖 𝑀𝑖2𝑗 ≤ 𝑛( √𝐾𝑛 ) 2 = 𝐾 2 . Í Definition 4.2.2. (RIP)[18, Prop 6.1] The sth restricted isometry constant 𝜖 𝑠 of a matrix 𝐴 ∈ R𝑚×𝑁 is the smallest 𝜖 ≥ 0 such that for all 𝑠-sparse 𝑥 ∈ R𝑁 (1 − 𝜖)||𝑥|| 2 ≤ || 𝐴𝑥|| 2 ≤ (1 + 𝜖)||𝑥|| 2 . Proposition 4.2.3. [18, Prop 6.6] For matrix 𝐴, let 𝜖 𝑠 be the 𝑠th restricted isometry constant of 𝐴. Then for integers 1 ≤ 𝑠 ≤ 𝑡 𝑡−𝑑 𝑑 𝜖𝑡 ≤ 𝜖2𝑠 + 𝜖 𝑠 , 𝑑 = gcd(𝑠, 𝑡). 𝑠 𝑠 53 In particular since 𝜖 𝑠 ≤ 𝜖2𝑠 𝑡 𝜖𝑡 ≤ 𝜖2𝑠 . 𝑠 Notation 4.2.4. For a matrix 𝐴, if 𝜖 𝑠 ≤ 𝜖, then we say 𝐴 satisfies the RIP of order (𝑠, 𝜖). Proposition 4.2.5. Let 𝑠 ∈ N, and 1 ≤ 𝑘 be a real number. Then 𝜖 𝑠 ≤ 𝑘𝜖 (2 ⌈ 𝑠 ⌉) . 𝑘 Proof. From proposition 4.2.3, or [18, Prop 6.6], for 1 ≤ 𝑠 ≤ 𝑡, we have 𝜖𝑡 ≤ 𝑠𝑡 𝜖2𝑠 . 𝑠 Since 1 ≤ 𝑘 and 𝑠 is an integer, 1 ≤ ⌈ 𝑘𝑠 ⌉ ≤ 𝑠, and hence 𝜖 𝑠 ≤ 𝜖 (2 ⌈ 𝑠 ⌉) ≤ 𝑘𝜖 (2 ⌈ 𝑠 ⌉) . ⌈ 𝑘𝑠 ⌉ 𝑘 𝑘 Definition 4.2.6. [29, definition 2.1](Extended Restricted Isometry Property (ERIP)) A matrix 𝐴 ∈ R𝑚×𝑁 satisfies the ERIP of order (𝑠, 𝜖) if every 𝑥 ∈ R𝑁 with ||𝑥|| 0 = 𝑠 satisfies || 𝐴𝑥|| 2 − ||𝑥|| 2 ≤ max{𝜖, 𝜖 2 }||𝑥|| 2 Remark 4.2.7. The above definition differs from the RIP when 1 < 𝜖. Definition 4.2.8. [29, definition 2.2] (Multiresolution Restricted Isometry Property (MRIP)) A matrix 𝐴 ∈ R𝑚×𝑁 satisfies the MRIP with (𝑠, 𝜖) if it possesses the extended RIP for (2𝑙 𝑠, 2𝑙/2 𝜖) for 0 ≤ 𝑙 ≤ ⌊log2 ( 𝑁𝑠 )⌋. Next we state an existing RIP estimate for SORS matrices from [18], followed by an improvement from [10]. The improvement has one fewer log factor. Theorem 4.2.9. (RIP for SORS matrices) [18, theorem 12.31]Let 𝐴 ∈ C𝑚×𝑁 be a SORS matrix with constant 𝐾 ≥ 1. For 𝜖 ∈ (0, 1) if 𝑚 ≥ 𝐶𝐾 2 𝜖 −2 𝑠 log4 (𝑁) (4.15) 3 then with probability at least 1 − 𝑁 − log (𝑁) the restricted isometry constant of √1 𝐴 for sparsity 𝑠, 𝑚 i.e 𝜖 𝑠 , satisfies 𝜖 𝑠 ≤ 𝜖. The constant 𝐶 > 0 is universal. Now we present the improvement from [10]. 54 Theorem 4.2.10. [10, Thm 1.1] There exist absolute constants 𝜅, 𝑐 0 , 𝑐 1 > 0 such that the following holds. Let 𝑋1 , ..., 𝑋𝑚 be independent copies of a random vector 𝑋 ∈ C𝑁 with bounded coordinates, i.e. max1≤𝑖≤𝑁 |⟨𝑋, e𝑖 ⟩| ≤ 𝐾 for some 𝐾 > 0, where {𝑒𝑖 }𝑖=1 𝑁 is the standard basis of C𝑁 . Let √ 𝑇 ⊆ {𝑥 ∈ C𝑁 : ∥𝑥∥ 1 ≤ 𝑠}, 𝜖 ∈ (0, 𝜅), and assume that   𝑚 ≥ 𝑐 0 𝐾 2 𝜖 −2 𝑠 log(𝑒𝑁) log2 𝑠𝐾 2 /𝜖 . Then with probability exceeding 1 − 2 exp −𝜖 2 𝑚/(𝑠𝐾 2 ) ,  𝑚 ! 1 ∑︁ sup |⟨𝑦, 𝑋𝑖 ⟩| 2 − E|⟨𝑦, 𝑋⟩| 2 ≤ 𝑐 1 𝜖 1 + sup E|⟨𝑦, 𝑋⟩| 2 . 𝑦∈𝑇 𝑚 𝑖=1 𝑦∈𝑇 Approximate values of constants are 𝜅 ≈ 0.306, 𝑐 0 ≈ 316792 and for 𝑐 1 we have 𝑐 1 = 492. Corollary 4.2.11. (Adapted from [10, Thm 1.1]) There exists absolute constants, 𝑎 0 , 𝑎 1 , 𝑎 2 > 0 such that the following holds for 𝜖 ∈ (0, 𝑎 2 ]. Assume 𝐴 is a 𝑚 × 𝑁 SORS matrix with constant 𝐾 such that 2   2 𝑠 2 𝑎 1 𝑠𝐾 𝑚 ≥ 𝑎 0 𝐾 2 log(𝑒𝑁) log , 𝜖 𝜖 then, with probability at least 1 − 2 exp(−𝜖 2 𝑚/(𝑎 21 𝑠𝐾 2 )), for all 𝑠-sparse vectors 𝑥 ∈ R𝑛 we have 2 1 2 (1 − 𝜖)∥𝑥∥ ≤ √ 𝐴𝑥 ≤ (1 + 𝜖)∥𝑥∥ 2 . 𝑚 C Proof. Using theorem 4.2.10, we consider unit length vectors 𝑇 := {𝑥 ∈ 𝑁 with ∥𝑥∥ 2 = 1 and √ C ∥𝑥∥ 1 ≤ 𝑠}. Since 𝐴 is a SORS matrix, there is a unitary 𝑁×𝑁 matrix 𝑈 where 𝐴 is sampled from √ √ 𝑁𝑈. Let 𝑋 be a random vector uniformly distributed in the rows of 𝑁𝑈. Then we can apply 2 theorem 4.2.10 to 𝑋. We have E|⟨𝑥, 𝑋⟩| 2 = ∥𝑥∥ 22 = 1 and √1𝑚 𝐴𝑥 = 𝑚1 𝑖=1 |⟨𝑥, 𝑋𝑖 ⟩| 2 . Thus Í𝑚 2 2 1 sup √ 𝐴𝑥 − ∥𝑥∥ 22 ≤ 2𝑐 1 𝜖 . 𝑥∈𝑇 𝑚 2 Changing constants to account for the extra 2𝑐 1 factor accompanying the 𝜖 above gives the stated bounds on the probability and 𝑚. The new constants and their approximate values are 𝑎 0 = 𝑐 0 (2𝑐 1 ) 2 ≈ 3 × 1011 , 𝑎 1 = 2𝑐 1 = 984, 𝑎 2 = 2𝑐 1 𝜅 ≈ 301. ■ 55 In the next corollary, we remove the restriction on the size of 𝜖 in corollary 4.2.11. We also introduce a parameter 𝜌 for more explicit control on the probability of success. Using 𝜌 we relate the number of rows, 𝑚, to the success probability. Corollary 4.2.12. Let 𝐴 be a 𝑚 × 𝑁 SORS matrix with constant 𝐾. Let 0 < 𝑠, 𝜖, 𝜌, with 𝑠 an integer and 𝜖, 𝜌 real numbers. There exists absolute constants, 𝑎 0 , 𝑎 1 , 𝑎 2 > 0 such that the following holds. If 𝜖 ≤ 𝑎 2 , and 2 𝑠 2 𝑎 1 𝑠𝐾 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log ( )(1 + 𝜌) (4.16) 𝜖2 𝜖   2 𝑎 1 𝑠𝐾 2 then with probability at least 1 − 2 exp −𝑎 𝑎2 0 log(𝑒𝑁) log ( 𝜖 )(1 + 𝜌) , for all 𝑠-sparse vectors 1 𝑥 ∈ R𝑛 we have 1 (1 − 𝜖)||𝑥|| 2 ≤ || √ 𝐴𝑥|| 2 ≤ (1 + 𝜖)||𝑥|| 2 . (4.17) 𝑚 If 𝜖 ≥ 𝑎 2 and 2 2⌈ 𝑎𝜖2 𝑠 ⌉ 2 𝑎 1 𝐾 2 2⌈ 𝑎𝜖2 𝑠 ⌉ 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌) (4.18) 𝑎 22 𝑎2  𝑎2 𝑠  2 −𝑎 0 2 𝑎 1 2⌈ 𝜖 ⌉𝐾 then with probability at least 1 − 2 exp 𝑎2 log(𝑒𝑁) log ( 𝑎2 )(1 + 𝜌) , for all 𝑠-sparse 1 vectors 𝑥 ∈ R𝑛 we have 1 (1 − 𝜖)||𝑥|| 2 ≤ || √ 𝐴𝑥|| 2 ≤ (1 + 𝜖)||𝑥|| 2 . 𝑚 Proof. The case of 𝜖 ≤ 𝑎 2 is the same as before, with a simplified expression for probability. In the simplification we plug in the expression for 𝑚 into the probability bound and simplify. For the case of 𝜖 > 𝑎 2 , in the formula for 𝜖 ≤ 𝑎 2 we put 𝜖 = 𝑎 2 and 𝑠 = 2⌈ 𝑎𝜖2 𝑠 ⌉. By proposition 4.2.5, from RIP of order (2⌈ 𝑎𝜖2 𝑠 ⌉, 𝑎 2 ), putting 𝑘 = 𝜖 𝑎2 , we can get RIP of order (𝑠, 𝜖). In this process we need 2 2⌈ 𝑎𝜖2 𝑠 ⌉ 2 𝑎 1 𝐾 2 2⌈ 𝑎𝜖2 𝑠 ⌉ 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌). 𝑎 22 𝑎2 Substituting the value of 𝑚 in the probability bound, we get the claimed expression. ■ In the next corollary we combine the two regimes of 𝜖 into one expression. 56 Corollary 4.2.13. There exist constants 𝑎 0 , 𝑎 1 , 𝑎 2 such that for a SORS matrix 𝐴 of size 𝑚 × 𝑁 with constant 𝐾, and positive integer 𝑠 and positive reals 𝜖 and 𝜌 the following holds. If   2 1 2 2 1 2 2 𝑚 ≥ 𝑎 0 𝐾 ( 2 + 2 )𝑠 log(𝑒𝑁) log ( + )𝑎 1 𝑠𝐾 (1 + 𝜌) 𝜖 𝑎2 𝜖 𝑎2 then the matrix √1 𝐴 satisfies the RIP condition for (𝑠, 𝜖) with probability at least 𝑚 ! −𝑎 0 𝑎 1 1 − 2 exp log(𝑒𝑁) log2 ( )(1 + 𝜌) . 𝑎 21 𝑎2 Proof. We combine the two bounds (4.16) and (4.18) for 𝑚 from the cases of 𝜖 ≤ 𝑎 2 and 𝜖 > 𝑎 2 into a single bound. Similarly we combine the probability bounds into one expres- sions. This final expression is independent of 𝑠 and 𝜖 and it relates to 𝑚 through the variable 𝜌. 𝑎2 𝑆 2⌈ 𝜖 ⌉ 2𝑆 For the 𝑚 bound, when 𝜖 ≥ 𝑎 2 we used 𝑎 22 ≤ 𝑎 22 . For the probability bounds first note that 𝑎 1 𝑠𝐾 2 𝑎1 1 − exp(−𝑥) is an increasing function. Then when 𝜖 ≤ 𝑎 2 we have 𝜖 ≥ 𝑎2 , since 𝑠, 𝐾 ≥ 1. 𝑎 𝑠 𝑎 1 𝐾 2 2⌈ 𝜖2 ⌉ 𝑎1 When 𝜖 ≥ 𝑎 2 , then 𝑎2 ≥ 𝑎2 . ■ The focus of the next lemma is transitioning from RIP bounds to their multi-resolution (MRIP) counterparts. Proposition 4.2.14. Let 𝐴 be a 𝑚 × 𝑁 SORS matrix with constant 𝐾. Let 𝑠 be a positive integer, 𝜖 < 1 and 𝜌 be positive reals. There exists absolute constants, 𝑎 0 , 𝑎 1 , 𝑎 2 > 0 such that if ! 1 1 1 1 1 1 𝑚 ≥ 2𝑎 0 𝐾 2 ( 2 + 2 + 3 )𝑠 log(𝑒𝑁) log2 2𝑎 1 𝑠𝐾 2 ( 2 + + 2 ) (1 + 𝜌) (4.19) 𝜖 𝑎2 𝑎2 𝜖 𝑎2 𝑎2 then √1 𝐴 satisfies the MRIP (𝑠, 𝜖) criteria with probability at least 𝑚 ! 𝑎0 𝑎 1 1 − 2 exp log(log(𝑁)) − 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) . 𝑎1 𝑎2 Proof. The probability expression is simpler to obtain so we start with it. We take the expression of probability from corollary 4.2.13, and apply it log 𝑁 times. In the definition of MRIP, given a 𝑖 sparsity level 𝑠 and distortion 𝜖, we need to get ERIP conditions of the form (2𝑖 𝑠, 2 2 𝜖) or (2𝑖 𝑠, 2𝑖 𝜖 2 ) for 𝑖 starting from 0 and continuing until 2𝑖 𝑠 is larger than 𝑁. Therefore there are at most log 𝑁 steps. Since the probability expression in corollary 4.2.13 is independent of 𝑠 and 𝜖, we can use 57 the same expression for different steps. Combining the failure probabilities by the union bound we arrive at the expression ! 𝑎0 𝑎 1 1 − 2 exp log(log(𝑁)) − 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) . (4.20) 𝑎1 𝑎 2 Now with respect to 𝑚, we need to show that the given bound for 𝑚, (4.19), is large enough for the different ERIP conditions required in the MRIP. Here we find it more convenient to return to corollary 4.2.12. First we check the condition for the regime 𝜖 ≤ 𝑎 2 , over the different 𝑖 values 𝑖 𝑖 (𝑠, 𝜖) → (2𝑖 𝑠, 2 2 𝜖) when 2 2 𝜖 ≤ 1. 𝑠 2 2 𝑎 1 𝑠𝐾 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log ( )(1 + 𝜌) → (4.21) 𝜖2 𝜖 2𝑖 𝑠 𝑎 1 2𝑖 𝑠𝐾 2 𝑚 ≥ 𝑎0 𝐾 2 𝑖 log(𝑒𝑁) log2 ( 𝑖 )(1 + 𝜌) (4.22) (2 2 𝜖) 2 (2 2 𝜖) 𝑠 𝑖 𝑎 1 𝑠𝐾 2 𝑚 ≥ 𝑎 0 𝐾 2 2 log(𝑒𝑁) log2 (2 2 )(1 + 𝜌) (4.23) 𝜖 𝜖 𝑖 The transition from (2𝑖 𝑠, 2 2 𝜖) to (2𝑖 𝑠, 2𝑖 𝜖 2 ) happens when for 𝜖 < 1, 𝑖 𝑖 1 22 𝜖 = 1 → 22 = (4.24) 𝜖 So we get the bound for 𝑚 as 𝑠 2 2 𝑎 1 𝑠𝐾 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log ( )(1 + 𝜌). (4.25) 𝜖2 𝜖2 Now we plug-in (2𝑖 𝑠, 2𝑖 𝜖 2 ) for the regime when 2𝑖 𝜖 2 > 1 to get 2𝑖 𝑠 2 2 𝑎 1 2 𝑠𝐾 𝑖 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log ( )(1 + 𝜌) (4.26) (2𝑖 𝜖 2 ) 2 (2𝑖 𝜖) 𝑠 𝑎 1 𝑠𝐾 2 ≥ 𝑎 0 𝐾 2 𝑖 4 log(𝑒𝑁) log2 ( )(1 + 𝜌) (4.27) 2𝜖 𝜖 and this bound for 𝑚 get smaller as 𝑖 grows so we can just work with the earlier one in (4.25). 𝑖 Now we work with the case for 𝜖 > 𝑎 2 . Again we plug in (2𝑖 𝑠, 2 2 𝜖) or (2𝑖 𝑠, 2𝑖 𝜖 2 ), and look for a 58 𝑖 combined upper bound. First substitute (2𝑖 𝑠, 2 2 𝜖) into (4.18) 2 2⌈ 𝑎𝜖2 𝑠 ⌉ 2 𝑎 1 𝐾 2 2⌈ 𝑎𝜖2 𝑠 ⌉ 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌) → (4.28) 𝑎 22 𝑎2 𝑖 𝑖 2⌈ 𝑎2 𝑖2 𝑠 ⌉ 𝑎 1 𝐾 2 2⌈ 𝑎2 𝑖2 𝑠 ⌉ 22 𝜖 22 𝜖 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) (4.29) 𝑎 22 𝑎2 𝑖 𝑖 2⌈ 𝑎2 2𝜖 𝑠 ⌉ 𝑎 1 𝐾 2 2⌈ 𝑎2 2𝜖 𝑠 ⌉ 2 2 2 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log2 ( )(1 + 𝜌) (4.30) 𝑎 22 𝑎2 𝑖 1 𝑖 Similar to above 2 2 is at most 𝜖 when 2 2 𝜖 ≤ 1 so we arrive at 2 2⌈ 𝑎𝜖22𝑠 ⌉ 2 𝑎 1 𝐾 2 2⌈ 𝑎𝜖22𝑠 ⌉ 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌) 𝑎 22 𝑎2 Next we try plugging in (2𝑖 𝑠, 2𝑖 𝜖 2 ) and we arrive at 𝑖 𝑖 2⌈ (2𝑎2𝑖 𝜖22 )𝑠2 ⌉ 𝑎 1 𝐾 2 2⌈ (2𝑎2𝑖 𝜖22 )𝑠2 ⌉ 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) (4.31) 𝑎 22 𝑎2 𝑎2 𝑠 2⌈ 𝜖 2 (2 𝑖 𝜖 2) ⌉ 𝑎 1 𝐾 2 2⌈ 𝜖 2 (2 𝑎2 𝑠 𝑖 𝜖 2) ⌉ 2 2 𝑚≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌). (4.32) 𝑎 22 𝑎2 We note that in this case by assumption 2𝑖 𝜖 2 ≥ 1 so we can drop it and get a stricter lower bound for 𝑚 as 2 2⌈ 𝑎𝜖22𝑠 ⌉ 2 𝑎 1 𝐾 2 2⌈ 𝑎𝜖22𝑠 ⌉ 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌). (4.33) 𝑎 22 𝑎2 So overall for 𝜖 ≤ 𝑎 2 and 𝜖 > 𝑎 2 we respectively need 𝑠 2 2 𝑎 1 𝑠𝐾 𝑚 ≥ 𝑎0 𝐾 2 log(𝑒𝑁) log ( )(1 + 𝜌) 𝜖2 𝜖2 and 2 2⌈ 𝑎𝜖22𝑠 ⌉ 2 𝑎 1 𝐾 2 2⌈ 𝑎𝜖22𝑠 ⌉ 𝑚 ≥ 𝑎0 𝐾 log(𝑒𝑁) log ( )(1 + 𝜌). (4.34) 𝑎 22 𝑎2 We can combine them both into one expression with as claimed in the proposition. ! 2 1 1 1 2 2 1 1 1 𝑚 ≥ 2𝑎 0 𝐾 ( 2 + 2 + 3 )𝑠 log(𝑒𝑁) log 2𝑎 1 𝑠𝐾 ( 2 + + ) (1 + 𝜌). (4.35) 𝜖 𝑎2 𝑎2 𝜖 𝑎 2 𝑎 22 59 𝑎2 𝑠 1 ⌈ ⌉ where for 𝜖 > 𝑎 2 , we used ( 𝜖12 + 𝑎 22 + 1 𝑎 32 )𝑠 ≥ 𝑠 𝑎 32 + 1 𝑎 22 ≥ 𝜖2 𝑎22 . ■ Having established the MRIP estimate for SORS matrices, we use the results of [29] to get a JL map for compact manifolds. First we state the result in [29]. Theorem 4.2.15. Let 𝑇 ⊂ R𝑁 and suppose the matrix 𝐻 ∈ R𝑚×𝑁 obeys the multiresolution RIP with sparsity and distortion levels 𝜖 rad(𝑇) 𝑠 = 150(1 + 𝜂), 𝛿= (4.36) 𝐶 max(rad(𝑇), 𝜔(𝑇)) with 𝐶 > 0 an absolute constant. Then, for a diagonal matrix 𝐷 with an i.i.d random sign pattern on the diagonal, the matrix 𝐴 = 𝐻𝐷 obeys sup || 𝐴𝑥|| 2 − ||𝑥|| 2 ≤ max(𝜖, 𝜖 2 ).rad(𝑇) 2 (4.37) 𝑥∈𝑇 with probability at least 1 − exp(−𝜂). Here rad(𝑇) = sup𝑣∈𝑇 ||𝑣|| is the maximum of the Euclidean norm of points inside 𝑇. ■ Now we use our proposition 4.2.14 with theorem 4.2.15. Theorem 4.2.16. Let 𝑀 ↩→ R𝑁 be a compact 𝑑-dimensional submanifold of R𝑁 with boundary 𝜕 𝑀. Let 𝜏𝑀 be the reach of 𝑀. Let 𝜏𝑖 be the reach of the 𝑖-th connected component of 𝜕 𝑀 as a submanifold of R𝑁 . Let 𝜏 = inf 𝑀,𝑖 {𝜏𝑀 , 𝜏𝑖 }. Let 𝑉𝑀 be the volume of 𝑀 and 𝑉𝜕 𝑀 be the volume of 𝜕 𝑀. Let 𝑈 = 𝑈 (𝑀 − 𝑀) = { || 𝑝−𝑞 𝑝−𝑞|| | 𝑝 ≠ 𝑞, 𝑝, 𝑞 ∈ 𝑀 } be the set of unit secants of 𝑀, and let 𝑈 be its closure. Let 𝜔(𝑈) be the Gaussian width of 𝑈. Define 𝑉𝑀 41 𝑑 𝑉𝜕 𝑀 81 𝑑−1 𝛼= ( ) + ( ) (4.38) 𝜔𝑑 𝜏 𝜔 𝑑−1 𝜏 𝛼2 𝑐 = ( + 3𝑑 𝛼) (4.39) 2 √ √︁ 𝑤 ∗ = 4 2 log(𝑐) + 4𝑑 (4.40) Let 𝜌, 𝜂 and 𝜖 be positive reals. Let 0 < 𝜖 < 1. Let 𝐴 be a SORS matrix of size 𝑚 × 𝑁 with constant 𝐾. 60 There exist absolute constants 𝑎 0 , 𝑎 1 , 𝑎 2 and 𝐶 such that if ! (𝐶𝑤 ∗2) 1 1 (𝐶𝑤 ∗2) 1 1 𝑚 ≥ 𝑎0 𝐾 2 2 + 2 + 3 )(1 + 𝜂) log(𝑒𝑁) log2 (1 + 𝜂)𝑎 1 𝐾 2 + + ) (1 + 𝜌) 𝜖 𝑎2 𝑎2 𝜖2 𝑎 2 𝑎 22 (4.41) then for all 𝑥, 𝑦 ∈ 𝑀 we have ∥ 𝐴(𝑥 − 𝑦)∥ 2 − ∥𝑥 − 𝑦∥ 2 ≤ 𝜖 ||𝑥 − 𝑦|| 2 (4.42) with probability ! 𝑎0 𝑎1 1 − exp(−𝜂) − 2 exp log(log(𝑁)) − 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) . (4.43) 𝑎1 𝑎2 Proof. We use proposition 4.2.14 with theorem 4.2.15. We consider 𝑈 (𝑀 − 𝑀) ⊂ 𝑆 𝑁−1 . We have 𝑤 ∗ as an upper bound for the Gaussian width of 𝑈 (𝑀 − 𝑀). Therefore to get an 𝜖 JL map for 𝑈 (𝑀 − 𝑀) we need to establish √1 𝐴 has the MRIP property for (𝑠, 𝛿) where 𝑠 determines the 𝑚 𝜖 rad(𝑈 (𝑀−𝑀)) probability and 𝛿 = 𝐶 max(rad(𝑈 (𝑀−𝑀)),𝜔(𝑈 (𝑀−𝑀))) . We have rad(𝑈 (𝑀 − 𝑀)) = 1, and 𝑤 ∗ > 1, and 𝜖 we get 𝛿 = 𝐶𝑤 ∗ . To get such an MRIP we need ! (𝐶𝑤∗) 2 1 1 (𝐶𝑤∗) 2 1 1 𝑚 ≥ 2𝑎 0 𝐾 2 ( + 2 + 3 )150(1 + 𝜂) log(𝑒𝑁) log2 2𝑎 1 (150)(1 + 𝜂)𝐾 2 ( + + ) (1 + 𝜌). 𝜖2 𝑎2 𝑎2 𝜖2 𝑎 2 𝑎 22 (4.44) and the success probability is ! 𝑎0 𝑎 1 1 − 2 exp log(log(𝑁)) − 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) . (4.45) 𝑎1 𝑎2 Going from MRIP to an 𝜖-JL map has a failure probability of its own as exp(−𝜂). Combining the failure probabilities we arrive at ! 𝑎0 𝑎 1 1 − exp(−𝜂) − 2 exp log(log(𝑁)) − 2 log(𝑒𝑁) log2 ( )(1 + 𝜌) . (4.46) 𝑎1 𝑎2 ■ 61 CHAPTER 5 NECESSARY CONDITIONS FOR JL MAPS OF MANIFOLDS (VARIATION 2) In this section we give a necessary condition for existence of JL maps for compact manifolds. Earlier work of Alon in [5, Theorem 9.3.] provides a necessary condition for a JL map of finite points. Our work is a partial generalization of that bound to manifolds. Our method is based on the Sudokov inequality [35, Theorem 7.4.1] which bounds the Gaussian width from below using covering numbers. Complementing with Sudakov-Fernique’s comparison theorem [35, Def 7.2.11], we control the reduction in the Gaussian width under the JL map. The image of the manifold under the map must fit in the target space and comparison of the Gaussian widths of the unit sphere of the target space with the Gaussian width of the manifold gives us the desired necessary condition. Our main theorem in this section is 5.0.5. After that we consider a reduction from manifolds to finite points, and compare our work to the lower bound in [5, Theorem 9.3.]. We start by reviewing the definitions and give estimates of the Gaussian width. Definition 5.0.1. [35, Def 7.5.1] Let 𝑔 be a standard Gaussian random variable in R𝑛 . Define the Gaussian Width of 𝑇 ⊂ R𝑛 , 𝜔(𝑇), as follows. 𝜔(𝑇) = E sup ⟨𝑔, 𝑥⟩ 𝑥∈𝑇 Proposition 5.0.2. Let 𝐿 : R𝑛 → R𝑚 be a linear map, 𝑇 ⊂ R𝑛 with 𝑎 and 𝑏 positive reals such that for 𝑥, 𝑦 ∈ 𝑇, 𝑎||𝑥 − 𝑦|| ≤ ||𝐿 (𝑥 − 𝑦)|| ≤ 𝑏||𝑥 − 𝑦||. Then 𝑎 𝜔(𝑇) ≤ 𝜔(𝐿(𝑇)) ≤ 𝑏 𝜔(𝑇). Proof. We use the Sudakov-Fernique’s comparison theorem [35, Def 7.2.11]. For 𝑡 ∈ 𝑇, define the Gaussian processes 𝐴𝑡 = 𝑎 ⟨𝑔𝑛 , 𝑡⟩, 𝐵𝑡 = ⟨𝑔𝑚 , 𝐿𝑡⟩, 𝐶𝑡 = 𝑏 ⟨𝑔𝑛 , 𝑡⟩. For their increments we have the estimates E( 𝐴𝑡1 − 𝐴𝑡2 ) 2 = 𝑎 2 E(⟨𝑔𝑛 , 𝑡1 − 𝑡 2 ⟩) 2 = 𝑎 2 ||𝑡1 − 𝑡2 || 2 ≤ E(𝐵𝑡1 − 𝐵𝑡2 ) 2 = E(⟨𝑔𝑚 , 𝐿(𝑡1 − 𝑡2 )⟩) 2 = ||𝐿(𝑡1 − 𝑡 2 )|| 2 ≤ E(𝐶𝑡1 − 𝐶𝑡2 ) 2 = 𝑏 2 E(⟨𝑔𝑛 , 𝑡1 − 𝑡2 ⟩) 2 = 𝑏 2 ||𝑡1 − 𝑡 2 || 2 . 62 By the comparison theorem E sup𝑡 𝐴𝑡 ≤ E sup𝑡 𝐵𝑡 ≤ E sup𝑡 𝐶𝑡 . This implies the result as E sup𝑡 𝐴𝑡 = 𝑎 E sup𝑡 ⟨𝑔𝑛 , 𝑡⟩ = 𝑎𝜔(𝑇) with similar expressions for 𝐵𝑡 and 𝐶𝑡 . ■ The next proposition gives the necessary condition on the target dimension given the requirement of the Gaussian widths. Proposition 5.0.3. Let 𝑀 ⊂ R𝑛 have Gaussian width 𝜔(𝑀), and 𝐿 be a linear map from R𝑛 to R𝑚 , and 0 < 𝜖 < 1. Assume for all 𝑥, 𝑦 ∈ 𝑀 (1 − 𝜖)||𝑥 − 𝑦|| ≤ ||𝐿(𝑥 − 𝑦)|| ≤ (1 + 𝜖)||𝑥 − 𝑦||. Then  2 1 1 − 𝜖 𝜔(𝑀) ( ) ≤ 𝑚. 2 1 + 𝜖 diam(𝑀) Proof. Consider the set 𝑀 − 𝑀 (defined as 𝑀 + (−𝑀)). By proposition 5.0.2, (1 − 𝜖)𝜔(𝑀 − 𝑀) ≤ 𝜔(𝐿 (𝑀 − 𝑀)) We have 𝜔(𝑀 − 𝑀) = 21 𝜔(𝑀) [35, proposition 7.5.2]. Image of 𝐿 (𝑀 − 𝑀) is contained in a √ ball in R𝑚 with radius (1 + 𝜖)( diam(𝑀)). Since 𝜔(𝐵𝑚 (1)) = 𝑚 [35, 7.5.7], then by monotonicity √ of Gaussian width and scaling 𝜔(𝐿 (𝑀 − 𝑀)) ≤ (1 + 𝜖)diam(𝑀) 𝑚. Then 1 √ (1 − 𝜖) 𝜔(𝑀) = (1 − 𝜖)𝜔(𝑀 − 𝑀) ≤ 𝜔(𝐿 (𝑀 − 𝑀)) ≤ (1 + 𝜖) diam(𝑀) 𝑚. 2 ■ In the next proposition we combine the Sudakov inequality with the lower covering estimates in corollary 2.4.2. When we optimize the Sudakov inequality over the choice of covering radii, we arrive at two possible regimes. Proposition 5.0.4. Let 𝑀 be a d-dimensional smooth submanifold of R𝑛 possibly with boundary, 𝜋 𝑑/2 with volume 𝑉 and reach 𝜏. Let 0 < 𝜖 ≤ 2𝜏 . Let 𝜔 𝑑 = Γ( 𝑑2 +1) be the volume of the unit d-ball. There 2 𝑉 𝑑1 there is a universal constant 𝑐 such that, if √ 3 𝑒 ( 𝜔 𝑑 ) ≤ 𝜏 then 63 √︂ 𝑐 𝑑 𝑉 1 ( ) 𝑑 ≤ 𝜔(𝑀). 3 2𝑒 𝜔 𝑑 2 𝑉 𝑑1 If 𝜏 ≤ √ 3 𝑒 ( 𝜔 𝑑 ) then √︄ 𝜏 𝑉 𝑐 log( ) ≤ 𝜔(𝑀). 2 𝜔 𝑑 ( 32 𝜏) 𝑑 Proof. We use Sudakov’s inequality [35, theorem 7.4.1]. √︁ 𝑐𝜖 log(𝐶 (𝑀, 𝜖)) ≤ 𝜔(𝑀) 𝜏 Sudakov’s inequality is valid for any radius 𝜖. We further restrict to 𝜖 ≤ 2 so we can use 𝑉 𝜔 𝑑 (3𝜖) 𝑑 ≤ 𝐶 (𝑀, 𝜖), corollary 2.4.2. We obtain √︄ 𝑉 𝑐𝜖 log( ) ≤ 𝜔(𝑀) (5.1) 𝜔 𝑑 (3𝜖) 𝑑 Maximizing the left hand side with respect to 𝜖, for the optimal 𝜖 = 𝜖 ∗ , we obtain √︄ 𝑉 𝑑 log( ) − √︃ =0 𝜔 𝑑 (3𝜖) 𝑑 2 log( 𝑉 ) 𝜔 𝑑 (3𝜖) 𝑑 1 𝑉 1 log(3𝜖 ∗ ) = log( ) − 𝑑 𝜔𝑑 2 1 𝑉 1 𝜖∗ = √ ( ) 𝑑 3 𝑒 𝜔𝑑 Using this optimal choice in equation (5.1),we get √︂ 𝑐 𝑑 𝑉 1 ( ) 𝑑 ≤ 𝜔(𝑀). 3 2𝑒 𝜔 𝑑 If the optimal 𝜖 ∗ is forbidden because of 𝜖 ≤ 2𝜏 , then we just use 𝜖 = 2𝜏 . It gives the inequality √︄ 𝜏 𝑉 𝑐 log( ) ≤ 𝜔(𝑀). 2 𝜔 𝑑 ( 32 𝜏) 𝑑 For a valid formula we must have 𝑉 0 ≤ log( ) 𝜔 𝑑 ( 32 𝜏) 𝑑 𝑉 1≤ 𝜔 𝑑 ( 32 𝜏) 𝑑 2 𝑉 1 𝜏 ≤ ( )𝑑 (5.2) 3 𝜔 64 Because we are considering the case where the optimal 𝜖 ∗ is forbidden, we have 𝜏 1 𝑉 1 < 𝜖∗ = √ ( ) 𝑑 2 3 𝑒 𝜔𝑑 2 𝑉 1 𝜏 < √ ( )𝑑 3 𝑒 𝜔𝑑 This is enough for satisfying (5.2). ■ Now we put together the requirement on the target dimension of a JL map, with the geometric properties of the manifold and its Gaussian width. Theorem 5.0.5. Let 𝑀 ⊂ R𝑛 with Gaussian width 𝜔(𝑀), 𝐿 a linear map from R𝑛 to R𝑚 , and 0 < 𝜖 < 1 and let 𝑐 represent multiple universal constants. Assume for all 𝑥, 𝑦 ∈ 𝑀 (1 − 𝜖)||𝑥 − 𝑦|| ≤ ||𝐿(𝑥 − 𝑦)|| ≤ (1 + 𝜖)||𝑥 − 𝑦||. 2 1 If √ ( 𝑉 )𝑑 3 𝑒 𝜔𝑑 ≤ 𝜏 then 1−𝜖 2 𝑑 𝑉 2 𝑐( ) 2 ( ) 𝑑 ≤ 𝑚. 1 + 𝜖 diam (𝑀) 𝜔 𝑑 2 1 If 𝜏 ≤ √ ( 𝑉 )𝑑 3 𝑒 𝜔𝑑 then 1−𝜖 2 1 𝑉 𝑐𝜏 2 ( ) 2 log( ) ≤ 𝑚. 1 + 𝜖 diam (𝑀) 𝜔 𝑑 ( 32 𝜏) 𝑑 Proof. We use proposition 5.0.3 and proposition 5.0.4. We have a lower bound for the embedding dimension 𝑚 using the distortion 𝜖 and the Gaussian width of 𝑀, 𝜔(𝑀). Then we have a lower bound for the 𝜔(𝑀) using the covering numbers of 𝑀. Putting the two together we get the following. 2 1 If √ ( 𝑉 )𝑑 3 𝑒 𝜔𝑑 ≤𝜏 √︂ 2𝑐 𝑑 𝑉 1 ( ) 𝑑 ≤ 𝜔(𝑀) 3 2𝑒 𝜔 𝑑  2 1 1 − 𝜖 𝜔(𝑀) ( ) ≤𝑚 2 1 + 𝜖 diam(𝑀) 1−𝜖 2 𝑑 𝑉 2 𝑐( ) ( ) 𝑑 ≤ 𝑚. 1 + 𝜖 diam2 (𝑀) 𝜔 𝑑 65 2 1 If 𝜏 ≤ √ ( 𝑉 )𝑑 3 𝑒 𝜔𝑑 then √︄ 𝜏 𝑉 𝑐 log( ) ≤ 𝜔(𝑀) 2 𝜔 𝑑 ( 32 𝜏) 𝑑  2 1 1 − 𝜖 𝜔(𝑀) ( ) ≤𝑚 2 1 + 𝜖 diam(𝑀) 1−𝜖 2 1 𝑉 𝑐𝜏 2 ( ) log( ) ≤ 𝑚. 1 + 𝜖 diam (𝑀) 2 𝜔 𝑑 ( 32 𝜏) 𝑑 ■ Standard Examples: We apply our theorem 5.0.5 to the standard examples of unit sphere 𝑆 𝑑 and unit ball 𝐵 𝑑 as submanifold of R𝑁 . Corollary 5.0.6. For 𝐵 𝑑 , the data is as follows. 𝑀 𝑉𝐵 𝑑 𝜏𝐵 𝑑 𝜔𝑑 diam(M) 𝑑 𝜋2 𝐵𝑑 𝑑 Γ( 2 +1) ∞ 𝑉𝐵 𝑑 2 Since 𝜏 is infinite, we have 1−𝜖 2 𝑑 𝑉 2 𝑐( ) 2 ( )𝑑 ≤ 𝑚 1 + 𝜖 diam (𝑀) 𝜔 𝑑 1−𝜖 2 𝑐( ) 𝑑≤𝑚 1+𝜖 which agrees with our intuition that for a perfect, no distortion embedding of unit 𝐵 𝑑 , target dimension grows at least linearly in 𝑑. ■ Corollary 5.0.7. For the unit S𝑑 the data is as follows. 𝑀 𝑉𝑆 𝑑 𝜏𝑆 𝑑 𝜔𝑑 diam(M) 𝑑+1 𝑑 2 𝜋2 𝑆𝑑 2 Γ(𝜋 𝑑+1 ) 1 Γ( 𝑑2 +1) 2 2 66 We check the relation between reach and the other parameters to find that 𝜏 is big enough. ! 𝑑1 2 𝑉 1 2 √ Γ( 𝑑2 + 1) √ ( ) = √ 2 𝜋𝑑 3 𝑒 𝜔𝑑 3 𝑒 Γ( 𝑑+1 2 ) ! 𝑑1 √ Γ( 𝑑 + 1) Using lim 2 𝜋 2𝑑+1 = 1, for large enough 𝑑 we have 𝑑→∞ Γ( 2 ) 2 𝑉 1 √ ( ) 𝑑 < 𝜏 = 1. 3 𝑒 𝜔𝑑 So we get 1−𝜖 2 𝑑 𝑉 2 𝑐( ) 2 ( )𝑑 ≤ 𝑚 1 + 𝜖 diam (𝑀) 𝜔 𝑑 1−𝜖 2 𝑐( ) 𝑑 ≤ 𝑚. 1+𝜖 This gives a linear dependence on 𝑑 which agrees which our intuition that to embed 𝑆 𝑑 without distortion at least 𝑑 dimensions are needed. The reason is standard 𝑆 𝑑 circumscribes 𝑑 + 2 points that are equidistant from each other and such points need at least R𝑑+1 for an embedding. ■ Reduction to the case of finite points: Now we show that when we reduce to the zero- 𝑛 dimensional case we recover the finite point case. Consider 𝑛 points in R𝑁 . There are at most 2 pair-wise secants. Normalizing them we get some points on the unit 𝑆 𝑁−1 . Call this set 𝑀. We 𝜏 apply Sudakov’s inequality to the set 𝑀. The covering number with balls of radius 𝜖 for 𝜖 < 2 is 𝜏 exactly |𝑀 |. Using the counting measure we put 𝑉 = |𝑀 |. Then by Sudakov’s inequality for 𝜖 = 2 we get 𝜏 √︁ 𝑐 log(𝑉) ≤ 𝜔(𝑀) √︂ 2 𝜏 𝑛(𝑛 − 1) 𝑐 log( ) ≤ 𝜔(𝑀) 2 2 𝜏 √︁ 𝑐 log(𝑛) ≤ 𝜔(𝑀). 4 67 Using proposition 5.0.3 √︁ !2  2 𝜏 1 1 − 𝜖 𝑐 4 log(𝑛) 1 1 − 𝜖 𝜔(𝑀) ( ) ≤ ( ) ≤ 𝑚. (5.3) 2 1 + 𝜖 diam(𝑀) 2 1 + 𝜖 diam(𝑀) For finite points the reach 𝜏 is equal to half of the smallest distance between the points and diameter is the largest distance hence 𝜏 diam ≤ 21 . Therefore this term won’t push the lower bound up enough to contradict the known upper bounds. Simplifying equation (5.3) we arrive at 𝜏 2 1−𝜖 2 𝑐( ) ( ) log(𝑛) ≤ 𝑚. diam 1 + 𝜖 Comparing to the well known upper JL bound 𝑂 ( 𝜖12 ) log(𝑛), [35, Theorem 5.3.1], we see that we recover the log(𝑛). A known lower bound for finite points from [5, Theorem 9.3.] is 1 𝑂 ( 𝜖 2 log( 1 ) log(𝑛). Comparing to our bound, we have the same dependence on log(𝑛), however ) 𝜖 1−𝜖 1 our 𝜖 dependence of 1+𝜖 is weaker than 𝜖 2 log( 𝜖1 ) . ■ 68 CHAPTER 6 A JL ALGORITHM WITH 𝑁 log(log(𝑁)) RUN-TIME (VARIATION 3) In this chapter we give an algorithm for computing a JL map with a 𝑁 log(log(𝑁)) run-time. This approach improves on the work of Ailon and Liberty in [2] for fast JL maps. First we present the algorithm for finite point sets, and after we upgrade it to the case of subspaces. Since we restrict the discussion to finite points and subspaces, our bounds don’t directly involve the reach as a parameter. Our algorithm is based on the divide and conquer approach, where we split a vector into pieces, apply a common fast JL map to each piece, combine the outputs and apply a sub-Gaussian JL map at the end. An schematic diagram for the algorithm in presented in figure 6.1. In this way we maintain the best of two worlds: the run-time speed of fast JL maps and the optimal embedding dimension of sub-Gaussian matrices. In lemma 6.0.1, we show the general framework for combining fast and sub-Gaussian JL maps work together. Our main theorems are 6.1.5 for finite points and 6.2.5 for the case of subspaces. Although our theorems are stated for general matrices with fast matrix-vector multiply, Discrete Fourier Transform (DFT) and Welsh-Hadamard matrices are prime examples of matrices that have the fast runtimes. C𝑁 be a set with 𝑛 points. Assume 𝑁 > 𝑚12 for integers 𝑚1 ≥ 𝑚2   Lemma 6.0.1. Let 𝜖 ∈ 0, 13 , 𝑆 ⊂ where 𝑚 12 divides 𝑁. Furthermore 1. Split every vector in 𝑆 into 𝑁/𝑚 12 pieces each in C𝑚 2 1 and obtain the set 𝑆′ ⊂ C𝑚 2 1 of cardinality 𝑛𝑁/𝑚 12 . 2. Let A ∈ C𝑚 ×𝑚 1 2 1 be an 𝜖-JL map of 𝑆′ into C𝑚 .1 3. Suppose A has a fast matrix vector multiply so that Ay can be computed in 𝑚 12 · 𝑓 (𝑚 1 ) = 𝑜(𝑚 13 )- time for all y ∈ C𝑚 . 2 1 69 Figure 6.1: A vector is split into pieces, each part is processed with a fast JL map, outputs are combined and fed into a Gaussian JL map. The Gaussian maps have optimal dimensional reduction while fast JL maps have better run-time. In this mixed approach, we combine the benefits of both methods. 4. Let 𝑆′′ ⊂ C𝑁/𝑚1 be a set of 𝑛 points obtained by reshaping the 𝑁/𝑚 12 pieces of each vector in 𝑆 back into a single vector after they’ve been mapped into C𝑚1 by A, and 5. Let B ∈ C𝑚2 ×𝑁/𝑚1 be any 𝜖-JL map of 𝑆′′ into C𝑚 . 2 ©𝐴 ª ­ ® Then, 𝐿 (x) := B ­ ­ . . . ® will be a 3𝜖-JL map of 𝑆 ⊂ C𝑁 into C𝑚2 . Furthermore, 𝐿 can be ® ­ ® ­ ® 𝐴 « ¬ applied to any vector in just 𝑁 · 𝑓 (𝑚 1 )-time. Proof. For 𝑥 ∈ 𝑆, let 𝑥𝑖 , 1 ≤ 𝑖 ≤ 𝑚𝑁2 be its pieces in 𝑆′. Let 𝑥 ′ ∈ 𝑆′′ be the concatenated vector 1    𝐴𝑥 1       𝐴𝑥  . Then ∥𝑥∥ 2 = Í𝑖 ∥𝑥𝑖 ∥ 2 , ∥𝑥 ′ ∥ 2 = Í𝑖 ∥ 𝐴𝑥𝑖 ∥ 2 , and  2    ..   .    (1 − 𝜖)∥𝑥𝑖 ∥ 2 ≤ ∥ 𝐴𝑥𝑖 ∥ 2 ≤ (1 + 𝜖)∥𝑥𝑖 ∥ 2 (1 − 𝜖)∥𝑥 ′ ∥ 2 ≤ ∥𝐵𝑥 ′ ∥ 2 ≤ (1 + 𝜖)∥𝑥 ′ ∥ 2 70 Therefore we get the claimed 3𝜖-JL map as follows (1 − 3𝜖)∥𝑥∥ 2 ≤ (1 − 𝜖) 2 ∥𝑥∥ 2 ≤ ∥𝐵𝑥 ′ ∥ 2 ≤ (1 + 𝜖) 2 ∥𝑥∥ 2 ≤ (1 + 3𝜖)∥𝑥∥ 2 . For the time complexity, the number of required operations is 𝑁 2 𝑁 (𝑚 1 𝑓 (𝑚 1 )) + 𝑚 2 = O (𝑁 𝑓 (𝑚 1 )). 𝑚 12 𝑚1 𝑁 The first term is 𝑚 12 applications of matrix 𝐴, and the second term is a single application of matrix 𝐵. ■ 6.1 Case of Finite Points In this section we review the the necessary theorems for JL maps for finite points via sub-Gaussian matrices and SORS matrices. For sub-Gaussian matrices, the main tools are the matrix deviation inequality and an upper bound on the Gaussian width of 𝑛 points. For the SORS matrices, we don’t have an equivalent matrix deviation inequality, hence we go through an indirect approach. First an RIP estimate is derived for the SORS matrices, and then one considers a sequence of random reflections by the coordinate axes, a theorem known as the Krahmer-Ward, to get a JL map. 6.1.1 JL Maps for Finite Points via sub-Gaussian Matrices We specialize the matrix deviation inequality to a finite point set and use the Gaussian width bound for finite points. Lemma 6.1.1. Let 𝑥𝑖 be 𝑛 points in R𝑁 . Let 0 < 𝜖, 𝜌 < 1. Let 𝐴 be a 𝑚 × 𝑁 sub-Gaussian random matrix with 𝐾 = max || 𝐴𝑖 || 𝜓2 . Then there exists a universal constant 𝐶 such that if   2 4 −2 2 𝑚≥𝐶 𝐾 𝜖 log(𝑛) 1 + log( ) (6.1) 𝜌 then with probably at least 1 − 𝜌 the following expressions holds simultaneously for all 𝑥 ∈ 𝑇. 1 (1 − 𝜖)||𝑥|| 2 ≤ || √ 𝐴𝑥|| 2 ≤ (1 + 𝜖)||𝑥|| 2 . (6.2) 𝑚 71 Proof. By normalizing the 𝑥𝑖 we can assume they are on the unit sphere. Next we use theorem 4.1.6 and we note that (6.2) is scaling invariant. If we let 𝑇 be the set of normalize 𝑥𝑖 , then by [35, lemma 7.5.10], we have √︁ 𝜔(𝑇) ≤ 𝐶 log(𝑛). (6.3) Then using (6.1) and adjusting the constants we get the desired bound. To get the squares in (6.2), we note that for 0 < 𝜖 < 1, 1 − 2𝜖 ≤ (1 − 𝜖) 2 ≤ (1 + 𝜖) 2 ≤ 1 + 3𝜖. 6.1.2 JL Maps for Finite Points via SORS Matrices We first review the Krahmer-Ward theorem that allows one to get a JL map for finite point via an RIP estimate. Theorem 6.1.2. (Krahmer-Ward) [18, theorem 9.36] Let 𝑥𝑖 be 𝑛 points in R𝑛 . Let 𝜌, 𝜖 ∈ (0, 1). Let 𝐴 ∈ R𝑚×𝑁 be a matrix where its restricted isometry constant for sparsity 2𝑠, i.e. 𝛿2𝑠 , satisfies 𝛿2𝑠 ≤ 𝜖 4 for 𝑠 ≥ 16 log( 4𝑛 𝜌 ). Let 𝐷 be a diagonal matrix with ±1 variables on the diagonal. Then with probability exceeding 1 − 𝜌 the following holds simultaneously for all 𝑥𝑖 . (1 − 𝜖)||𝑥𝑖 || 2 ≤ || 𝐴𝐷𝑥𝑖 || 2 ≤ (1 + 𝜖)||𝑥𝑖 || 2 . (6.4) Here we present an RIP estimate for SORS matrices that we can pair with the Krahmer-Ward theorem. Theorem 6.1.3. (RIP for SORS Matrices) [18, theorem 12.31]Let 𝐴 ∈ C𝑚×𝑁 be a SORS matrix with constant 𝐾 ≥ 1. Let 𝜖 ∈ (0, 1). There is a universal constant such that if 𝑚 ≥ 𝐶𝐾 2 𝜖 −2 𝑠 log4 (𝑁) (6.5) 3 then with probability at least 1 − 𝑁 − log (𝑁) the restricted isometry constant of √1 𝐴 for sparsity 𝑠, 𝑚 𝛿 𝑠 , satisfies 𝛿 𝑠 ≤ 𝜖. Finally we present a finite point JL map for the SORS matrices. 72 Proposition 6.1.4. Let 𝑥𝑖 be 𝑛 points in C𝑁 . Let 𝐴 ∈ C𝑚×𝑁 be a SORS matrix with constant 𝐾. There is a universal constant 𝐶 such that if 𝑛 𝑚 ≥ 𝐶𝐾 2 𝜖 −2 log( ) log4 (𝑁), (6.6) 𝜌 3 then with probability at least 1 − 𝜌 − 𝑁 − log (𝑁) 1 (1 − 𝜖)||𝑥𝑖 || 2 ≤ || √ 𝐴𝐷𝑥𝑖 || 2 ≤ (1 + 𝜖)||𝑥𝑖 || 2 , (6.7) 𝑚 where 𝐷 is a random diagonal matrix with ±1 on its diagonal with uniform distribution. Proof. We combine theorems 6.1.2 and 6.1.3; using the union bound, we get the claimed success probability. ■ 6.1.3 Combining the Two Matrix Types We combine the SORS and sub-Gaussian JL maps as in the scheme in lemma 6.0.1 to get our main theorem for finite points. Theorem 6.1.5. Let 𝑥𝑖 be 𝑛 points in R𝑛 . Let 𝜖, 𝜌 ∈ (0, 1). Then with probability at least 3 1 − 𝜌1 − 𝜌2 − 𝑁 − log (𝑁) , one can find a linear map 𝐿 : R𝑁 → R𝑚 such that (1 − 𝜖)||𝑥𝑖 || 2 ≤ ||𝐿𝑥𝑖 || 2 ≤ (1 + 𝜖)||𝑥𝑖 || 2 (6.8)   log(𝑛) 𝑚=O (1 + log(2/𝜌2 )) (6.9) 𝜖2    4 and a single 𝐿𝑥𝑖 can be computed with time complexity O 𝑁 log 𝜖 −2 log( 𝑛𝑁 𝑝1 ) log (𝑁) . Proof. We follow the in the scheme in lemma 6.0.1. Hence we have 𝐴 and 𝐵 such that 𝐴∈ C𝑚 ×𝑚 1 2 1 and 𝐵 ∈ C𝑚2 ×𝑁/𝑚1 . The task is to find sufficient requirements for 𝑚 1 and 𝑚 2 , where 𝑚 1 will determine the runtime and 𝑚 2 will determine the final embedding dimension. First we determine 𝑚 1 . We need a 3𝜖 -JL map for 𝑛𝑁 𝑚 12 points from the SORS matrix 𝐴. For this by proposition 6.1.4, we need 9 𝑛𝑁 𝑚 1 ≥ 𝐶𝐾 2 2 log( 2 ) log4 (𝑁). (6.10) 𝜖 𝑚 1 𝜌1 73 Removing the reference of 𝑚 1 on the right hand side makes the required condition on 𝑚 1 stricter, hence it is sufficient to have 9 𝑛𝑁 𝑚 1 ≥ 𝐶𝐾 2 2 log( ) log4 (𝑁). (6.11) 𝜖 𝜌 1 3 The success probability is 1 − 𝜌1 − 𝑁 − log (𝑁) . For 𝑚 2 , we need 𝐵 to be a 3𝜖 -JL map of 𝑛 points . Hence by lemma 6.1.1, we need 9 2 𝑚2 ≥ 𝐶 2 𝐾 4 log(𝑛)(1 + log( )) (6.12) 𝜖2 𝜌2 with success probability at least 1 − 𝜌2 . So combining the two steps together and specializing to DFT matrix that has a 𝑑 log(𝑑) runtime, we see that our algorithm has   log(𝑛) Final embedding dimension: 𝑚 2 = O (1 + log(2/𝜌2 )) (6.13) 𝜖2    −2 𝑛𝑁 4 Runtime : O (𝑁 log(𝑚 1 )) = O 𝑁 log 𝜖 log( ) log (𝑁) (6.14) 𝑝1 3 Success Probability: 1 − 𝜌1 − 𝜌2 − 𝑁 − log (𝑁) . (6.15) This finishes the proof. ■ 6.2 Upgrade for Subspaces In this section, we repeat the process as in the previous case, but here we change our theorems so that we can get a JL embedding of a subspace. By the scaling invariance of the JL condition (1 − 𝜖)||𝑥|| ≤ ||𝐿𝑥|| ≤ (1 + 𝜖)||𝑥𝑖 ||, we can reduce the subspace problem to its corresponding unit sphere. We have a theorem from [29, theorem 3.3] that provides a JL map for an arbitrary set in terms of its Gaussian width. With this theorem we can get a JL embedding of a unit sphere in a subspace. This theorem uses SORS matrices. 74 6.2.1 SORS Matrices for Subspaces Theorem 6.2.1. [29, theorem 3.3] Let 𝑇 ⊂ R𝑁 and suppose 𝐴 ∈ R𝑚×𝑁 is selected from a SORS distribution with a constant 𝐾. Then sup ||| 𝐴𝑥|| 2 − ||𝑥|| 2 | ≤ max{𝜖, 𝜖 2 }rad(𝑇) 2 (6.16) 𝑥∈𝑇 holds with probability at least 1 − 𝜌 as long as 2 𝜔 (𝑇) 2 max(1, rad(𝑇) 2) 𝑚 ≥ 𝐶𝐾 2 (1 + log( )) 2 log4 (𝑁) . (6.17) 𝜌 𝜖2 ■ Now we specialize to a unit sphere of a subspace using an estimate for the Gaussian width of a unit sphere. Corollary 6.2.2. Let 𝑇 be a 𝑑 dimensional affine subspace of R𝑁 . Assume 𝐴 ∈ R𝑚×𝑁 is selected from a SORS distribution with a constant 𝐾. Suppose 0 < 𝜖, 𝜌 < 1 and 2 𝑑 𝑚 ≥ 𝐶𝐾 2 (1 + log( )) 2 log4 (𝑁) 2 (6.18) 𝜌 𝜖 Then with probability at least 1 − 𝜌, the following holds for all 𝑥 ∈ 𝑇 (1 − 𝜖)||𝑥|| 2 ≤ || 𝐴𝑥|| 2 ≤ (1 + 𝜖)||𝑥|| 2 . (6.19) Proof. We use theorem 6.2.1. Since the equation (6.19) is scaling invariant, we can scale 𝑇 to √ be a unit 𝑆 𝑑−1 . We have that the 𝜔(𝑆 𝑑−1 ), the Gaussian width of 𝑆 𝑑−1 , is bounded above by 𝑑 + 𝐶 for a universal constant [35, Example 7.5.7]. Using this fact and adjusting the constants the result follows. ■ 6.2.2 Sub-Gaussian Matrices for Subspaces Now we state a JL theorem for subspace embeddings using sub-Gaussian matrices. The main tool is the matrix deviation inequality [35, Chapter 9]. 75 Corollary 6.2.3. Let 𝑇 be a 𝑑-dimensional affine subspace of R𝑁 . Let 𝐴 be a 𝑚 × 𝑁 sub-Gaussian random matrix with 𝐾 = max || 𝐴𝑖 || 𝜓2 . Suppose 2 𝑑 𝑚 ≥ 𝐶𝐾 4 (1 + log( )) 2 2 . (6.20) 𝜌 𝜖 Then with probability at least 1 − 𝜌, the following holds for all 𝑥 ∈ 𝑇 (1 − 𝜖)||𝑥|| 2 ≤ || 𝐴𝑥|| 2 ≤ (1 + 𝜖)||𝑥|| 2 . (6.21) Proof. We use theorem 4.1.6. Again since (6.21) is scaling invariant we can reduce to the unit √ sphere. Since 𝜔(𝑆 𝑑−1 ) ≤ 𝑑 + 𝐶 for a universal constant, we get √ √︃ 2 𝐶2𝐾 4 𝑑 +𝐶 + log( 2𝜌 ) 𝑚≥ (6.22) 𝜖2 adjusting the constants and using 3(𝑎 2 + 𝑏 2 + 𝑐2 ) ≥ (𝑎 + 𝑏 + 𝑐) 2 gives the desired result. ■ 6.2.3 Mixing the Matrix Types We use the scheme in lemma 6.0.1 to combine the SORS and Gaussian matrices. We break the vectors into smaller pieces, apply a fast JL map, recombine the pieces and then apply a sub-Gaussian JL map. To the prove the desired properties of our algorithm, we also need the to review the proposition 5.0.2 that lets us control the Gaussian width of a set after a JL map is applied to it. Proposition 6.2.4. Let 𝐿 : R𝑛 → R𝑚 be a linear map, 𝑇 ⊂ R𝑛 with 𝑎 and 𝑏 positive reals such that for 𝑥, 𝑦 ∈ 𝑇, 𝑎||𝑥 − 𝑦|| ≤ ||𝐿 (𝑥 − 𝑦)|| ≤ 𝑏||𝑥 − 𝑦||. Then 𝑎 𝜔(𝑇) ≤ 𝜔(𝐿(𝑇)) ≤ 𝑏 𝜔(𝑇). Now we are ready to present the proof of algorithm in the case of subspaces. The proof strategy first uses the properties of SORS matrices for the pieces of an input vector, then we use proposition 6.2.4 to control the Gaussian width of the image of the set after the SORS matrix has been applied. This in turn allows us to apply the matrix deviation inequality as the final step. 76 Theorem 6.2.5. Let 𝑇 be a 𝑑-dimensional affine subspace of R𝑁 . Let 0 < 𝜖, 𝜌1 , 𝜌2 < 1 such that 𝜌1 + 𝜌2 ≤ 1 and 𝐾 ≥ 1. Let 𝑚 1 and 𝑚 2 be integers such that 2𝑁 2 4 𝑑 𝑚 1 ≥ 𝐶𝐾 2 (1 + log( 2 )) log (𝑁) 2 (6.23) 𝜌1 𝑚 1 𝜖 𝐶𝐾 4 (1 + 𝜖) 2 𝑑 (1 + log( 𝑝22 )) 𝑚2 ≥ (6.24) 𝜖2 Then with probability at least 1 − 𝜌1 − 𝜌2 one can find a linear map 𝐿 : R𝑁 → R𝑚2 such that for all 𝑥, 𝑦 ∈ 𝑇 (1 − 𝜖)||𝑥 − 𝑦|| 2 ≤ ||𝐿(𝑥 − 𝑦)|| 2 ≤ (1 + 𝜖)||𝑥 − 𝑦|| 2 (6.25)    Furthermore, for 𝑥 ∈ 𝑇, the 𝐿𝑥 can be computed with a O 𝑁 log 𝐾 2 (1 + log( 𝜌2𝑁𝑚 2 )) 2 log4 (𝑁) 𝜖𝑑2 1 1 run-time. Proof. Following the scheme in lemma 6.0.1, we have matrices 𝐴 and 𝐵 such that A ∈ C𝑚 ×𝑚 1 2 1 and B ∈ C𝑚2 ×𝑁/𝑚1 . Our goal is to find sufficient conditions on 𝑚 1 and 𝑚 2 to satisfy the required JL properties. The projection of 𝑇 onto a 𝑚 12 dimensional subspace is another subspace of dimension at most √ 𝑑. The Gaussian width of the unit sphere in this subspace is bounded above by 𝑑 + 𝐶 for a 𝑁 universal constant C. Now we apply corollary 6.2.2, for 3𝜖 , 𝑚 12 different times since there are that 𝜌1 𝑚 12 many different projections. Applying each with a success probability 1 − 𝑁 , and combining the failure probabilities using the union bound we get the success probability of 1 − 𝜌1 . Therefore we need 2𝑁 2 4 𝑑 𝑚 1 ≥ 𝐶𝐾 2 (1 + log( 2 )) log (𝑁) 2 . (6.26) 𝜌1 𝑚 1 𝜖 For a universal constant 𝐶, where 𝐾 is the constant associated to the SORS matrix 𝐴. 𝑁 ×𝑁 Now we consider the block diagonal matrix 𝐶 = diag( 𝐴) where 𝐶 ∈ R 𝑚1 . For any 𝑥 ∈ 𝑇, since matrix 𝐴 distorts each individual projection of 𝑥 at must by 1 ± 𝜖, then same is true for 𝑥. Now we apply proposition 6.2.4 to the unit sphere of 𝑇, call it 𝑈 (𝑇). Hence we can estimate the √ Gaussian width of the image of 𝑈 (𝑇) from above as 𝜔(𝐶 (𝑈 (𝑇))) ≤ (1 + 𝜖)( 𝑑 + 𝐶 ′). 77 Now we can apply the matrix deviation inequality, theorem 4.1.5. With probability at least 1 − 𝑝 2 , we have " # √ √︁ sup ∥𝐵𝑥∥ 2 − 𝑚 2 ∥𝑥∥ 2 ≤ 𝐶 ′ 𝐾 2 𝑤(𝐶 (𝑈 (𝑇))) + log(2/𝑝 2 ) · sup ∥𝑥∥ 2 , (6.27) 𝑥∈𝐶 (𝑈 (𝑇)) 𝑥∈𝐶 (𝑈 (𝑆)) √ h √ √︁ i sup ∥𝐵𝑥∥ 2 − 𝑚 2 ∥𝑥∥ 2 ≤ 𝐶 ′ 𝐾 2 (1 + 𝜖)( 𝑑 + 𝐶 ′) + log(2/𝑝 2 ) · (1 + 𝜖) . (6.28) 𝑥∈𝐶 (𝑈 (𝑇)) Rewriting, we get that for 𝑥 ∈ 𝐶 (𝑈 (𝑇)) h √ √︁ i 𝐶 ′ 𝐾 2 (1 ′ + 𝜖) ( 𝑑 + 𝐶 ) + log(2/𝑝 2 ) ∥𝑥∥ 2 − √ 𝑚2 1 ≤ ∥ √ 𝐵𝑥∥ 2 𝑚2 h √ √︁ i 𝐶 ′ 𝐾 2 (1 ′ + 𝜖) ( 𝑑 + 𝐶 ) + log(2/𝑝 2 ) ≤ ∥𝑥∥ 2 + √ . 𝑚2 Now Since 𝐶 is an 𝜖-JL map of a 𝑈 (𝑇) we have that 1 − 𝜖 ≤ ∥𝑥∥ 2 ≤ 1 + 𝜖 . Therefore we get h √ √︁ i 𝐶 ′ 𝐾 2 (1 + 𝜖) ( 𝑑 + 𝐶 ′) + log(2/𝑝 2 ) 1 1−𝜖 − √ ≤ ∥ √ 𝐵𝑥∥ 2 𝑚2 𝑚2 h √ √︁ i ′ 2 ′ 𝐶 𝐾 (1 + 𝜖) ( 𝑑 + 𝐶 ) + log(2/𝑝 2 ) ≤ 1+𝜖 + √ . 𝑚2 Finally as long as 𝐶 ′ 𝐾 4 (1 + 𝜖) 2 𝑑 (1 + log( 𝑝22 )) 𝑚2 ≥ (6.29) 𝜖2 we get 1 1 − 2𝜖 ≤ ∥ √ 𝐵𝑥∥ 2 ≤ 1 + 2𝜖 𝑚2 78 Adjusting the constants we get the desired bound. To summarize we have 3 estimates. 𝐾 4 (1 + 𝜖) 2 𝑑 (1 + log( 𝑝22 )) ! Final embedding dimension: 𝑚 2 = O . (6.30) 𝜖2 !! 2𝑁 𝑑 Runtime : O (𝑁 log(𝑚 1 )) = O 𝑁 log 𝐾 2 (1 + log( )) 2 log4 (𝑁) 2 . (6.31) 𝜌𝑚 12 𝜖 Success Probability: 1 − 𝜌1 − 𝜌2 . (6.32) 79 BIBLIOGRAPHY 80 BIBLIOGRAPHY [1] Eddie Aamari, Jisu Kim, Frédéric Chazal, Bertrand Michel, Alessandro Rinaldo, and Larry Wasserman. Estimating the reach of a manifold. Electronic Journal of Statistics, 13(1):1359– 1399, 2019. 12, 13 [2] Nir Ailon and Edo Liberty. Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes. Discrete & Computational Geometry, 42(4):615–630, December 2009. 69 [3] Ralph Alexander and Stephanie Alexander. Geodesics in Riemannian manifolds-with- boundary. Indiana University Mathematics Journal, 30(4):481–488, 1981. Publisher: JSTOR. 15 [4] Stephanie B. Alexander, I. David Berg, and Richard L. Bishop. The Riemannian obstacle problem. Illinois Journal of Mathematics, 31(1):167–184, March 1987. 16 [5] Noga Alon. Problems and results in extremal combinatorics—i. Discrete Mathematics, 273(1-3):31–53, 2003. 62, 68 [6] Marcel Berger. A Panoramic View of Riemannian Geometry. Springer Berlin Heidelberg, Berlin, Heidelberg, 2003. 16, 38, 39 [7] R.L. Bishop and R.J. Crittenden. Geometry of manifolds. ISSN. Elsevier Science, 2011. 16 [8] Jean-Daniel Boissonnat, André Lieutier, and Mathijs Wintraecken. The reach, metric dis- tortion, geodesic convexity and the variation of tangent spaces. Journal of Applied and Computational Topology, 3(1-2):29–58, June 2019. 12, 15, 17 [9] Károly Böröczky, Jr. Finite Packing and Covering. Cambridge Tracts in Mathematics. Cambridge University Press, 2004. 19 [10] Simone Brugiapaglia, Sjoerd Dirksen, Hans Christian Jung, and Holger Rauhut. Sparse recovery in bounded riesz systems with applications to numerical methods for pdes. arXiv preprint arXiv:2005.06994, 2020. 9, 53, 54, 55 [11] E.J. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2):489–509, 2006. 7 [12] E.J. Candes and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 51(12):4203–4215, 2005. 7 [13] Emmanuel J. Candes and Terence Tao. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory, 52(12):5406–5425, 2006. 7 81 [14] Kenneth L. Clarkson. Tighter bounds for random projections of manifolds. In Proceedings of the twenty-fourth annual symposium on Computational geometry - SCG ’08, page 39, College Park, MD, USA, 2008. ACM Press. 28 [15] Armin Eftekhari and Michael B. Wakin. New analysis of manifold embeddings and signal recovery from compressive measurements. Applied and Computational Harmonic Analysis, 39(1):67–109, July 2015. 12, 16, 28, 29, 49, 52 [16] Herbert Federer. Curvature measures. Transactions of the American Mathematical Society, 93(3):418–418, March 1959. 11, 12, 13, 15 [17] Werner Fenchel. On the differential geometry of closed space curves. Bulletin of the American Mathematical Society, 57(1. P1):44–54, 1951. 38, 39 [18] Simon Foucart and Holger Rauhut. A Mathematical Introduction to Compressive Sensing. Applied and Numerical Harmonic Analysis. Springer New York, New York, NY, 2013. 6, 9, 33, 53, 54, 72 [19] Sylvestre Gallot, Dominique Hulin, and Jacques Lafontaine. Riemannian geometry, volume 2. Springer, 1990. 18, 26 [20] Walter Gautschi. Some elementary inequalities relating to the gamma and incomplete gamma function. J. Math. Phys, 38(1):77–81, 1959. 20 [21] Mohammad Ghomi and Serge Tabachnikov. Totally skew embeddings of manifolds. Mathe- matische Zeitschrift, 258(3):499–512, 2008. 31 [22] Alfred Gray. Tubes, volume 221. Birkhäuser, 2012. 12, 13, 23, 39 [23] Subhaneil Lahiri, Peiran Gao, and Surya Ganguli. Random projections of random manifolds. arXiv preprint arXiv:1607.04331, 2016. 28 [24] R. Lashof and S. Smale. On the Immersion of Manifolds in Euclidean Space. The Annals of Mathematics, 68(3):562, November 1958. 28 [25] S. Li. Concise Formulas for the Area and Volume of a Hyperspherical Cap. Asian Journal of Mathematics & Statistics, 4(1):66–70, December 2010. 19 [26] Martin Lotz. On the volume of tubular neighborhoods of real algebraic varieties. Proceedings of the American Mathematical Society, 143(5):1875–1889, December 2014. 23 [27] Partha Niyogi, Stephen Smale, and Shmuel Weinberger. Finding the Homology of Subman- ifolds with High Confidence from Random Samples. Discrete & Computational Geometry, 39(1-3):419–441, March 2008. 15, 16, 37 [28] Barrett O’Neill. Semi-Riemannian geometry: with applications to relativity. Number 103 in Pure and applied mathematics. Academic Press, New York, 1983. 15 [29] Samet Oymak, Benjamin Recht, and Mahdi Soltanolkotabi. Isometric sketching of any set via the Restricted Isometry Property. Information and Inference: A Journal of the IMA, 7(4):707–726, December 2018. 7, 9, 53, 54, 60, 74, 75 82 [30] William F. Pohl. Some Integral Formulas for Space Curves and Their Generalization. Amer- ican Journal of Mathematics, 90(4):1321, October 1968. 28 [31] Takashi Sakai. Riemannian geometry. translations of mathematical monographs, vol. 149. American Mathematical Society, Providence, RI, pages 262–272, 1996. 15 [32] Shigeo Sasaki. On the differential geometry of tangent bundles of riemannian manifolds. Tohoku Mathematical Journal, Second Series, 10(3):338–354, 1958. 23 [33] Hukum Singh. On the cut locus and the focal locus of a submanifold in a riemannian manifold. ii. Publ. Inst. Math.(Beograd)(NS), 41(55):119–124, 1987. 15 [34] Christoph Thäle. 50 years sets with positive reach–a survey. Surveys in Mathematics and its Applications, 3:123–165, 2008. Publisher: University Constantin Brancusi. 12 [35] Roman Vershynin. High-dimensional probability: an introduction with applications in data science. Number 47 in Cambridge series in statistical and probabilistic mathematics. Cam- bridge University Press, Cambridge ; New York, NY, 2018. v, 1, 2, 3, 4, 5, 6, 17, 18, 19, 30, 49, 50, 51, 62, 63, 64, 68, 72, 75 [36] Hermann Weyl. On the Volume of Tubes. American Journal of Mathematics, 61(2):461, April 1939. 23 [37] James H. White. Self-Linking and the Gauss Integral in Higher Dimensions. American Journal of Mathematics, 91(3):693, July 1969. 28 [38] James H. White. Self-linking and the directed secant span of a differentiable manifold. Journal of Differential Geometry, 5(3-4):357–369, 1971. 28 83