COMBINATORIAL AND FOURIER ANALYTIC L2 METHODS FOR BUFFON’S
NEEDLE PROBLEM
By
Matthew Robert Bond

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Mathematics
2011

ABSTRACT
COMBINATORIAL AND FOURIER ANALYTIC L2 METHODS FOR
BUFFON’S NEEDLE PROBLEM
By
Matthew Robert Bond
In recent years, progress has been made on Buﬀon’s needle problem, in which one considers
a subset of the plane and asks how likely “Buﬀon’s needle” - a long, straight needle with
independent, uniform distributions on its position and orientation - is to intersect said set.
The case in which the set is a small neighborhood of a one-dimensional unrectiﬁable Cantorlike set has been considered in recent years, and progress has been made, motivated in part
by connections to analytic capacity [25].
Call the set E, the radius of the neighborhood ε, and the neighborhood Eε . Then
in some special cases [5][13][18], it has been conﬁrmed that Buﬀon’s needle intersects Eε
with probability at most C| log ε|−p , for p > 0 small enough, C > 0 large enough. In the
special case of the so-called “four corner” Cantor set and Sierpinski’s gasket, the lower bound
C log | log ε|
is known [3], replacing the previously-known lower bound C which is good
| log ε|
| log ε|
for more general one-dimensional self-similar sets.
In addition, the stronger lower bounds are still good if one “bends the needle” into the
shape of a long circular arc, or “Buﬀon’s noodle.” The radius one uses can be as small as
| log ε| 0 , for any 0 > 0, with the constant C depending on 0 [6]. It is unknown whether
this condition or anything like it is necessary.
Work continues on generalizing the upper bound results.

For Rachel and Erica, my favorite couple ever. They don’t have to read this document. They
do have to visit me in Vancouver sometime, though.

iii

ACKNOWLEDGMENT

I have received a lot of help from a lot of sources, and it would be a shame to cut this
section short.
As early as my ﬁrst year at MSU - this was 2005 - fellow graduate student Mike Dabkowski
already knew that I was an analyst and never believed anything else. Around this time, there
was not a large or active group of graduate students interested in analysis at MSU - there
was only Alberto Condori, a few years ahead of us all and eager to share what he knew.
Though Nick Boros and I were a few years behind and probably learned much more from
Alberto than vice versa, I hope he got something out of it, even if nothing more than a few
chances to practice some talks.
I do not know all of the details, but undoubtedly Alberto was instrumental in helping
Nick and I ﬁnd our advisor, Alexander Volberg. While it is quite common for graduate
students to have to do a bit of begging, searching, and proving themselves before getting an
advisor, we were lucky enough to have Dr. Volberg show up to substitute for our analsis
class one day and start a habit of meeting with each of us once a week to describe interesting
problems to us, give us papers to read, etc. We weren’t even done with our qualifying exams
yet, but already we had a distinguished and very helpful advisor. Alberto must have been
conﬁdent enough in our skills from having graded our homework in our ﬁrst year.
Dr. Volberg is also a vigorous proponent of his students, always looking for colloquia,
workshops, etc. for his students to participate in, sharing with the organizers his - I trust
well-deserved - high opinion of them. I may not have met half as many mathematicians as
I have if it hadn’t been for his level of advocacy.
iv

The writers of my letters of recommendation for employment are my advisor, Yang
Wang, Michael Lacey, and Ignacio Uriarte-Tuero, and the sponsoring scientist at University
of British Columbia, where I begin my postdoctoral appointment next year, is Izabella
Laba. I have not read these letters of course, but they must have been very good letters.
Though my papers summarized in this thesis are co-authored with only my advisor, having
an appointment lined up - and quite importantly, at such a place and to work with such a
person - is a relief and a motivation, making the circumstances around the writing of this
thesis much brighter than they may have been otherwise.
I’d like to thank Nikos Pattakos and Alexander Reznikov for coming here and giving Nick
and I more peers to talk to about analysis these last couple of years. Though they are a
couple years behind us as naively measured by time spent in graduate school, they are very
quick to pick up analysis, came knewing a lot already, and have contributed more than their
fair share to the student analysis seminar, which now runs again at MSU. Thanks also to
Clark Mussellman for recently making what Mike Dabkowski would undoubtledly call “the
right choice” - that is, deciding to become an analyst. He has also contributed a couple of
nice talks to our seminar.
Thanks also to Ignacio Uriarte-Tuero for co-organizing the student analysis seminar with
me the last two years. Though he is not anyone’s advisor, he has been consistently helpful
to all of us analysis graduate students here at MSU in the few years he’s been around. I am
sure that he’ll be well-suited to advising graduate students when the time comes.

v

TABLE OF CONTENTS

List of Figures

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

1 Deﬁnitions, notations, results, and background
1.1 Buﬀon needle probability and Favard length . .
1.2 Homogeneous Cantor-like sets . . . . . . . . . .
1.3 Results for Buﬀon’s needle problem . . . . . . .
1.4 Counting function . . . . . . . . . . . . . . . . .
1.5 Heuristics and napkin sketches . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

1
1
3
9
10
11

2 The lower bound in Buﬀon’s noodle problem - circular noodle case

15

3 The
3.1
3.2
3.3

lower bound in Buﬀon’s noodle problem - general noodles
General Buﬀon noodle probabilities and some preliminary reductions. . . . .
Some useful facts about shear groups . . . . . . . . . . . . . . . . . . . . . .
Proof of the νP Lemma for general noodles . . . . . . . . . . . . . . . . . .

24
24
29
30

4 The upper bound in Buﬀon’s needle problem - Sierpinski’s gasket
4.1 Reductions and main Fourier-analytic argument . . . . . . . . . . . . . . . .
4.2 Controlling SSV (t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32
35
38

5 The upper bound in Buﬀon’s needle problem
5.1 The Fourier-analytic part . . . . . . . . . . . .
5.1.1 The setup . . . . . . . . . . . . . . . .
5.1.2 Initial reductions . . . . . . . . . . . .
5.1.3 The proof of Proposition 24 . . . . . .
5.2 Two combinatorial lemmas . . . . . . . . . . .
5.2.1 |A∗
K,N,θ | vs. |LN K β ,θ | . . . . . . . .
5.2.2 supn≤N ||fn,θ ||2 vs. |A∗
2
K,N,θ | . . . .

.
.
.
.
.
.

42
42
42
44
48
56
57

. . . . . . . . . . . . . . . . .

60

.
.
.
.

63
66
67
71

5.3
5.4
5.5

Controlling SSV (t) . . . . . . . . .
5.3.1 A Blaschke estimate . . . .
A localized upper bound on ||P1 ||2 .
Discussion . . . . . . . . . . . . . .

.
.
.
.

6 Epilogue

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.
.

general
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

case
. . .
. . .
. . .
. . .
. . .
. . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.

74
vi

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

77

LIST OF FIGURES

1.1

One of Count Buﬀon’s beasts. For interpretation of the references to color
in this and all other ﬁgures, the reader is referred to the electronic version of
this thesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2

G1 and G2 , stages 1 and 2 of the construction of Sierpinski’s gasket. . . . . .

4

1.3

K3 , stage 3 of the construction of the square Cantor set. . . . . . . . . . . .

5

1.4

Several translations of the triangles cover the discs, so the lengths of the
orthogonal projections are comparable. . . . . . . . . . . . . . . . . . . . . .

8

Discs turn green when the stack is tall for the ﬁrst time; averages of fn,θ on
the illustrated interval will remain bounded below as n increases. . . . . . .

13

2.1

And illustration of the action of σθ . . . . . . . . . . . . . . . . . . . . . . . .

16

2.2

The projection to the x-axis is the entire interval; the same interval is covered
by projθ (Kn ) for all n by self-similarity. . . . . . . . . . . . . . . . . . . . .

18

2.3

K2 in the adjusted coordinate system. . . . . . . . . . . . . . . . . . . . . .

19

2.4

Only where the annuli intersect will we ﬁnd centers of circles of radius r which
intersect both Cantor squares. Approximation by a rectangle is suﬃcient to
give the desired estimate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

1.5

5.1

It is quite diﬃcult for a large number of factors of P1,t (x) to be close to 1
simultaneously. In particular, Lk x and Lk tx must be close to Z for many
values of k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.1

51

Minotaur. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

viii

Chapter 1
Deﬁnitions, notations, results, and
background

1.1

Buﬀon needle probability and Favard length

All sections of this thesis will have a great deal in common. In it, we will consider the Buﬀon
needle probability, or Favard length, of a measurable(1) set E ⊆ C. This quantity is
deﬁned as
F av(E) :=

1 π
|projθ (E)|dθ,
π 0

(1.1)

where projθ denotes orthogonal projection onto the line forming the angle θ with the positive
real axis, and |F | denotes the Lebesgue measure of F regarded a subset of R. Pointwise, one
deﬁnes projθ (reiθ ) := r · cos(θ − θ).
The reason this is sometimes also called Buﬀon needle probability: after a normalization
constant, it is the probability that “Buﬀon’s needle” will intersect E when thrown, where
1 In fact, we will only consider compact sets
1

Figure 1.1: One of Count Buﬀon’s beasts. For interpretation of the references to color in
this and all other ﬁgures, the reader is referred to the electronic version of this thesis.

2

“Buﬀon’s needle” is a straight line which lands with independent, uniformly distributed
location and angle with the positive real axis(2) . Favard length is known to be related
to analytic capacity, a measure of how well E can “hide singularities of bounded analytic
functions” - see [19], [25]. The sets we study in this thesis are of interest both for the analytic
capacity problem and for Buﬀon’s needle problem.

1.2

Homogeneous Cantor-like sets

As we consider Buﬀon’s needle problem here, the sets which will play the role of E can
either be thought of as “partially constructed” self-similar sets, or small neighborhoods of a
self-similar set; they will be equivalent for our purposes, and we will freely conﬂate the two
without harm. In particular, the self-similar sets we study will be unrectiﬁable self-similar
sets of Hausdorﬀ dimension one.
Deﬁnition 1. For s ≥ 0, we say that H s (E) < ∞ if there is a constant M such that
for all
such that

> 0, E can be covered by countably many balls Bk of radii rk smaller than
s
s
k rk ≤ M . The Hausdorﬀ measure H (E) is deﬁned to be the inﬁmum

over all such possible values of M . The Hausdorﬀ dimension, dim(E), is given by
dim(E) := inf{s : H s (E) = 0} = sup{s : H s (E) = ∞}. When 0 < H 1 (E) < ∞, a set
such that F av(E) = 0 is called purely 1-unrectiﬁable, referred to in this thesis simply as
unrectiﬁable or 1-unrectiﬁable.
The opposite of an unrectiﬁable set is a rectiﬁable set (properly speaking, an m2 In the 18th century, “Buﬀon’s needle” was a short, physical needle which was thrown
repeatedly at a grid of uniformly spaced lines. By counting the proportion of the time
the needle crossed a line, approximate values of π were found from a Monte Carlo type of
formula.
3

G1

G2

Figure 1.2: G1 and G2 , stages 1 and 2 of the construction of Sierpinski’s gasket.
4

Figure 1.3: K3 , stage 3 of the construction of the square Cantor set.

5

rectiﬁable set), such as an m-dimensional smooth manifold in Rn , where s = m ∈ N. H m
agrees with the usual notions of length, area, volume, etc. for m = 1, 2, 3, etc. when E is a
smooth m-manifold. Therefore H s generalizes such notions, as it is well known to be a Borel
measure on Rn , and s is allowed to be non-integer. For m ∈ N, an m-rectiﬁable set is any
countable union of Lipschitz images of Rm and H m null sets. For equivalent deﬁnitions of
rectiﬁability, see [16]; this thesis is concerned with unrectiﬁable sets of Hausdorﬀ dimension
1.
Because of work done by Besicovitch, it is known that 1-unrectiﬁable sets E ⊂ C are those
such that at least two orthogonal projections have zero Lebesgue measure, or equivalently,
every Lipschitz curve meets E in a set of zero H 1 -measure [16]. In fact, dim(E) = 1 is the
critical case for Buﬀon’s needle problem: if dim(E) > 1, F av(E) > 0, and if dim(E) < 1,
F av(E) = 0; hence the role played by H 1 (E) in the deﬁnition of rectiﬁability. In general, if
H m (E) < ∞, E decomposes into m-rectiﬁable and m-unrectiﬁable parts.
A standard example of a 1-unrectiﬁable set is K, the four-corner Cantor set. It is the
1
unique compact invariant set(3) of the function system Sk (z) = 4 z + ck , where c1 = (0, 0),
c2 = (3/4, 0), c3 = (0, 3/4), c4 = (3/4, 3/4). Note also that
4
K=
n

Kn , where K0 = [0, 1] × [0, 1] and Kn+1 =

Sk (Kn ).
k=1

We can do this with other function systems, too. Consider also Sierpinski’s gasket, G, the
2πi( 1 + k )
1
2 3 , k = −1, 0, 1.
unique compact invariant set of the function system Sk (z) = 3 z + r
1
The most general case we will consider here: Sk (z) = L z + zk , k = 1, ..., L. In such a
case, the unique compact invariant set is called J ; this case contains both cases above. If
3 E is an invariant set if E =

k Sk (E); such a compact set exists and is unique [16].
6

the centers zk are not all collinear, then J is unrectiﬁable. By the word homogeneous, it
1
is meant that instead of Sk (z) = L z + zk , one could have had Sk (z) = rk z + zk such that
L r = 1, but in this thesis, we limit ourselves to the homogeneous case r = 1 .
k L
k=1 k
Above Kn was deﬁned as the union of all possible images of the convex hull of K under
n-fold compositions of the similarity maps Sk ; deﬁne Gn and Jn analogously, or see the
following formal deﬁnition:
Deﬁnition 2. Let k = 1, 2, ..., L. Let Σn = {1, 2, ..., L}n . For any v = (v1 , v2 , ..., vn ) ∈ Σn
and any k ∈ {1, 2, ..., L}, let (v, k) := (v1 , v2 , ..., vn , k) ∈ Σn+1 . Let S(k) := Sk , and for
v ∈ Σn , let S(v,k) : C → C be given by S(v,k) := Sk ◦ Sv . Let J0 be the convex hull of J .
Let Jv := Sv (J0 ), and let Jn :=

v∈Σn Jv .

For example, K0 = [0, 1] × [0, 1], and earlier we saw a picture of K3 . G0 is a certain
closed triangle which is “ﬁlled in” rather than “empty.”
Remark 1. For our intents and purposes, we more or less identify Jn with an appropriate
˜
neighborhood of J . Deﬁne Bε (E) := {z : dist(z, E) < ε}. Temporarily deﬁne Jn to be
˜
˜
B −n (J ). Then c · F av(Jn ) ≤ F av(Jn ) ≤ C · F av(Jn ). The reason for this is simply the
L
˜
fact that in either case, either of Jn and Jn can be covered by several translations of the
other (4) . As such, we will no longer bother to distinguish between the two. This is also the
reason why Buﬀon’s needle problem for sets like Jn is often phrased for simplicity, “How
likely is Buﬀon’s needle to land near J ?”[20] rather than “How likely is Buﬀon’s needle to
intersect Jn ?” (See Figure 1.4)
Remark 2. K,G, and J were chosen for the notation as follows: K is K is for “Cantor”(5) ;
4 the number of translates, and thus the constants c and C, depend on the eccentricity of
the convex hull of {zk }L
k=1
5 In [18], C has been used for the usual Cantor subset of [0, 1]
7

G1

z0

approximated
by discs

z-1

z1
z 0,0

z0,-1

z0,1

G 2 approximated
by discs
z 1,0

z -1,0
z-1,-1

z -1,1

z 1,-1

z 1,1

Figure 1.4: Several translations of the triangles cover the discs, so the lengths of the orthogonal projections are comparable.

8

G is G is for “Gasket”; since G is taken, J is J is for “General” – the phonics of the situation
don’t make it possible to reasonably misspell “gasket” as “jasket” or with any other ﬁrst letter.
So it goes.

1.3
Let An

Results for Buﬀon’s needle problem
Bn mean that there exists a constant C such that An ≤ CBn , where C must not

depend on n.
Some known results:
Theorem 1. [3], 2008
F av(Kn )

log n
n . The same proof also shows that F av(Gn )

log n
n .

Theorem 2. [18], 2008
F av(Kn )

1 for any ﬁxed 0 < p < 1 , where the implied constant may depend on p.
6
np

Theorem 3. [13], 2010
Let J be as above. Additionally, suppose J is a product set, and let the coordinates of
the zk be rational. Suppose also that there exists a direction θ0 such that |projθ (J )| > 0.
0
1 , where p depends only on θ .
Then F av(Jn )
0
np
This thesis contains a generalization of [18] and [13] to Gn . In addition, a weaker estimate
is proved for Jn . Current work between myself, Volberg, and Laba continues toward proving
the power estimate for Jn (without one or more of the additional conditions in [13]). Volberg
and I have published work in this direction in [7], where the strong result was proved entirely
for Gn .
Theorem 14:
9

For some c > 0, F av(Gn )

1
nc

Theorem 13:
For some 0 > 0 depending on only the zk deﬁning J0 , F avJn

√
− 0 log n
e

This thesis also contains a generalization of [3], not for more general sets, but for a
generalized notion of Favard length, called “Buﬀon noodle probability” or “circular Favard
length”. The results for this problem, which we will prove in Chpaters 2 and 3, are also
found in [6]. We refrain from stating the results here since they employ specialized notation
for describing the “noodle”.

1.4

Counting function

All results concerning Buﬀon’s needle problem for Jn employ the functions fn,θ : R → R,
called either the “counting function” or the “projection multiplicity function”. θ ∈ [0, π],
n ∈ N. Recall Deﬁnition 2 and Bε from Remark 1.

fn,θ :=
v∈Σn

χproj (J )
θ v

(1.2)

In light of Remark 1, there exists an alternate form of fn that is equivalent for our purposes.
1
Recall: Sk (z) = L z + zk . Then one simply redeﬁnes Jv := B −n (
L

n L−k z ); i.e., for
k
k=1

simplicity, one rescales a little and then approximates by discs. (See Figure 1.4)
We will not make use of the following fact, but it is interesting to notice the role of a
function like fn,θ in deﬁning the integralgeometric measure of a set. Deﬁne gn,θ =
p∈E χprojθ (p) ; i.e., gn,θ counts the number of points in E “Buﬀon’s needle” intersects
if the “needle” goes through xeiθ and is perpendicular to the line reiθ , r ∈ R. Then
10

π
1
1
the integralgeometric measure I1 (E) is given by I1 (E) = 0 R gn,θ (x) dx dθ. With
1
proper normalization constant, I1 gives the length of any smooth curve E ⊂ C, just like
H 1 . However, H 1 is positive on J (if the maps Sk satisfy the open set condition, [16]),
1
1
whereas I1 vanishes. Generalizations of H s and I1 appear in [16] and others.

1.5

Heuristics and napkin sketches

In Chapters 2 and 3, we will prove a lower bound in Buﬀon’s noodle problem, for circular
arcs and more general “noodles,” respectively. In chapters 4 and 5, we will prove upper
bounds in Buﬀon’s needle problem. The lower bounds have a much simpler proof, remaining
relatively painless even in the presence of an additional complication, the “bend” in the
needle. The small bend in the needle is an unwelcome distraction for now, so forget it as
we brieﬂy discuss heuristics; in Chapters 2 and 3, we’ll bend the needle as much as possible
without damaging our argument.
Note that ||fn,θ ||1 = C for all n and θ; we can rescale and say C = 1. As n increases,
however, ||fn,θ ||p is, in fact, an unbounded function for almost all θ for p > 1. This growth
occurs because the L1 mass concentrates on smaller sets as n increases; the eﬀect is quite
dramatic for the case θ = 0 and Jn = Kn , and the squares stack up perfectly, the number of
squares in each stack being 2n . However, Kn also has tan(θ) = 1/2 as a clear counterexample
(see Figure 2.2), and for Gn , the (perhaps surprising) truth is that the exceptional θ such
that |projθ (G)| > 0 form a dense subset of [0, π][12], see also [14] (the “obvious” examples for
Gn are θ = 0, 2π/3, 4π/3). As such, the quantitative Buﬀon’s needle problem is inherently a
bit ﬁnnicky.
Micro-theorem: If |projθ (J )| = 0, then ||fn,θ ||p → ∞ as n → ∞.
11

Proof: Since the Jn are compact and nested, |projθ (Jn )| → |projθ (J )| = 0. The result
follows from Holder’s inequality:

1 = ||fn,θ ||1 ≤ ||χsupp(f

n,θ )

||q ||fn,θ ||p = |projθ (Jn )|1/q ||fn,θ ||p .

(1.3)

If Jn had no self-similar structure, it would not be possible to state much in the way of
a converse to the micro-theorem(6) . However, there is a converse.
C
Micro-theorem converse: If ||fN ,θ ||∞ > K for some N0 , then |projθ (Jn )| < K for
0
some n large enough.
Sketch of proof: (See next ﬁgure) Fix θ. JN has a stack of K discs above θ. So say
0
N0 discs are green, and label also its descendents green. Consider
that these K out of L
Jj·N . Each disc of JjN is replaced by a rescaled copy of JN when forming the set
0
0
0
J(j+1)N . In particular, each white disc gives birth to a stack of K discs we label green,
0
and LN0 − K white discs, and green discs give birth to only green discs. In this way, the
j
, and this proportion does not exceed the
total proportion of white discs is 1 − K
LN0
measure of their uniﬁed projection. In particular, the union of projected white discs has
measure that approaches zero as j → ∞.
On the other hand, the green discs do not unify to any more than C/K in the projection
at any stage n, either. This ultimately follows from the Hardy-Littlewood theorem: If we
sit at x, directly below a green disc at some stage JN0 , then ﬁnd the smallest j such that
6 Suppose no self-similar structure were available, and suppose still that ||f

n,θ || = 1. One
can use the Chebyshev’s inequality to split to two level sets and show that 1 · |supp(fn,θ )| +
(K − 1)|{x : fn,θ ≥ K}| ≤ 1. The resulting bound on |supp(fn,θ )| is not that strong if the
height varies a lot. One could expand this to include many more level sets and try again,
but then the problem seems more diﬃcult than the one we started with.
12

in generation jN0 , this ancestor has turned green for the ﬁrst time; taking an interval of
width 2 · L−jN0 centered at x, we obtain an average value of fjN ,θ of size at least K/2.
0
As this interval contains all projections of all children, and the union of the children equals
the parent in L1 mass, this estimate on the average remains valid; that is, all green discs live
above places where Mfn,θ ≥ K/2. (M is the usual Hardy-Littlewood maximal operator)
C
C
Thus |union of projections of green discs| ≤ |{x : Mfn,θ (x) > K/2}| ≤ K ||fn,θ ||1 ≤ K .

Buffon's needle

appropriate Hardy-Littlewood interval

x

Figure 1.5: Discs turn green when the stack is tall for the ﬁrst time; averages of fn,θ on the
illustrated interval will remain bounded below as n increases.

Note that if ||fn,θ ||∞ → ∞, the above theorem implies that the measure of projθ (Jn ) →
0 as n → ∞ by monotonicity. [18] uses a sharpened form of this micro-theorem converse.
The L∞ condition is replaced with an L2 condition, so that one ﬁnds many stacks of size K
at various diﬀerent generations, rather than just one stack in a single generation. By doing
this, one can start out with a much larger proportion of green discs, leading to a much more
rapid exponential rate of conversion of discs from white to green. Go green, indeed.
13

In all cases, we will use p = 2. Note that the micro-theorem (lower bound) was easier; as
such, the bound it proves is easier to obtain. One needs only set the problem up with the
aid of just one additional insight, and the rest is counting. The insight is simply partitioning
the Favard length integral into well-chosen θ-intervals I1 , I2 , ..., Ilog n and integrating the
inequality |projθ (Jn )| ≥ ||fn,θ ||−2 in θ (this comes from (1.3)); a single integral with no θ
2
partitioning exactly leads to inferior estimate F av(Kn )

1
n.

The upper bound is more ﬁnnicky, relying on some somewhat delicate Fourier analysis.
ˆ
The Fourier transform fn,θ is (away from ∞ equivalent to) a self-similar exponential polynomial, and ultimately, it is the bad behavior of its zeroes when L > 4 that delays us from
proving more general results for now.

14

Chapter 2
The lower bound in Buﬀon’s noodle
problem - circular noodle case
In this chapter, we will state and prove Theorems 4 and 5.
In [6], a related circular Favard length, or Buﬀon noodle probability, was studied.
To get circular Favard length F avσ instead of usual Favard length F av, orthognal projection
along the line is replaced by projection along a circular arc tangent to the line. Speciﬁcally,
deﬁne the noodles
Fr (y) := r −

r2 − y 2

(2.1)

Also deﬁne σ0 (x, y) := (x − Fr (y), y), and σθ := R−θ ◦ σ0 ◦ Rθ , where Rθ is clockwise
rotation by the angle θ. (1) (Also Figure 2.1. σθ depends on r, but r will be stated in each
context and always refers to this implicit parameter wherever it appears.)
1 Note that if we replace σ with the identity map, we are in the setting of [3]. We will
often appeal to the σ = Id case for intuition, while noting that the content of [6] is that the
arguments of [3] carry over into [6] when c n ≤ r < ∞ with the only diﬀerence being a
change in the universal constants.
15

By deﬁnition, any g : R → R is a noodle, but we will use this language only for functions
playing a role like that played by Fr in the deﬁnition of σθ .

Y
z
z'
σϴ(z)
σϴ(z')

(ρ+r)eiϴ
ρeiϴ

X
Figure 2.1: And illustration of the action of σθ .
Finally, let
F avσ (Kn ) :=

1 π
|P rojθ (σθ (Kn ))| dθ
π 0

Remark 3. Note that F avσ (Kn ) is the dρdθ measure of the set of centers of circles of
radius r that intersect Kn , where such centers are parameterized by z = (ρ + r)eiθ . In
addition to considering the dρ dθ measure of this set, we may also naturally be interested in
the (r + ρ) dρ dθ measure of this set - that is, its area. Indeed, since r is much larger than
the diameter of Kn , ρ + r ≈ r. This is the key convenience that makes our estimate for the
circular noodle much easier and sharper by the arguments given here.
Speciﬁcally, if A ⊆ {z ∈ C : |z| ∈ (cr, Cr)} is measurable and |A| denotes its area, then
16

2π
|A| ≈ r 0 R χ
(ρ)dρdθ. If we let A = {z : z + reiθ ∈ Kn for some θ ∈
{ρ :(ρ +r)eiθ ∈A}
R}, then this says, “The area of all points distance r away from Kn ≈ r·the noodle probability
of Kn .” Our main application, however, will be to a setting in which A is a set of circle
centers like in Figure 2.1 - that is, the circle centered at z ∈ A intersects two or more squares
of Kn .
We will modify fn,θ according to this problem. For any Cantor square Q ⊂ Kn , let
χQ,θ := χP roj (σ (Q)) .
θ θ

fn,θ,σ :=

χQ,θ .
Cantor squares Q⊂Kn

projθ (σθ (Kn )) = supp(fn,θ,σ ), which we will also call En,θ,σ .
Note that
( I R fn,θ,σ dxdθ)2
.
|En,θ,σ | ≥
( I R f2
dxdθ)
I
n,θ,σ

(2.2)

(This is (1.3) with a bend in the needle) The idea is to pick ≈ log n many disjoint
intervals Ij such that each such estimate gives
C
|En,θ,σ | dθ ≥ .
n
Ij

(2.3)

Summing over j = 1, 2, ..., C log n, the result will be
Theorem 4. For each c > 0, there exists C > 0 such that whenever r ≥ cn , F avσ (Kn )
C

log n
n . Further, we may interpret F av(Kn ) to be F avσ (Kn ) in the case r = ∞.
If r << n , then we can still say something. We will prove the above theorem, but

the following generalized theorem is proved by carefully examining for which values of j the
17

estimate (2.5) holds in this general case. The lower bound on r is enough to make sure for
Lemma 2.6 to holds.
Theorem 5. For all n ∈ N and for all r

n,

F avσ (Kn )

log(r)
n

whenever 10 ≤ r ≤ n.
Good intervals Ij can be found near θ = arctan(1/2), because on this direction, Kn
orthogonally projects onto a single connected interval, and the projected squares intersect
only on their endpoints. These almost-disjoint projected intervals induce a 4-adic structure
on the interval. Let us rotate the axes and redeﬁne the old arctan(1/2) direction to be our
new θ = 0 direction. (See ﬁgure)

y
1

x

.75
.5
.25
.25

.5

.75

1

Figure 2.2: The projection to the x-axis is the entire interval; the same interval is covered
by projθ (Kn ) for all n by self-similarity.
Deﬁnition 3. Let Ij := [arctan(4−j−1 ), arctan(4−j )], 3 < j < C log n.
18

y
1
.75

x

.5

1
.25

.75

.5
.25

Figure 2.3: K2 in the adjusted coordinate system.
Then IC log n will be the closest direction to 0, and it’s reasonable to think that on
average, each time j decreases by 1, Ij will grow by the factor 4, and for θ ∈ Ij , |En,θ,σ | will
decay no more than by a factor of 1/4, resulting in the persistence of (2.3). For individual
θ, this is reasoning is completely invalid, but in the “average” sense as formulated by the
integral dθ in (2.3), the reasoning is sound. (2.3) is, indeed, a theorem, which we will now
prove:
Proposition 6. For 3 < j < C log n, I |En,θ,σ |d θ
j
Recall (2.2). Trivially, [ I
j
2
fn,θ,σ =

fn,θ,σ dx dθ]2 ≤ |Ij |2 · 1 ≤ C4−2j , while

χQ,θ χQ ,θ =
Q,Q

1
n.

χ2 .
Q,θ

χQ,θ χQ ,θ +
Q=Q

Q

Integrating over Ij × R, the latter diagonal sum becomes C4−j

n4−2j (the inequality

uses j < log n < log n). When estimating the other integral, things become combinatorial
19

- most of these terms are identically 0 in Ij × R. It remains only to show
Proposition 7. For 3 < j < C log n,

Ij ×R
Q=Q

χQ,θ χ

Q ,θ

dxdθ

n4−2j

Deﬁnition 4. Aj,k is the set of pairs P = (Q, Q ) of Cantor squares such that there exists
θ ∈ [0, π] such that the σθ images of the centers z = x + iy and z = x + iy of Q and Q
have distance 4−k−1 ≤ |yσ (z) − yσ (z ) | ≤ 4−k and satisfy the condition on horizontal
θ
θ
spacing
xσ (q) − x
σθ (q )
θ
4−j−1 ≤
≤ 4−j .
(2.4)
yσ (q) − y
σθ (q )
θ
We can think of 4−j as being tan(θ) for θ as in Figure 2.1. The terms in the sum of
Proposition 7 are supported on the integration region only when (Q, Q ) ∈ Aj−1,k , Aj,k , or
Aj+1,k .
In [3], it was proved2 that

|Aj,k | ≤ 42n−k−2j

(2.5)

when r = ∞. The proof is very direct counting argument; roughly, if (Q, Q ) ∈ Aj,k , then
the most recent common ancestor of Q and Q must have been of generation k, and Q and
Q must have been as close as possible in the x direction for the next j generations. That is,
of the 4n bits of information needed to specify a pair P ∈ Aj,k , all but k + 2j + c of them
are free to vary, where c is an absolute constant.
2 Actually, the bound and its proof on |A | are entirely two-sided, but we do not need
j,k
this fact.
20

To get the same |Aj,k | estimate for n

r < ∞ as shown in [6], it suﬃces to compare

the two cases with an application of the following lemma:

Lemma 8. Let ε > 0 be small enough. Let T : C → C be such that Lip(T − Id) < ε. Then
∀z, w ∈ C,
|arg(z − w) − arg(T (z) − T (w))| < 2ε(mod 2π)
Proof. Write z − w = ρeiθ , and let α := arg(z − w) − arg(T (z) − T (w)).

arg(T (z) − T (w)) = arg((T − Id)(z) − (T − Id)(w) + (z − w)) = arg(λρeiβ + ρeiθ )

for some λ < ε, β ∈ [0, 2π]. So arg(T (z) − T (w)) = arg(λeiβ + eiθ ) Then |α| ≤ α, where
ˆ
ε
tan(ˆ ) = 1−ε ⇒ |α| < 2ε.
α
This is where the condition r

n is used: to make Lemma 8 suﬃcient for the purposes

of relation 2.4. Since σθ is just σ0 conjugated by an isometry, the Lipschitz constant for σθ
(restricted to Kn ) is uniformly bounded by the size of the derivative of Frn on [−2, 2].
Deﬁnition 5. For any P = (Q, Q ) ∈ Aj,k , let
π
νP :=

0

R

χQ,θ χ
dxdθ.
Q ,θ

We need the estimate
νP ≤ C4k−2n ,

(2.6)

since the integrand is supported only for angles belonging to Ij−1 , Ij , and Ij+1 . So we ﬁx
j and sum over k to get
21

Ij ×R
Q=Q

χQ,θ χQ ,θ dθdx ≤

n−j+1
max{νP : P ∈ Aj ,k for some j = j − 1, j, j + 1}(|Aj−1,k | + |Aj,k | + |Aj+1,k |)
k=1
≤ Cn4−2j .
Here we used (2.5) and (2.6). The estimate (2.6) is elementary when r = ∞. It is true
more generally than that, though.
Lemma 9. νP lemma for circles
For any j, k pair P and r

n , νP

C 4k−2n .

Proof. It may be useful to consult Figure 2.1 and Remark 3 now. If an arc of radius r
intersects two Cantor squares, then the arc must be centered inside the intersection of two
annuli whose radii are r ± 4−n , and whose centers are the centers of the two Cantor squares.
So we want to prove that the area A of this intersection of annuli satisﬁes A ≤ Cr4k−2n .
Without the loss of generality, the squares are centered on the x-axis at 0 and at rx0 . We
have rx0 ≈ 4−k and we deﬁne η by rη = 4−n . So we need to show that A ≤ Cr2 η 2 /x0 . We
can scale the problem by r. Thus if we let r = 1, then we need only show that if x0 < 1/2,
then A ≤ Cη 2 /x0 . It will not hurt to let the inner radius be 1 rather than 1 − η 3 . Let
R = 1 + η.
The area A is taken from the region bounded by y = y1 =
R 2 − x2 , y = y 2 =

+
1 − (x − x0 )2 , and y = y2 =

+
1 − x2 , y = y 1 =

R2 − (x − x0 )2 .

3 One may divide the annulus along the circle with radius one. The inner annulus can be
rescaled to have inner radius 1, and the constants change negligibly
22

A<

_

C(rη)2

x0

_
rx0

rx

2

*

rη=4-n

rη

r(1-η)

rx 0~ 4
~

r(1-η)

-k

rη

Figure 2.4: Only where the annuli intersect will we ﬁnd centers of circles of radius r which
intersect both Cantor squares. Approximation by a rectangle is suﬃcient to give the desired
estimate.
+
1
y1 = y2 at a point we will call x∗ = 1 x0 + 2x η(2 + τ ). So a rectangle which contains
2
0
x0
x0
+ x0
1
the area A has width 2(x∗ − 2 ) = 2x η(2 + η), and height y1 ( 2 ) − y1 ( 2 ). So we need
0
+ ( x0 ) − y ( x0 ) ≤ Cη. To do this, we use the Mean Value Theorem on the
only show that y1 2
1 2
√
function s(x) = x.

x
+ x
y1 ( 0 ) − y2 ( 0 )) =
2
2

x
R 2 − ( 0 )2 −
2

x
≤ s (1 − ( 0 )2 )(2η + η 2 ) ≤ C
2

η
x0
1 − ( 2 )2

Thus A ≤ Cη 2 /x0 , as desired, so that νP ≤ 4k−2n .
This completes all proofs for this chapter.

23

x
1 − ( 0 )2
2
≤C η

Chapter 3
The lower bound in Buﬀon’s noodle
problem - general noodles

3.1

General Buﬀon noodle probabilities and some preliminary reductions.

A notation for this chapter: P rojθ (E)(x) := χproj (E) (x). Aside from mathematical gramθ
mar and context, one can also tell what is referred to by proj and P roj by paying attention
to capitalization.
In the previous chapter, our noodles were the functions Frn , playing a certain role in the
expression σθ .
Let us deﬁne general noodle probabilities now. Because an arbitrary noodle does not have
as many symmetries as a circular arc, a general noodle probability will need to integrate over
three independent parameters: two real variables for where the noodle lands, and one for
the orientation of the noodle. This serves two purposes: ﬁrst, it better conforms to our
24

intuition about what it means to randomly toss a possibly asymmetric noodle. Second, an
extra variable of intergration allows us to more readily partition regions of integration into
ones possessing symmetry. Our parameterization will have three real variables, ρ and θ like
before, and a third parameter τ for translation orthogonal to the axis in the θ direction. In
the case where the noodle is a circular arc, the two-parameter deﬁnition is equivalent for
our purposes. It is clear that such a translation by τ of a circle is again a circle, and the
information about whether this circle intersects a set can be transformed into an equivalent
question in the two-parameter setting of the previous chapter. This is clearly not possible
for noodles with less symmetry.
Let gτ (y) := g(y − τ ). (If we have a family gn of noodles, then we can write gn,τ (y) :=
(gn )τ (y) = gn (y − τ ).) For a probability distribution P on R2 × S 1 , a set E ⊂ C, and
noodle g, we can deﬁne

Bug (E) =

g
P rojθ σ τ (E)(x)dP (x, τ, θ).
θ

We can choose an L > 10, say, and let P be normalized Lebesgue measure on (−2, 2) ×
(−L, L) × (0, 2π), under which

Bug (E) =

2π L
L
1
1
g
|projθ (σ τ (E))|dτ dθ =
F av gτ (E)dτ .
θ
σ
16πL 0
16πL −L
−L
θ

(Note: proj was lower-case, so |proj(...)| denoted Lebesgue measure rather than pointwise
absolute value of a function.)
Having done this, we will say that a noodles gn are undercooked if Bugn (Kn )

log n
n .

We call such a family of noodles undercooked because they are suﬃciently close to being
25

straight lines. It is not clear whether the “undercooking” condition is necessary or an artifact
of the proof; on the other hand, it is clear that nearly-linear noodles are undercooked by
the deﬁnition speciﬁed for some appropriate notion of “nearly linear”. We will prove one
such result in this chapeter:
1
Theorem 10. If ||gn (y)||4 · ||gn ||∞ ≤ 4−n and ||gn (y)|| ≤ 100n , then the noodles gn are
∞
undercooked.
Remark 4. In particular, this theorem implies that the Frn are undercooked if rn ≥ 4n/5 ,
which is a much stronger condition than that required by Theorem 4. Another example is
gn (y) = 4−n/2 sin(4n/4 y).
Remark 5. Using methods like those of Lemma 9 combined with the methods of this chapter,
it may be possible to weaken the ﬁrst condition of Theorem 10 in favor of conditions that
require convexity and/or a condition on ||g ||∞ . One would estimate noodle probability by
estimating the distortion caused by thinking of noodle segments as segments of circular arcs
rather than thinking of these segments as being “nearly linear”.
Theorem 4 of Chapter 2 could be stated as follows:
Theorem 11. The functions Frn , where Frn (y) := rn −
noodles if rn

n for some

2
rn − y 2 , deﬁne undercooked

> 0.

The proof will be essentially the same, with the diﬀerence being that the corresponding
νP lemma will be more tortuous. Deﬁne
L
νP,σ g =

2π

−L 0

g
g
|projθ (σ τ (Q)) ∩ projθ (σ τ (Q ))|dθdτ .
θ
θ
26

(3.1)

Lemma 12.
νP,σ g

4k−2n .

There will be two main parts of the proof of the above νP lemma. If one of the two
squares, Q, were centered at the origin and τ were ﬁxed, the computation would merely
amount to ﬁnding how often a needle close to the origin intersected the other square, Q .
We claim that this assumption can be justiﬁed if one folliates the domain appropriately and
then changes variables. In fact, one can further assume that Q lies on the negative y − axis
and that τ = 0. Having done this, we will linearly approximate gn and use the structure of
the shear group(Section 3.2) to get our desired estimate. The idea is that we pick one of
the two squares Q and partition the integration domain accoding to which point along the
noodle punctures the center of Q. One can imagine dropping the noodle so it intersects Q,
gluing this point of intersection in place, and then rotating the noodle around this point,
asking how often the noodle hits the other square, Q . Each of these positionings of the
noodle can be expressed uniquely by a triple (τ, θ, x). If a particular point on the needle
crosses the center of Q in a particular point along the noodle, then under this restriction,
one thinks of θ as free and of x and τ as functions of θ.
Let us state the formulas. Fix a j, k pair Q, Q . We will describe the portion of the
domain of integration in which the noodle hits the center of a square Q at the same point
gτ
˜
−τ0 of the noodle. That is, if Q has center z = ρeiθ0 , consider g := g − g(−τ0 ) and σ 0 .
˜
θ
For each θ, we need to ﬁnd the unique xθ and τθ such that the line centered at xθ eiθ and
with positive axis in the θ + π/2 direction intersects z at y = τθ − τ0 . In fact,

xθ = |z| cos(θ − θ0 ),
27

and τθ = τ0 − |z| sin(θ − θ0 ).

Then when computing

2π
0

xθ +a

gτ
˜
gτ
˜
P rojθ (σ θ (Q))(x)P rojθ (σ θ (Q ))(x)dxdθ,
θ
θ
xθ −a

Without the loss of generality z = 0. That is,

2π

xθ +a

gτ
˜
gτ
˜
P rojθ (σ θ (Q))(x)P rojθ (σ θ (Q ))(x)dxdθ
θ
θ
xθ −a
0
2π a
g
˜
g
˜
P rojθ (σ (Q − z))(x)P rojθ (σ (Q − z))(x)dxdθ.
=
θ
θ
−a
0

For z = center of Q, and for ﬁxed τ0 , deﬁne

D = {τ = τ0 − |z| sin(θ − θ0 ), |x − |z| cos(θ − θ0 )| ≤ C4−n , θ ∈ (0, 2π)}.

Then if
ID (τ0 ) :=

gτ
˜
P rojθ (σ θ (Q ))(x)dxdθ,
θ
D

L
νP,σ g ≤ −L ID (τ0 )dτ0 .
Putting this all together, we are seeking to prove that if in addition to the hypotheses
of Theorem 10, g(0) = 0 and Q is approximately at distance 4−k from the origin, then
2π 4−n P roj (σ g (Q ))(x)dxdθ ≤ C4k−2n . Here we use that the σ-projection of a
θ θ
0 −4−n
small square centered at the origin is essentially an interval around the origin regardless of
28

θ.

3.2

Some useful facts about shear groups

g
A few facts about the shear groups Σθ := {σ : g : R → R measurable} need to be stated
θ
g
(the operation is composition of the maps σ ). Below, g and h will be arbitrary noodles.
θ
Recall:
g
σ0 (x, y) := (x − g(y), y),
g
g
σ := R−θ ◦ σ0 ◦ Rθ .
θ
First, there is this simple fact for arbitrary functions g and h:

g
g+h
h
σ ◦ σθ = σ
θ
θ

(3.2)

Next, we show how shears by linear noodles behave. We can let Eθ be a family of subsets
of C, but for our application, we will ﬁx Eθ = Kn . For g(y) = b, we get
g
projθ (σ (Eθ )) = projθ (Eθ ) − b
θ

(3.3)

For g(y) = my, α := arctan m, we get

g
projθ (σ (Eθ )) =
θ

projθ−α (Eθ )
cos(α)

=

1 + m2 projθ−α (Eθ )

(3.4)

Remember that the lower-case proj denotes a set, not a characteristic function. That is,
keep in mind that we deﬁned P rojθ (E) := χproj (E)
θ
29

g
For for g(y) = my + b and given x ∈ R, we can see that x ∈ projθ (σ (Eθ )) if and only
θ
x+b ∈ proj
θ−α (Eθ ). Thus for any measurable A ⊂ R,
1+m2

if

2π
0

g
P rojθ (σ (Eθ ))(x)dxdθ =
θ
A

1 + m2

2π
0

1
(A+b)
1+m2

P rojθ (Eθ+α )(x)dxdθ
(3.5)

3.3

Proof of the νP Lemma for general noodles

Now to prove Lemma 12. Lemma 8 will be used several times without explicit mention.
Lipschitz constants are clearly gotten from Taylor estimates on g.
The rough idea of this proof: the set of parameters for which two squares are simultaneously punctured by the needle may be translated considerably in parameter space by
the shears, but it cannot be dilated by too much. Since a shear with small curvature
is well-approximated by a linear shear, the result will follow. Let λ = ||gn (y)||∞ and
λ

= ||gn (y)||∞ . Let Q and Q be centered at (0, 0) and (0, −L), respectively, where

L ≈ 4−m . Note that

ν

P,σ

g
g
g ≤ C4−n |{θ : σθ Q ∩ σθ Q }| ≤ C4−n (4k−n + λ ).
θ

We need this quantity to be < C4k−2n . This task is already done if λ ≤ 4k−n , so assume
the opposite. Now for such θ we have |θ| < C(4k−n + λ ) < Cλ .
For these θ, rotation Rθ (Q) is in the band L−δ ≤ y ≤ L+δ, for δ = 4−n +L(1−cos(Cλ )),
2
giving δ ≤ C max{4−n , Lλ }. Now transform the integral using the shear group. Let l(y)
linearly approximate g(y) at y = L − δ, with l(y) = my + b. Note that |b| ≤ CLλ . Let
30

(y) := g(y) − l(y) on [L − δ, L + δ] and extend

continuously to be constant elsewhere.

Then, with b := b/ 1 + m2 :

νP,σ g =

2π 4−n
g
g
g
|projθ (σ (Q )) ∩ projθ (σ (Q))|dθ ≤
P rojθ (σ (Q ))(x) dxdθ
θ
θ
θ
0 −4−n
2π
=
0

[−4−n ,4−n ]

l
P rojθ (σθ (σθ (Q ))) dxdθ ≤

2π
C
0

[b −2·4−n ,b +2·4−n ]

P rojθ−α (σθ (Q )) dxdθ .

Changing variable, we see that this is at most
2π
C
0

[b −2·4−n ,b +2·4−n ]

P rojθ (σθ+α (Q )) dxdθ .

Let Γ := {θ : projθ (σθ+α (Q )) ∩ [b − 2 · 4−n , b + 2 · 4−n] = ∅}, and let z := (0, −L).
If θ ∈ Γ, then projθ (σθ+α (z)) ∈ [b − 3 · 4−n , b + 3 · 4−n ].
Since | (y)| < Cδλ , it follows that | (y)| < C δ 2 λ

4
< C L2 λ λ

< C4−n . So

|σθ+α (z) − z| < c 4−n , and hence |projθ (σθ+α (z)) − projθ (z)| ≤ C 4−n ∀θ ∈ Γ. So
Γ ⊆ {θ : projθ (z) ∈ [b − C4−n , b + C4−n ]} = {θ : L sin θ ∈ [b − C4−n , b + C4−n ]},

which implies:
|Γ| ≤ C|{θ : sin θ ∈ [b/L − C4k−n , b/L + C4k−n ]}|.

(3.6)

Since b < CLλ and 4k−n ≤ λ ≤ C , sin θ ≈ θ, and we get |Γ| ≤ C4k−n , completing the
n
proof of the νP Lemma.
Theorem 3.1 follows as well.
31

Chapter 4
The upper bound in Buﬀon’s needle
problem - Sierpinski’s gasket
Here we will prove Theorem 14. The argument is elaborated in slightly more detail in [5].
It may be instructive to compare the general case Jn with the special case Gn we consider
here. When a theorem for Gn is treated as a special case to later be proved for Jn , the
correspondence will be noted by theorem number and then omitted, both to minimize repetition and to prevent readers from missing the forest for the trees. The following preamble
serves equally well for the gasket and for the general case.
In Chapter 1, we saw that the growth of ||fn,θ ||p → ∞ was equivalent to the decay
of |projθ (Jn )| → 0. We stated and proved a micro-theorem and its converse. Chapters
2 and 3 used the idea of the micro-theorem, and here we employ a stronger form of the
micro-theorem converse, Theorem 27.
π
0 |projθ (Jn )| d θ → 0, as guaranteed by the Besivotich theorem, is only an average we also noted that [12] and [14] show that the exceptional angles θ where |projθ (J )| > 0
32

can be dense in [0, π]. The “set of bad angles at stage n”, En (or just E), necessarily has
small measure when n is large. Since E is small and |projθ (J 2 )| is small for θ ∈ E c , the
n
integral can be split according to the cases θ ∈ E and θ ∈ E c :

|projθ (J 2 )|d θ +
|projθ (J 2 )|d θ
π · F av(J 2 ) =
n
n
n
Ec
E
≤ |E| + (π − |E|) · sup |projθ (J 2 )| << 1
n
θ∈E c
(The exponent 2 is chosen somewhat arbitrarily.)
Quntitative control over E has been accomplised to some extent:
Theorem 13. For all n ∈ N,
√
F av(Jn ) ≤ e− 0 log n .

Theorem 14. There is a p0 > 0 such that for all p < p0 , there exists Cp > 0 such that for
all n ∈ N,
F av(Gn ) ≤ Cp n−p .
1
1
1
Further, one may take p0 =
≈ 10.262 , so p = 11 is suﬃcient.(1)
[2 log3 (169)]−1 +1
A method for controlling |E| originates with [18]. One takes the Fourier transform of
ˆ
fn,θ in the length variable and takes a sample integral of |fn,θ (x)|2 over a chosen small
ˆ
interval I where E×I |fn,θ (x)|2 dθdx is small. One then shows that there is a θ ∈ E such
ˆ
that I |fn,θ (x)|2 dx is large relative to |E|, and so |E| must be small.
1 It is not suspected that this value of p is sharp; on the other hand, p = 1 is impossible
0
log n
because of the argument of [3], F av(Gn )
n .
33

ˆ
fn,θ is a decay factor times a ﬁnite self-similar product

−k
k ϕθ (L y) of trigonometric

polynomials ϕθ . The most direct methods don’t accomplish the estimate all at once; the
high-frequency terms form a product P1,θ such that I |P1,θ |2 dx is large, and the danger
is that perhaps the zeroes of the low-frequency terms P2,θ might be located such that
2
I |P1 P2 | dx is small. In [18], the four frequencies of ϕθ were symmetric around 0, allowing
the terms to simplify to two cosines, and trigonometric identities allowed the whole product
to be estimated by a single sine term. In [13], an analogous role was played by tilings of
the line on the non-Fourier side by projθ (Jn ) in the special direction θ0 , and the product
0
structure of Jn allowed for a change and separation of variables. Separating variables is
more diﬃcult when there is no product structure. The simplest case without the product
structure is the Sierpinski gasket G considered in this chapter. We give a sketch of the
power estimate (proven in detail in [5]), which is based on the fact that zeroes of ϕ(3k ·) are
separated away from each other for diﬀerent values of k. This special structure of zeros (we
call it “analytic tiling” after [13]) is not always available for all angles. We have not yet
found an adequate substitute for it in the general case, and this is why the for the general
√
− 0 log n . Rather strangely, a claim in the spirit of
case we still only have F av(Jn ) ≤ e
the Carleson Embedding Theorem, in the form of Lemma 40, plays an important part in
our reasoning in the general case of Chapter 5. Because the Fourier transform turns stacks
of discs (i.e., sums of overlapping characteristic functions) into clusters of frequencies, this
lemma provides important upper bounds when θ belongs to E.

34

4.1

Reductions and main Fourier-analytic argument

B(z0 , r) := {z ∈ C : |z − z0 | < r}. For α ∈ {−1, 0, 1}n let
n
zα :=
k=1

1
iπ[ 1 + 2 α ]
( )k e 2 3 k , Gn :=
3

B(zα , 3−n ).
α∈{−1,0,1}n

This set is our approximation of a Gn ; recall Remark 1. We may still speak of the discs
B(zα , 3−n ) as “Sierpinski triangles.” The result for the Sierpinski gasket is the following:

Theorem 15. For some p > 0, F av(Gn )

1 .
np

We will simplify the proof by picking speciﬁc values for constants; at the end of this
paper, a short remark shows how to recover the full range p < p0 as in Theorem 14. As in
Chapter 1, let
fn,θ :=
Discs D of Gn

χproj (D) .
θ

Self-similarity allows us to write fn,θ in a form well-suited to Fourier analysis:
1
,
fn,θ = νn ∗ 3n χ
[−3−n ,3−n ]
2

where
νn := ∗n νk
k=1
νk :=

1
δ
+ δ −k
+ δ −k
3 cos(−π/6−θ)
3 cos(7π/6−θ)
3 3−k cos(π/2−θ)

For K > 0, let AK := AK,n,θ := {x : fn,θ ≥ K}. Lθ,n := projθ (Jn ) = A1,n,θ .
35

For our result, some maximal versions of these are needed.(2) :

∗
∗
fN,θ := max fn,θ , A∗ := A∗
K
K,N,θ := {x : fN,θ ≥ K} =
n≤N

N
AK,n,θ .
n=1

Also, let E := EN := {θ : |A∗ | ≤ K −3 } for K = N 0 , where 0 > 0 is a small enough
K
absolute constant.(3)
Later, we will jump to the Fourier side, where the function

ϕθ (x) :=

1 −i cos(π/2−θ)
e
+ e−i cos(−π/6−θ) + e−i cos(7π/6−θ)
3

plays the central role: νn (x) =

n ϕ (3−k x).
k=1 θ

Let Ln,θ := projθ (Gn ). The following constitutes the content of Theorem 27: If
C
θ ∈ EN , then |L
/
| ≤ K . (The same is true of Jn when everything is again deﬁned
N K 3 ,θ
analogously)
Now Theorem 15 follows from the following:
Theorem 16. Let 0 < 1/ log3 (169), suﬃciently, 0 ≤ 1/9.262. Then for N >> 1, |E| <
1
N− 0 = K .
This is better than what has currently been done for Jn , Theorem 21. It turns out that
L2 theory on the Fourier side is of great use here. The following is later proved as Theorem
31 in Section 5.2.2: For all θ ∈ EN and for all n ≤ N , ||fn,θ ||2
2

K. (The implied constant

depends only on the set of self-similarities)
2 See the micro-theorem converse of 1.5 for the rough idea why this is useful, and then
Theorem 27 for the formal statement of what one can then say
3 To get the sharpest exponent in Theorem 14, K −3 should be replaced by K −τ for τ > 2
arbitrary.
36

One can then take small sample integrals on the Fourier side and look for lower bounds
as well. Let K = N 0 , and let m = 2 0 log3 N . Theorem 31 easily implies the existence of
˜
˜
˜
E ⊂ E such that |E| > |E|/2 and number n, N/4 < n < N/2, such that for all θ ∈ E,
3n
3n−m

n

Km
N

|ϕθ (3−k x)|2 dx

k=0

N 0 −1 log N.

The number n does not depend on θ; n can be chosen to satisfy the estimate in the average
˜
over θ ∈ E, and then one chooses E. Let I := [3n−m , 3n ].
Now the main result amounts to this (with absolute constant α large enough):
Theorem 17.
n
˜
∃θ ∈ E :
I k=0
The result: log N

|ϕθ (3−k x)|2 dx

3m−2·αm = N −2 0 (2α−1) .

N 1− 0 (4α−1) = N δ , where δ > 0. Then it follows that N ≤ N ∗ .

Now we sketch the proof of Theorem 17. We split up the product into two parts: high and
low-frequency:
n−m−1
P1,θ (z) =

ϕθ (3−k z),

k=0
n
P2,θ (z) =

ϕθ (3−k z).

k=n−m
The following is Proposition 23:
Proposition 18. For all θ ∈ E, I |P1,θ |2 dx ≥ C 3m .
Low frequency terms do not have as much regularity, so we must control the damage
caused by the set of small values, SSV (θ) := {x ∈ I : |P2 (x)| ≤ 3− },
37

= α m. In the

˜
˜
next result we claim the existence of E ⊂ E, |E| > |E|/2 with the following property:
The next proposition is like Proposition 24, except that the following proposition holds
for a larger set SSV (θ) than the corresponding set SSV (t) deﬁned there (t is a reparameterization of θ):
Proposition 19.
˜
E SSV (θ)

|P1,θ (x)|2 dx dθ ≤ 32m− /2

˜
Therefore, ∃E ⊂ E such that:

∀θ ∈ E
SSV (θ)

|P1,θ (x)|2 dx

Then Proposition 23 and 19 give Theorem 17; since

K 32m− /2 .

= αm and K 2 = 3m , we see that

any α > 2 may be used for this estimate; however, we will need α to be larger soon.

4.2

Controlling SSV (t)

Up until now, the proof has not diﬀered from the general case other than some choices of
m, K, |E| etc., for reasons soon to be established. In this section, we depart dramatically
from the general case considered in Chapter 5. Remark 6, as we will see, is indispensible
for the proof we consider for the gasket and unavailable in the general case. In particular, a
large set of angles lacking properties like those in Remark 6 sometimes impies that SSV (t)
√
is large for a set of angles having size L− m , invalidating the approach we will use here,
or at the very least contributing another type of case we don’t yet know how to deal with.
The general case is handled by much less elementary methods in Section 5.3, which must
take into account the possibility of “repeated zeroes”.
38

Remark 6. Consider Φ(x, y) = 1 + eix + eiy ; note that ϕθ (z) = Φ(xθ (z), yθ (z)). To
understand the small values of Φ, the key observation is the fact that if Φ(x, y) = 0 and
x, y ∈ R, then Φ(3x, 3y) = 3, and further, that x = ±2π/3 mod 2π and y =

2π/3 mod 2π.

See also the Section 5.5.
These lead to the following estimates:

|Φ(x, y)|2 ≥ a(|4 cos2 x − 1|2 + |4 cos2 y − 1|2 )

(4.1)

sin 3x
= 4 cos2 x − 1 .
sin x

(4.2)

Actually, we will set α = a−1 in the end. Changing variable we can replace 3ϕθ (x) by
φt (x) = Φ(x, tx).
n−m 1 φ (3−k x).
k=0 3 t
We need control over the set SSV (t) := {x ∈ I : |P2,t (x)| ≤ 3− }. One can easily

Consider P2,t (x) :=

1
n
−k
k=n−m 3 φt (3 x), P1,t (x) :=

imagine SSV (t) if one considers Ω := {(x, y) ∈ [0, 2π]2 : |P(x, y)| := |

m Φ(3k x, 3k y)| ≤
k=0

3m− }. Moreover, (using that if x ∈ SSV (t) then 3−n x ≥ 3−m , and using xdxdt = dxdy)
we change variable in the next integral:

˜
E SSV (t)

n

|P1,t (x)|2 dxdt = 3−2n+2m · 3n

n

≤ 3−n+3m

˜
E 3−n SSV (t) k=m

Φ(3k x, 3k tx)|2 dxdt

Φ(3k x, 3k y)|2 dxdy .

|
Ω

|

k=m

Now notice that by our key observations

Ω ⊂ {(x, y) ∈ [0, 2π]2 : | sin 3m+1 x|2 + | sin 3m+1 y|2 ≤ a−m 32m−2 ≤ 3− } .
39

(4.3)

The latter set Q is the union of 4 · 32m+2 squares Q of size 3−m− /2 × 3−m− /2 . Fix
such a Q and estimate
n
|

n

Φ(3k x, 3k y)|2 dxdy ≤ 3

|
Q

Q k=m

Φ(3k x, 3k y)|2 dxdy

k=m+ /2

n−m− /2
≤ 3 · (3−m− /2 )2

[0,2π]2

Φ(3k x, 3k y)|2 dxdy

|
k=0

≤ 3 · (3−m− /2 )2 · 3n−m− /2 = 3−2m · 3n−m− /2 .
Therefore, taking into account the number of squares Q in Q and the previous estimates we
get

E SSV (t)

|P1,t (x)|2 dxdt ≤ 32m− /2 .

Proposition 19 is proved.

Remark 7. It is true that α depends on the constant a in (4.1), since it appears in (4.3).
1
One can use a = 18 , attained at (x, y) = (0, π). Then from (4.3), we get α = m/ ≥
log3 (162) ≈ 4.631 as our last condition on α. We need this to compute the best exponent p.
Note that in our argument, we cut a couple corners. To get the best exponent currently
available, let γ > 1. Let m = γ 0 log3 N . Then the argument works as long as 0 <
1
[2γα + 1 − γ]−1 , i.e., 0 <
. Using the sharper exponent β > 1 in Theorem 27,
2 log3 (169)
1
1
one can get any p = −1
<
in the estimate F av(Gn ) ≤ Cp n−p . In
[2 log3 (169)]−1 +1
0 +β
1
particular, p = 10.262 is small enough.
This argument can be improved, but not so much that one should expect to get the sharp
40

exponent without signiﬁcant, totally new ideas.

41

Chapter 5
The upper bound in Buﬀon’s needle
problem - general case
See the beginning of the previous chapter for a summary of the main ideas.

5.1

5.1.1

The Fourier-analytic part

The setup

The goal of this section is to prove Theorem 21, which shows that for most directions θ, a
considerable amount of stacking occurs orthogonal to θ. The constants c and C will vary
from line to line, but will be absolute constants not depending on anything except perhaps L
in some cases. The symbols c and C will typically denote constants that are suﬃciently small
or large, respectively. Everywhere we use the deﬁnition B(z0 , ε) := {z ∈ C : |z − z0 | < ε}.
42

Recall Remark 1, which allows us to say
L
J1 =
j=1

iθ 1
B(rj e j , ).
L

Also,

L
fn,θ :=
Discs D of Jn

χproj (D) .
θ

Observe that fn,θ = νn ∗ Ln χ
, where νn := ∗n νk and
k=1
[−L−n ,L−n ]

1
νk =
L

L
l=1

δ −k
.
L rl cos(θ−θl )

We will now slightly modify f for convenience. Note that

ˆ
fn,θ (x) = Ln χ
ˆ
(x) ·
[−L−n ,L−n ]

n

φθ (L−k x),

k=1

L e−irl cos(θl −θ)x . We are interested in L2 norms, so the argument
l=1

1
where φθ (x) = L

of φ is of no consequence. By factoring out the ﬁrst term, discarding this factor, and changing
the variable, we may instead write in place of φθ the function

1
ϕt (x) =
1 + eix + eitx +
L

L

eal x+bl tx , t ∈ [0, 1] .

(5.1)

l=4

We assumed here that r1 = 0, r2 = r3 = 1, θ2 = 0, θ3 = π/2. We can do this by aﬃne
change of variable.
43

For numbers K, N > 0, deﬁne the following(1) :

∗
∗
fN (s) := fN,t sup fn,t (s)
n≤N

(5.2)

∗
A∗ := A∗
K
K,N,t := {s : fN (s) ≥ K}

(5.3)

1
}.
E := {t : |A∗ | ≤
K
K3

(5.4)

E is essentially the set of pathological t such that ||fn,t || 2
is small for all n ≤ N , as
L (s)
in [18]. In fact, we have this result, proved in Section 5.2.2:
Theorem 20. Let t ∈ E. Then

max
fn,t 2 2
≤ cK .
L (s)
0≤n≤N
The aim of Section 5.1 is to prove the following:
√
Theorem 21. Let 0 be a ﬁxed small enough constant. Then for N >> 1, |E| < e− 0 log N .
√
0 log N , and suppose |E| > 1 . We will show that N < N ∗ , for some
So let K ≈ e
K
ﬁnite constant N ∗ >> 1.

5.1.2

Initial reductions

Because of Theorem 20, we have ∀t ∈ E,
LN/2
K ≥ ||fN,t ||2 2
≈ ||fN,t ||2 2
≥C
|νN (x)|2 dx
L (s)
L (x)
1

(5.5)

1 Note that our result could be sharper if K 3 were replaced by K τ , τ > 2. The constant
0 could be computed explicitly, and it depends on τ . We will not do this, though.
44

0
Let m ≈ ( 2 log N )1/2 . Split [1, LN/2 ] into N/2 pieces [Lk , Lk+1 ] and take a sample
integral of |νN |2 on a small block I := [Ln−m , Ln ], with n ∈ [N/4, N/2] chosen so that
Ln
1
|ν (x)|2 dx dt ≤ CKm/N .
|E| E Ln−m N
This choice is possible by (5.5). Deﬁne

˜
E := {t ∈ E :

Ln
Ln−m

|νN (x)|2 dx ≤ 2CKm/N } .

1
˜
It then follows that |E| ≥ 2K .
Note that νN (x) =

N ϕ(L−k x) ≈
k=1

n ϕ(L−k x) for x ∈ [Ln−m , Ln ].
k=1

So for t ∈ E,
Ln
Ln−m

n
k=1

CKm
|ϕt (L−k x)|2 dx ≤
≤ 2 0 N 0 −1 log N.
N

0
Recall that m ≈ ( 2 log N )1/2 . Later, we will show that ∃t ∈ E and absolute constant α
such that

Ln
Ln−m

n

2
|ϕt (L−k x)|2 dx ≥ cLm−2·αm ≥ cN −α 0 .

(5.6)

k=1

The result: 2 0 log N ≥ N 1−4α 0 − 0 , i.e., N ≤ N ∗ if 0 is small enough. In other
words:

Proposition 22. Inequality (5.6) is suﬃcient to prove Theorem (21). Further, inequality
5.6 can be deduced from Propositions 23 and 24, as will be seen shortly.
45

So let us prove inequality (5.6).
n ϕ (L−k x) = P (x) = P (x)P (y), where P is the low
t
1,t
2,t
2
k=1 t

First, let us write

frequency part, and P1 is has medium and high frequencies:
n−m
P1,t (x) :=

ϕt (L−k x) = νn−m (x)

k=1
n

ϕt (L−k x) = νm (Lm−n x)

P2,t (x) =
k=n−m
We want the following:

Proposition 23. Let t ∈ E be ﬁxed. Then

Ln
|P (x)|2 dx ≥ C Lm .
Ln−m 1,t

˜ ˜
Recall that we deﬁned the set E, |E| > |E|/2, and we assume that

|E| > 1/K .

(5.7)

Recall that we denoted
I = [Ln−m , Ln ] .
We also want a proportion of the contribution to the integral separated away from the
complex zeroes of P2,t :
2
Proposition 24. Let SSV (t) := {x ∈ I : |P2,t (x)| ≤ L−αm }. Suppose also that E is
˜
unable to hide, that is (5.7) is valid. Then there exists a subset E ⊂ E, |E| ≥ 1/4K, such
that for every θ ∈ E one has

SSV (t)

|P1,t (x)|2 dxdt ≤ 2c Lm ,
46

where 2c is less than the C from Proposition 23. In particular,

1
|P1,t (x)|2 dxdt ≤ c Lm ,
˜ ˜
|E| E SSV (t)

Remark 8. The set SSV (t) is so named because it is the set of small values of P2 on
I. Combining this with Proposition 23,
Ln
Ln−m

|P1,t (x)|2 |P2,t (x)|2 dx ≥

I\SSV (t)

2
2
|P1,t (x)|2 · L−αm dx ≥ c Lm−2αm ,

which gives (5.6)–exactly what we promised to obtain from Propositions 24, 23. Thus Propositions 23 and 24 suﬃce to prove Theorem 21, and Proposition 22 has been demonstrated.

Remark 9. We want to show that for N >> 1, (5.7) fails. After showing this, we will have:

|E| ≤ 1/K = L

−m
2

1/2
= e−C(L) 0 (log N )
,

(5.8)

proving the main result, since the projections decay quickly enough on E c .

First, let us ﬁx t ∈ E and prove Proposition 23.

Proof. We are using ﬁrst Salem’s trick on
Ln
0

|P1 (x)|2 dx :

ˆ
Let h(x) := (1 − |x|)χ[−1,1] (x), and note that h(α) = C 1−cos α > 0. Then if we write
α2
47

P1 = Lm−n−1

Ln−m eiαj x , we get
j=0
Ln
0

|P1 (x)|2 dx ≥ 2

≥ C(Lm−n )2 [Ln · Ln−m +

Ln
−Ln

h(L−n x)|P1 (x)|2 dx

Ln−m

ˆ
Ln h(Ln (αj − αk ))] ≥ CLm .

j=k;j,k=1
To show that this is not concentrated on [0, Ln−m ], we will use Theorem 20 and Lemma
40. We get
Ln−m
0

|P1 (x)|2 dx =

Ln−m
0

|νn−m (x)|2 dx = L2(m−n)

Ln−m n−m iα x
e j |2 dx
|
0
j=0

m
≤ CK ≤ CL 2 .

(5.9)

So now we have Proposition 23. The greater challenge will be Proposition 24.

5.1.3

The proof of Proposition 24

2
Recall that SSV (t) := {x ∈ I = [Ln−m , Ln ] : |P2,t (x)| ≤ L−αm }.
To get Proposition 24, we will split P1,t into two parts, P1,t (x) and P1,t (x) corresponding
to medium and high frequencies.
A straightforward application of Lemma 40 to high frequency part P1,t (x) will get us
part of the way there (see Proposition 26), and the claim 25 applied to medium frequency
term P1,t (x) will further sharpen the ﬁnal estimate to what we need. This latter reﬁnement
will be a “for most t...” statement about P1,t (x) that contributes a small amount to the
48

possible size of E.
Naturally, P1,t (x) and P1,t (x) are deﬁned as the medium and high frequency parts of
P1,t (x). Below,

:= αm:
n−m−1
P1,t (x) :=

ϕt (L−k x) = ν −1 (Lm+ −n x) ,
ˆ

k=n−m−
n−m− −1
P1,t (x) :=

ϕt (L−k x) = νn−m− −1 (x).
ˆ

k=1
What follows is the ﬁrst claim of this subsection. The idea is simply that |Φt | ≤ 1, with
equality only when the exponents all belong to 2πZ. As it is quite diﬃcult for this to happen
simultaneously for even just two exponential terms, one gains a lot of information about the
decimal expansion of t whenever |Φt | is close to 1.
Proposition 25. For all suﬃciently small positive numbers τ ≤ τ0 and for all suﬃciently
large m and

= α m there exists an exceptional set H of directions t such that

|H| ≤ L− /2 ,

(5.10)

∀t ∈ H ∀x ∈ [Ln−m , Ln ], |P1,t (x)| ≤ e−τ .
/

(5.11)

Proof. Notice that
φθ (r) = Φ(r cos θ, r sin θ) ,
where for x = (x1 , x2 ),
1
Φ(x) := Φ(x1 , x2 ) =
L
49

L
l=1

e2πi al ,x .

As some pair of vectors al − a1 , l ∈ [1, L] must span a two-dimensional space, we can assume
without the loss of generality (make an aﬃne change of variable) that

a1 = (0, 0) , a2 = (1, 0) , a3 = (0, 1) .

Then
1
Φ(x1 , x2 ) = (1 + e2πix1 + e2πix2 +
L

L

e2πi al ,x ) .

(5.12)

l=4

We make the change of variable y = (y1 , y2 ) = L−(n−m) x. Let Rt denote the ray y2 = ty1 .
Then we need to prove that there exists a small set H of t s such that if y ∈ Rt ∩ {y : |y| ∈
[1, Lm ]}, t ∈ H then
/
|Φ(y) · · · · · Φ(L y)| ≤ e−τ .

(5.13)

We consider only the case t ∈ [0, 1], all our y’s will be such that 0 < y2 ≤ y1 , and as
1
|y| ≥ 1 we have y1 ≥ √ .
2
It is very diﬃcult if at all possible for function Φ to satisfy |Φ(y)| = 1. In fact, looking
at (5.12) we can see that

2
|Φ(y)| ≤ 1 − bdist(y, Z2 ) ≤ e−bdist(y,Z ) .

(5.14)

Therefore, we are left to understand that there are few t’s such that

1
∃y ∈ Rt , : y1 ∈ [ √ , Lm ] : b ·
2

dist(Lk y, Z2 ) ≤ τ .
k=0

Now may be a good time to consult Figure 5.1.
50

(5.15)

y2
y2 =ty1

3
(L x, L3tx)

(L2 x,L2 tx)
(Lx,Ltx)

y1

(x,tx)

φt(x)= ϕ(y)=(1+eiy1 +eiy2 +e

i(ta4y1+ b y1 )
4

+...)

L
Figure 5.1: It is quite diﬃcult for a large number of factors of P1,t (x) to be close to 1
simultaneously. In particular, Lk x and Lk tx must be close to Z for many values of k.

51

Fix y ∈ Rt as above. If (5.15) holds then for 90 per cent of k s one has

dist(Lk y, Z2 ) ≤ 10τ .

(5.16)

Denote Zy := {k ∈ [0, ] : dist(Lk y, Z2 ) ≤ 10τ }. We know that

|Zy | ≥ 0.9 .

Let us call scenario the collection s := {m1 ; k1 , ..., k0.1 }, where m1 = 0, .., m; 0 ≤ k1 <
... < k0.1 .
Every t such that there exists y such that (5.15) holds generates several scenarios according to
y1 ∈ [Lm1 −1 , Lm1 )
and according to what is the set [0, ] \ Zy —this is the set k1 , ..., k0.1 of the scenario.
We will calculate the number of scenarios later. Now let us ﬁx a scenario s = {m1 ; k1 , ..., k0.1 },
and let us estimate the measure of the set T (s), T (s) := {t ∈ (0, 1) : ∃y, y2 = ty1 , y1 ∈
[Lm1 −1 , Lm1 ) such that [0, ] \ Zy = {k1 , . . . , k0.1 }. To do that for this ﬁxed scenario we
ﬁx a net. To explain what is a net we ﬁx

a :=

log 100
η
log L

+ 1,

where η = C τ and C is an absolute constant to be chosen soon.
A net is a collection N (s) := {n1 , . . . , nj }, n1 < n2 < . . . , where every ni is not among
52

3
kj included in the scenario, j ≥ 4a + 1, and

ni+1 − ni ≥ 2a .

Given a scenario it is always possible to built a net. In fact we just delete from [0, ] the
numbers k1 , ..., k0.1 belonging to the scenario, we are left with at least 0.9 numbers. We
choose an arithmetic progression with step a (enumerating them anew ﬁrst). This arithmetic
3
progression will be long enough, its length j ≥ 4a because after eliminating k1 , ..., k0.1 we
still have at least 0.9 numbers left. We mark the numbers of this progression. Then we put
back k1 , ..., k0.1 . The marked numbers will form our net.

If t ∈ T (s) then there exists y = (y1 , ty1 ) as above, in particular,

dist(Lni y, Z2 ) ≤ 10τ , ∀ni ∈ N (s) .

Let us write that then there exist integers p1 ≤ q1 : |Ln1 y1 − q1 | < 10τ , |Ln1 y2 − p1 | < 10τ ,
so
p
Ln1 y
p
Ln1 y − p1 + p1 p1
t− 1 = n 2 − 1 = n 2
−
q1
q1
L 1 y1 q1
L 1 y 1 − q1 + q1
(Ln1 y2 − p1 + p1 )q1 − (Ln1 y1 − q1 + q1 )p1
|Ln1 y2 − p1 ||q1 | + |Ln1 y1 − q1 ||p1 |
≤
(q1 − 10τ )q1
(Ln1 y1 − q1 + q1 )q1
≤ 40τ

1
.
q1

As promised we choose C: C = 40, η := 40τ and we get

p
1
∃p1 ≤ q1 : t − 1 ≤ η .
q1
q1
53

(5.17)

Next we choose integers p2 ≤ q2 : |Ln2 y1 − q2 | < 10τ , |Ln2 y2 − p2 | < 10τ and obtain

p
1
∃p2 ≤ q2 : t − 2 ≤ η .
q2
q2

(5.18)

√
Notice also that because of |Ln1 y1 − q1 | < 10η, |Ln2 y1 − q2 | < 10η, y1 ≥ 1/ 2, and
smallness of τ , and the fact that n2 − n1 ≥ 2a, we get
q2
100
≥ La ≥
.
q1
η

(5.19)

3
We continue in the same vein, i = 2, . . . , j − 1 ≥ 4a :
p
1
∃pi ≤ qi : t − i ≤ η .
qi
qi

(5.20)

√
Notice also that because of |Ln1 y1 − q1 | < 10η, |Ln2 y1 − q2 | < 10η, y1 ≥ 1/ 2, and
smallness of τ , and the fact that n2 − n1 ≥ 2a, we get
qi+1
100
≥ La ≥
.
qi
η

(5.21)

Inequality (5.17) gives that |T (s)| ≤ η, inequalities (5.17) and (5.18) in conjunction with
(5.19) give |T (s)| ≤

1
1 + 100 η 2 , similarly all inequalities (5.20), (5.21) together give
3
− 3 (1− (η))
|T (s)| ≤ (1.01η) 4a ≥ e0.1 L 4
.
54

log 100
η
Here we used of course that a :=
+ 1. Finally, if η is suﬃciently small we have
log L
−2
|T (s)| ≤ L 3 .

(5.22)

Let S denote the set of all scenarios. Now we want to calculate the number of scenarios.
This is easy:
#S ≤ m ·

0.1

≤ ·

10 0.9
· 100.1 .
9

We just proved that the measure of the set of all t ∈ (0, 1) such that one has (5.15)

1
∃y ∈ Rt , : y1 ∈ [ √ , Lm ] :
2

dist(Lk y, Z2 ) ≤ τ
k=0

can be estimated as
≤ ·

10 0.9
−2
· 100.1 · L 3 ≤ L− /2 .
9

Proposition 25 is proved. Except for a small set of exceptional directions, the uniform
bound |P1,t (x)| < e−τ holds.

Here is the second claim of the subsection:

Proposition 26.
t∈E⇒
SSV (t)

|P1,t (x)|2 dx ≤ C K Lm .

We will see in Section 5.3 that for each t, SSV (t) is contained in C · Lm neighborhoods
of size Ln−m− around the complex zeroes λj of P2 .
55

Fix t. Let
Ij = [λj − Ln−m− , λj + Ln−m− ],
where SSV (t) ⊆

(5.23)

Ij

(5.24)

j
Choose j for which I |P1,t (x)|2 dx is maximized. Then
j

SSV (t)

|P1,t (x)|2 dx ≤ CLm

Ij

|P1,t (x)|2 dx ≤ CLm (L +m−n )2

n−m−
|
Ij

iα x
e j |2 .

k=0

As |Ij | ≤ 2 · Ln−m− , so Lemma 40 and the deﬁnition of E give us Proposition 26.
˜
˜
The estimate for t ∈ E \ H follows. If |E| ≥ 1/K, K = Lm/2 , |E| > 1/2K, and we also
just proved that |H| ≤ L− /2 ,

˜
= α m with large α, we have a set E ⊂ E \ H, E > 1/4K,

such that for every t ∈ E

SSV (t)

|P1 (r)|2 dr ≤ L−

SSV (t)

|P1,t (x)(r)|2 dr ≤ C K Lm · L−αm .

So we proved
SSV (t)

|P1 (r)|2 dr ≤ c Lm

(5.25)

with c as small as we wish. In particular, Proposition 24 is completely proved.

5.2

Two combinatorial lemmas

In this section, we will prove two combinatorial lemmas. The objective in each case is
to rigorously estimate one quantity by another, clearly related, quantity. The two, taken
56

together, reduce the problem of ﬁnding an upper bound in Buﬀon’s needle problem to the
problem of ﬁnding a bound on |E|.
For this section, regard the set E from Section 5.1 as parameterized by θ, and use the
variable x instead of s on the non-Fourier side, since we will not work on the Fourier side at
all during this section.

5.2.1

|A∗
K,N,θ | vs. |LN K β ,θ |

In this section, we show how Theorem 13 follows from Theorem 21. The theorem we prove
here is the big brother of the micro-theorem converse.
First, let us deﬁne
LN,θ := projθ JN .

(5.26)

Theorem 27. Let β > 1 (we used β = 3 in the previous section). Let K and N be large
enough, possibly depending on L. If t ∈ E (see deﬁnition (5.4) and use τ > 2 as suggested),
/
C
then |L
| ≤ K.
N K β ,θ
Proof. Let us use θ instead of t and use x for the space variable on the non-Fourier side,
since we do not use Fourier analysis in this proof. Fix θ, and for j ∈ N, let Fj := A∗
K,jN,θ =
∗
{x : fjN (x) ≥ K}. Let F := F1 . θ ∈ E means |F | ≥ K −τ , where τ > 2 is ﬁxed.
/
Note that this theorem is the sophisticated analog of the micro-theorem converse of
Chapter 1.
Consider the discs of JN . All discs are white initially. Now each disc lying above any
x ∈ F green. We will now consider the sets JjN , for j = 1, 2, ... and label these discs as
green or white according to these rules:
1) If a disc in JjN is green, its oﬀspring in J(j+1)N are all green. 2) If a disc in JjN is
57

white, its oﬀspring in J(j+1)N are white except for those discs which are self-similar copies
of the discs which were green in JN .
Let Gj denote the set of green discs in JjN . Note that θ ∈ E tells us that |G1 | is fairly
/
large - let us prove a statement to this eﬀect. Consider φj (x) :=

D∈Gj χprojθ (D) , and

let φ(x) := φ1 (x).
Proposition 28.
D ⊆ {x : Mφj > K/4}
D∈Gj
Proof. When a disc D in Gj with projected center at x0 has white ancestor in Gj−1 – that
is, it is “green for the ﬁrst time” – it is clear that Mφj > K/4 by taking the average of φj on
[x0 − 2L−jN , x0 + 2L−jN ]. In fact, the L1 mass of the green discs above such an interval
cannot decrease below this bound, simply because the oﬀspring have L1 mass summing to
that of its parent, and the interval contains all of these K discs entirely.

Proposition 29. F ⊆ {x : Mφ ≥ K/2}, where M is the (uncentered or centered; we will
take it to be centered) Hardy-Littlewood maximal operator.

Proof. Fix x ∈ F . By deﬁnition, ∃n < N such that fn (x) ≥ K. Thus the interval [x −
2L−n , x + 2L−n ] contains the projections of K green discs of Jn , i.e., φ(x) ≥ K. In fact,
the total L1 mass of the sum of characteristic functions of the children of these projected
green discs remains constant as n increases. So clearly
−n
Ln
Ln x+2L
Mφ(x) ≥
fN,θ (x)dx ≥ K2L−n
≥ K/2.
4 x−2L−n
4

58

Of course one sees where this is headed:

|F | ≤ |{x : Mφ(x) > K/2}|

1
2
||φ||1 = L−N |G1 |.
K
K

(5.27)

Since θ ∈ E, this immediately proves:
/

Proposition 30.
|G1 |

K 1−τ LN .

Let Pj denote |Gj | · L−jN , that is, the proportion of discs of JjN which are green. Note
j
j
1−τ
|G1 | j
cjK 1−τ
1−τ
= 1−
≤ 1 − cK
.
≈ e−cjK
that Qj := 1 − Pj = 1 − N
j
L
Note that
projθ (W ) ≤ 2Qj

e−cjK

1−τ

.

W a white disc of JjN
Also, we saw already that the remaining discs of D of JjN are exactly the green discs,
i.e., D ∈ Gj . Using Proposition 28, we see that

projθ (D) ≤ |{x : Mφj (x) > K/2}|
D∈Gj

1
2
||φj ||1 ≤ .
K
K

In particular, if j > K τ −1+ε = K β , there are few enough white discs, and all is well.
This completes the proof of Theorem 27.

59

5.2.2

supn≤N ||fn,θ ||2 vs. |A∗
2
K,N,θ |

Theorem 31. Let θ ∈ E. Then

max
fn,θ 2 2
≤CK.
L (R)
n:0≤n≤N
To prove this we ﬁrst need the following claim, which is the main combinatorial assertion
of this subsection. It repeats the one in [18] but we give a slightly diﬀerent proof.
We ﬁx a direction θ, we think that the line θ on which we project is R. If x ∈ R then
by Nx we denote the line orthogonal to R and passing through point x, we call Nx a needle.
∗
Recall that A∗
K,N,θ := {x ∈ R : fN,θ (x) > K}. When N and θ are understood from
context, we can write FK := A∗
K,N,θ .
Theorem 32. There exists an absolute constant C such that for any large enough K, M ,
and N ,
|F2LKM | ≤ CLK |FK | · |FM | .

(5.28)

Proof. One can see this by considering maximal discs above F2LK . Suppose x ∈ F2LK .
Then there are at least 2LK “light green” (relative to x and n) discs of some generation
n ≤ N above x; call these Lx,n . In generation n − 1, there are still at least 2K discs above
x - namely, the fathers of the 2LK discs of generation n. Keep going back one generation
until you reach j0 = j0 (x), the largest j < n such that the generation j ancestors of the
light green discs of Lx,n are fewer than 2LK in number. Call these discs of generation
j0 (x) the green discs (relative to x and n), or Gx,n . Then |Gx,n | ≥ 2K. Form the union
G = ∪x,n Gx,n of green discs. The discs of this union are just called green.
Each green disc is maximal for some (x, n), but it may be the case that a green disc
60

above (x1 , n1 ) is properly contained in a green disc above (x2 , n2 ). We want our maximal
discs to be truly maximal, so mark as dark green all green discs which are not sub-discs of
a larger green disc. Call the family of dark green discs D.
The largest dark green disc has some radius L−n0 . Call one such dark green disc Q0 .
Q0 ∈ Gx,n for some (x, n), so it belongs to a stack of K or more green discs. In fact, they
are all dark green by the maximality of Q0 .
Let I0 = 20projθ (Q0 ), where the rescaling is concentric. Consider all Q ∈ D whose
projection intersects I0 . Call this set of such Q by the name F(Q0 ). For all x ∈ R, the needle
at x intersects fewer than 2LK discs from the set F(Q0 ) (Otherwise, larger green discs could
be found by taking ancestors, contradiction). Since F(Q0 ) lives above I0 +[−2L−n0 , 2Ln0 ],
|F(Q0 )| < 100LK.
Let x0 be the projected center of Q0 . Let J0 := [x0 , x0 +L−n0 ] or J0 := [x0 −L−n0 , x0 ],
whichever contains at least K projected centers of dark green discs. Thus J0 ⊆ FK and
|J0 | ≥ L−n0

|I0 |.

Lemma 33. |F2LKM ∩ I0 |

KL|I0 ||FM |

KL|J0 ||FM |

Proof. Let x ∈ F2LKM ∩ I0 . Note that F2LKM ∩ I0 ⊆ F2LK ∩ I0 ⊆ F(Q0 ). So in
generation n0 , x has fewer than 2LK discs above it, whose projected lengths sum to at most
cKL|I0 |. For some n ≤ N , the stack must reach height 2LKM , which means that one of
the discs of F(Q0 ) must give birth to a stack of M discs. That is, x must belong to one of
≤ 2KL self-similar copies of FM living inside of F(Q0 ). The lemma follows.
To ﬁnish the proof, one needs to induct. That is, one needs intervals I1 , I2 , ... covering
F2LKM such that comparable subintervals J1 , J2 , ... can be substituted for I0 and J0 in
the statement of this last lemma. This, in fact, can be done; one deletes
61

s I from
r=1 r

F2KL and starts the maximality argument over again to get Is+1 and Js+1 . Note that
by maximality, it is impossible for the sets Ir to overlap too much; each is centered outside
of the previous, and they only shrink. The problem is ﬁnite, so in fact all of F2LKM is
exhausted in this way.
This completes the proof of Theorem 32.

Now we can prove Theorem 31.

Proof. Let Ej := {x : fn,θ (x) > (2LK)j+1 }, j = 0, 1, ..... We know by Theorem 32 that

|Ej | ≤ (CLK)j |E0 |j+1 .

Hence,
fn,θ (x)2 dx ≤ 2LK

∞
fn,θ (x) dx +

j=0 Ej \Ej+1

fn,θ (x)2 dx

∞

≤ 2LK

fn,θ (x) dx +

(2LK)j+2
fn,θ (x) dx
Ej \Ej+1
j=0

∞

≤ 2CLK +

(2LK)j+2 (CLK)j |E0 |j+1 .
j=0

∗
If |{x : fN (x) > K}| ≤ 1/K τ , τ > 2, then for all n ≤ N we can immediately read the
previous inequality as
fn,θ (x)2 dx ≤ C(τ ) K .

62

5.3

Controlling SSV (t)

Now we have to consider P2,t (r) = φt (r)φt (L−1 r) · · · · · φt (L−m r). We are interested in
the set
2
SSV (t) := {r ∈ [1, Lm ] : |P2,t (r)| ≤ L−Am } .
We will be using so-called Turan’s lemma:
Lemma 34. Let f (x) =

L c eλl x , let E ⊂ I, I being any interval. Then
l=1 l

A|I| L
sup |f (x)| .
sup |f (x)| ≤ emax | λn | |I|
|E|
E
I
Here A is an absolute constant.
In this form it is proved by F. Nazarov [21].
1
Now let us consider any square Q = [x − 1, x + 1] × [−1, 1]. We call 2 Q the concentric
square of half the size.
Lemma 35. With uniform constant C depending only on L one has

sup |φt (z)| ≤ C sup |φt (z)| .
1
Q
2Q
Proof. Let z0 = x0 + iy0 is a point of maximum in the closure of Q. We ﬁrst want to
compare |f (z0 )| and |f (x0 )|. Consider fx0 (y) := φt (x0 + iy). Notice that uniformly in Q
and x0
|fx (y)| ≤ C(L) .
0
1
This means that |fx0 (y)| ≥ 2 |fx0 (0)| on an interval of uniform length c(L).
63

Notice also that the exponents λl (t), l = 1, . . . , L, encountered in φt are all uniformly
bounded. Then applying Lemma 34 we get

|φt (z0 )| = |fx0 (y0 )| ≤ C (L)|fx0 (0)| .

Now consider F (x) = φt (x). We want to compare F (x0 ) = fx0 (0) = φt (x0 ) with

max
|F (x)|.
1 ,x + 1 ]
[x − 2
2
By Lemma 34 we get again

|fx0 (0)| = |F (x0 )| ≤

sup
|F (x)| ≤ C (L)
sup
|F (x)| ≤ C (L) sup |φt (z)|
1
[x −1,x +1]
[x −1/2,x +1/2]
2Q

Combining the last two display inequalities we get Lemma 35 completely proved.

Lemma 36. With uniform constant C depending only on L (and not on m) one has

sup |φt (L−k z)| ≤ C sup |φt (L−k z)| , k = 0, . . . , m .
1
Q
2Q
The proof is exactly the same. We just use L−k λl (t), l = 1, . . . , L, encountered in
φt (L−k ·) are all uniformly bounded.
By complex analysis lemmas from Section 5.4 we know that Lemma 36 implies that
1
every 2 Q has at most M (depending only on L) zeros of φt (z). And if we denote them by
64

µ1 , . . . , µM then
1
{x ∈ Q ∩ R : |φt (x)| ≤ L−M } ⊆
2

M

B(µi , L− ) .

(5.29)

i=1

Consider µ1 , . . . , µS being all zeros of P2,t in [1/2, Lm + 1] × [1/2, 1/2]. By abovementioned lemmas from Section 5.4 and by Lemma 36 we get that

S ≤ M (L) Lm .

From (5.29) it is immediate that

{x ∈ [1, Lm ] : |P2,t (L−(n−m) x)| ≤ L−M m } ⊂

M Lm

B(µi , L− ) .

(5.30)

i=1
Changing the variable y = Ln−m x we get the structure of the set of small values used
above during the proof of Proposition 26:

C Lm
SSV (t) ⊂ ∪i=1 Ii ,

(5.31)

where each interval Ii has the length 2 Ln−m− .
In this section, we also include Lemmas 37 and 38. Given a bounded holomorphic function
on the disc, its supremum, and an interior non-zero value, these lemmas bound the number
of zeroes and contain the set of small values within certain neighborhoods of these zeroes.
They are somewhat standard, but are included for completeness.
65

5.3.1

A Blaschke estimate

Lemma 37. Let D be the closed unit disc in C. Suppose φ is holomorphic in an open
1
neighborhood of D, |φ(0)| ≥ 1, and the zeroes of φ in 2 D are given by λ1 , λ2 , ..., λM . Let
C = ||φ||L∞ (D) . Then M ≤ log2 (C).
Proof. Let
M
B(z) =
k=1

z − λk
¯ .
1 − λk z

φ
Then |B| ≤ 1 on D, with = on the boundary. If we let g := B , then g is holomorphic and
nonzero on 1 D, and |g(eiθ )| ≤ C ∀θ ∈ [0, 2π]. Thus |g(0)| ≤ C by the maximum modulus
2
principle. So we have
|φ(0)|
C ≥ |g(0)| =
≥
|B(0)|

M
k=1

1
≥ 2M .
|λk |

Lemma 38. In the same setting as Theorem 37, the following is also true for all δ ∈ (0, 1/3):
1
{z ∈ 4 D : |φ| < δ} ⊆

1≤k≤M B(λk , ), where

:=

9
9
(3δ)1/M ≤ (3δ)1/log2 (C) .
16
16

1
Proof. Let δ ∈ (0, 1/3), and let z ∈ 4 D such that |z − λk | > ∀k. Note that g is harmonic
1
and nonzero on 1 D with |g(0)| ≥ 2M . Thus Harnack’s inequality ensures that |g| ≥ 3 2M
2
on 1 D, so there
4
1
|φ(z)| ≥ |g(z)B(z)| ≥ 2M
3

M
|
k=1
66

z − λk
16 M 1
¯ z | ≥ ( 9 ) 3 = δ.
1 − λk

We can conclude the proof by the contrapositive.

5.4

A localized upper bound on ||P1||2.

By manipulating some estimates with Poisson kernels, it is possible to localize information
about ||fn ||2 to say something about ||P1 · χI ||2 for an arbitrary interval I. We used this
to show that P1 doesn’t “live too much on small intervals,” in particular, near the origin,
[0, Ln−m ] - this lemma is used (in the form of Corollary 41) to get (5.9).
The ﬁrst claim, Lemma 39, uses the Carleson imbedding theorem. It can be skipped,
though, as a stronger version, Lemma 40, is proved using general H 2 theory on the upper
half-plane C+ . The Carleson imbedding theorem and some H p theory can be found in [10]
and its references.
Lemma 39. Let j = 1, 2, ...k, cj ∈ C, |cj | = 1, and αj ∈ R. Let A := {αj }k . Then
j=1
k

1

iα y
cj e j |2 dy ≤ C k ·

|
0

j=1

sup
#{A ∩ I} .
I a unit interval

Proof. Let A1 := {µ = α + i : α ∈ A}. Let ν :=

µ∈A1 δµ . This is a measure in C+ .

Obviously its Carleson constant

ν C :=

sup
J⊂R, J is an interval

ν(J × [0, |J|])
|J|

can be estimated as follows

ν C ≤2

sup
#{A ∩ I} .
I a unit interval
67

(5.32)

Recall that

∀f ∈ H 2 (C+ )

|f (z)|2 dν(z) ≤ C0 ν C f 2 2 ,
H
C+

(5.33)

where C0 is an absolute constant. Now we compute
k

1

iα y
cj e j |2 dy ≤ e2

|
0

e2

j=1
k

∞
|
0

k

1
|
0

i(α +i)y 2
cj e j
| dy ≤

j=1

i(α +i)y 2
cj e j
| dy = e2

j=1

|
R µ∈A
1

cµ 2
| ,
x−µ

where cµ := cj for µ = αj + i. The last equality is by Plancherel’s theorem.
We continue

|
R µ∈A
1
4π 2

cµ 2
| =
x−µ

2
cµ
sup
f,
=
x−µ
2 (C ), f ≤1
f ∈H
µ∈A1
+
2

cµ f (µ)|2 ≤ C #{A1 }
sup
|f (µ)|2 ≤
sup
|
f ∈H 2 (C+ ), f 2 ≤1 µ∈A1
f ∈H 2 (C+ ), f 2 ≤1 µ∈A1

C #{A}

sup
|f (z)|2 dν(z) ≤ 2C0 C #{A}
sup
#{A ∩ I} .
2 (C ), f ≤1 C+
I a unit interval
f ∈H
+
2

This is by (5.39) and (5.32). The lemma is proved.

Now we are going to prove a stronger assertion by a simpler approach. This stronger
assertion is what is used in the main part of the article.

68

Lemma 40. Let j = 1, 2, ...k, cj ∈ C, |cj | = 1, and αj ∈ R. Let A := {αj }k . Then
j=1
Suppose
(
χ[α−1,α+1] (x))2 dx ≤ S ,
R α∈A

(5.34)

Then there exists an absolute constant C such that
1

cα eiαy |2 dy ≤ C S .

|
0

(5.35)

α∈A

Of course, one can change variables and get:

Corollary 41. Let j = 1, 2, ...k, cj ∈ C, |cj | = 1, and αj ∈ R. Let A := {αj }k , and let
j=1
δ > 0. Suppose
(
R α∈A

χ[α−δ,α+δ] (x))2 dx ≤ S ,

(5.36)

Then there exists an absolute constant C such that for any a ∈ R
a+δ −1

cα eiαy |2 dy ≤ C S /δ 2 .

|
a

(5.37)

α∈A

Remark. Lemma 40 is obviously stronger than Lemma 39. In fact, let S0 be the maximal
number of points A in any unit interval. Then

χ[α−1,α+1] (x) ≤ 2S0 .

f (x) :=
α∈A

Now R f 2 (x)dx ≤ 4kS0 , where k as above is the cardinality of A. We can put now
S := 4kS0 , apply Lemma 40 and get the conclusion of Lemma 39. The proof of Lemma 40
does not require the Carleson imbedding theorem. Here it is.
69

Proof. Using Plancherel’s theorem we write
1

cα eiα y dy|2 ≤ e2

|
0

α∈A

1

cα ei(α+i) y dy|2 ≤ e2

|
0

α∈A

e2
R α∈A

∞

cα ei(α+i) y dy|2 =

|
0

α∈A

2
cα
dx .
α+i−x

2
Identify z = x + iy in the usual way (x, y ∈ R). Let C+ = {x + iy : y > 0}. Let H0 be
the space of measurable functions f : C+ → C such that supy>0 R |f (x + iy)|2 dx < ∞.
2
2
Let H 2 be the subspace of H0 consisting of analytic functions. H0 is a Hilbert space:

< f1 , f2 >:=

lim
f (x + iy)f2 (x + iy)dx.
+ R 1
y→0

(5.38)

It is a standard fact2 that H 2 is orthogonal to H 2 , implying in particular that if f1 , f2
are analytic in C+ with R |fj (x + iy)|2 dx < M for all y > 0 and for j = 1, 2, then

0 =< f1 , f2 >=

R

f1 (x + iy)f2 (x + iy)dx ∀y > 0.

(5.39)

In our application, we can use

f1 (z) =
α∈A

cα
, f (z) =
α−i−z 2

α∈A

c¯
α
α−i−z

¯
¯
We can just evaluate at y = 0 directly. Note that (5.39) says that < f1 − f2 , f1 − f2 >=
||f1 ||2 + ||f2 ||2 ≥ ||f1 ||2 .
2 f f is analytic in this case, and by conformal identiﬁcation of C ∪ ∞ with the unit
+
1 2
disc, one sees that the complex integral along a circle is equal 0.
70

Then we get
2
2
cα
cα
cα
−
dx ≤
dx
R α∈A α + i − x
R α∈A α − i − x α∈A α + i − x
2
2
−2icα
2
cα P1 (α − x) dx ,
=
dx = 4π
R α∈A 1 + (α − x)2
R α∈A
where P1 is the Poisson kernel in the upper half-plane C+ at height h = 1:
1
h
Ph (x) :=
.
π h2 + x2
We continue by noticing that P1 ∗ χ[λ−1,λ+1] (x) ≥ c P1 (λ − x) with absolute positive
c. This is an elementary calculation, or, if one wishes, Harnack’s inequality. Now we can
continue
1
0

2

cα eiα y dy|2

|

R

α∈A

(P1 ∗

cα χ[α−1,α+1] )(x)

dx .

α∈A

Now we use the fact that f → P1 ∗ f is a contraction in L2 (R). So
1

cα eiα y dy|2

|
0

α∈A

|
R α∈A

cα χ[α−1,α+1] (x)|2 dx

S.

The lemma is proved.

5.5

Discussion

The reason we were able to prove the stronger estimate for the Sierpinski gasket is exactly
given by (4.1) and (4.2). They are a quantiﬁed version of the fact that the three-term sum
71

2 jπi
ϕ(z) = 1 + eiz + eitz is zero if and only if the summands are e 3
, j = 0, 1, 2, and that for
such z, ϕ(3k z) = 3 for all integers k ≥ 1. An alternate argument using this fact in this form
is employed in [7]. Both versions of this fact we call by the general term “analytic tiling”.
It is not a tiling of the interval by projected Cantor squares as in [13], but there is a certain
tiling pattern to the zeroes of the Fourier transform.
However, there cannot be such a thing in the general case. Suppose we had 5 selfsimilarities, and that for for some direction θ, we had φθ (x0 ) = 1+(−i)+i+e2πi/3 +e4πi/3 =
0. Then clearly, taking ﬁfth powers of the summands results in another zero with exactly the
same summands, in complete and utter contrast to the three-point case. Similar examples
using partitions into relatively prime roots of unity exist for numbers other than 5. In
fact, there are examples where L = 5 and for θ in a pathological set of size >> L−m ,
√
|{x ∈ [Ln−m , Ln ] : n
|ϕθ (x)| < e−cm }| > Ln− log n . That is, SSV (t) takes
k=n−log n
up a proportion of I much larger than one that is exponentially small in the number of terms
1
in the product. By taking m log |P2 |, one gets a certain ergodic sum which one may hope
has nice properties, but for some sets J , such nice properties fail for a set of directions far
too large to ignore.
It is not yet known whether some separate argument is valid for this new set of “bad
directions.” One thought is that perhaps there are “structured” and “pseudo-random” directions, and that a separate argument works for each. In the latter case, a pseudo-random
analog of the large deviations theory for i.i.d. random variables may hold. But much remains
to be seen.
For example, if one considers Kn as in [18], one gets ϕθ (z) = 1 + eiπz + eiλz + ei(λ+π)z ,
which has the zero z = 1. Then ϕθ (4k ) = 2(1 + cos(4k λ)) for k > 0. λ depends continuously
72

on θ, and for ﬁxed λ such an ergodic sampling results in a sequence ak := ϕ(4k ), and either:
1: ak is eventually periodic and non-zero,
2: ak takes values other than 4 only ﬁnitely often,
or 3 (the case for almost every λ): 4k λ mod 2π evenly samples [0, 2π] over the long term,
1
with long-term average N

N log a → log 2 as N → ∞.
k
k=1

This regularity agrees with the result [18], which already proved a result without using
ergodic theory or large deviation theory. There was a θ and x separation of variables, and
the zeroes obeyed an “analtyic tiling” property like the one for the gasket.

73

Chapter 6
Epilogue
This thesis was written by a minotaur in a manner compliant with federal policies on the
use of human and/or animal subjects in research projects. No harm was sustained by the
minotaur except for maybe some loss of sleep and the formation of a coﬀee dependence.

74

Figure 6.1: Minotaur.

75

BIBLIOGRAPHY

76

BIBLIOGRAPHY

[1] M.
Bateman,
N.Katz,
Kakeya
arXiv:math/0609187v1, 2006, pp. 1–10.

sets

in

Cantor

directions,

[2] M. Bateman, Kakeya sets and the directional maximal operators in the plane,
arXiv:math.CA/0703559v1, 2007, pp. 1–20.
[3] M. Bateman, A. Volberg An estimate from below for the Buﬀon needle probability of the four-corner Cantor set, arXiv:0807.2953v1 [math.CA], 2008.
[4] A. S. Besicovitch, Tangential properties of sets and arcs of inﬁnite linear
measure, Bull. Amer. Math. Soc. 66 (1960), 353–359.
[5] M. Bond, A. Volberg Buﬀon needle lands in -neighborhood of a 1-dimensional
Sierpinski Gasket with probability at most | log |−c , Comptes Rendus Mathematique, Volume 348, Issues 11-12, June 2010, 653-656
[6] M. Bond, A. Volberg: Circular Favard Length of the Four-Corner Cantor Set,
J. of Geometric Analysis, online July 2010, DOI: 10.1007/s12220-010-9141-4.
[7] M. Bond, A. Volberg: The power law for Buﬀon’s needle landing near the
Sierpinski gasket, arXiv: 0911.0233v2, 2009.
[8] J. Bourgain, Averages in the plane over convex curves and maximal operators,
J. Analyse Math. 47 (1986), 69–85.
[9] G. David, Analytic capacity, Calder´n-Zygmund operators, and rectiﬁability,
o
Publ. Mat. 43 (1999),3–25.
[10] J. Garnett, Bounded Analytic Functions, Springer Graduate Texts in Mathematics 236, 2007.
[11] P. W. Jones and T. Murai, Positive analytic capacity but zero Buﬀon needle
probability, Paciﬁc J. Math. 133 (1988), 99–114.
[12] R. Kenyon, Projecting the one-dimensional Sierpinski gasket, Israel J. Math.
97 (1997), 221–238.
77

[13] I. Laba, K. Zhai, The Favard length of product Cantor sets, Bulletin of the
London Mathematical Society, doi: 10.1112/blms/bdq059, 2010.
[14] J. C. Lagarias and Y. Wang, Tiling the line with translates of one tile, Invent.
Math.124 (1996), 341–365.
[15] J. Mateu, X. Tolsa and J. Verdera, The planar Cantor sets of zero analytic
capacity and the local T (b)-theorem. J. Amer. Math. Soc. 16 (2003), 19–28.
[16] P. Mattila, Geometry of Sets and Measures in Euclidean Spaces, Cambridge
University Press, 1995.
[17] P. Mattila, On the analytic capacity and curvature of some Cantor sets with
non-σ-ﬁnite length, Publ. Mat. 40 (1996),no. 1, 195–204.
[18] F. Nazarov, Y. Peres, A. Volberg, The power law for the Buﬀon needle probability of the four-corner Cantor set, arXiv:0801.2942, 2008.
[19] H. Pajot. Analytic Capacity, Rectiﬁability, Menger Curvature and the Cauchy
Integral, Lecture Notes in Mathematics, vol. 1799, Springer, Berlin, 2002.
[20] Y. Peres and B. Solomyak, How likely is Buﬀon’s needle to fall near a planar
Cantor set? Paciﬁc J. Math. 204, 2 (2002), 473–496.
[21] F. Nazarov, Local estimates of exponential polynomials and their applications
to inequalities of uncertainty principle type , St Petersburg Math. J., v. 5
(1994), No. 4, pp. 3–66.
[22] A. Seeger, T. Tao, J. Wright, Singular Maximal Functions and Radon Transforms near L1 , Amer. J. Math, 126 (2002), 607–647.
[23] E.M. Stein, Maximal functions: Spherical means, Proc. Nat. Acad. Sci.
U.S.A., 73 (1976), 2174–2175.
[24] T. Tao, A quantitative version of the Besicovitch projection theorem via multiscale analysis, pp. 1–28, arXiv:0706.2446v1 [math.CA] 18 Jun 2007.
[25] X. Tolsa, Analytic capacity, rectiﬁability, and the Cauchy integral, Proceedings of the ICM, 2006, Madrid.

78