CONTINUITY OF WEIGHTED ESTIMATES IN HARMONIC ANALYSIS WITH
RESPECT TO THE WEIGHT
By
Nikolaos Pattakos

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Mathematics
2012

ABSTRACT
CONTINUITY OF WEIGHTED ESTIMATES IN HARMONIC ANALYSIS
WITH RESPECT TO THE WEIGHT
By
Nikolaos Pattakos
Given the class of Ap weights, 1 < p < ∞, we are able to deﬁne a metric d∗ on this set
such that the operator norm of any Calder´n-Zygmund operator T on Lp (w), w ∈ Ap , is a
o
continuous function with respect to w. Moreover, we ﬁnd the “rate” of this continuity with
respect to the weight and prove that it is sharp. This is done by ﬁnding the exact “rate”
for the Hilbert transform H on the unit disk. We also study many properties of this new
metric space (Ap , d∗ ) and identify its completion as a subset of BM O(Rd ). In addition, we
extend the continuity result to the case of matrix-valued A2 weights W , for the Martingale
W
transform Mσ and we show that it does not hold for the classical Martingale transform.

The problem of continuity of weighted estimates with respect to the weight appears naturally
in problems of PDE (Partial Diﬀerential Equations) with random coeﬃcients, and can also
be important to multivariate stationary processes.

Copyright by
NIKOLAOS PATTAKOS
2012

To my mother Antigoni, my father Georgios and my brother Evangelos.

iv

ACKNOWLEDGMENTS

It would be a shame to cut this section short, since so many people have helped me.
I would like to thank my dissertation advisor, Dr. Alexander Volberg for his support
and guidance through these years. He has always been there answering my questions and
helping me with many of the problems that I had to face during my Ph.D. study. I was not
even done with my qualifying exams yet, but I already had a distinguished and very helpful
advisor. Because of him and his research I started to get involved in Harmonic Analysis and
I should confess that it is my favorite subject. I remember stepping into his oﬃce for the
ﬁrst time in the beginning of Fall semester 2008 and asking him to be my advisor for my
Ph.D. study. He accepted right away without any hesitation, even though he did not know
me, and I am grateful for that.
My deepest thanks also go to my defense committee members Dr. Vladimir Peller, Dr.
Ignacio Uriarte-Tuero, Dr. Jeﬀrey Schenker and Dr. Shen, Chun-Yen for their expertise
and precious time. Spending time with them and attending their classes has been very
important for me. Every time I needed them they were there for me. I am also grateful to
Ms. Barbara Miller, Graduate Secretary in the Department of Mathematics, for her generous
help during my graduate study. Here is a good time to say thank you to my advisors from
the Math department of the University of Crete in Greece. People such as Dr. Michalis
Papadimitrakis, Dr. Themistoklis Mitsis, Dr. Souzana Papadopoulou, Dr. Konstantinos
Skandalis, Dr. George Kostakis and Dr. Athanasios Feidas helped me enormously during
my undergraduate years in Greece.
Moreover, I would like to express my gratitude to Dr. Nicholas Boros for his friendship
and precious help throughout my studies here in MSU. We spent many hours discussing
v

problems of Harmonic Analysis together. I would also like to thank my friends from the
math department Mr. Manousos Maridakis, Mr. Ambar Rao, Mr. Alexander Reznikov,
Mr. Michalis Orfanoudakis, Dr. Mathhew Bond and Dr. Diogo Oliveira e Silva for giving
me more chances to discuss mathematics with them. In addition, friends like Mr. George
Koutsimanis, Dr. George Perdikakis, Dr. Zacharias Fthenakis, Dr. Eleni Beli, Dr. Artemis
Spyrou, Dr. Evangelos Milliordos and Dr. Georgia Mavrommati made my stay here to East
Lansing a better place to be.
Finally, I would like to thank my mother Antigoni, my father George and my brother
Evangelos for supporting and believing in me. I can not come up with the words to appropriately express my gratitude. I could not ﬁnish writing this part of my thesis without mentioning the encouragement and support that I received all these years from Father Methodios
and Father Onisimos from Greece. They have always been there for me.

vi

TABLE OF CONTENTS

Chapter 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Chapter 2

The Muckenhoupt A∞ class as a metric space . . . . . . . . . .

6

Chapter 3 Continuity of weighted estimates and sharpness of result . . .
3.1 The continuity on the weight . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 The sharp rate of convergence . . . . . . . . . . . . . . . . . . . . . . . . . .

12
12
24

Chapter 4

Bellman functions and an application to Littlewood-Paley estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 5 Matrix weights and the A2 condition .
5.1 The not well behaved dyadic operators . . . . . .
5.2 Some results about “ﬂatness” and a Riesz basis .
W
5.3 The Martingale transform revisited: Mσ and Mσ
5.4 Open problems about matrix weights . . . . . . .
Bibliography . . . . . . . . . . . . . . .

vii

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

28

.
.
.
.
.

34
38
41
51
54

. . . . . . . . . . . . . . . . . . .

59

Chapter 1
Introduction
Weighted inequalities have been studied extensively during the last thirty years and have
found many applications in PDE, geometric measure theory and multivariate stationary
processes. The ﬁrst characterization of the famous Muckenhoupt Ap classes in dimension 1,
was done in [18] by B. Muckenhoupt. The deﬁnition, which we will state in dimension d, is
the following. For a positive L1 (Rd ) function w and p ∈ (1, +∞) we write that w ∈ Ap if
loc
the quantity
[w]Ap := sup
Q

1
w(x)dx
|Q| Q

p−1
1
− 1
w(x) p−1 dx
,
|Q| Q

is ﬁnite, where we consider the supremum over all cubes Q inside Rd . The number [w]Ap is
called the Ap characteristic of the weight w. The A∞ class of weights is deﬁned to be the
union of all the Ap classes. That is

A∞ =

Ap .
p>1

There is also the following useful characterization of A∞ . For a weight w we have that
w ∈ A∞ if and only if the quantity

[w]A∞ := sup
Q

1
w(x)dx
|Q| Q
1
exp( |Q| Q log w(x)dx)

1

,

is ﬁnite, where we consider the supremum over all cubes Q inside Rd . This is called the A∞
characteristic of the weight.
In [18], Muckenhoupt characterized all weights, w, in dimension 1, and in [4] Coifman and
Feﬀerman all weights in dimension d, with the property that the Hardy-Littlewood maximal
operator M deﬁned as
1
|f (y)|dy,
Q |Q| Q

M f (x) := sup

where we consider the supremum over all cubes Q in Rd such that x ∈ Q, is bounded from
Lp (w) to Lp (w), for p ∈ (1, +∞). That is, under what conditions on the weight w the
inequality
Rd

|M f (x)|p w(x)dx

1
p

≤C

Rd

|f (x)|p w(x)dx

1
p

,

holds for all functions f ∈ Lp (w) and a constant C > 0 independent of f . As it turns
out, this is true if and only if w ∈ Ap . The classical proof of this fact starts by showing
that the Ap condition on w is necessary and suﬃcient for the Hardy-Littlewood maximal
operator to be of weak type (p,p) with respect to w. This means that M sends Lp,∞ (w) to
Lp,∞ (w). Then assuming that the Ap condition is true one shows using Calder´n-Zygmund
o
decomposition, that w satisﬁes the Reverse H¨lder inequality which implies that there is an
o
> 0 such that w ∈ Ap− . Finally, by the use of the Marcinkiewicz interpolation theorem we
are done. Observe that the Ap condition is equivalent to the requirement that all averaging
operators
f→

1
f (t)dt · χI ,
|I| I

are uniformly bounded on Lp (w) with respect to the ﬁnite interval I. The Hardy-Littlewood
maximal operator is exactly an averaging type operator.

2

It is a very natural question to ask what happens when we choose p to be 1. In this
case it does not make sense to require that the Maximal operator sends L1 (Rd ) to L1 (Rd )
since this makes no sense even in the unweighted case. It is very well known that M sends
L1 (Rd ) to L1,∞ (Rd ). Therefore, the correct question is under what conditions on the weight
w, M is bounded from L1 (w) to L1,∞ (w). The answer is exactly when w ∈ A1 . The precise
deﬁnition of this class of weights is the following. If there is a positive constant c such that

M w(x) ≤ cw(x),

for almost every x ∈ Rd , we say that w ∈ A1 . The smallest such c is called the A1
characteristic of the weight and is denoted by [w]A1 . The relation among the characteristics
of a weight w, for p ∈ [1, +∞], is

[w]A∞ ≤ [w]Ap ≤ [w]A1 .

This implies that the Ap classes are nested. That is A1 ⊂ Ap ⊂ A∞ .
A very interesting fact is that the Ap condition is necessary and suﬃcient for many
of the classical operators of Harmonic Analysis to be bounded from Lp (w) to Lp (w), for
p ∈ (1, +∞). For instance, the Hilbert transform

Hf (x) :=

f (y)
1
p.v.
dy,
π
R x−y

and the Riesz transforms

Rj f (x) :=

Γ( n+1 )
2
n+1
π 2

p.v.
3

xj − yj
Rd

|x − y|d+1

f (y)dy,

1 ≤ j ≤ d, are examples of such operators. Additionally, in [12] it was proven that the
Hilbert transform is of weak type (p, p) with respect to the weight w if and only if w ∈ Ap .
Their argument can be adopted to the case of Riesz transforms. Furthermore, Stein showed
in [28] that if any of the Riesz transforms is bounded on Lp (w) then w ∈ Ap .
By the use of good-λ inequalities and the strong Maximal operator, one is able to prove
that all Calder´n-Zygmund operators are bounded on Lp (w), provided that w ∈ Ap . Reo
cently though, the focus has been on a problem, now known as the A2 conjecture for
Calder´n-Zygmund operators, on how to ﬁnd the sharp dependence of the operator norm
o
and the A2 characteristic of the weight. It states that for any singular integral operator T
of Calder´n-Zygmund type we have the estimate
o

T L2 (w)→L2 (w) ≤ c[w]A2 ,

(1.1)

for all A2 weights w, where c is a positive constant independent of the weight. This result
turns out to be correct and was ﬁrst proven in [13]. This linear estimate with respect to
the A2 characteristic of the weight, is sharp for many of the classical operators, such as the
Hilbert and Riesz transforms. Such estimates play a very important role in PDE. In fact,
Calder´n-Zygmund operators naturally arise as fraction derivatives of solutions of PDE. If
o
we recall, for example, in [26] the authors proved that the Ahlfors-Beurling operator in the
complex plane C deﬁned as

T f (z) =

f (ζ)
1
p.v.
dζ,
2
π
C (z − ζ)

satisﬁes (1.1), and as a consequence they obtained borderline regularity properties for solu-

4

tions of the Beltrami equation on C (uz = µuz , where µ is a given function of µ L∞ < 1).
The main tool in their proof is the Bellman function technique which is very powerful to
such kind of estimates. We are also going to use this technique to estimate the norm of a
Riesz matrix operator.

5

Chapter 2
The Muckenhoupt A∞ class as a
metric space
The main purpose of this chapter is to deﬁne a natural metric structure on the classical
Muckenhoupt Ap classes. As far as we know, this is the ﬁrst time that such metric has been
studied in the context of continuity of norms of Calder´n-Zygmund operators. Classically,
o
the Ap spaces have only been treated as sets with no additional structure on them.
Before we deﬁne the metric structure we need to state some useful and well known results
about the Ap classes and their relation with the BM O(Rd ) space. First of all, the space of
BM O functions in Rd , consists of locally integrable functions f such that the norm
1
|f (x) − fQ |dx
Q |Q| Q

f ∗ = sup

is ﬁnite.

Notice that this quantity becomes a norm if we identify all functions inside

BM O(Rd ) that diﬀer by a constant. If f is a BM O function then for any number λ ∈
(0, fc ], the function eλf is an Ap weight, 1 < p < ∞, where the constant c depends on p
∗

and the dimension d. Secondly, for small BM O norm, the Ap characteristic of the weight eλf
is bounded by the number 2 for example (see e.g. [8]). A subset of BM O(Rd ) that appears
in many applications is BLO(Rd ). It stands for the functions of bounded lower oscillation.
A function f ∈ L1 (Rd ) is said to belong in BLO(Rd ) if there is a positive constant c such
loc
6

that:
1
f (y)dy − inf f (x) ≤ c
|Q| Q
x∈Q
for all cubes Q, where the inﬁmum is to be understood as the essential inﬁmum. It can
be proved that for any w ∈ A1 , the function log w is in BLO(Rd ). Also if a function
f ∈ BLO(Rd ) then for suﬃciently small λ > 0 the function eλf ∈ A1 . The reference for all
these results is [8].
Let us observe that if we have any weight w, any positive constant c > 0 and any 1 ≤ p ≤ ∞,
then [w]Ap = [cw]Ap . We deﬁne an equivalence relation in A∞ in the following way: for
u, v ∈ A∞ we will write u ∼ v if and only if there is a positive constant c such that u = cv
almost everywhere in Rd . This allows us to deﬁne the quotient space:

A∞ = A∞

∼.

In the same way we deﬁne for 1 ≤ p < ∞:

Ap = Ap

∼.

For two elements u, v ∈ A∞ we deﬁne the distance function d∗ as:

d∗ (u, v) = log u − log v ∗ .

It is obvious that all the requirements of a metric are satisﬁed and the reason for deﬁning

7

the equivalence relation is exactly because we need to have:

d∗ (u, v) = 0 ⇔ u ∼ v.

So we deﬁne a metric in A∞ , going through the BM O(Rd ) space. Notice that the restriction
of the d∗ metric to Ap , makes the class a metric space. The drawback of these “new” metric
spaces is that none of them is complete. However, the following is an obvious remark that
gives more informations about this “new” spaces. It states that small balls around the
constant weight 1, are complete in the d∗ metric.
¯
Theorem 1. Consider a closed ball B(1, r) of suﬃciently small radius r > 0 and center at
¯
the weight 1, in the metric space (A∞ , d∗ ), i.e. B(1, r) = {w ∈ A∞ : d∗ (w, 1) ≤ r}.Then
¯
B(1, r) is a complete metric space with respect to the metric d∗ .
Proof. :
¯
Consider a Cauchy sequence {wn }n∈N in (B(1, r), d∗ ). This means that the sequence
{log wn }n∈N is Cauchy in the BM O(Rd ) space. But BM O(Rd ) is a Banach space and so
there is a function f ∈ BM O(Rd ) such that log wn → f in BM O(Rd ) as n → ∞. By the
John-Nirenberg inequality we know that there is a dimensional constant c > 0 such that for all
λ ∈ (0, fc ] the function eλf ∈ A2 . But | log wn ∗ − f ∗ | ≤ log wn −f ∗ → 0 as n → ∞.
∗
¯
Here we use the fact that wn ∈ B(1, r). This means that log wn ∗ = log wn − log 1 ∗ ≤ r
and r is suﬃciently small. Therefore, the number f ∗ is small and so the number fc is
∗
really big. We are now allowed to choose for λ = 1 and we get that ef ∈ A2 or equivalently
there is a weight w ∈ A2 ⊂ A∞ with f = log w. It is trivial now to see that d∗ (wn , w) → 0
as n → ∞.

8

Of course in the previous Theorem, we can replace the A∞ space by any of the other
Ap spaces. We already mentioned that none of the Ap spaces is complete. The proof of this
fact is very simple. Let us prove that A1 is not complete by ﬁnding a Cauchy sequence in
the space that has no limit inside A1 . It will follow that this example works for anyone of
the Ap spaces. Consider a decreasing sequence −1 < rn < 0 with limn→∞ rn = −1. Deﬁne
the A1 weights wn = |x|rn . Then:

d∗ (wrn , wrm ) = rn log |x| − rm log |x| ∗ = |rn − rm | log |x| ∗

and since rn → 1 we see that {wn }n∈N is Cauchy in A1 , or equivalently the sequence
{log wn }n∈N is Cauchy in BM O(Rd ). It’s limit in the BM O(Rd ) space is obviously the
1
function f (x) = − log |x|. This means that for w(x) = |x| we have d∗ (wn , w) → 0 as

n → ∞, but since w is not in L1 (Rd ) it can not be an A1 weight. So the space (A1 , d∗ ) is
loc
not complete.
Let us also mention the following result in [9], by Garnett and Jones, that helps to
understand better when a ball in (Ap , d∗ ) is complete. It states that for a function f ∈
BM O(Rd ),

distBM O (f, L∞ ) := inf{ f − g ∗ : g ∈ L∞ } ∼

1
.
sup{λ > 0 : eλf ∈ A2 }

This means that if we have a Cauchy sequence in Ap , the closer the sequence is to the
L∞ (Rd ) space, the more chances it has to have a limit in Ap .
So now we can try and ﬁnd the completion of these spaces under the metric d∗ . By
¯
deﬁnition the completion of (Ap , d∗ ) is the space Ap that consists of the equivalence classes

9

of all Cauchy sequences of Ap . We can identify this space as a subspace of BM O(Rd ).
Indeed:

¯
Ap = {f ∈ BM O(Rd ) : ∃{wn }n∈N ⊂ Ap : lim

n→∞

log wn − f ∗ = 0},

¯
and we can think of the Ap class as a subset of Ap , by identifying every weight w with it’s
logarithm, log w, in BM O(Rd ). Since the classical Ap spaces form an increasing “sequence”
of the variable p (and of course the same is true for the Ap spaces), the same is true for this
¯
¯
¯
¯
new subspaces of BM O(Rd ), A1 ⊂ Ap ⊂ Aq ⊂ A∞ ⊂ BM O(Rd ), for 1 ≤ p ≤ q ≤ ∞.
¯
They are also convex subsets of BM O(Rd ). Indeed, consider 1 < p < ∞, and f, g ∈ Ap .
This means that there are sequences {wn }n∈N , {vn }n∈N ⊂ Ap such that: f = limn→∞ wn ,
¯
g = limn→∞ vn , in BM O(Rd ). Let 0 < t < 1 be ﬁxed. We will show that tf + (1 − t)g ∈ Ap .
t 1−t
For this, we only need to see that tf + (1 − t)g = limn→∞ log(wn vn ), in BM O(Rd ), and
t 1−t
check using H¨lder that the weight wn vn ∈ Ap , for all n, since:
o

[wt v 1−t ]Ap ≤ [w]t p [v]1−t ,
A
A
p

¯
¯
for all w, v ∈ Ap . Thus, tf + (1 − t)g ∈ Ap . It is trivial to see now that A∞ is also a convex
¯
subset of BM O(Rd ). For A1 the same holds, since if we have two A1 weights, w, v, it is
trivial to see that wt v 1−t ∈ A1 and actually that [wt v 1−t ]A1 ≤ [w]t [v]1−t .
A1 A1
¯
Here, let us observe that for any 1 < p < ∞, we have that L∞ (Rd ) ⊂ Ap . There is a nice
result of weighted theory (see [8]) that states the following (we will present the statement
only for A2 ): There are dimensional constants c1 , c2 > 0, such that for a function φ in Rd
we have:

10

a) eφ ∈ A2 provided inf{ φ − g ∗ : g ∈ L∞ (Rd )} ≤ c1 and
b) inf{ φ − g ∗ : g ∈ L∞ (Rd )} ≤ c2 provided eφ ∈ A2 . This means that all functions
¯
f ∈ BM O(Rd ) that satisfy the assumption a), belong to the A2 space. Equivalently, there
¯
is a small neighborhood of L∞ (Rd ) inside BM O(Rd ), that lies inside the A2 space.
We should also mention that since:

BLO(Rd ) = {α log w : α ≥ 0, w ∈ A1 },

¯
we can ask the question if the spaces A1 , BLO(Rd ) are equal. Let us assume that they
are. A classical result of weighted theory is that BM O(Rd ) = BLO(Rd ) − BLO(Rd ).
¯
¯
By our assumption we have that BM O(Rd ) = A1 − A1 . Now consider a function f ∈
¯
BM O(Rd ). There are functions φ, ψ ∈ A1 such that f = φ − ψ. We know that there are
sequences of A1 weights {φn }n∈N , {ψn }n∈N such that f = limn→∞ log φn −limn→∞ log ψn =
−1
−1
limn→∞ log φn ψn , where the limit is in BM O(Rd ). But φn ψn is an A2 weight for all n.

¯
So we get that A2 = BM O(Rd ). But this is obviously false.
¯
¯
¯
Notice that from the argument follows the inclusion, A1 − A1 ⊂ A2 . Trivially, we have
¯
¯
¯
the more general fact, that for any 1 < p < ∞, A1 + (1 − p)A1 ⊂ Ap . Also, since we have
¯
¯
that w ∈ Ap ⇔ w1−p ∈ Ap , we get the equivalence f ∈ Ap ⇔ (1 − p )f ∈ Ap . For p = 2
¯
¯
¯
we have f ∈ A2 ⇔ −f ∈ A2 , which means that the A2 class is symmetric with respect to
¯
the origin in the BM O(Rd ) space. No other Ap class has this property. Here we should
remember the following about power weights. A function of the form |x|α is an Ap weight
in Rd , if and only if −d < α < d(p − 1). The interval for α is symmetric with respect to the
origin, if and only if p = 2. Now we can see that there is a “correspondence” between the
¯
A2 space and the interval (−d, d).

11

Chapter 3
Continuity of weighted estimates and
sharpness of result

3.1

The continuity on the weight

In this chapter we are going to study the behavior of the operator norm of a sub-linear
operator T on Lp (w) with respect to the weight w. Our goal is to show that if two weights w
and w0 are close in the metric d∗ , deﬁned in the previous chapter, then the numbers T p,w
and T p,w0 are also close. The main Theorem of this chapter is the following.
Theorem 2. Consider 1 < p < +∞ and w0 ∈ Ap . Suppose that the sub-linear operator T
on Rd satisﬁes the weighted estimate

T Lp (w)→Lp (w) ≤ F ([w]Ap ),

for all w ∈ Ap , where F is a positive increasing function. Then there is a positive constant
c that depends on p, the dimension d, [w0 ]Ap and the function F such that for all weights w
that are suﬃciently close to w0 in the metric d∗ ,

T Lp (w)→Lp (w) ≤ T Lp (w )→Lp (w ) (1 + cd∗ (w, w0 )).
0
0

12

Moreover, we have

lim
d∗ (w,w0 )→0

T Lp (w)→Lp (w) = T Lp (w )→Lp (w ) .
0
0

Remark 3. In [2] Buckley showed that the Hardy-Littlewood maximal operator satisﬁes the
estimate
1
p−1

M Lp (w)→Lp (w) ≤ c[w]A ,
p
for 1 < p < +∞, and all weights w ∈ Ap , where the constant c > 0 is independent of the
weight w. This means that the assumptions of Theorem 2 hold for M .
Remark 4. Consider any Calder´n-Zygmund operator T . By [13] we know that:
o
1
max(1, p−1 )

T Lp (w)→Lp (w) ≤ c [w]A
p

,

for any Ap weight w, where c > 0 is independent of the weight. This means that we can
apply Theorem 2, for 1 < p < ∞ and F (x) = cx

1
max(1, p−1 )

.

Before moving on to the proof of Theorem 2 we need to see some preliminary results. For
the proof of our theorem interpolation with change of measure is going to play an important
role. In the following (X, M, µ) and (Y, N , ν) will denote measure spaces. Suppose T is an
operator of a class of functions on X into a class of functions on Y . T is called a sub-linear
operator, if it satisﬁes the following properties:
i)If f (x) = f1 (x) + f2 (x) and T f1 (x), T f2 (x) are deﬁned then T f (x) is deﬁned,
ii)|T (f1 (x) + f2 (x))| ≤ |T f1 (x)| + |T f2 (x)|, µ almost everywhere,
iii)For any scalar k, we have |T (kf (x))| = |k||T f (x)|, µ almost everywhere.
Let µ0 , µ1 be two measures for (X, M). If we deﬁne the measure µ = µ0 + µ1 , then µ0 , µ1
13

are each absolutely continuous with respect to µ. Thus, by the Radon-Nikodym theorem,
there exists two functions, α0 , α1 such that for any E ∈ M,

µj (E) =

E

αj (x)dµ(x)

where j = 0, 1. In the following we will assume that α0 , α1 are never zero. This is equivalent
to asserting that the sets of measure zero with respect to µj , j = 0, 1, are the same as the
sets of measure zero with respect to µ. Thus, in the various measure spaces that we will
consider, the equivalence classes of functions will be the same. Let 0 ≤ s ≤ 1, and deﬁne
the measure µs on X by
µs (E) =
E

1−s
s
α0 (x)α1 (x)dµ(x),

for each E ∈ M. Also assume, that we have two measures ν0 , ν1 on N , and deﬁne the
measures νr , for 0 ≤ r ≤ 1, just as we did for µs above.
Given any real numbers 1 ≤ p0 , p1 , q0 , q1 and any 0 ≤ t ≤ 1, we deﬁne pt , qt , s(t), r(t) as
follows:
(1 − t)qt tqt
(1 − t)pt tpt
+
= 1,
+
=1
p0
p1
q0
q1
s(t) =

(tpt )
(tqt )
, r(t) =
.
p1
q1

We have the following Theorem by [29]:
Theorem 5. Suppose that T is a sub-linear operator satisfying

T f qj ,νj ≤ Kj f pj ,µj

14

p
for all f ∈ L j (X, M, µj ), j = 0, 1. Then, for 0 ≤ t ≤ 1, we have

1−t t
T f qt ,νr(t) ≤ K0 K1 f pt ,µs(t)

for all f ∈ Lpt (X, M, µs(t) ).
In addition to the previous theorem we need also the following proved in [15]:
Theorem 6. If the A∞ characteristic of a weight w is small, i.e. [w]A∞ ≤ 1 + δ < 2, then
the function f = log w, and any cube Q satisfy
√
1
|f (x) − fQ |dx ≤ 32 δ.
|Q| Q
We will give a rough idea for A2 , since for A∞ is similar. We will show that for any A2
weight w:
log w ∗ ≤ 2 [w]A2 − 1.
Indeed, for any real number x we have that: 2 + x2 ≤ ex + e−x . Now apply it with
w
x = log( w ) and get:
Q

wQ
1
w 2
1
w
1
2 + log
≤
+
.
|Q| Q
wQ
|Q| Q wQ |Q| Q w
Hence,
1
| log w − log(wQ )|2 ≤ 1 + wQ (w−1 )Q − 2 ≤ [w]A2 − 1.
|Q| Q

15

1
f 2,
|Q| Q

1
By H¨lder’s inequality, we have f : |Q| Q f ≤
o

1
| log w − log(wQ )| ≤
|Q| Q

for any positive function f . Thus,

[w]A2 − 1.

Now using the well known inequality

1
1
| log w − (log w)Q | ≤ 2 inf
| log w − r|,
|Q| Q
r∈R |Q| Q
we get exactly what we want. Now for a general A∞ weight the proof follows the same lines.
See [15] for more details and for the proof that the square root is sharp.
Now we are ready to present the proof of Theorem 2.
Proof. First we will show that for any sub-linear operator T that satisﬁes the assumptions
of our theorem we have:

T Lp (w)→Lp (w) ≤ T Lp (w )→Lp (w ) (1 + cδ),
0
0

for all weights w ∈ Ap with d∗ (w, w0 ) ≤ δ. Let δ > 0 be a small number that we consider
to be ﬁxed. Fix also an Ap weight w, with d∗ (w, w0 ) < δ. This means that

w
log w ∗ ≤ δ.
0

1−t
We would like to write our weight w as w = w0 W t , for some small and positive number t

(which is going to be about δ), and some weight W ∈ Ap . From the expression we can see
that
1

W =

wt

1
t
w0

w0 .

For this, let us consider only the case p = 2, but the general case is identical to this one.
Since w0 ∈ A2 we know that there is a small
16

1+
> 0 such that w1 := w0 ∈ A2 . Then

obviously w0 =

1−s
w1

for small s > 0. To continue, consider the function f = log

w
w0

1
s

.

The BM O norm of f is really small since:

1
1
f ∗ = d∗ (w, w0 ) ≤ δ,
s
s
and so by the John-Nirenberg inequality we have that for all λ ∈ (0, fc ] the function
∗
eλf

=

w
w0

λ
s

∈ A2 , where c is a positive constant that depends only on the dimension.
c

0
If we choose λ = δ , c0 > 0 is any constant less than or equal to sc, we see that w2 :=
w
w0

c0
δs

1−s s
∈ A2 , which implies that the function w1 w2 ∈ A2 . Then:
1

W :=

wt

1
t
w0

1−s s
w0 = w1 w2 ∈ A2 ,

where we put t = cδ . Here we should mention that the A2 norm of W can be chosen to
0
be bounded above by a constant that depends only on the A2 norm of w1 . On the other
hand, [w1 ]A2 depends only on the A2 norm of w0 , and this is ﬁxed. With this in mind, let
us assume that the A2 characteristic of W is bounded above by c. The important thing here
is that it does not depend on δ. Write γ = T Lp (w )→Lp (w ) . By the interpolation result
0
0
of Stein and Weiss, Theorem 5, for X = Y = Rd , M = N = L, where by L we denote the
σ-algebra of Lebesgue measurable sets in Rd , and µ0 = ν0 = w0 dx, µ1 = ν1 = W dx, we get

T Lp (w)→Lp (w) ≤ γ 1−t T t p (W )→Lp (W )
L
≤ γ 1−t ct F [W ]Ap
≤ γ 1−t ct F (c)t

17

t

and the right-hand side goes to γ as t → 0+ or equivalently as δ → 0+ . In other words:

lim sup
d∗ (w,w0 )→0

T Lp (w)→Lp (w) ≤ T Lp (w )→Lp (w )
0
0

and in addition we have the desired estimate:

T Lp (w)→Lp (w) ≤ T Lp (w )→Lp (w ) (1 + cδ),
0
0

where c is a constant depending on n, p, [w0 ]Ap and the function F , for all weights w in Ap
that are δ close to w0 in the metric d∗ .
We can also conclude the following new result:
Proposition 7. The set
{log w : w ∈ Ap }
is open in BM O(Rd ) for all 1 < p < +∞.
Proof. To see this ﬁx w0 ∈ Ap and choose suﬃciently small δ > 0. For f ∈ BM O(Rd ) with
f − log w0 ∗ ≤ δ, write f = log u, where u is a positive function. Then follow the previous
1−t
reasoning in the beginning of the proof, with w = u and write u = w0 W t , for 0 < t < 1.

It follows that W ∈ Ap , if δ > 0 is small depending only on the Ap norm of w0 , and so
1−t
u = w0 W t is an Ap weight, by H¨lder’s inequality. As we can see, this is exactly the same
o

argument as before.
There is only one thing remaining to ﬁnish the proof of Theorem 2. We need to show
that
T Lp (w )→Lp (w ) ≤
0
0

lim inf
d∗ (w,w0 )→0

18

T Lp (w)→Lp (w) .

We are going to resent two diﬀerent proofs. The ﬁrst appeared in [24] and the second in [22].
Both approaches give diﬀerent information about the weights involved in the calculations.

For the ﬁrst proof we assume that our operator T is linear. Let us also assume for
simplicity that p = 2 and that T L2 (w )→L2 (w ) = 1. Note that other p s can be treated
0
0
similarly. Let Mφ denote the operation of multiplication by φ. To ﬁnish the proof of the
continuity at w = w0 we are going to assume that the quantity

lim inf
d∗ (w,w0 )→0

T L2 (w)→L2 (w)

which is equal to

TM 1
1
1 L2 (w )→L2 (w )
−2 1
2 −
0
0
w0 w 2
w0 w 2

M

lim inf
d∗ (w,w0 )→0

is strictly less than 1 and get a contradiction. This means that there is τ > 0 small, and a
sequence of A2 weights wn such that d∗ (wn , w0 ) → 0 as n → ∞ and in addition:
1

1

1

1

−
2
2 −
w0 2 wn T w0 wn 2 g L2 (w ) ≤ (1 − τ ) g L2 (w )
0
0

(3.1)

for all functions g ∈ L2 (w0 ). Fix now any cube Q in Rd . Here we can make the normalization
1
assumption |Q| Q wn dx = 1 for all n ∈ N. We claim two things::
w0
1

1

−
−
1∗ ) wn 2 − w0 2 L2 (w ,Q) → 0 as n → ∞ where by L2 (w0 , Q) we mean the L2 (w0 ) norm
0

over Q, and
2∗ ) there exists a subsequence kn such that wkn → w0 almost everywhere in the cube Q.
Obviously 2∗ follows from 1∗ . For a proof of 1∗ , see Lemma after the end of this proof.
19

Now without loss of generality we can assume that the subsequence is the original sequence
1

1

−
−
wn . Notice that 1∗ implies wn 2 f − w0 2 f L2 (w ,Q) → 0 as n → ∞ for all bounded
0
1

1

1

−
2 −
f , and so for g = f w0 2 , we get T (w0 wn 2 g) − T g L2 (w ,Q) → 0 as n → ∞ and this
0

implies that for a subsequence of wn (which again we assume that is the whole sequence),
1

1

1

1

−
2
2 −
w0 2 wn T w0 wn 2 g → T g almost everywhere in the cube Q. It is time to apply Fatou’s

Lemma in inequality (3.1) and get:
1

1

1

1

−
2
2 −
lim inf w0 2 wn T w0 wn 2 g
n→∞

1

1

1

1

−
2
2 −
≤ lim inf w0 2 wn T w0 wn 2 g 2
n→∞
L2 (w0 ,Q)
L (w0 ,Q)

≤ (1 − τ ) g L2 (w ,Q) .
0
1

−
Here g = f w0 2 with bounded f form a dense family in L2 (w0 , Q). For g from this dense

family it follows:
T g L2 (w ) ≤ (1 − τ ) g L2 (w )
0
0
by letting the cube Q expand to inﬁnity, for g in some dense subclass of L2 (w0 ) . By
assumption T L2 (w )→L2 (w ) = 1 and this is how we have our contradiction. All that
0
0
remains is the following Lemma:
Lemma 8. Let w0 , w ∈ A2 such that d∗ (w, w0 ) ≤ , where is suﬃciently small. Let us have
1
1
1
−1
1
w
a normalization assumption |Q| Q w dx = 1. Then w− 2 − w0 2 L2 (w ,Q) ≤ |Q| 2 c( ) 2 ,
0
0

where c( ) is a positive constant that goes to 0 as

goes to 0.
1

1
−
Notice that this Lemma states that the weight w− 2 is close to w0 2 in the L2 (w0 ) norm

of the cube Q.

20

Proof. : We want to estimate the expression:
1
1
−1 2
1
1
w0
2
w0 2
w − 2 − w0 2 2
=
+1−
.
|Q|
|Q| Q w
|Q| Q w
L (w0 ,Q)

The last integral can be taken care of really easy, since by our normalization assumption and
Cauchy-Schwartz we get the following:
1
w0 1
1
w −1
1
w 1 −1
w −1
1
2
2
2
2
=
≥
= 1.
≥
|Q| Q w
|Q| Q w0
|Q| Q w0
|Q| Q w0
Therefore, the quantity that we need to estimate is bounded above by:
1
−1 2
w0
1
1
w − 2 − w0 2 2
≤
− 1.
|Q|
|Q| Q w
L (w0 ,Q)

w
It is time to use the fact that d∗ (w, w0 ) ≤ . We get that the weight w is in the A2 class
0
w
and actually because the BM O norm of log w
0

is really small, the A2 characteristic is

bounded by 1 + c( ), where c( ) is a positive constant that goes to 0 as

goes to 0. So we

have the desired inequality:
1
−1 2
w
1
w− 2 − w0 2 2
≤
− 1 ≤ c( ).
|Q|
w 0 A2
L (w0 ,Q)

Observe that the proof just presented can not be used to the case when the operator T
is sub-linear. The linearity assumption for this proof is really essential.
The second proof covers all cases. Here we need not make any linearity assumptions for
T . Our operator is going to be sub-linear. The main tool for the proof is the inequality
21

(proved earlier in this chapter)

T Lp (u)→Lp (u) ≤ T Lp (v)→Lp (v) (1 + c[v] d∗ (u, v)),
Ap

(3.2)

that holds for all Ap weights u, v ∈ Ap that are suﬃciently close in the d∗ metric, and for
sublinear operators T that satisfy the assumptions of our Theorem. The positive constant
c[v]

Ap

that appears in the inequality depends on the dimension n, p, the function F and the

Ap characteristic of the weight v. Since the quantities n, p, F are ﬁxed we only write the
subscript c[v] to emphasize this dependence on the characteristic.
Ap
Use inequality (3.2) with u = w0 and v = w

T Lp (w )→Lp (w ) ≤ T Lp (w)→Lp (w) (1 + c[w] d∗ (w, w0 )).
0
0
Ap

At this point if we know that the constant c[w]

Ap

remains bounded as the distance d∗ (w, w0 )

goes to 0 we are done.
w
For this reason we assume that d∗ (w, w0 ) = δ is very close to 0. Then the function w
0

is an Ap weight with Ap characteristic very close to 1 (see [8]). How close depends only on
w
δ, not on w. Thus, if R is large enough, the weight ( w )R ∈ Ap , with Ap characteristic
0

independent of w (again see [8]). Note that from the classical Ap theory, for suﬃciently
small
1
R

1+
> 0, we have w0 ∈ Ap . Choose the numbers R, such that we have the relation

1
+ 1+ = 1, i.e. such that R and R = 1 + are conjugate numbers. Then

< w >Q < w

1
− p−1

p−1
>Q =

w
w
w0 0 Q

22

1
1
w − p−1 − p−1 p−1
w0
w0
Q

and by H¨lder’s inequality it is less than or equal to
o
1
1
w R R R R
w0
w0
Q
Q

p−1
1
1
− p−1 ·R p−1
w − p−1 ·R R
R .
w0
w0
Q
Q

Separating the R-terms from the R -terms and applying H¨lder’s inequality one more time
o
we obtain that this is at most
1
w R R 1+
[w
w0
Ap 0

1
R
]A
p

≤ C,

where C is a constant independent of the weight w. Therefore, [w]Ap ≤ C.
that appears in
The last step is to remember how we obtained the constant c[w]
Ap
inequality (3.2). We used the Riesz-Thorin interpolation theorem with change in measure
and then expressed one of the terms that appears in our calculations as a Taylor series. The
appears at exactly this point and it is not diﬃcult to see that it depends
constant c[w]
Ap
continuously on [w]Ap . Since this characteristic is bounded for w close to w0 in the metric
d∗ we have that c[w] is bounded as well. This completes the proof.
Ap
A consequence of the proof is the following remark.
Remark 9. Fix a weight w0 ∈ Ap and a positive number δ suﬃciently small. There is a
positive constant C that depends on [w0 ]Ap and δ such that for all weights w with d∗ (w, w0 ) <
δ we have [w]Ap ≤ C. In addition, from the inequality (see the proof of Theorem 2)

[w]Ap ≤

1
w R R 1+
[w
w0
Ap 0

1
R
]A ,
p

(3.3)

and Lebesgue dominated convergence theorem (by letting R → +∞ and remembering that

23

w
the Ap constant of the weight ( w )R is independent of R) we obtain
0

lim sup [w]Ap ≤ [w0 ]Ap .

d∗ (w,w0 )→0

In order to get the remaining inequality

[w0 ]Ap ≤

lim inf
d∗ (w,w0 )→0

[w]Ap ,

we rewrite (3.3) as
[w0 ]Ap ≤

1
w0 R R 1+
[w
w
Ap

1
R ,
]A
p

and we proceed in the same way as before. In this case the number

depends on [w]Ap .

But we already know that for w close to w0 in the d∗ metric the Ap characteristic of w is
bounded from above. This means that we are allowed to choose the same number

for all

weights w that are suﬃciently close to w0 and we are done. Therefore, the Ap characteristic
of a weight w ∈ Ap is a continuous function of the weight with respect to the metric d∗ , i.e.
the following equality is true

lim

[w]Ap = [w0 ]Ap .

d∗ (w,w0 )→0

3.2

The sharp rate of convergence

In the following we are going to consider the Hilbert transform, H, the Riesz projection,
P+ and weights in A2 on the circle. We are going to show that Theorem 2 is sharp for the
Hilbert transform and that it is not sharp for the Riesz projection. This result is interesting

24

because these two operators, H and P+ , are very closely related, i.e.

H = −iP+ + i(I − P+ ).

But as we shall see they do not behave in the same way.
We start with a weight w ∈ A2 , such that [w]A2 = 1 + δ, where δ > 0 is really close
to 0. We know that there exists an outer function h such that w = |h|2 . Outer means
that h = eu+iu , where u denotes the harmonic conjugate of the function u. As we already
√
have mentioned, log w = 2u is in BM O(T) with norm log w ∗ ≤ c δ. This means that
√
the conjugate function of u has small BM O norm, i.e. u ∗ ≤ c δ. From [15] the square
√
root of δ is sharp. So we can choose our function u such that c1 δ ≤ u ∗ . Observe also
that h = e−2iu . Let us now look at the operator f → eif , that maps the space BM O(T)
h
continuously into itself (this is clear since if the oscillation of f is bounded then the same
should be true for the function eif ). Of course, this is not a linear operator but it has some
nice properties. For example, for

> 0 small, it maps the ball B(0, ) = {f ∈ BM O(T) :

f ∗ ≤ } into another ball of center 0 and radius say c . We claim that the ball B(0, ) is
mapped homeomorphically onto it’s image, and that the image contains a ball B(0, c ), for
some c. For this it suﬃces to see that the derivative of this map at the point 0, is exactly
the linear map f → if which is a continuous surjection from BM O(T) onto itself. Then
make use of the inverse function theorem for Banach spaces. We did all this in order to be
able to claim that we can choose our function h satisfying:
√
√
h
≤ c1 δ.
c2 δ ≤
h ∗

25

Let f± denote the analytic and anti-analytic parts of a bounded function f on the circle. Now
the space BM O(T) can be written as the direct sum of the BM OA and BM OA spaces, the
BM O analytic and the BM O anti-analytic spaces respectively. Without loss of generality
√
√
c2 δ
h
we can assume that c1 δ ≥
≥ 2 . But,
h
− BM OA

h
h
= dist , H ∞ =
sup
h − BM OA
h
1
φ 1 ≤1,φ∈H0
=

sup
φ1 2 , φ2 2 ≤1,φ2 (0)=0

h
φ
h
h
φ φ ,
h 1 2

where . 2 is the norm in the Hardy space H 2 . This last supremum is exactly equal to

(H h φ1 , φ2 ) = H h ,

sup

φ1 2 , φ2 2 ≤1,φ2 (0)=0

h

h

2
where H h : H 2 → H− is the Hankel operator of symbol h . Now consider the spaces:
h
h

H+ = closL2 (w) {1, z, z 2 , ...}, H− = closL2 (w) {z, z 2 , ...}.

These spaces are called the future and the past spaces (the terminology comes from the
probability, where w plays the role of the spectral density of a stationary stochastic process,
see e.g. [31] and the literature cited therein).
The next step is to ﬁnd the angle θ of these two spaces in L2 (w). This is exactly

sup
φ−

φ+ , φ−

=1, φ+ 2
=1
L2 (w)
L (w)

26

L2 (w)

.

If we write down just one of these inner products we see the following

φ+ φ− |h|2 =

(φ+ h)(φ− h)

h
.
h

The ﬁrst two functions that appear in the integrand are analytic since they are products
of analytic functions. Note that since the function φ− is anti-analytic, the function φ− is
analytic. Also their H 2 norm is ≤ 1. This means that the supremum is exactly equal to
√
c2 δ
≤ Hh =
sup
2
φ− 2
=1, φ+ 2
=1
h
L (w)
L (w)
Therefore, the cos θ is exactly of the order

√

φ+ , φ−

L2 (w)

√
≤ c1 δ.

δ. This means that sin θ − 1 is of the order δ.

Now, all that remains is an easy problem. We are given that the cosine of the angle of two
√
u+v
directions is of the order δ and we would like to ﬁnd the order of sup u−v over all vectors
u that have the ﬁrst direction and v that have the second direction. Using the theorem of
√
cosines we can see that the order of this supremum must be 1 + c δ. Thus

u+v
H L2 (w)→L2 (w) ≥ sup
u−v

√
1+c δ

and
1
P+ L2 (w)→L2 (w) =
sin θ

1 + cδ.

This means that P+ converges faster to its L2 norm, as [w]A2 → 1, than the Hilbert transform. This should not be a surprise, since the multiplier that corresponds to P+ takes only
the values {0, 1} and the multiplier for the Hilbert transform attains the values {−1, 1}. So
the jump for P+ is only 1 and for H is 2.

27

Chapter 4
Bellman functions and an application
to Littlewood-Paley estimates
In this chapter we construct a new Bellman function based on the results of the previous
chapter. It can be used to estimate the norms of second order Riesz transforms and to give
a better understanding of some Littlwood-Paley estimates that ﬁrst appeared in [26]. For
instance, using our main Theorem 2 and techniques from [7], [20], [26], we can prove that
∞
for any f, g ∈ Cc (R2 ) the quantity

+∞

2

R2

0

∂f h
∂f h
+
∂x1
∂x2

1
2

∂g h
∂g h
+
∂x1
∂x2

1
2

dydt

is bounded from above by
√
(p∗ − 1)(1 + c δ) f Lp (w) g p 1−p ,
L (w
)
for any Ap weight w on R2 with [w]Ap ≤ 1 + δ < 2. The functions on the left hand side are
the heat extensions of f, g respectively, and c is a constant that depends on p ∈ (1, +∞) and
1
p∗ − 1 = max{p − 1, p−1 }. For example, for an f in Rd say, the heat extension to Rd+1 is

28

the convolution of f with the heat kernel, that is

f h (x, t) = cd

Rd

f (y) exp −

|x − y|2
dy.
4t

The main Theorem of this chapter is the following.
p

Theorem 10. For any 1 < Q < 2, 1 < p < +∞ deﬁne the domain DQ = {0 <
(X, Y, x, y, r, s) ∈ R × R × Rd × Rd × R × R : |x|p < Xsp−1 , |y|p < Y rp −1 , 1 < rsp−1 < Q}.
(p)

p

Let K be any compact subset of DQ . Then there exists a function B = BQ,K (X, Y, x, y, r, s)
inﬁnitely diﬀerentiable in a small neighborhood of K, and at the same time for any

> 0,

BQ,K can be chosen in such a way that
√
(1) 0 ≤ B ≤ (p∗ − 1)(1 + )(1 + c δ)X 1/p Y 1/p
(2) −d2 B ≥ 2|dx||dy|,
where Q = 1 + δ and c is a constant that depends on p and the dimension d.
Proof. By Theorem 2 we know that for the martingale transform Tr and an Ap weight w,
on R, of characteristic [w]Ap < 1 + δ < 2,
√
Tr Lp (w)→Lp (w) ≤ Tr Lp →Lp (1 + c δ) ,

where c is a constant that depends on p. It is really easy to see that the interpolation in
[29] works for the vectorized martingale transform Tr . This means that using the techniques
from the previous chapter, the above inequality is also true for the vectorized martingale
transform (acting on functions with values in a separable Hilbert space). But a famous
result by Burkholder (see [3], and an extension of it in [7]), states that Tr Lp →Lp = p∗ − 1.

29

Therefore,
√
Tr Lp (w)→Lp (w) ≤ (p∗ − 1)(1 + c δ).
Now, by using duality we arrive to the point (we denote by | . | the norm in our Hilbert
space) that the expression

1
4|J|

| < f >I+ − < f >I− || < g >I+ − < g >I− ||I|
I∈D(J)

is bounded from above by
√
1/p
1/p
(p∗ − 1)(1 + c δ) < |f |p w >J < |g|p w1−p >J

for any J ∈ D, any vector functions f ∈ Lp (w) and g ∈ Lp (w1−p ). The deﬁnition of the
Bellman function is the following.

B(X, Y, x, y, r, s) = sup

1
4|J|

| < f >I+ − < f >I− || < g >I+ − < g >I− ||I| :
I∈D(J)

< f >J = x, < g >J = y, < w >J = r, < w1−p >J = s,
< |f |p w >J = X, < |g|p w1−p >J = Y .
Obviously, this function satisﬁes inequality (1) in the statement of our Theorem and it
does not depend on the choice of the interval J since averages of functions are translation invariant.

We claim that for all 6-tuples a+ = (X + , Y + , x+ , y + , r+ , s+ ), a− =

30

+ −
p
p
(X − , Y − , x− , y − , r− , s− ) ∈ DQ , such that a +a ∈ DQ , the inequality is true
2

B(a+ ) + B(a− )
1
a+ + a−
−
≥ |x+ − x− ||y + − y − |.
B
2
2
4
To prove this let us consider a positive . Find functions f + , g + , w+ on J+ such that they
satisfy the conditions in the supremum of the function B for the vector a+ and

B(a+ ) − ≤

1
|J+ |

| < f >+ − < f >+ || < g >+ − < g >+ ||I|.
I+
I−
I+
I−
I∈D(J+ )

Do the same for the vector a− in the interval J− . Deﬁne the functions F,  W on the
G,








f
g
w
 + on J+
 + on J+
 + on J+
interval J as: F =
G =
and
W =






f
g
w
 − on J−
 − on J−
 − on J− .
Observe that they satisfy the required equalities in order to be acceptable for the supremum
+ −
that deﬁnes the Bellman function for the vector a +a and therefore,
2

B(

a+ + a−
1
)≥
2
|J|

1
2|J+ |
1
2|J− |

|FI+ − FI− ||GI+ − GI− ||I| =
I∈D(J)

+
+
+
+
|fI+ − fI− ||gI+ − gI− ||I|+
I∈D(J+ )
−
−
−
−
|fI+ − fI− ||gI+ − gI− ||I| +

I∈D(J− )

1
term(I = J)
|J|

1
1
1
≥ (B(a+ ) − ) + (B(a− ) − ) + |x+ − x− ||y+ − y− |
2
2
4
Now we need to mollify this function B, in order to take the smooth version of it. This can
be done in exactly the same way as in [20]. The concavity inequality remains the same after

31

the molliﬁcation and the size condition can become 1 + CK times worse, where CK is just
a constant that depends on the compact set K.
For a nice application of Theorem 10, we can formulate the following result.
Theorem 11. Let 1 < p < +∞, and any scalar Ap weight w on Rd of [w]Ap < 1 + δ < 2.
Then
√
R Lp (Rd ,Rd ,wdx)→Lp (Rd ,Rd ,wdx) ≤ (p∗ − 1)(1 + c δ) ,
where R = (Ri Rj )d
i,j=1 , is a matrix with each entry a product of two Riesz transforms.
Observe that if we let δ go to 0, which means w becomes a constant weight

R Lp (Rd ,Rd )→Lp (Rd ,Rd ) ≤ (p∗ − 1).

Proof. We can show that for any Ap weight, w, on Rd of [w]Ap ≤ 1 + δ < 2, and any vector
∞
functions Φ = (φ1 , ..., φd ), Ψ = (ψ1 , ..., ψd ) ∈ Cc (Rd ) the quantity
d

2

Rd+1
+

i,j=1

∂φh (x, t) 2 1
j
2
∂xi

d
i,j=1

h
∂ψj (x, t) 2 1
2
dxdt
∂xi

is bounded from above by
√
(p∗ − 1)(1 + c δ) Φ Lp (w) Ψ p 1−p .
L (w
)

The proof of this inequality, follows the standard techniques appearing in [7], [20], [26], in
which the existence of the Bellman function implies a Littlewood-Paley type estimate and
it, in its turn, implies the desired estimate.

32

In addition, expressing the norm of R by duality we obtain (here Φ = (φj )d , Ψ =
j=1
(ψi )d are vector functions on Rd ):
i=1
d

< RΦ, Ψ >= 2

Rd+1 i,j=1
+

∂φh (x, t) ∂ψ h (x, t)
j

d

=2

∂ 2 φh (x, t) h
j
ψ (x, t)dxdt
∂xi ∂xj i
i

d+1
R+ i,j=1

∂xj

∂xi

dxdt,

where we get the second equality because φj , ψi are smooth with compact support, and
h
hence φh , ψi are Schwarz functions. Now, we only need to observe that:
j

d

∂φh (x, t) ∂ψ h (x, t)
j
i

Rd+1 i,j=1
+

∂xj

∂xi

d

dxdt =

d

∂φh (x, t) ∂ψ h (x, t)
j
i

Rd+1 i=1
+

j=1

∂xi

∂xj

dxdt

which in its turn is equal to

Rd+1
+

trace

∂φh (x, t) d
j
∂xi

h
∂ψi (x, t) d
dxdt,
∂xj
i,j=1
i,j=1

and that on the other hand, point-wisely

trace

∂φh (x, t) d
j
∂xi

h
∂ψi (x, t) d
∂xj
i,j=1
i,j=1

≤

This means we are done.

33

∂φh (x, t) d
j
∂xi

i,j=1 2

h
∂ψi (x, t) d
.
∂xj
i,j=1 2

Chapter 5
Matrix weights and the A2 condition
In the previous chapters we considered only scalar weights w and studied the behavior of the
operator norm of an operator T on Lp (w), 1 < p < ∞, with respect to the weight w. We
treated all operators at the same time meaning that the exact same proof works for all of
them. The only property that we required from the operator was that it is strongly bounded
on Lp (w) and that its operator norm depends only on the Ap characteristic of the weight.
Weighted estimates for matrix valued weights W have also been studied in the literature and
one of the main references is the paper [30] by Dr. Treil and Dr. Volberg. They considered
L1 matrices W ∈ Cd×d that are invertible, self-adjoint and positive almost everywhere
loc
with respect to Lebesgue measure. One of their main results is the characterization of all
matrices W with the property that the inequality
1

R

(W (x)Hf (x), Hf (x))Cd dx 2 ≤ C

1

R

(W (x)f (x), f (x))Cd dx 2 ,

holds for all f ∈ L2 (W ), for some positive constant C that depends on the dimension d
and the weight W , and H is the Hilbert transform that acts coordinate-wise on the vector
function f . The space L2 (W ) consists of all measurable functions f : R → Cd such that

f 22
= f 2 =
2,W
L (W )

R

(W (x)f (x), f (x))Cd dx < ∞.

34

Their Theorem states that the class of such weights is the matrix A2 class that consists of
W that satisfy
1

1

2
2
[W ]A2 = sup < W >I < W −1 >I

< ∞,

I

where the supremum is taken over all ﬁnite intervals I of the real line R, and the quantities
< W >I , < W −1 >I are used to denote the averages of W and W −1 over the interval I
respectively. Notice that this characteristic for the matrix weight W is a generalization of
the scalar one and that the former is the square root of the latter. From now on in this paper,
wherever we write the symbol for the A2 characteristic we mean the one given for matrix
weights. Throughout the paper we always assume that the weight W is non-degenerate in
the sense that there is no vector e ∈ Cd such that W (t)e = 0 almost everywhere, because
otherwise we can always restrict ourselves to the orthogonal complement of such e. We have
to point out that the A2 condition just stated is equivalent to the requirement that

< W −1 >I ≤ [W ]2 < W >−1 ,
A
I
2

in the sense of quadratic forms. In addition, it is equivalent to the statement that all
averaging operators
f→

1
f (x)dx χI ,
|I| I

are uniformly bounded in L2 (W ) with respect to the ﬁnite interval I. For this reason if we
consider any direction e ∈ Cd , e = 1, we see that the scalar weight we (x) = (W (x)e, e)Cd
is an A2 weight of characteristic at most [W ]A2 . Immediately we obtain that the diagonal
elements of W are scalar A2 weights and that the weight trace(W ) is also an A2 weight of
characteristic at most d · [W ]A2 .

35

The motivation of studying estimates of this type comes from stochastic processes and
operator theory. Let us consider a multivariate random stationary process. For simplicity
we consider the case of discrete time i.e. a sequence of d-tuples x(n) = (x1 (n), ..., xd (n)),
n ∈ Z, of scalar random variables such that E|xj (n)|2 < +∞ and the correlation matrix

Q(n, k) = {Q(n, k)i,j }1≤i,j≤d := {Exi (n)xj (n)}1≤i,j≤d ,

depends only on the diﬀerence n − k (we use the symbol E to denote the expectation).
Without loss of generality we can assume that the process is complex valued. It is well
known (see [27]) that there exists a matrix valued non-negative measure M on the unit
circle T whose Fourier coeﬃcients coincide with the entries of the correlation matrix

Q(n, k) = M (n, k),

n, k ∈ Z and that if the process is completely regular then its spectral measure, M , is
absolutely continuous with respect to the normalized Lebesgue measure m on the unit circle,
i.e. dM = W dm. The past of the process is deﬁned as

Xn = span{xj (k) : 1 ≤ j ≤ d, k < n}

and the future as
X n = span{xj (k) : 1 ≤ j ≤ d, k ≥ n}.
By writing span we mean the closed linear span in the complex Hilbert space L2 (Ω, dP ). If

36

we consider the mapping
xj (k) → z k ej ,
where {ej }1≤j≤d is the standard orthonormal basis of Cd , then we obtain an isometric
isomorphism between span{xj (k) : 1 ≤ j ≤ d, k ∈ Z} and L2 (W ). The past and the future
of the process are mapped to the subspaces of L2 (W )

Xn = span{z k Cd : k < n}

and
X n = span{z k Cd : k ≥ n},
respectively. In this representation the angle between past and future is nonzero if and only
if the Riesz projection P+ is bounded in the weighted space L2 (W ). All these applications
are thoroughly discussed in the introduction of [30] and the references therein.
In this chapter we are going to study the behavior of the operator norm of some dyadic
operators on L2 (W ) with respect to a matrix weight W . As we shall see there are many
important diﬀerences between the scalar and the matrix cases. We will prove that for a
dimensional analogue of the Martingale transform the operator norm on L2 (W ) does not
approach the unweighted norm as the matrix weight W “approaches” the identity matrix
Id. This already is in contrast with the scalar case where such thing can not happen as we
showed in the previous chapters. It seems that as we consider more than one dimensions
the ﬂatness (meaning closeness to 1) of the A2 characteristic does not suﬃce for continuity
results of the kind of Theorem 2. It is also interesting and surprising that trivial dyadic
operators are examples of such not well behaved operators. Here we should mention that

37

none of the techniques we used in the scalar case work for the matrix case. Firstly, the
Riesz-Thorin interpolation theorem with change in measure does not work and secondly,
there is no useful BMO theory for the case of matrices. Useful in the sense that there is a
nice interplay between BM O and the A2 class.

5.1

The not well behaved dyadic operators

Before we study such examples we need to establish some notation. For a function f (scalar
or matrix valued) and a ﬁnite interval I we denote by < f >I the average of f over the
1
interval I, that is the number |I| I f (x)dx. For a given interval I we denote the right half

as I+ and the left half as I− . For such interval there is a Haar function associated to it,
which we call hI , deﬁned in the following way

hI (x) =

1
|I|

(χI+ (x) − χI− (x)),

where χA (x) represents the characteristic function of the set A. It is obvious that if we
restrict ourselves to dyadic subintervals, I, J ∈ D, of the real line we have

(hI , hJ )L2 :=

hI (x)hJ (x)dx = δIJ ,
R

where δIJ is equal to 1 if I = J and equal to 0 if I = J. By D we denote the set D =
∪+∞ Dk , where Dk = {[ jk , j+1 ) : j ∈ Z}. In addition, for an interval J ∈ D we are going
k=−∞
2
2k
to denote by D(J) the set of all dyadic subintervals of J, including J itself.
Given a sequence of signs enumerated by the dyadic intervals σ = {σI }I∈D , σI ∈

38

{−1, +1}, we deﬁne the martingale transform Tσ of a function f to be

Tσ f (x) =

σI (f, hI )L2 hI (x).
I∈D

The boundedness properties of this operator on L2 (w) for a scalar A2 weight w were studied
in [33], where the sharp dependence of the operator norm and the A2 characteristic was
found for the ﬁrst time. For a function f : R → Cd we can deﬁne the Martingale transform
Tσ f to be the vector in Cd with coordinates the numbers (Tσ f1 , Tσ f2 , ..., Tσ fd ). For our
purposes we will deﬁne a more general operator than this which we still call the Martingale
transform of the function f deﬁned as

j

Mσ f (x) =

j

j

σI (f, hI )L2 hI (x),
I∈D
1≤j≤d

j

where the function hI (x) is the vector hI ej and {ej }1≤j≤d is a ﬁxed orthonormal basis of
j

Cd . Here the sequence σ = {σI }, I ∈ D and 1 ≤ j ≤ d, is again a sequence of signs.
Let us also deﬁne the projections PI,j (that are orthogonal in the unweighted L2 space)
in L2 (W ) by the formula

PI,j f = hI (

j

I

f (t)hI (t)dt, ej )Cd ej .

j

Notice that PI,j f = (f, W −1 hI )2,W hI , where we denote by (, )2,W the inner product in
L2 (W ), from which it follows

j
j 2
PI,j 2 = W −1 hI 2
2,W
2,W hI 2,W .

39

Observe that after some easy calculations we obtain

PI,j 2 = (< W >I ej , ej )Cd (< W −1 >I ej , ej )Cd .
2,W

The claim is that such quantity can not be controlled by the A2 characteristic. Let us choose
a matrix weight W ∈ C2×2 and an orthonormal basis {ej }1≤j≤d in C2 such that the operator
norm of PI,j is not close to its unweighted norm, which is 1, no matter how “close” W is to
1
the identity matrix Id. Assume that one of the bases vectors is e1 = √ (1, 1) and that W
2

is a diagonal 2 × 2 matrix with two scalar A2 weights w and v for diagonal elements. Let it
be that w is in the 1, 1 spot and v in the 2, 2 spot. Then

1
−1 > + < v −1 > )
PI,1 2
I
I
2,W = 4 (< w >I + < v >I )(< w
1
≥
(2+ < w >I < v −1 >I + < w−1 >I < v >I )
4
1
< w >I
< v >I
≥
2+
+
,
4
< v >I
< w >I
and the weights w, v have no relation with each other. Both of them can have A2 character<w>

<v>

istic close to 1, as close as we like, but the quotients <v> I and <w>I can not be controlled
I
I
in general. This means that even though the matrix weight W has A2 characteristic close
to 1 the norm PI,1 2,W is not close to PI,1 2,Id = 1. Here notice that the Martingale
transform Mσ can be written in the form

j

j

j

j

σI (f, W −1 hI )L2 (W ) hI (x) =

Mσ f =

σI PI,j f.
I∈D
1≤j≤d

I∈D
1≤j≤d

Since the operator norms of the projections PI,j in L2 (W ) are not continuous with respect

40

to the weight W we can not expect this Martingale transform to have an operator norm in
L2 (W ) that is continuous with respect to W . Now we deﬁne the projection

PI f =

PI,j f = hI
1≤j≤d

I

f (t)hI (t)dt .

In [30] it has been proved that the operator norm in L2 (W ) is exactly equal to
1

PI 2,W =

1

2
2
< W >I < W −1 >I .

This expression is obviously less than or equal to [W ]A2 which immediately shows that PI
behaves nicely compared to its component operators that are the ones who do not. Here we
have a collection of projections {PI,j }1≤j≤d such that some of them do not become “ﬂat” as
the weight W becomes “ﬂat” but their sum is “ﬂat”. Therefore, we already see that strange
things can occur in more than one dimensions.

5.2

Some results about “ﬂatness” and a Riesz basis

In this section we will discuss some results which give us hope that there are important
quantities of dyadic harmonic analysis that obey the same rules as their scalar dimensional
analogues. We also present an important example of a Riesz basis for L2 (W ). All of them
were proved in [30] but we present them here to show that the dependence on the A2
characteristic is the “correct” one and because we need them for our calculations.
We start with a Lemma.

41

Lemma 12. Let A and B be nonsingular positive d × d matrices. Then:
√

A+B
.
2

det A det B ≤ det

Proof. It suﬃces to prove the Lemma in the special case when A+B = I, since we can always
2
consider the matrices C ∗ AC, C ∗ BC where the matrix C =

A+B
2

1
−2

. Write A = I + D,

B = I − D, D = D∗ , and let λ1 , ...λd be the eigenvalues of D. Then the eigenvalues of A, B
are 1 + λ1 , ..., 1 + λd and 1 − λ1 , ..., 1 − λd , respectively. It follows that:
d

det(AB) = det A det B =

A+B
(1 + λi )(1 − λi ) ≤ 1 = det
2

1
2

,

i=1

which is exactly what we need.
Lemma 13. Let W be a matrix weight such that W and W −1 are summable on a measurable
set I. Then for any vector e ∈ Cd
(< W >I e, e)Cd

([< W −1 >I ]−1 e, e)Cd
1

≥ 1.

1

2
2
Moreover, the operators < W >I < W −1 >I are expanding in the sense that they satisfy

<W

1
2
>I <

W −1

1
2
>I

e ≥ e for all vectors e ∈ Cd .

Proof. Fix a vector e and deﬁne f = [W −1 (I)]−1 e. Then

|I|([W −1 (I)]−1 e, e)Cd = |I|(e, f )Cd =
≤
I

=

1

I

1

(W 2 (t)e, W − 2 (t)f )Cd dt

(W (t)e, e)Cd dt

1
2

1
−1 (t)f, f ) dt 2
(W
Cd

I
1
1
(W (I)e, e) 2 d (W −1 (I)f, f ) 2 d
C
C

42

1

1

C

C

= (W (I)e, e) 2 d ([W −1 (I)]−1 e, e) 2 d .

Thus,
|I|2 ≤

(W (I), e, e)Cd

,

([W −1 (I)]−1 e, e)Cd

which is exactly what we wanted to prove.
With the help of these two Lemmas we are able to show the following.
Lemma 14. Let us consider a matrix weight W ∈ A2 . There is a constant c independent of
the weight W such that for all J ∈ D:

1
|J|

1 2

1

trace (WI )− 2 (WI+ − WI− )(WI )− 2

|I| ≤ c log[W ]A2 ,

I∈D(J)

and
1
|J|

1

1

[(WI )− 2 (WI+ − WI− )(WI )− 2 ]

2

|I| ≤ c log[W ]A2 .

I∈D(J)

Proof. For a dyadic interval I let us denote by µ(I) = det WI , ν(I) = det(W −1 )I and
m(I) = µ(I)ν(I). Since WI =

WI +WI
−
+
2

Lemma 12 implies that µ(I)2 ≥ µ(I+ )µ(I− ) and

similarly ν(I)2 ≥ ν(I+ )ν(I− ). Also, we deﬁne the matrices:
1

1

1

1

A = (WI )− 2 WI+ (WI )− 2 , B = (WI )− 2 WI− (WI )− 2 .

Observe that A+B = I and as it was done in the proof of the previous Lemma, we write
2
A = I + D, B = I − D and let λ1 , ..., λd be the eigenvalues of D. Then:
d

d

(1 − λ2 )
i

det A det B =

log(1 − λ2 )
i

= exp

i=1

i=1

43

d

≤ exp −

λi
i=1

= exp(−trace(D2 ))
1
= exp − trace((A − B)2 ) .
4

So we have proved that

1

µ(I) ≥ (µ(I+ )µ(I− )) 2 exp

1
1
1 1
· trace([(WI )− 2 (WI+ − WI− )(WI )− 2 ]2 ) .
2 4

But ν(I)2 ≥ ν(I+ )ν(I− ) and this implies
1

m(I) ≥ (m(I+ )m(I− )) 2 exp

1
1
1 1
· trace([WI )− 2 (WI+ − WI− )(WI )− 2 ]2 ) .
2 4

Applying this last inequality to I+ , I− and then to the halves of these intervals we get on
the nth step

m(I) ≥

m(J)

1
2n

exp

1
8

1
1
|J|
· trace([(WI )− 2 (WI+ − WI− )(WI )− 2 ]2 ) ,
|I|

where the product is over all subintervals, I, of J of length |I| = |J|2−n and the summation is over all subintervals, I, of J of length |I| > |J|2−n . We know that the operators
1

1

(WI ) 2 ((W −1 )I ) 2 are expanding and this implies that m(I) ≥ 1. We take logarithms, let n
go to inﬁnity and obtain the inequality

1
|J|

1

1

trace([(WI )− 2 (WI+ − WI− )(WI )− 2 ]2 )|I| ≤ 8 log sup[det(WI0 ) det((W −1 )I0 )].
I0

I∈D(J)

44

The right hand side of this inequality is less than or equal to the quantity

8 log([W ]2d ) = 16d log[W ]A2 .
A
2

Therefore, we have shown that

1
|J|

1

1

trace([(WI )− 2 (WI+ − WI− )(WI )− 2 ]2 )|I| ≤ 16d log[W ]A2 .
I∈D(J)

1

1

Notice that the matrix (WI )− 2 (WI+ − WI− )(WI )− 2 is self-adjoint which implies that we
have the same estimate with the square of the operator norm of the matrix in the place of
trace.
The following Lemma is a type of a weighted Carleson embedding theorem.
Lemma 15. Let W ∈ Cd×d be an A2 matrix weight and consider the quantity
−1
−1 2
µI = |I| < W >I 2 (< W >I+ − < W >I− ) < W >I 2 .

Then there is a positive dimensional constant c such that
1

I∈D

1
−
µI < W >I 2 < W 2 f >I 2 ≤ c[W ]2 log[W ]A2 f 2 ,
2
A2

holds for all f ∈ L2 (R → Cd ).
This was proved in [30] but the authors were not interested in the dependence of the
inequality with respect to the A2 characteristic of the weight. If we just simply follow their
proof we are able to obtain the square of [W ]A2 and the logarithm with the use of Lemma
45

14.
Suppose now that we have a collection of subspaces En of a Hilbert space H. We assume
that the only vector perpendicular to every En is the zero vector. We will call such collections
complete.
The collection is called minimal if there is a family of bounded projections (not necessarily
orthogonal) En
En = Id · χEn ,
and is called uniformly minimal if

sup En H→H < ∞.

n∈N

For a minimal collection of subspaces En we can deﬁne the bi-orthogonal or dual system by

En = (En )∗ (H) = span{Ek : k = n}⊥ ,

where (En )∗ denotes the dual operator of En .
A complete system of subspaces En is called an unconditional basis if there exists an
isomorphism U from H onto another Hilbert space H that maps the collection En into an
orthogonal system. Such an isomorphism is called the orthogonalizer of the collection. An
equivalent statement is that there exists a constant C > 0 such that for any ﬁnite collection
of vectors fn ∈ En
1
C

fk 2 ≤
H

fk

2
H

≤C

fk 2 .
H

Notice that if the subspaces En were orthogonal then we would have that these quantities

46

are equal by the Pythagorean theorem. Instead of that we have that they are comparable.
A collection of vectors fn is called an unconditional basis if the corresponding system of
one dimensional spaces is an unconditional basis and we call the collection of vectors a Riesz
basis if it is almost normalized, that is

0 < inf fn H ≤ sup fn H < ∞.
n∈N

n∈N

Notice that we do not allow the vectors fn to be arbitrarily large or arbitrarily small inside
the Hilbert space H.
The following is a very important result of [21].
Theorem 16. A complete collection of subspaces En of a Hilbert space H is an unconditional
basis if and only if it is uniformly minimal and the following two conditions hold for some
positive constant C independent of f

PEn f 2 ≤ C f 2
H
H
n

and

n

PE f 2 ≤ C f 2 ,
H
n H

for all f ∈ H, where by PEn and PE we denote the orthogonal projections onto En and En
n
respectively. .
Our Hilbert space now is going to be L2 (W ) where W ∈ Cd×d is an A2 matrix weight.
A construction of a Riesz basis in L2 (W ) was done in [30]. We need this basis to deﬁne a
Martingale transform that its operator norm on L2 (W ) is going to be a continuous function
of the weight W . For this reason let us see how this construction was done.
47

Denote by ek , 1 ≤ k ≤ d, an orthonormal basis of Cd , consisting of eigenvectors of the
I
positive self-adjoint matrix < W >I and let

k
wI

=

1
1
1
k , ek ) dt − 2 = (< W > ek , ek )− 2
(W (t)eI I Cd
I I I Cd
|I| I
1

= ([WI ]−1 ek , ek ) 2 d =
I I C

1

−
< W >I 2 ek .
I

Deﬁne the vectors
k
k
fI (x) = wI hI (x)ek .
I

Observe that

k k
k k
(fI , fJ )L2 = wI wJ (ek , ek )Cd
I J

hI (x)hJ (x)dx
R

which is equal to zero if I = J since hI ⊥ hJ in L2 and if k = k and I = J it is again equal
k
to zero since the vectors ek , ek are orthogonal in Cd . This means that the vectors {fI } are
I I

orthogonal in the unweighted L2 space.
In the L2 (W ) space similar things happen but the situation is slightly diﬀerent. For
instance,

k k
k
(fI , fI )L2 (W ) = (wI )2

R

k
h2 (x)(W (x)ek , ek )Cd dx = (wI )2 (< W >I ek , ek )Cd
I
I I
I I

= 1

k k
and for k = k we have that (fI , fI )L2 (W ) = 0 since the vectors ek , ek are orthogonal in
I I

Cd . But for I = J we do not have orthogonality in general. The reason for that is that the
vectors ek and ek for I = J have no relation for an arbitrary matrix weight W . This is a
I
J
48

diﬃculty that we are able to overcome easily since we can prove that for W with “ﬂat” A2
characteristic these vectors are almost orthogonal. We will make this more precise later.
Now let us deﬁne a collection of spaces EI as

k
EI = span{fI : 1 ≤ k ≤ d} = hI Cd ,

k
I ∈ D. The vectors {fI }1≤k≤d constitute an orthonormal basis of EI inside L2 (W ). Here

notice that from our previous considerations it follows that the subspaces EI and EJ are
orthogonal in the unweighted L2 space for I = J. The EI ’s are d-dimensional subspaces of
L2 (W ) ∩ L2 . It is east to prove that if a vector function f is orthogonal to all EI ’s then
f = 0 almost everywhere. This means that our collection {EI }I∈D is a complete system of
subspaces.
Let us deﬁne the projections

PI f (x) = hI (x)

I

f (t)hI (t)dt .

Notice that we considered these projections before in the example of the not well behaved
Martingale transform. Also, PI 2 = 1 and
1

1

PI 2,W =

2
2
< W >I < W −1 >I .

That is, these projections are orthogonal in L2 but they are almost orthogonal in L2 (W ) for
“ﬂat” W ∈ A2 . In addition, inside L2 (W ) (and L2 ) we have the equality

PI = Id · χE ,
I

49

which implies that our collection {EI }I∈D is minimal. Actually,

sup PI 2,W ≤ [W ]A2 < ∞,

I∈D

and so the collection is uniformly minimal. Let us denote by EI the bi-orthogonal system
and by PE the orthogonal projection onto EI and by PE the orthogonal projection onto
I
I
EI . Using techniques from [30] and Lemma 15 we can show that the following is true.
Theorem 17. There is a positive dimensional constant C with the property

PE f 2 ≤ (1 + C
I 2,W

log[W ]A2 ) f 2 ,
2,W

PE g 2 ≤ (1 + C
2,W

log[W ]A2 ) g 2 ,
2,W

I∈D

for all f ∈ L2 (W ) and

I∈D

I

for all g ∈ L2 (W ).
Notice that this proves that the uniformly minimal collection {EI }I∈D is actually an
k
unconditional basis in L2 (W ) and that the vectors fI are a Riesz basis in L2 (W ). Here is a

short outline of the proof of Theorem 17. What we want to show is that

k
|(f, fI )2,W |2 ≤ (1 + C

log[W ]A2 ) f 2 .
2,W

I∈D
1≤k≤d
1

−1
1
k
k
For this reason we deﬁne the vectors gI = fI + χI AI ek , where AI = 2 |I|− 2 < W >I (<
I
1

−
k
W >I+ − < W >I− ) < W >I 2 . Since the collection {gI } is orthogonal in L2 (W ) and since

50

k
k
by Bessel’s inequality (the norms gI 2,W are uniformly bounded because supI,k gI L2 (W ) ≤

1 + c log[W ]A2 )
1
I∈D
1≤k≤d

k
gI 2
2,W

k
|(f, gI )2,W |2 ≤ f 2 ,
2,W

it suﬃces to prove

|(f, χI AI ek )2,W |2 ≤ (1 + C
I

log[W ]A2 ) f 2 ,
2,W

I∈D
1≤k≤d

for all f ∈ L2 (W ). This statement is equivalent to

1
|I|2 A∗ (W 2 f )I 2 ≤ (1 + C
I

log[W ]A2 ) f 2 ,
2

I∈D
1≤k≤d
1

1

for all f ∈ L2 . But A∗ e ≤ 1 (µI ) 2 |I|−1 (WI )− 2 e , for all vectors e, where µI is the
2
I
quantity that appears in Lemma 15. This means we are done. Before we go further and
study the Martingale transform in the next section, note that the system of subspaces EI
1

in L2 (W ) has the same geometry as the system W 2 EI in the unweighted L2 space. The
1

bi-orthogonal to the latter system is W − 2 EI .

5.3

W
The Martingale transform revisited: Mσ and Mσ

In this section we will try to use the construction of the Riesz basis which was presented
W
before to prove that a new Martingale transform, Mσ , which depends on the matrix weight

W is “ﬂat” for “ﬂat” A2 weight W . We already know that we can not expect the usual
Martingale transform Mσ to behave nicely for a general A2 weight. This is because the

51

L2 (W ) operator norms of the projections PI,j that we considered in section 5.1 are not
“ﬂat” in general for “ﬂat” A2 matrix weight W . Let us ﬁx such an A2 weight W and for
each dyadic interval I we consider the eigenvectors ek of the matrix < W >I as we did in the
I
previous section, their eigenvalues λk , and the vectors hI ek (see section 5.2). For a vector
I
I
function f we deﬁne the operator

W
Mσ f =

k
σI (f, hI ek )2 · hI ek
I
I
I∈D
1≤k≤d

and the projections

W
PI,k f = (f, W −1 hI ek )2,W hI ek = hI
I
I

I

f (t)hI (t)dt, ek
I

ek .
d I
C

We write the super-index W because they depend on the matrix weight. The claim is the
following Theorem which is a substitute for the not well behaved Martingale transform Mσ
(see section 5.1).
Theorem 18. Let W be a matrix weight with [W ]A2 = 1 + δ where δ > 0. There is a
dimensional constant c > 0 such that for all δ suﬃciently close to 0 we have the estimate

W
Mσ 2,W ≤ 1 + c

[W ]A2 − 1.

W
Proof. We claim that the projections PI,k behave nice in L2 (W ). The square of their norm
W
k 2
k 2
PI,k 2
2,W is equal to hI eI 2,W hI eI 2,W −1 and in its turn this is equal to the product

52

(the brackets (, ) mean inner product in Cd )

(< W >I ek , ek )(< W −1 >I ek , ek ) = λk (< W −1 >I ek , ek )
I I
I I
I
I I
= (<
= (<

1
2
>I
1
2
W −1 >I <

1
k ek , < W −1 > 2
λI I
I
1
2
W >I ek , < W −1
I

W −1

λk ek )
I I
1

1

2
2
>I < W >I ek )
I

≤ [W ]2
A2

which is equal to 1 + cδ for [W ]A2 = 1 + δ, δ ≈ 0. To continue we denote by (., .)H the usual
inner product in the Hilbert space H = L2 (W ) (we denote the dual space L2 (W −1 ) = H ∗ )
W
and we use the notation xI,k = hI ek . In order to estimate the operator norm of Mσ on
I

L2 (W ) we only need to estimate the expression

|(xI,k , W −1 f )H ||(xI,k , W ψ)H ∗ |,
I∈D
1≤k≤d

for f ∈ L2 (W −1 ) and ψ ∈ L2 (W ). This sum is equal to

I∈D
1≤k≤d

xI,k
xI,k
xI,k H xI,k H ∗ |(
, F )H ||(
, Ψ)H ∗ |,
xI,k H
xI,k H ∗

(5.1)

where we denote by F = W −1 f and Ψ = W ψ. We bound xI,k H xI,k H ∗ by the norm of
W
the projection PI,k on L2 (W ) which is less than or equal to [W ]2 and then by the use of
A2

Cauchy-Schwartz inequality, expression (5.1) is bounded above by

[W ]2
A2

I∈D
1≤k≤d

1
xI,k
2 2
|(
, F )H |
xI,k H

53

|(
I∈D
1≤k≤d

xI,k
xI,k H ∗

, Ψ)H ∗

1
2 2.
|

(5.2)

We bound each one of the two sums separately. Notice that for the ﬁrst sum we project
x

the vector F orthogonally onto the unit vectors x I,k which constitute a bases of the
I,k H
vector space EI introduced in section 5.2. This means that

xI,k
, F )H |2
1≤k≤d |( x
I,k H

=

PE F 2
2,W and therefore,
I

I∈D
1≤k≤d

1
xI,k
2 2 ≤
, F )H |
|(
xI,k H

PE F
I

2
2,W

1
2

≤ (1 + c log[W ]A2 ) F

2
2,W

1
2

I∈D

by Theorem 17, which is exactly what we want. For the second sum of (5.2) similar considerax

tions apply. Namely, we project the vector Ψ onto the unit vectors x I,k which are almost
I,k H ∗
orthogonal in H ∗ (the cosine of the angle between the vectors xI,k and xI,k , for k = k , is of
√
the order δ) and they constitute a bases of EI . So by the use of the law of cosines we can
√
x
bound the sum 1≤k≤d |( x I,k , Ψ)H ∗ |2 from above by (1 + c δ) PE Ψ 2 −1 . Since
I,k H ∗

PE Ψ 2
I∈D

I

2,W −1

I

2,W

≤ (1 + c log[W ]A2 ) Ψ 2 −1
2,W

by Theorem 17, we are done. In summary we have the desired estimate

W
Mσ 2,W ≤ 1 + c

5.4

[W ]A2 − 1.

Open problems about matrix weights

As we showed in the beginning of this chapter for any A2 weight W ∈ Cd×d and any
direction e ∈ Cd the scalar weight we (x) = (W (x)e, e)Cd is A2 with characteristic at most
54

[W ]A2 . Since the A2 characteristic does not change when we multiply the weight by a
positive constant number it is not hard to see that for any vector y ∈ Cd , y = 0, the weight
wy (x) = (W (x)y, y)Cd is an A2 scalar weight and actually,

sup
y∈Cd \{0}

[wy ]A2 ≤ [W ]A2 .

The question that rises is the following. Suppose that for a matrix weight W and any nonzero vector y the scalar weights wy := (W (x)y, y)Cd and (w−1 )y = (W −1 y, y)Cd are in the
A2 class with uniform bound for the A2 characteristic. Do we necessarily have that W ∈ A2 ?
In [16] Dr. Lauzon and Dr. Treil proved that for W ∈ R2×2 the answer to this question is
positive. That is if the weights wy , (w−1 )y are uniformly A2 over all unit vectors y ∈ R2 ,
then W satisﬁes the matrix A2 condition. In addition, they proved that for n ≥ 6 there are
W ∈ Rn×n such that for all directions y ∈ Rn the scalar weights wy , (w−1 )y are uniformly
A2 but W ∈ A2 . For dimensions n = 3, 4 and 5 the answer is not known.
/
We should mention that the problem of characterizing matrix weights W ∈ Cd×d that
satisfy
1
p
2
p Hf (x), Hf (x)) 2 dx p
(W (x)
Cd
R

≤C

1
p
2
p f (x), f (x)) 2 dx p ,
(W (x)
Cd
R

for 1 < p < ∞, where C > 0 is a constant, has been solved in [19] and [32] with diﬀerent
1
methods. It states that this holds if and only if W ∈ Ap,q where p + 1 = 1. To introduce
q

the Ap,q condition we need some preliminary deﬁnitions. Let t → ρt , t ∈ R be a function
whose values are norms (or even semi-norms) on Rn (or Cn ). We assume this function to be
measurable in the sense that for any vector x ∈ Rn the function t → ρt (x) is measurable.

55

The Lp (ρ) space consists of measurable vector functions f such that

p

f Lp (ρ) :=
ρt (f (t))p dt < ∞.
R
The weighted Lp (W ) space with a matrix weight W is a special case of the space Lp (ρ)
1

where ρt (x) = W (t) p x . For a norm ν on Rn (or Cn ) we denote by ν ∗ the dual norm
deﬁned as
|(x, y)|
.
y=0 ν(y)

ν ∗ (x) = sup

The notation (, ) denotes the inner product on Rn (or Cn depending on which case we are
considering). For a normed valued function ρ we deﬁne the dual function ρ∗ = (ρt )∗ . We
t
also denote by < ρ >I,p the p-average

< ρ >I,p (x) =

1
1
p
ρt (x)p dt ,
|I| I

where x ∈ Rn . We say that a normed valued function ρ satisﬁes the Ap,q condition for
1
1 < p < ∞, p + 1 = 1 if there is a positive constant C such that
q

< ρ∗ >I,q ≤ C < ρ >∗ ,
I,p

(5.3)

for all ﬁnite intervals I in R. Notice that the opposite inequality always holds with C = 1.
This follows by H¨lder’s inequality and the fact that for a norm ν and all x, y ∈ Rn we have
o
|(x, y)| ≤ ν(x)ν ∗ (y). We can denote the smallest C that satisﬁes inequality (5.3) by [ρ]Ap,q
(notice that it is always bigger or equal to 1) and try to study the relation between operator
norms on Lp (ρ) with respect to the quantity [ρ]Ap,q . A problem like this seems more diﬃcult

56

than the p = 2 case.
The Ap,q classes do not have the analogous properties of their scalar dimensional analogues namely the Ap classes. In [1] the author showed that there is an A2 matrix weight
1
W ∈ C2×2 such that W ∈ Ap,q for all p < 2, p + 1 = 1. This is a counter example to the
/
q

open ended property of the Ap classes. He also presented an example of a 2 × 2 A2 weight
W such that W r ∈ A2 for all r > 1.
/
The behavior of the dimensional constants c that appear in our calculations when d
grows is unclear and it can be an interesting problem. It is quite natural to understand
the inﬁnite-dimensional stationary stochastic processes, and, thus, to understand operatorvalued weights. However, this is an open question. In [10] it was shown that for the Martingale transform
Tσ f =

σI (f, hI )hI ,
I∈D

the following is true. There are constants 0 < a, A < ∞, such that for every d there is an
A2 weight Wd ∈ Cd×d with [Wd ]A2 ≤ A and
1

sup Tσ 2,W ≥ a(log d) 2 .
d
σ

In [11] the same authors proved that a completely analogous result holds for the Hilbert
transform.

57

BIBLIOGRAPHY

58

BIBLIOGRAPHY
[1] M. Bownik, Inverse volume inequalities for matrix weights , Indiana Univ. Math. J.
50 (2001), no. 1, 383-410.
[2] Stephen M. Buckley, Estimates for operator norms on weighted spaces and reverse
Jensen inequalities, Trans. Amer. Math. Soc., volume 340, number 1, November 1993.
[3] D. Burkholder, Boundary value problems and sharp estimates for the martingale
transforms, Ann. of Prob. 12 (1984), 647-702.
[4] R. Coifman and C. Fefferman, Weighted norm inequalities for maximal functions
and singular integrals, Studia Math. 51 (1974), 241-250.
[5] R. Coifman, P. Jones and J. Rubio de Francia, Constructive decomposition of
B.M.O. functions and factorization of Ap weights, Proc. Amer. Math. Soc. 87 (1983),
675-676.
[6] J.G. Conlon and T. Spencer, A strong central limit theorem for a class of random
surfaces, arXiv:1105.2814v2.
[7] O. Dragicevic and A. Volberg, Bellman function, Littlewood-Paley estimates
and asymptotics for the Ahlfors-Beurling operator in Lp (C), Indiana Univ. Math. J.
54 (2005), no. 4, 971-995.
[8] J. Garcia-Cuerva and J. Rubio De Francia, Weighted norm inequalities and
related topics, North Holland Math. Stud. 116, North Holland, Amsterdam 1985.
[9] J. Garnett and P.W. Jones, The distance in BM O to L∞ , Ann. of Math. 108
(1978), 373-393.
[10] A. Gillespie, S. Pott, S. Treil and A. Volberg, Logarithmic growth for matrix
Martingale transform, J. London Math. Soc. (2) 64 (2001), no. 3, 624-636.
[11] A. Gillespie, S. Pott, S. Treil and A. Volberg, Logarithmic growth for
weighted Hilbert transform and vector Hankel operators, J. Operator Theory 52 (2004),
no. 1, 103-112.

59

[12] R. Hunt, B. Muckenhoupt and R. Wheeden, Weighted norm inequalities for
the conjugate function and the Hilbert transform, Trans. Amer. Math. Soc. 176 (1973),
227-251.
[13] T. Hytonen, The sharp weighted bound for general Calder´n-Zygmund operators,
o
arXiv:1007.4330.
[14] P. Jones, Factorization of Ap weights, Ann. of Math. 111 (1980), 511-530.
[15] M. B. Korey, Ideal weights: Asymptotically optimal versions of doubling, absolute
continuity, and bounded mean oscillation, The Journal of Fourier Analysis and Applications, Volume 4, Issues 4 and 5, 1998.
[16] M. Lauzon and S. Treil, Scalar and vector Muckenhoupt weights, Indiana Univ.
Math. J. 56 (4) (2007) 1989-2015.
[17] N. G. Meyers, An Lp estimate for the gradient of solutions of second order elliptic
divergence equations, Ann. Scuola Norm. Sup. Pisa (3) 17 189-206, (1963).
[18] B. Muckenhoupt, Weighted norm inequalities for the Hardy maximal function,
Trans. Amer. Math. Soc. 165 (1972), 207-226.
[19] F. Nazarov and S. Treil, The hunt for a Bellman function: applications to estimates for singular integral operators and to other classical problems of harmonic
analysis, Algebra i Analiz, 8:5 (1996), 32162.
[20] F. Nazarov and A. Volberg, Heating of the Ahlfors-Beurling operator and estimates of it’s norms, St. Petersburg Math. J. Vol. 15 (2004), No. 4, Pages 563-573.
[21] N. K. Nikolskii, “Treatise on the Shift Operator”, Springer-Verlag, New York, 1985.
[22] M. Papadimitrakis and N. Pattakos, Continuity of weighted estimates for sublinear operators, arXiv:1206.4580v1.
[23] N. Pattakos and A. Volberg, Continuity of weighted estimates in Ap norm, Proc.
Amer. Math. Soc. 140 (2012), 2783-2790.
[24] N. Pattakos and A. Volberg, The Muckenhoupt A∞ class as a metric space and
continuity of weighted estimates, Math. Res. Lett. 19 (2012), no. 02, 499-510.

60

[25] N. Pattakos and A. Volberg, A new weighted Bellman function, C. R. Acad. Sci.
Paris, Ser. I 349 (2011) 1151-1154.
[26] S. Petermichl and A. Volberg, Heating of the Ahlfors-Beurling operator: Weakly
quasiregular maps on the plane are quasiregular, Duke Math. J. 112 (2002), no. 2,
281-305.
[27] Yu. A. Rozanov, Stationary stochastic processes, Holden-Day, SF, 1967.
[28] E. M. Stein, Harmonic Analysis: Real Variable Methods, Orthogonality, and Oscillatory Integrals, Princeton Univ. Press, Princeton, 1993.
[29] E. Stein and G. Weiss, Interpolation of operators with change of measures, Transactions of the AMS, Volume 87, Number 1, Pages 159-172 (1958).
[30] S. Treil and A. Volberg, Wavelets and the angle between the Past and the Future,Journal of Functional Analysis 143, 269-308 (1997).
[31] S. Treil and A. Volberg, Completely regular multivariate stationary processes and
the Muckenhoupt condition, Paciﬁc J. Math. Vol. 190 (1999), No. 2, 361382.
[32] A. Volberg, Matrix Ap weights via S-functions, Journal of the American Mathematical Society Volume 10, Number 2, April 1997, Pages 445-466.
[33] J. Wittwer, A sharp estimate on the norm of the Martingale transform, Math. Res.
Lett. 7, 112 (2000).

61