Singular integrals with flag kernels on homogeneous groups, I

Similar documents
Algebras of singular integral operators with kernels controlled by multiple norms

Chapter One. The Calderón-Zygmund Theory I: Ellipticity

A new class of pseudodifferential operators with mixed homogenities

On a class of pseudodifferential operators with mixed homogeneities

2. Function spaces and approximation

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra

Some Background Material

JUHA KINNUNEN. Harmonic Analysis

SCHWARTZ SPACES ASSOCIATED WITH SOME NON-DIFFERENTIAL CONVOLUTION OPERATORS ON HOMOGENEOUS GROUPS

MORE NOTES FOR MATH 823, FALL 2007

Metric Spaces and Topology

An introduction to some aspects of functional analysis

Lebesgue Measure on R n

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

An introduction to Birkhoff normal form

1 Differentiable manifolds and smooth maps

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

L p -boundedness of the Hilbert transform

Here we used the multiindex notation:

We denote the space of distributions on Ω by D ( Ω) 2.

FILTERED RINGS AND MODULES. GRADINGS AND COMPLETIONS.

8 Singular Integral Operators and L p -Regularity Theory

CHAPTER 3 Further properties of splines and B-splines

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Representation theory and quantum mechanics tutorial Spin and the hydrogen atom

Part III. 10 Topological Space Basics. Topological Spaces

arxiv: v2 [math.ag] 24 Jun 2015

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

Introduction to Real Analysis Alternative Chapter 1

A new proof of Gromov s theorem on groups of polynomial growth

Laplace s Equation. Chapter Mean Value Formulas

EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Singular Integrals. 1 Calderon-Zygmund decomposition

Analysis in weighted spaces : preliminary version

HARMONIC ANALYSIS. Date:

Cover Page. The handle holds various files of this Leiden University dissertation

Analysis-3 lecture schemes

1.5 Approximate Identities

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

Math212a1413 The Lebesgue integral.

MATH6081A Homework 8. In addition, when 1 < p 2 the above inequality can be refined using Lorentz spaces: f

7: FOURIER SERIES STEVEN HEILMAN

Some topics in analysis related to Banach algebras, 2

Groups of Prime Power Order with Derived Subgroup of Prime Order

Stabilization of Control-Affine Systems by Local Approximations of Trajectories

Notes on Distributions

6 Lecture 6: More constructions with Huber rings

Topological properties of Z p and Q p and Euclidean models

THE GROUP OF UNITS OF SOME FINITE LOCAL RINGS I

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels

The Heine-Borel and Arzela-Ascoli Theorems

Bounded point derivations on R p (X) and approximate derivatives

Quivers of Period 2. Mariya Sardarli Max Wimberley Heyi Zhu. November 26, 2014

10. Smooth Varieties. 82 Andreas Gathmann

ON THE MATRIX EQUATION XA AX = X P

STRONGLY SINGULAR INTEGRALS ALONG CURVES. 1. Introduction. It is standard and well known that the Hilbert transform along curves: f(x γ(t)) dt

Overview of normed linear spaces

RESULTS ON FOURIER MULTIPLIERS

A SHORT PROOF OF THE COIFMAN-MEYER MULTILINEAR THEOREM

NECESSARY CONDITIONS FOR WEIGHTED POINTWISE HARDY INEQUALITIES

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS

New Negative Latin Square Type Partial Difference Sets in Nonelementary Abelian 2-groups and 3-groups

I. The space C(K) Let K be a compact metric space, with metric d K. Let B(K) be the space of real valued bounded functions on K with the sup-norm

ESTIMATES FOR MAXIMAL SINGULAR INTEGRALS

1 Directional Derivatives and Differentiability

DIEUDONNE AGBOR AND JAN BOMAN

Measure and integration

Real Analysis Chapter 4 Solutions Jonathan Conder

12. Hilbert Polynomials and Bézout s Theorem

Mathematical Analysis Outline. William G. Faris

Constrained Optimization and Lagrangian Duality

Math 249B. Nilpotence of connected solvable groups

Recent structure theorems of orders and results in abstract harmonic analysis

Introduction to Bases in Banach Spaces

REAL AND COMPLEX ANALYSIS

MAXIMAL AVERAGE ALONG VARIABLE LINES. 1. Introduction

Topological properties

Characterizations of some function spaces by the discrete Radon transform on Z n.

Classification of root systems

Harmonic Polynomials and Dirichlet-Type Problems. 1. Derivatives of x 2 n

Jordan Journal of Mathematics and Statistics (JJMS) 9(1), 2016, pp BOUNDEDNESS OF COMMUTATORS ON HERZ-TYPE HARDY SPACES WITH VARIABLE EXPONENT

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

Function spaces and multiplier operators

be the set of complex valued 2π-periodic functions f on R such that

Topics in Harmonic Analysis Lecture 6: Pseudodifferential calculus and almost orthogonality

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

Functional Analysis HW #3

Schubert Varieties. P. Littelmann. May 21, 2012

De Rham Cohomology. Smooth singular cochains. (Hatcher, 2.1)

Eventually reducible matrix, eventually nonnegative matrix, eventually r-cyclic

Derivatives of Harmonic Bergman and Bloch Functions on the Ball

arxiv: v1 [math.ca] 7 Aug 2015

ADVANCED TOPICS IN ALGEBRAIC GEOMETRY

Eilenberg-Steenrod properties. (Hatcher, 2.1, 2.3, 3.1; Conlon, 2.6, 8.1, )

Matroid intersection, base packing and base covering for infinite matroids

Sobolev spaces. May 18

2. Intersection Multiplicities

Transcription:

Rev. Mat. Iberoam. 28 (2012), no. 3, 631 722 doi 10.4171/rmi/688 c European Mathematical Society Singular integrals with flag kernels on homogeneous groups, I Alexander Nagel, Fulvio Ricci, Elias M. Stein and Stephen Wainger Abstract. Let K be a flag kernel on a homogeneous nilpotent Lie group G. We prove that operators T of the form T (f)=f K form an algebra under composition, and that such operators are bounded on L p (G)for1<p<. 1. Introduction This is the first of two papers dealing with singular integral operators with flag kernels on homogeneous nilpotent groups. Our goal is to show that these operators, along with appropriate sub-collections, form algebras under composition, and that the operators in question are bounded on L p. Operators of this kind arose initially when studying compositions of sub-elliptic operators on the Heisenberg group (such as the sub-laplacian L and b ) with elliptic-type operators. In particular in [7] one saw that operators of the form m(l,it)(wherem is a Marcinkiewicz multiplier ) are singular integrals with flag kernels and satisfy L p estimates. The theory was extended in [9] to encompass general flag kernels in the Euclidean space R N, and the resulting operators arising via abelian convolution. In addition, aspects of the CR theory for quadratic manifolds could be studied via such operators on various 2-step groups. More recently, flag kernels have been studied in [13] and[2]. In view of this, and because of their potential further application, it is desirable to extend the above results in [9] to the setting of homogeneous groups of higher step. To achieve this goal requires however that we substantially recast the approach and techniques used previously, since these were essentially limited to the 2-step case. Our main results are two-fold. Suppose G is a homogeneous nilpotent group and K denotes a distribution on G which is a flag kernel (the requisite definitions are given below in Definition 2.3). Theorem A. The operators T of the form T (f) =f Kform an algebra under composition. Mathematics Subject Classification (2010): Primary 42B20; Secondary 2010. Keywords: Flag kernel, homogeneous nilpotent Lie group, cancellation condition.

632 A. Nagel, F. Ricci, E. M. Stein and S. Wainger Theorem B. The above operators are bounded on L p (G) for 1 <p<. Given the complexity of the material, in this introductory section we provide the reader with an outline of the main ideas that enter in the proofs of the above theorems. Moreover, in order to simplify the presentation we will often not state matters in the most general setting and sometimes describe the situation at hand a little imprecisely. 1.1. Flag kernels We start with a direct sum decomposition R N = R a1 R an, with n j=1 a j =N, and we write x =(x 1, x 2,...,x n ), with x m R am. We also fix a one-parameter family of dilations δ r on R N,givenbyδ r (x) =(r d1 x 1,...,r dn x n ), with positive exponents d 1 <d 2 < <d n. 1 We denote by Q k = d k a k the homogeneous dimension of R a k. We also define the partial norms N k (x) = x k 1/d k e,where x k e is the standard Euclidean norm on R a k. In this setting, a flag kernel K is a distribution on R N which is given by integration against a C function K(x) awayfromx 1 = 0 and which satisfies two types of conditions. The first are the differential inequalities for x 1 0: n (1.1) x α K(x) C α (N 1 (x) +N 2 (x)+ + N k (x)) Q k d k α k k=1 with α =(α 1,...,α n ). The second are the cancellation conditions. These are most easily expressed recursively. Let K,ϕ denote the action of the distribution K on a test function ϕ. At the beginning of the recursion there is the following condition, in many ways typical of the others: (1.2) sup K,ϕ R < R where R = (R 1,R 2,...,R n ), ϕ R (x) = ϕ(r d1 1 x 1, R d2 2 x 2,...,Rn dnx n), and ϕ is an arbitrary C function which is supported in the unit ball. More generally, one requires that the action of K on a test function in some subset of variables {x m1,...,x mβ } produces a flag kernel in the remaining variables {x l1,...,x lα }. The precise formulation of these conditions is given in Section 2 and Definition 2.3 below. 1.2. Dyadic decomposition A main tool used in studying flag kernels is their dyadic decomposition into sums of bump functions. This proceeds as follows. Let I =(i 1,i 2,...,i n )denoteany indexing set of integers that satisfies. (1.3) i 1 i 2 i n 1 i n. 1 One can also allow non-isotropic dilations on each subspace R a l. See Section 2 below.

Singular integrals with flag kernels on homogeneous groups I 633 Also let {ϕ I } be a family of C functions supported in the unit ball that are uniformly bounded in the C (m) norm for each m. Set [ϕ I ] I (x) =2 i1q1 i2q2 inqn ϕ I (2 d1i1 x 1,...,2 dnin x n ), so that the [ϕ I ] I are L 1 -normalized. We say that the ϕ I satisfy the strong cancellation condition if for each k with 1 k n, (1.4) ϕ I (x 1,...,x k,...,x n ) dx k 0 R a k when all the inequalities (1.3) fori are strict. In the case that there are some equalities in (1.3), say i l 1 <i l = i l+1 = = i k <i k+1, then only cancellation in the collection of corresponding variables is required: (1.4 ) ϕ I (x 1,...,x l,...,x k,...,x n ) dx l dx k 0. R a l R a k The first result needed is that any sum (1.5) [ϕ I ] I I made up of such bump functions, with the cancellation conditions (1.4) and(1.4 ), converges in the sense of distributions to a flag kernel, and conversely, any flag kernel K can be written in this way (of course, not uniquely). There are two parts to this result (which in effect is stated but not proved completely in [9]). The first is that the sum in (1.5) is indeed a flag kernel. To see this, one can use the estimate in Proposition 10.1 given in Appendix II below; one also notes from this that even without the cancellation conditions (1.4) and(1.4 ), the sum (1.5) satisfies the differential inequalities (1.1). The converse part requires Theorem 6.1 below, and the observation that the parts of the sum (1.5) contributed by I s where there may be equality in (1.3) give flag distributions corresponding to various coarser flags. However, what will be key in what follows is that the strong cancellation conditions (1.4) or(1.4 ) can be weakened, and still lead to the same conclusion. While these weak cancellation conditions are somewhat complicated to state (see Definition 5.5 below), they are easily illustrated in the special 2-step case. Here we have the decomposition R n = R a1 R a2, x =(x 1, x 2 ). The cancellation condition for the second variable is as before: ϕ I (x 1, x 2 )dx 2 0. For x 1 the weak cancellation condition takes the form (1.6) ϕ I (x 1, x 2 ) dx 1 =2 ɛ(i2 i1) η I (x 2 ), R a 1 for some ɛ>0, with I =(i 1,i 2 )andη I an L 1 normalized bump in the x 2 variable. In this context, the main conclusion (Theorem 6.8) is that the sum (1.5) is still a flag kernel if the weak-cancellation conditions are assumed instead of (1.4)

634 A. Nagel, F. Ricci, E. M. Stein and S. Wainger and (1.4 ), and the functions {ϕ I } are allowed to belong to the Schwartz class instead of being compactly supported. In understanding Definition 5.5, one should keep in mind that conditions like (1.4), which involve vanishing of integrals, are equivalent with expressions of the ϕ I as the sums of appropriate derivatives. (This is established in Lemma 5.1). 1.3. Other properties of flag kernels Along with the results about decompositions of flag kernels, there are a number of other properties of these distributions that are worth mentioning and are discussed in Section 6. First, the class of flag kernels is invariant under changes of variables compatible with the structure of the flags. We have in mind transformations x y = F (x), with y k = x k + P k (x), and P k a homogeneous polynomial of x 1,...,x k 1, of the same homogeneity as x k.thefactthatk F satisfies the same differential inequalities (1.1) ask is nearly obvious, but the requisite cancellation conditions (such as (1.2)) are subtler and involve the weak cancellation of the bump functions. (See Theorem 6.15.) A second fact is that the cancellations required in the definition of a flag kernel can be relaxed. For example, assuming that the differential inequalities (1.1) hold, then the less restrictive version of (1.2) requires that the supremum is taken only over those R for which R 1 R 2 R n > 0. The formulation and proof of the sufficiency of these restricted conditions is in Theorem 6.13. Finally we should point out that at the basis of many of our arguments is an earlier characterization in [9] of flag kernels in terms of their Fourier transforms: these are bounded multipliers that satisfy the dual differential inequalities given in Definition 6.3. 1.4. Graded groups and compositions of flag kernels Up to this point our discussion of flag kernels has focused on their definition as distributions on the Euclidean space R N. We now consider convolutions with flag kernels on graded nilpotent Lie groups G whose underlying space is R N. The choice of an appropriate coordinate system on the group G, and its multiplication structure, induces a decomposition R N = R a1 R an and allows us to find exponents d 1 <d 2 < <d n as above so that the dilations δ r (x) =(r d1 x 1,...,r dn x n ), with r>0, are automorphisms of G. The proof of Theorem A reduces to the statement that if K 1 and K 2 are a pair of flag kernels, then K 1 K 2 is a sum of flag kernels, where the convolution is taken with respect to G. Note that when G is the abelian group R N, the result follows immediately from the characterization of flag kernels in terms of their Fourier transforms, cited earlier, and in fact the convolution of two flag kernels is a single flag kernel. In the non-commutative case the proof is not as simple and proceeds as follows. First write K 1 = I [ϕi ] I, K 2 = J [ψj ] J in terms of decompositions with bump functions with strong cancellation. Now, formally, (1.7) K 1 K 2 = [ϕ I ] I [ψ J ] J. I J

Singular integrals with flag kernels on homogeneous groups I 635 We look first at an individual term [ϕ I ] I [ψ J ] J in the above sum. It has three properties: (a) [ϕ I ] I [ψ J ] J =[θ I,J ] K with [θ I,J ] K a bump scaled according to K, where K = I J; thatisk =(k 1,k 2,...,k n ), and k m =max(i m,j m ), 1 m n. This conclusion holds even if we do not assume the cancellation conditions on [ϕ I ] I and [ψ J ] J. (b) Next, because we do have the cancellation conditions (1.4) and(1.4 ), we have a gain: There exists ɛ>0sothat[θ I,J ] K can be written as a finite sum of terms of the form 2 ɛ i l j l 2 ɛ[(im+1 im)+(jm+1 jm) [ θi,j ] K l A m B where [ θ I,J ] K is another bump function scaled according to K, anda and B are disjoint sets with A B = {1,...,n}, with n/ B. (c) Strong cancellation fails in general for [θ I,J ] K, but weak cancellation holds. Statements (a), (b), and (c) above are contained in Lemmas 6.17 and 6.18. With these assertions proved, one can proceed roughly 2 as follows. We define θ K = [ϕ I ] I [ϕ J ] J, I J=K where the sum is taken over all pairs (I,J) for which I J = K. Because of the exponential gain given in (b) this sum converges to a K-scaled bump function. Moreover, because of (c), θ K satisfies the weak cancellation property. As a result, the sum ψ K K converges to a flag kernel, and hence K 1 K 2 is a flag kernel as was to be shown. We comment briefly on the arguments needed to establish (b) and (c). Here we use the strong cancellation properties of [ϕ I ] I (or [ψ J ] J ). For (b) we express [ϕ I ] I as a sum of derivatives with respect to appropriate coordinates, then re-express these in terms of left-invariant vector fields, and finally pass these differentiation to [ψ J ] J. The reverse may be done starting with cancellation of [ψ J ] J. To obtain (c), the weak cancellation of [ϕ I ] I [ψ J ] J, we begin the same way, but express [ϕ I ] I in terms of right-invariant vector fields and then pass these differentiations onto the resulting convolution products. The mechanism underlying this technique is set out in the various lemmas of Section 3. The argument is a little more complex when we are in the case of equality for some of the indices that arise in I or J. This in effect involves convolutions with kernels belonging to coarser flags. The guiding principle for convolutions of such bump functions (or kernels) is that if K j are flag kernels corresponding to the flags F j, j =1, 2, then K 1 K 2 is a flag kernel for the coarsest flag F that is finer than F 1 and F 2. The combinatorics involved are illustrated by several examples given in Sections 7.3 and 7.5. 2 There are actually additional complications. We must first make a preliminary partition of the set of all pairs (I,J), and the result is that K 1 K 2 is actually a finite sum of flag kernels.

636 A. Nagel, F. Ricci, E. M. Stein and S. Wainger 1.5. L p estimates via square functions The proof of the L p estimates (Theorem 8.14) starts with the descending chain of subgroups G = G 1 G 2 G n,where G m = { x =(x 1, x 2,...,x n ):x 1 =0, x 2 =0,..., x m 1 =0} when m 2. We observe that the dilations δ r restrict to automorphisms of the G m. We then proceed as follows: (i) The standard (one-parameter) maximal functions and square functions on each group G m, as given in [3], are then lifted (or transferred ) to the group G. (ii) Compositions of these lifted objects lead to (n-parameter) maximal functions and square functions on G. Among these is the strong maximal function 1 M(f)(x) =sup f(xy 1 ) dy, m(r s ) R s for which one can prove vector-valued L p inequalities. Here R s = {x : x k s d k k },with(s 1,...,s n ) restricted to s 1 s 2 s n. There are also a pair of square functions, S and S, with the properties: (1.8) f L p A p S(f) L p and S(f) L p A p f Lp, for 1 <p<. (iii) The connection of these square functions with our operators T,given by Tf = f K with K a flag kernel, comes about because of the pointwise estimate: (1.9) S(Tf)(x) c S(f)(x), which is Lemma 8.13 below. Now, (1.8), together with (1.9), prove the L p boundedness of our operators. Among the ideas used to prove (1.9) is the notion of a truncated flag kernel: such a kernel is truncated at width a, a 0, if it satisfies the conditions such as (1.1), but with N 1 + + N k replaced by a + N 1 + + N k throughout (see Definition 6.20). A key fact that is exploited is that a convolution of a bump of width b with a truncated kernel of width a yields a truncated kernel of width a+b. For this, see Theorem 8.7, and its consequence, Theorem 8.9. 1.6. Final remarks The collection of operators with flag kernels contains both the automorphic (nonisotropic) Calderón Zygmund operators as well as the usual isotropic Calderón Zygmund operators with kernels of compact support (broadly speaking, the standard pseudo-differential operators of order 0). But flag kernels, by their definition, may have singularities away from the origin. Thus the algebra we are considering

Singular integrals with flag kernels on homogeneous groups I 637 consists of operators that are not necessarily pseudo-local. The study of a narrower algebra that arises naturally, which consists of pseudo-local operators and yet contains both types of Calderón Zygmund operators, will be the subject of the second paper [10] inthisseries. The authors are grateful to Brian Street for conversations and suggestions about the decomposition of flag kernels into sums of dilates of compactly supported functions. We would also like to thank the referee for a very careful reading of the paper. We note that the topic of this paper was the subject of several lectures given by one of us (EMS), in particular at a conference in honor of F. Treves at Rutgers, April, 2005, and at Washington University and UCLA in April and October 2008. During the preparation of this paper we learned of the work of G lowacki ([4], [5], and [6]) where overlapping results are obtained by different methods. We should also mention a forthcoming paper of Brian Street [12] that deals with the L 2 -theory in a more general context than is done in the present paper. 2. Dilations and flag kernels on R N Throughout the paper we shall use standard multi-index notation. Z denotes the set of integers, and N, the set of non-negative integers. If α =(α 1,...,α N ) N N, then α = α 1 + + α N and α! = α 1! α N!. If x = (x 1,...,x N ) R N, then x α = x α1 1 xαn N. For 1 j N, x j (or more simply j ) denotes the differential operator x j. If α N N, then α denotes the partial differential operator α1 1 αn N. The space of infinitely differentiable real-valued functions on R N with compact support is denoted by C0 (RN ) and the space of Schwartz functions is denoted by S(R N ). The basic semi-norms on these spaces are defined as follows: if ϕ C 0 (R N ), ϕ (m) =sup { α x ϕ(x) : α m, x R N } ; if ϕ S(R N ), ϕ [M] =sup { (1 + x e ) α β x ϕ(x) : α + β M, x RN }. Here x e denotes the usual Euclidean length of x R N. 2.1. The basic family of dilations Fix positive real numbers 0 <d 1 d 2 d N, and define a one-parameter family of dilations on R N by setting (2.1) δ r [x] =r x = ( r d1 x 1,...,r dn x N ). Also fix a smooth homogeneous norm x on R N so that r x = r x. The homogeneous ball of radius r is B(r) ={x R N : x <r}, and the homogeneous dimension of R N (relative to this family of dilations) is Q = d 1 + + d N. Recall that x e = x 2 1 + + x2 N denotes the ordinary Euclidean length of a vector x RN. If m(x) =c x α = cx α1 1 xαn N is a monomial, then m(r x) =rα1d1+ +αn dn m(x),

638 A. Nagel, F. Ricci, E. M. Stein and S. Wainger and the homogeneous degree of m is Δ(m) =α 1 d 1 + +α N d N. In particular, the homogeneous degree of a constant is zero. We shall agree that if the homogeneous degree of a monomial is negative, the monomial itself must be identically zero. With this convention, if m is any monomial, we have (2.2) Δ( j m)=δ(m) d j. We denote by H d the space of real-valued polynomials which are sums of monomials of homogeneous degree d. We have the following easy result: Proposition 2.1. If P is a polynomial, then P H d if and only if P (x) = α H d c α x α,whereh d = { α =(α 1,...,α N ) N n N j=1 α jd j = d }. Moreover, (1) if P H d,thenp(r x) =r d P (x); (2) if P H d1 and Q H d2,thenpq H d1+d 2 ; (3) if P H d,then k (P )(x) 0 if d k >d. 2.2. Standard flags and flag kernels in R N If X is an N-dimensional vector space, an n-step flag in X is a collection of subspaces X j X, 1 j n, such that (0) X 1 X 2 X n 1 X n = X. When X = R N we single out a special class of standard flags. These are parameterized by partitions N = a 1 + + a n (where each a j is a positive integer) as follows. We write (2.3) R N = R a1 R an, and we write x R N as x =(x 1,...,x n ) with x j R aj. With an abuse of notation, we identify R a k with vectors in R N of the form (0,...,0, x k, 0,...,0). Then the standard flag F associated to the partition N = a 1 + + a n and to the decomposition (2.3) is given by (2.4) (0) R an R an 1 R an R a2 R an R a1 R an = R N. In dealing with such decompositions and flags, it is important to make clear which variables in R N appear in which factor R a l. We can write x R N either as x =(x 1,...,x N ) with each x j R, orasx =(x 1,...,x n ) with x l = (x pl,...,x ql ) R a l so that q l = p l + a l 1. Denote by J l = {p l,p l +1,...,q l } the set of subscripts corresponding to the factor R a l so that {1,...,N} is the disjoint union J 1 J n. There is a mapping π : {1,...,N} {1,...,n} so that j J π(j) for 1 j N. Thus for example π(10) = 3 means that the variable x 10 belongs to the factor R a3. With the family of dilations defined in (2.1), the action on the subspace R a l is given by (2.5) r x l = ( r dp l xpl,...,r dq l xql ).

Singular integrals with flag kernels on homogeneous groups I 639 The homogeneous dimension of R a l is (2.6) Q l = d pl + + d ql = j J l d j. The function (2.7) N l (x l )= sup p l s q l x s 1/ds is a homogeneous norm on R a l so that N l (r x l )=rn l (x l ). If α=(α 1,...,α N ) N N, let ᾱ l =(α pl,...,α ql ), and set (2.8) [ᾱ l ] = α pl d pl + + α ql d ql = j J l α j d j. We introduce a partial order on the set of all standard flags on R N. Definition 2.2. Let A =(a 1,...,a r )andb =(b 1,...,b s ) be two partitions of N so that N = a 1 + + a r = b 1 + + b s. (1) The partition A is finer than the partition B (or B is coarser than A) ifthere are integers 1 = α 1 <α 2 < <α s+1 = r +1sothatb k = α k+1 1 j=α k a j. We write A Bor B A.IfA Bbut A B we write A Bor B A. (2) If F A and F B are the flags corresponding to the two partitions and if A B (or A B), we say that the flag F A is finer than F B (or F B is coarser than F A ) and we also write F A F B and F B F A (or F A F B and F B F A ). We recall from [9] the concept of a flag kernel on the vector space R N associated to the decomposition R a1 R an, equipped with the family of dilations given in equation (2.1). Let F be the standard flag given in (2.4). In order to formulate the cancellation conditions on the flag kernel, we need notation which allows us to split the variables {x 1,...,x n } into two disjoint sets. Thus if L = {l 1,...,l α } and M = {m 1,...,m β } are complementary subsets of {1,...,n} so that α + β = n, let N a = a l1 + +a lα and N b = a m1 + +a mβ.writex R N as x =(x, x )where x =(x l1,...,x lα )andx =(x m1,...,x mβ ). If f is a function on R Na and g is a function on R N b, define a function f g on R N by setting f g(x 1,...,x n )=f(x l1,...,x lα ) g(x m1,...,x mβ ). Definition 2.3. A flag kernel adapted to the flag F is a distribution K S (R N ) which satisfies the following differential inequalities (part (a)) and cancellation conditions (part (b)): (a) For test functions supported away from the subspace x 1 = 0, the distribution K is given by integration against a C -function K. Moreover, for every α = (α 1,...,α N ) Z N there is a constant C α so that if ᾱ k =(α pk,...,α qk ), then for x 1 0, α K(x) n [ C α N1 (x 1 )+ + N k (x k ) ] Q k [[ᾱ k ]]. k=1

640 A. Nagel, F. Ricci, E. M. Stein and S. Wainger (b) Let {1,...,n} = L M, with L = {l 1,...,l α }, M = {m 1,...,m β }, and L M = be any pair of complementary subsets. For any ψ C0 (R N b )and any positive real numbers R 1,...,R β, put ψ R (x m1,...,x mβ )=ψ(r 1 x m1,...,r β x mβ ). Define a distribution K # ψ,r S (R a l 1 + +a lr ) by setting K # ψ,r,ϕ = K,ψ R ϕ for any test function ϕ S(R a l 1 + +a lr ). Then the distribution K # ψ,r satisfies the differential inequalities of part (a) for the decomposition R a l 1 R a lr. Moreover, the corresponding constants that appear in these differential inequalities are independent of the parameters {R 1,...,R s }, and depend only on the constants {C α } from part (a), the semi-norms of ψ, and the support of ψ. The constants {C α } in part (a) and the implicit constants in part (b) are called the flag kernel constants for the flag kernel K. Remarks 2.4. (a) This definition proceeds by induction on the length n of the flag. The case n = 1 corresponds to Calderón Zygmund kernels, and the inductive definition is invoked in part (b). (b) With an abuse of notation, the distribution K # ψ,r is often written K # ψ,r (x l 1,...,x lr )= K(x) ψ R (x m1,...,x ms ) dx m1 dx ms. R am 1 R ams 3. Homogeneous vector fields In Section 6.6 below, where we consider a nilpotent Lie group G whose underlying space is R N, we will need to consider the families of left and right invariant vector fields on G. At this stage, before we introduce the group structure, we consider instead two spanning sets of vector fields {X 1,...,X N } and {Y 1,...Y N } on R N which are homogeneous with respect to the basic family of dilations given in (2.1); this means that if Z j is either X j or Y j for 1 j N, thenz j can be written 3 (3.1) Z j [ψ](x) = j [ψ](x)+ Pj(x) l l [ψ](x), d l >d j with Pj l H d l d j. It follows from part (3) ofproposition2.1 that k (Pj l) 0 if d k >d l. Thus we can commute the operators given by multiplication by the polynomial Pj l and differentiation with respect to x l and also write N (3.2) Z j [ψ](x) = j [ψ](x)+ l [Pj l ψ](x). d l >d j It follows from (3.1) or(3.2) thatz N = N. 3 Despite some risk of confusion, we do not introduce different notations for the coefficients of X j and Y j.

Singular integrals with flag kernels on homogeneous groups I 641 Proposition 3.1. If P H d and Z j is either X j or Y j,thenz j [P ] H d dj,and if d j >d, Z j [P ] 0. Proof. It follows from (2.2) thatifp H d,then l [P ] H d dl, and since P l j H dl d j, it follows from part (2) ofproposition2.1 that P l j x l [P ] H d dj. Thus Z j [P ] H d dj. The last conclusion then follows from part (3) ofproposition2.1. In equations (3.1) or(3.2), the vector fields {Z j } are written in terms of the Euclidean derivatives. Because these equations are in upper triangular form, it is easy to solve for the Euclidean derivatives in terms of the vector fields. Proposition 3.2. For each 1 j N, letz j denote either X j or Y j. Then there are polynomials Q l k H d l d k such that, for ψ S(R N ), k [ψ](x) =Z k [ψ](x)+ N d l >d k Q l k (x) Z l[ψ](x) =Z k [ψ](x)+ N d l >d k Z l [ Q l k ψ ] (x). Proof. We argue by reverse induction on the index k. Whenk = N it follows from equation (3.1) that N = Z N = X N = Y N. To establish the induction step, suppose that the conclusion of the proposition is true for all indices greater than k. From equation (3.1) and the induction hypothesis, for either choice of Z k we have N N [ N ] k [ψ] =Z k [ψ] Pk m m [ψ] =Z k [ψ] Pk m Z m [ψ]+ Q l mz l [ψ] d m>d k d m>d k d l >d m N [ ] = Z k [ψ] Pk m Z m [ψ] Pk m Q l m Z l [ψ]. d m>d k d k <d m<d l d l >d k But according to part (2) ofproposition2.1, Pk m the proof. Ql m H dl d k, and this completes For ψ S(R N )andt>0setψ t (x) =ψ(t 1 x). Then multiplication by a polynomial P H d is an operator homogeneous of degree d in the sense that P (x)ψ t (x) =t d (Pψ) t (x), and the vector fields X j and Y j are operators homogeneous of degree d j in the sense that X j [ϕ t ](x) =t dj (X j ϕ) t (x), Y j [ϕ t ](x) = t dj (Y j ϕ) t (x). In particular, the commutators [X j,x k ]and[y j,y k ] are vector fields which are homogeneous of degree (d j + d k ). It follows that we can write [X j,x k ]= Q m j,k (x) m = Rj,k m (x) Z m, with Q m j,k,rm j,k H d m d j d k, d m d j+d k d m d j+d k [Y j,y k ]= j,k (x) m = j,k (x) Z m, with Q m j,k, R j,k m H dm d j d k. d m d j+d k Qm d m d j+d k Rm If the operators {X j } and {Y j } are bases for a Lie algebra (as in the case of left or right invariant vector fields), the coefficients {Rj,k m } and { R j,k m } are constants.

642 A. Nagel, F. Ricci, E. M. Stein and S. Wainger Equations (3.1) or(3.2) express the vector fields {Z k } in terms of the standard derivatives { j }, and Proposition 3.2 expresses the standard derivatives in terms of the vector fields. We shall need analogous identities for products of r vector fields Z k1 Z kr or products of r Euclidean derivatives k1 kr. The formulas are somewhat complicated, since they involve products of operators of various lengths. To help with the formulation of the results, it will be convenient to introduce the following notation: Definition 3.3. Let k 1,...,k r {1,...,N} be a set of r integers, possibly with repetitions. (1) For any non-empty set U {1,...,r}, putd U = l U d k l and I(U) ={m {1,...,N} : d m d U }. Note that if U consists of two or more elements and m I(U), thend m > sup l U d kl. (2) For each integer 1 s r, let Us r denote the set of partitions of the set {1,...,r} into s non-empty disjoint subsets U = {U 1,...,U s }. The following proposition then shows how to write products of vector fields in terms of products of Euclidean derivatives. Proposition 3.4. Let k 1,...,k r {1,...,N} be a set of r integers, possibly with repetitions. For 1 l r, let Z kl denote either X kl or Y kl. Then there are polynomials P m l U l H dml d Ul such that Z k1 Z kr [ψ] = r s=1 (U 1,...,U s) U r s m 1 I(U 1) m s I(U s) m1 ms [ P m 1 U 1 P ms U s ψ ]. If s = r, so that U j = {k j }, the polynomial P kj U j (x) 1. Proof. We argue by induction on r. Thecaser =1iscontainedinequation(3.2), so suppose we are given vector fields {Z k1,...,z kr+1 } where each Z kl is either X kl or Y kl. Then, by induction, [ Z k1 Z kr+1 [ψ] =Z k1 Z kr Zkr+1 [ψ] ] r [ = m1 ms P m 1 [ (3.3) U 1 P ms U Zkr+1 s [ψ] ]]. s=1 (U 1,...,U s) Us r m 1 I(U 1) Since we can write Z kr+1 [ψ] = kr+1 [ψ]+ m s I(U s) {m : d m>d kr+1 } where P m k r+1 H dm d kr+1, the derivative m1 ms [ P m 1 U 1 m [ P m kr+1 ψ ] P ms U s [Z kr+1 [ψ]] ] in the

Singular integrals with flag kernels on homogeneous groups I 643 last line of (3.3) can be written [ m1 ms P m 1 [ U 1 P ms U Zkr+1 s [ψ] ]] [ = m1 ms P m 1 U 1 + d m>d kr+1 m1 ms = m1 ms kr+1 [ P m 1 U 1 [ P ms U kr+1 s [ψ] ]] [ P m 1 U 1 P ms U s ψ ] [ [ m1 ms kr+1 P m 1 ] ] U 1 P ms U s ψ + [ m1 ms m P m 1 U 1 d m>d kr+1 [ [ m1 ms m P m 1 U 1 d m>d kr+1 P ms U s [ m [P m k r+1 ψ] ]] P ms U s P m k r+1 ψ ] P ms U s ]P m k r+1 ψ ]. The terms in the first and third lines of the last expression [ have the right form for the case r +1. Thusin the term m1 ms kr+1 P m 1 U 1 P ms U s ψ ], r has been replaced by r +1, and the set{1,...,r,r +1} has been decomposed into s +1 subsets {U 1,...,U s,u s+1 } where U s+1 = {k r+1 },andp kr+1 U s+1 (x) 1. The same is [ true for each term m1 ms m P m 1 U 1 P ms U s Pk m r+1 ψ ], except that PU m s+1 = Pk m r+1. For the terms in the second and fourth lines, we use the product rule; we write [ kr+1 P m 1 ] [ U 1 P ms U s and m P m 1 U 1 P ms U s ]asasumofsterms. For example, [ [ m1 ms m P m 1 U 1 ]P m2 U 2 P ms U s Pk m r+1 ψ ] d m>d kr+1 [( [ ) = m1 ms P m 1 U 1 ]Pk m r+1 P m2 U 2 d m>d kr+1 m [ = m1 ms P m 1 P m2 Ũ 1 U 2 P ms U s ψ ] ] P ms U s ψ m1 where P = [ Ũ 1 d m>d kr+1 m P m 1 U 1 ]Pk m r+1 H dm1 d U1 d kr+1. These terms also have the right form for the case r + 1, since we now let Ũ1 = U 1 {k r+1 },sothat {k 1,...,k r+1 } = Ũ1 U 2 U s. This establishes the proposition. The next result shows how to write products of Euclidean derivatives in terms of products of vector fields. Since the proof is similar to that of Proposition 3.4, we omit it. Proposition 3.5. Let k 1,...,k r {1,...,N} be a set of r integers, possibly with repetitions, chosen from the set {1,...,N}. There are polynomials Q m l U l H dml d Ul such that r k1 kr [ψ] = Z m1 Z ms [Q m1 U 1 Q ms U s ψ]. s=1 (U 1,...,U s) U r s m 1 I(U 1) m s I(U s) Here each Z j is either X j or Y j. If s = r, so that U l = {k l }, the polynomial Q m l U l (x) 1.

644 A. Nagel, F. Ricci, E. M. Stein and S. Wainger 4. Normalized bump functions and their dilations 4.1. Families of dilations Fix the family of dilations on R N given in equation (2.1). We introduce an N-parameter family of dyadic 4 dilations. For f L 1 (R N )andi Z N set (4.1) 2 I x =(2 d1i1 x 1,...,2 dn in x N ), N [ d l i l f (x) =2 l=1 f(2 ]I I x). Then [f]i L1 (R N ) = f L 1 (R N ). The set of monotone increasing indices is denoted by (4.2) E N = { I =(i 1,...,i N ) Z N i1 i 2 i N }. When we consider flag kernels corresponding to the decomposition A given by N = a 1 + + a n and R N = R a1 R an, we consider the n-parameter family of dilations parameterized by n-tuples I =(i 1,...,i n ) Z n : (4.3) 2 I x =(2 i1 x 1,...,2 in x n ), where 2 il x l =(2 dp l i l x pl,...,2 dq l i l x ql ), and n [f] I (x) =2 Q l i l l=1 f(2 I x). The set of monotone increasing indices in this case is denoted by (4.4) E n = { I =(i 1,...,i n ) Z n i1 i 2 i n }. Given the decomposition A, there is a mapping p A : E n E N given by (4.5) p A (i 1,...,i n )= ( a1 {}}{ i 1,...,i 1, a 2 {}}{ i 2,...,i 2,..., a n {}}{ i n,...,i n ). We shall want to write flag kernels as sums of dilates [ϕ] I of normalized bump functions ϕ. Roughly speaking, a family of functions {ϕ α } in C 0 (RN )ors(r N ) is normalized if one has uniform control of the supports (in the case of C 0 (R N )) and of the semi-norms ϕ α (m) or ϕ α [M]. The following definition will simplify the precise statements of our results. Definition 4.1. (1) If ϕ, ϕ C0 (R n ), then ϕ is normalized in terms of ϕ if 5 there are constants C, C m > 0andintegersp m 0sothat: (a) If the support of ϕ is contained in the ball B(ρ), then the support of ϕ is contained in the ball B(Cρ). (b) For every non-negative integer m, ϕ (m) C m ϕ (m+pm). 4 In Section 8, we shall use a continuous version of this dyadic family. If t =(t 1,...,t N )with each t j > 0, we will set f t (x) =f(t d 1 1 x 1,...,t d N N x N ). 5 We shall sometimes use the expressions normalized with respect to or normalized relative to as a variant of normalized in terms of.

Singular integrals with flag kernels on homogeneous groups I 645 (2) If ψ, ψ S(R n ), then ψ is normalized in terms of ψ if there are constants C N > 0andintegersp N 0sothat ψ [N] C N ψ [N+pN ] for every nonnegative integer N. (3) If P, P are polynomials, then P is normalized in terms of P if P is obtained from P by multiplying each coefficient by a constant of modulus less than or equal to 1. If ϕ C 0 (RN ), it is sometimes convenient to write ϕ as a sum of products of functions of a single variable. That this is possible follows from the following fact: Proposition 4.2. Let ϕ C 0 (RN ). Then for each α N N and 1 k N there are functions ϕ α,k C 0 (R N ) so that ϕ(x 1,...,x N )= α N N c α ϕ α,1 (x 1 ) ϕ α,n (x n ), where for any M>0, there is a constant C M such that c α C M (1 + α ) M. Proposition 4.2 follows easily by appropriately periodizing ϕ, expanding ϕ in a rapidly converging Fourier series, and then multiplying by appropriate cutoff functions ψ(x 1 ) ψ(x N ). 4.2. Differentiation and multiplication of dilates of bump functions In this section we study the action of differentiation or multiplication by a homogeneous polynomial on dilates of bump functions. The key results are Proposition 4.7 and Corollary 4.8 below. We begin with the following result, which follows easily from the definitions and the chain rule: Proposition 4.3. Let ψ S(R N ).Then (4.6) 2 +d ki k k [ψ] I (x) =[ k ψ] I (x) and 2 d ki k x k [ψ] I (x) =[x k ψ] I (x). More generally, if I =(i 1,...,i N ) Z N,andα N N,then (4.7) 2 +[[α I]] α [ψ] I = [ α ψ ] I and 2 [[α I]] x α [ψ] I = [ x α ψ ] I, where [α I[] = N k=1 α kd k i k. We will frequently use the following generalization of the second identity in equation (4.7): Proposition 4.4. Let P H d where d<d l. If I E N, there is a polynomial P I H d, normalized in terms of P, so that for ψ S(R N ). P (x)[ψ] I (x) =2 di l[ PI ψ ] I (x)

646 A. Nagel, F. Ricci, E. M. Stein and S. Wainger Proof. Write P (x) = α H d c α x α with H d = {α =(α 1,...,α N ) N N : α 1 d 1 + + α N d N = d}. Since d<d l,ifα H d we have α j =0forj l. AccordingtoProposition4.3, it follows that P (x)[ψ] I (x) = α H d c α 2 [[α I]][ x α ψ ] (x), and I l 1 [α I ] = Thus m=1 l 1 α m d m i m =i l m=1 l 1 α m d m m=1 l 1 α m d m (i l i m )=di l m=1 P (x) [ [ ψ] I (x) =2 di l c α 2 l 1 ] m=1 αmdm(i l i m) x α ψ (x). I α H d α m d m (i l i m ). Since I E N, each exponent l 1 m=1 α md m (i l i m ) 0. The proof is complete if we set P I (x) = α H d c α 2 l 1 m=1 αmdm(i l i m) x α. Remark. Equation (4.6) shows that the operator 2 d ki k xk, applied to the I-dilate of a function ϕ, isthei-dilate of a function ϕ normalized in terms of ϕ, and multiplying the I-dilate of function ϕ by 2 d ki k x k is the I-dilate of a function ϕ normalized in terms of ϕ. Thus at scale I, the operators 2 d ki k xk and 2 d ki k x k are invariant ; they map the collection of I-dilates of normalized functions to itself. A key observation, which is used when we consider convolution on homogeneous nilpotent groups, is that we can replace the operator 2 d ki k xk with the operator 2 d ki k Z k, or conversely the operator 2 d ki k Z k by the operator 2 d ki k xk,atthecost of introducing an error involving terms 2 d li l Z l or 2 d li l l,wherel>k, multiplied by a gain 2 d k(i l i k ). The precise statement is given in Proposition 4.5 below. Let Pk l H d l d k be the homogeneous polynomials that are coefficients of a vector field Z k as in equations (3.1) or(3.2), and let Q l k H d l d k be the polynomials in Proposition 3.2. Sinced l d k <d l, we can use Proposition 4.4 to write (4.8) Pk l (x)[ψ] I(x) =2 i l(d l d k ) [ Pk,I l ψ] I (x), Q l k (x)[ψ] I(x) =2 i l(d l d k ) [ Q l k,i ψ] (x), I where P l k,i,ql k,i H d l d k are normalized relative to P l k and Ql k Proposition 4.5. Let I E N,andlet{Pk,I l } and {Ql k,i } be the homogeneous polynomials defined in equation (4.8). Letψ C0 (R N )(respectively ψ S(R N )). Then, N (2 d ki k Z k )[ψ] I =(2 d ki k k )[ψ] I + 2 dk(il ik) [ (2 dlil l ) P l k,i ψ ] I, d l >d k N (2 d ki k k )[ψ] I =(2 d ki k Z k )[ψ] I + 2 dk(il ik) [ (2 dlil Z l ) Q l k,i ψ ] I. d l >d k

Singular integrals with flag kernels on homogeneous groups I 647 The functions {P l k,i ψ} and {Ql k,i ψ} are normalized with respect to ψ in C 0 (RN ) (respectively in S(R N )). Proof. The first identity follows immediately from equations (3.2) and(4.8). To obtain the second, use Proposition 3.2 and equation (4.8); we have N 2 d ki k xk [ψ] I =2 d ki k Z k [ψ] I +2 d ki k Z l Q l k(x)[ψ] I which is the desired formula. l=k+1 l=k+1 N =2 d ki k Z k [ψ] I + 2 d k(i l i k ) (2 d li l Z l ) [ Q l k,i ψ], I Corollary 4.6. If ϕ C 0 (R N )(respectively ϕ S(R N )) and if I E N,then there is a function ϕ C 0 (RN )(respectively ϕ S(R N )), normalized with respect to ϕ, such that 2 d ki k Z k [ϕ] I =[ ϕ] I. We shall need an analogue of Proposition 4.5 for r-fold products of vector fields or Euclidean derivatives. If {P m l U l } and {Q m l U l } are the polynomials appearing in Propositions 3.4 and 3.5, we use Proposition 4.4 to define polynomials P m l U l,i and Q m l U l,i by the formulas P m l U l (x)[ψ] I (x) =2 i l(d ml d Ul ) [ P m l U l,i ψ] I (x), (4.9) Q m l U l (x)[ψ] I (x) =2 i l(d ml d Ul ) [ Q m l U l,i ψ] I (x). Proposition 4.7. Let k 1,...,k r {1,...,N} be a set of r integers, possibly with repetitions, and let I E N.LetZ l denote either X l or Y l.then (2 d k 1 i k1 Zk1 ) (2 d kr i kr Zkr )[ψ] I r = s=1 (U 1,...,U s) Us r m 1 I(U 1) m s I(U s) 2 r d kl i kl s i ml d Ul l=1 l=1 and (2 dm 1 im 1 m1 ) (2 dmsims ms ) [ P m1 ms U 1,I PU ψ] s,i I, (2 d k 1 i k1 k1 ) (2 d kr i kr kr )[ψ] I r = s=1 (U 1,...,U s) Us r m 1 I(U 1) m s I(U s) 2 r d kl i kl s i ml d Ul l=1 l=1 (2 dm 1 im 1 Zm1 ) (2 dms ims Z ms ) [ Q m1 U 1,I Qms U ψ] s,i I. In either identity, if s = r so that U l = {k l }, 1 l r, the polynomials P k l U l (x) =Q k l U l (x) 1.

648 A. Nagel, F. Ricci, E. M. Stein and S. Wainger Proof. Using Proposition 3.4, we have (2 d k 1 i k1 k1 ) (2 d kr i kr kr )[ψ] I r d kl i kl r s =2l=1 =2 = r d kl i kl l=1 r s=1 s=1 r s=1 s U Us r j=1 m j I(U j) s U Us r j=1 m j I(U j) U Us r j=1 m j I(U j) r d kl i kl s i ml d Ul l=1 l=1 2 [ Z m1 Z ms Q m 1 ] U 1 Q ms U s [ψ] I s i ml (d ml d Ul ) [ 2l=1 Zm1 Z ms Q m 1 U 1,I Qms (2 dm 1 im 1 Zm1 ) (2 dms ims Z ms ) [ Q m1 U 1,I Qms U s,i ψ] I. U s,i ψ] I This completes the proof of the first identity. The second identity is established in the same way. In Proposition 4.7, we can rewrite the exponent of the power of 2 as follows. Having chosen U = (U 1,...,U s ) U r s, there is a unique mapping σ = σ U : {1,...,r} {1,...,s} so that l U σ(l) for 1 l r. Then r d kl i kl l=1 s i ml d Ul = l=1 r d kl i kl l=1 r i mσ(l) d kl = l=1 r d kl (i mσ(l) i kl ). Since l U σ(l) and m σ(l) I(U σ(l) ), it follows that d mσ(l) d Uσ(l) d kl, with equality only possible if U σ(l) = {σ(l)}. ThusifI E N, it follows that i mσ(l) i kl, in which case l=1 2 r l=1 d k l i kl s l=1 im l du l 2 ɛ r l=1 (im σ(l) i k l ), wherewecantakeɛ = d 1 > 0. We can now recast the identities in Proposition 4.7 in a way which, although losing some information, makes them more useful and easier to work with when dealing with flag kernels. Corollary 4.8. Fix a decomposition R N = R a1 R an. Let k 1,...,k r {1,...,N} be a set of r integers, possibly with repetitions, with k l J π(l). 6 For 1 l r, letz kl denote either X kl or Y kl. Let I E n,andletψ S(R N ) or C0 (RN ). (1) The function (2 d k 1 i π(1)zk1 ) (2 d kr i π(r) Zkr )[ψ] I can be written as a finite sum of terms of the following form. Decompose the set {1,...,r} into two disjoint complementary subsets A and B with π(j) n for any j A and B. Let 6 Recall from page 638 that π : {1,...,N} {1,...,n}; for any coordinate x j,1 j N, then x j is a coordinate in the factor R a π(j),andj π(j) is the set of indices of all the coordinates in R a π(j).

Singular integrals with flag kernels on homogeneous groups I 649 B = {l 1,...,l s }, and choose integers m = {m 1,...,m s } so that each m t J lt. Then a typical term in the expansion of (2 d k 1 i π(1)zk1 ) (2 d kr i π(r) Zkr )[ψ] I is 2 ɛ j A (i π(j)+1 i π(j) ) (2 dm 1 i π(1) m1 ) (2 dms i π(s) ms ) [ ψ A,B, m ]I, where ψ A,B, m is normalized relative to ψ. (2) The function (2 d k 1 i π(1) k1 ) (2 d kr i π(r) kr )[ψ] I can be written as a finite sum of terms of the following form. Decompose the set {1,...,r} into two disjoint complementary subsets A and B with π(j) n for any j A and B. Let B = {l 1,...,l s }, and choose integers m = {m 1,...,m s } so that each m t J lt. Then a typical term in the expansion of (2 d k 1 i π(1) k1 ) (2 d kr i π(r) kr )[ψ] I is 2 ɛ j A (i π(j)+1 i π(j) ) (2 dm 1 i π(1) Z m1 ) (2 dms i π(s) Z ms ) [ ψ A,B, m ]I, where ψ A,B, m is normalized relative to ψ. Remarks 4.9. (a) The essential point of part (1) in the corollary is that, when replacing the operator with a sum of terms of the form (2 d k 1 i π(1) Z k1 ) (2 d kr i π(r) Z kr )[ψ] I 2 ɛ j A (i π(j)+1 i π(j) ) (2 dm 1 i π(1) m1 ) (2 dmsi π(s) ms ) [ ψ A,B, m ]I, either the factor (2 d k l i π(l) Z kl ) is replaced by a term (2 dm l i π(l) ml ), where the coordinate x ml belongs to the same subspace as x kl and hence has the same dilation i π(l), or it is replaced by the gain 2 ɛ(i π(l)+1 i π(l) ).Part(2) is the same assertion with the roles of the vector fields and Euclidean differentiation interchanged. (b) One term in the expansion of (2 d k 1 i π(1)zk1 ) (2 d kr i π(r) Zkr )[ψ] I in part (1) arises by letting A = so that B = {1,...,r}, and then choosing m r = k r. We have seen that in this case the function ψ A,B, m = ψ, sowegettheterm (2 d k 1 i π(1) k1 ) (2 d kr i π(r) kr )[ψ] I. Every other term in the expansion then involves either a gain 2 ɛ(i π(l)+1 i π(l) ) or the replacement of a variable x kl by a different variable x ml with d ml >d kl.we shall say that any term of the form 2 ɛ j A (i π(j)+1 i π(j) ) (2 dm 1 i π(1) m1 ) (2 dmsi π(s) ms ) [ ψ A,B, m ]I, where either A or some d ml >d kl is an allowable error. Thus the difference (2 d k 1 i π(1) Z k1 ) (2 d kr i π(r) Z kr )[ψ] I (2 d k 1 i π(1) k1 ) (2 d kr i π(r) kr )[ψ] I is a sum of allowable errors. (c) Since Euclidean derivatives commute, it follows that if σ is any permutation of the set {1,...,r}, then the difference (2 d k 1 i π(1) Z k1 ) (2 d kr i π(r) Z kr )[ψ] I (2 d k 1 i π(1) Z kσ(1) ) (2 d kr i π(r) Z kσ(r) )[ψ] I is a sum of allowable errors.

650 A. Nagel, F. Ricci, E. M. Stein and S. Wainger 5. Cancellation A function is often said to have cancellation if its average or integral is zero. For our purposes, we shall need a more refined notion involving integrals in some subset of variables. Let J = {j 1,...,j s } {1,...,N}. Ifψ S(R N ), write (5.1) ψ(x) dx J = ψ(x 1,...,x N ) dx j1 dx js. R s Note that ψ(x) dx J is then a function of the variables x k for which k/ J. Wesay that a function ψ has cancellation in the variables {x j1,...,x js } if R s ψ(x) dx J 0, where J = {j 1,...,j s }. Later, in Section 5.2, we shall give additional definitions of strong cancellation and weak cancellation for a function ψ. 5.1. Cancellation and the existence of primitives We begin by showing that cancellation in certain collections of variables is equivalent to the existence of appropriate primitives. Lemma 5.1. Let ψ S(R N ),andletj k {1,...,n} be non-empty subsets for 1 k r. Ifthesets{J k } are mutually disjoint, then the following two statements are equivalent: (a) For 1 k r, ψ(x) dx Jk =0. (b) There are functions ψ j1,...,j r S(R N ), normalized with respect to ψ, such that ψ(x) = j1 jr ψ j1,...,j r (x). j 1 J 1 j r J r Moreover, if the function ψ C0 (R N ), then we can choose the functions ψ j1,...,j r C0 (RN ) normalized with respect to ψ. It follows easily from the Fundamental Theorem of Calculus that (b) implies (a). The main content of the lemma is thus the opposite implication. This will follow by induction on r from the following assertion: Proposition 5.2. Let ψ C0 (RN ). Suppose that J 1,J 2 {1,...,N} are nonempty and disjoint. If ψ(x) dx J1 = ψ(x) dx J2 =0, then for each k J 1 there is a function ψ k C0 (RN ), normalized relative to ψ, so that: (i) we can write ψ = k J 1 k ψ k ; (ii) for each k J 1 we still have ψ k (x) dx J2 =0. If ψ S(R N ), the same conclusions hold except that the functions ψ k S(R N ) and are normalized relative to ψ.

Singular integrals with flag kernels on homogeneous groups I 651 Proof. By relabeling the coordinates, we can assume that J 1 = {1,...,k}, and that J 2 {k +1,...,N}. Suppose that ψ has compact support in the set B = {x R N : x j <a j }. Choose χ C0 (R) withsupportin[ 1, +1] such that R χ(s) ds = 1, and put χ j(t) =a 1 j χ(a 1 j t) sothatχ j issupportedin[ a j, +a j ] and still has integral equal to 1. Put ϕ 1 (x) =ψ(x) χ 1 (x 1 ) ψ(s, x 2,...,x N ) ds and for 2 j k, put ϕ j (x) = [ j 1 l=1 R ] χ l (x l ) ψ(s 1,...,s j 1,x j,...,x N ) ds 1 ds j 1 R j 1 [ j ] χ l (x l ) ψ(s 1,...,s j,x j+1,...,x N ) ds 1 ds j. R j l=1 Then the functions {ϕ j } have the following properties. First, since ψ(x) dx J1 =0, the second integral in the definition of the last function ϕ k is zero, and hence ψ(x) = k j=1 ϕ j(x). Next, it is clear that each ϕ j is supported in the set B. Finally, for 1 j k, ϕ j (x 1,...,x j 1,s,x j+1,...,x N ) ds =0, so if we put R (5.2) ψ j (x) = = xj x j ϕ j (x 1,...,x j 1,s,x j+1,...,x N ) ds ϕ j (x 1,...,x j 1,s,x j+1,...,x N ) ds, then ψ j is supported on the set B, andϕ j (x) = j ψ j (x). It is clear that one can estimate the size of the derivatives of the functions {ψ j } in terms of the derivatives of ψ, soψ j is normalized in terms of ψ. Moreover,since ψ(x) dx J2 = 0, it follows from their definitions that ϕ j (x) dx J2 = 0, and hence ψ j (x) dx J2 =0. This completes the proof if ψ C0 (RN ). If ψ S(R N ), the proof goes the same way. One only has to observe from equation (5.2) that the functions ψ j S(R N ). This completes the proof of Proposition 5.2, and hence Lemma 5.1 is also established. 5.2. Strong and weak cancellation In this section we introduce two kinds of cancellation conditions that can be imposedonfunctionsins(r N )orc0 (RN ). As discussed in Section 1.1, these concepts are used in the context of the decomposition R N = R a1 R an given in (2.3), where a 1 + + a n = N and each a j 1. Recall that if x R N,wewrite

652 A. Nagel, F. Ricci, E. M. Stein and S. Wainger x =(x 1,...,x n )wherex l =(x pl,x pl +1,...,x ql ) R a l.wethenletj l denote the set of integers {p l,p l +1,...,q l }.Ifϕ S(R N ), set (5.3) ϕ(x 1,...,x n ) dx l = ϕ(x 1,...,x n ) dx pl dx ql. R a l R a l Definition 5.3. Fix the decomposition (2.3), and let ϕ S(R N ). The function ϕ has strong cancellation if and only if ϕ(x 1,...,x n ) dx l 0, 1 l n. R a l That is, ϕ has strong cancellation if and only if it has cancellation in each collection of variables {x pl,...,x ql } for 1 l n. Remark 5.4. It follows from Proposition 5.2 that ϕ has strong cancellation if and only if there are functions ϕ j1,...,j n normalized with respect to ϕ so that ϕ = j1 jn ϕ j1,...,j n. j 1 J 1 j n J n We now introduce a weaker cancellation condition. Definition 5.5. Fix the decomposition (2.3) ofr N. Let ϕ S(R N )orϕ C0 (RN ), and let I =(i 1,...i n ) E n. The function ϕ has weak cancellation with parameter ɛ>0 relative to the multi-index I if and only if ( ϕ = 2 ɛ(is+1 is)) ( ) j1 jr [ϕa,b,j1,...,j r ] s B j 1 J α1 j r J αr A B={1,...,n} A={α 1,...,α r} n/ B where each ϕ A,B,j1,...,j r is normalized relative to ϕ. Here the outer sum is taken over all decompositions of the set {1,...,n} into two disjoint subsets A and B such that n A. According to Remark 5.4, ϕ has strong cancellation if it can be written as a sum of functions of the form j1 jn ϕ; i.e., asn th -derivatives of functions where there is one derivative in a variable from each of the n subspaces R a l. Definition 5.5 imposes a weaker condition; a function ϕ has weak cancellation if again it is a sum of terms, but the term j1 jn ϕ j1,...,j n is itself replaced by a new sum of terms. If a derivative jl does not appear, so that the term does not have cancellation in the subspace R a l, there is instead a gain given by 2 ɛ(i l+1 i l ). Remarks 5.6. There will be occasions when we will use the fact that a function ϕ has weak cancellation with respect to I E n to draw inferences about existence of primitives and smallness in only some of the subspaces {R a l }. In particular, the following assertions follow easily from Definition 5.5:

Singular integrals with flag kernels on homogeneous groups I 653 (1) Suppose that I E n and that ϕ has weak cancellation relative to I. Let M {1,...,n}. Then ( ϕ = 2 ɛ(is+1 is)) ( ) j1 js [ ϕa,b,j1,...,j r ] A B=M A={α 1,...,α r} n/ B s B j 1 J α1 j r J αr where each ϕ A,B,j1,...,j r is normalized relative to ϕ, and where the outer sum is over all subsets B M, with the understanding that if n M then n/ B. (2) In particular, if we take M = {1}, it follows that if ϕ has weak cancellation relative to I E n, we can write a 1 ϕ = r [ϕ r ]+2 ɛ(i2 i1) ϕ 0 r=1 where {ϕ 0,ϕ 1,...,ϕ a1 } are normalized relative to ϕ. Here, the derivatives are with respect to the variables in the first subspace R a1. We can also characterize weak cancellation in terms of the smallness of integrals, to be compared with the characterization of strong cancellation in terms of the vanishing of integrals given in Lemma 5.1. For any partition {1,...,n} = A B with A = {j 1,...,j a } and B = {k 1,...,k b }, write x R N as x = (x A, x B ) where x A =(x j1,...,x ja ), x B =(x k1,...,x kb ). Let dx A = dx j1 dx ja, and let dx B = dx k1 dx kb. Proposition 5.7. Let ɛ>0 and I E n.afunctionψ C0 (R N ) has weak cancellation with parameter ɛ relative to I if and only if for every partition {1,...,n} = A B into disjoint subsets we have 0 if n B, k B Ra k ψ(x A, x B ) dx B = [ k B 2 ɛ(i k+1 i k )] ψ A (x A ) if n/ B, where ψ A S( j A Raj ) is normalized relative to ψ. If ψ C0 (R N ),thefunctions ψ A C0 ( j A Raj ). Proof. It is clear that if ψ has weak cancellation, then it satisfies the condition of Proposition 5.7, so the main content is the opposite implication. Let χ l C0 (Ra l ) have support in the set where x l 1, with R χ(x a l l) dx l =1. For ψ S(R N )and1 l N, define L l [ψ](x) =ψ(x) χ l (x l ) ψ(x) dx l, R a l M l [ψ](x) =χ l (x l ) ψ(x) dx l. It is easy to check that the operators {L 1,...,L n,m 1,...,M n } all commute, and that M l [ψ](x) dx l = ψ(x) dx l and L l [ψ](x) dx l =0. R a l R a l R a l R a l