Sobolev’s inequality under a curvature-dimension condition

,


Introduction
Given d ∈ N, d ≥ 2, and p ∈ [1, d), let p * ∈ [1, +∞) denote Sobolev's exponent, that is According to Sobolev's inequality, there exists a constant A > 0 such that for every ϕ ∈ C ∞ c (R d ), see [Sob38], as well as [Gag58,Nir59] for the case p = 1, [Rod66, Aub76b,Tal76] for the value of the sharp constant A and the expression of the extremals, [Lie83] for a more direct proof using rearrangements and [CGS89] for the classification of all positive solutions to the associated Euler-Lagrange equation.In the special case p = 2, using the stereographic projection (see e.g.[LP87]), the sharp Sobolev inequality in R d is equivalent to where q = 2 * , v ∈ C ∞ (S d ), S d is the standard sphere equipped with its normalized * measure.The inequality is again sharp and the extremals are known, see [Aub76a], as well as Theorem 5.1 p. 121 in [Heb00].In fact, the inequality is true for every q ∈ [1, 2 * ], q = 2, see [BV91,Bec93], as well as [Dem05], Section 3.11 for the case q ∈ [1, 2).Also note that letting q → 2, one recovers the sharp log-Sobolev inequality.In [Ili83], Sobolev's inequality has been generalized as follows to any compact Riemannian manifold (M, g) with positive Ricci curvature.
Theorem A ( [Ili83]) Let (M, g) be a smooth connected, compact, d-dimensional Riemannian manifold, d ≥ 3. Assume that the Ricci curvature of M is uniformly bounded from below by a constant ρ > 0. Let q = 2 * = 2d d−2 .Then, for all v ∈ C ∞ (M), where M is equipped with its normalized measure.
Remark 1.1 It is not necessary to assume that M is compact, as follows from Myer's theorem (see e.g.[Heb00] p. 100 for a geometric proof, or combine Theorems 3.2.7,6.6.1 and 6.8.1 in [BGL14] for an analytic proof ).
Many proofs of Theorem A are available.The approach in [Ili83] relies on symmetrization arguments and the Lévy-Gromov isoperimetric inequality [Gro07], the rigorous proof of which seems involved, see e.g.[Vil19].The proof of [BV91] clarifies computations of [GS81], but does not elucidate them.The latter paper presumably took inspiration from Obata's work [Oba62] (also described in [BGM71], pp.179-185).In [BL96] (see Theorem 6.10 p. 107 in [Bak94] for the actual proof, as well as Chapter 6 in [BGL14] for a more recent and thorough account), the inequality is generalized to any Markov generator satisfying the curvature-dimension condition CD(ρ, n), ρ > 0, n > 2. Among other tools, their proof makes use of the Bakry-Émery method (or Γ-calculus) and a rather unintuitive change of unknown which was already present in the aforementioned litterature.The proof of Fontenas [Fon97] provides a sharper version of the inequality in terms of the generator's best Poincaré constant in the case q ∈ [2, 2 * ).His computations, inspired by [Rot86], use again the Γ-formalism and recast the proof in a yet simpler form, but still fail short of making it transparent.In [DD02], Sobolev's inequality in R d appears as a limiting case of a family of optimal Gagliardo-Nirenberg inequalities.This paper puts forward two important tools for our purposes: the classification of solutions to the associated Euler-Lagrange, based here on the symmetry result of [GNN81] and, more importantly, the connection between Sobolev's inequality and the convergence to equilibrium of solutions to the fast-diffusion equation, or rather to a Fokker-Planck-type equation obtained by rescaling.The fast-diffusion and porous medium equations had just been reformulated in [Ott01] as a gradient flow in Wasserstein space, leading the way to the reinterpretation of Sobolev's inequality (and more generally the Gagliardo-Nirenberg inequalities studied by del Pino and Dolbeault) as a simple convexity inequality along a flow, in other words as an entropy-entropy production inequality.This latter point of view was taken in [CT00], [CJM + 01] and [CV03] to establish Sobolev-type inequalities in R d and more recently simplified and generalized to convex euclidean domains in [Zug20].Soon after, [CNV04] gave a short proof using optimal transport, but valid in the euclidean setting only.The extension of the Bakry-Émery method to nonlinear flows was further cleverly extended in the Riemannian setting in [Dem08], although without Otto's geometric insight, but with a twist: the use of two distinct entropy functionals, the evolutions of which can be related through a simple differential inequality.Other recent generalizations include the cases of RCD * (ρ, n)-spaces [Pro15] and Riemannian manifolds with boundary [IS18].Going back to the euclidean setting, but with weights, [DEL16] extended the method to prove the sharp Caffarelli-Kohn-Nirenberg inequalities and the associated Liouville-type results.
We probably forgot to cite important contributions and judging by the extent of the bibliography, one may wonder why we intend to give here yet another proof of Sobolev's inequality.From our point of view, the proof presented below, inspired by [DEL16], has the advantage of being short, transparent and hopefully robust.In particular, with no extra work, our proof yields the following generalization † of Theorem A, due to [BL96].
where M is equipped with the measure dν = e −W Z dVol g , with Z ∈ R * + chosen so that ν(M) = 1.
Remark 1.2 Again, it is not necessary to assume that M is compact, as follows from the generalized Myer's theorem proved in [BL96].
where A > 0 and f ∈ C 1,α (R * + ; R * + ), α ∈ (0, 1) and f is nonincreasing.Let A * = 4(n−1) n(n−2)ρ .Then, A ≤ A * .In addition, if Remark 1.4 If equality holds in Sobolev's inequality (1) for some nonconstant function v, then v solves the associated Euler-Lagrange (equation (2) with n = d, A = A * , L = ∆ and f constant).As follows from the proof of Theorem 1.3, the function This in turn implies that (M, g) is conformally diffeomorphic to the round sphere, see e.g.Lemme 6.4.3 in [Heb97].If we assume in addition that (M, g) is Einstein, letting d g denote its Riemannian distance, we have in fact that (M, g) is isometric to the round sphere and that v for some β > 1 and x 0 ∈ M, see e.g.Theorem 5.1 in [Heb00] and its proof.
2 Proofs of Theorem A, Theorem B and Theorem 1.3

Proof of Theorem A
Fix q ∈ [1, 2 * ).By the (non-sharp but tight) Sobolev inequality, there holds (3) † For convenience of the reader, in Section 2, we recall the definition of the CD(ρ, n) condition used in Theorem B.
for some A ∈ R * + and every v ∈ H 1 (M), apply e.g.[Heb00], Corollary 2.1 and [BGL14], Proposition 6.2.2.Given A ∈ R * + , consider the minimization problem Thanks to the Banach-Alaoglu-Bourbaki and Rellich-Kondrakov compactness theorems (see e.g.[Bre11] Theorem 3.16 and [Heb00] Theorem 2.9), there exists a minimizer v ∈ H 1 (M) s.t.||v|| q = 1.By Stampacchia's theorem [Sta66], |v| is also a minimizer, so we may assume that v ≥ 0 a.e. in M. In addition, a constant multiple of v (abusively denoted the same below) is a weak solution to By standard elliptic regularity (see e.g.[Heb97], proof of Theorem 6.2.1, p. 248) v ∈ C 3 (M) and by the strong maximum principle (see e.g.[Heb97], Theorem 5. where Multiply equation ( 5) by ∆Φ 1−d ′ and integrate.For the right-hand-side we find, where we expressed the carré du champ operator Γ(Φ) = |∇Φ| 2 and where c = 2λ(d where we expressed the iterated carré du champ Γ 2 (Φ) = 1 2 ∆|∇Φ| 2 − ∇Φ • ∇∆Φ and used the fact that Collecting the left and right-hand sides and dividing by d ′ , we find The celebrated Bochner-Lichnerowicz formula states ‡ that Γ 2 (Φ) = ∇ 2 Φ 2 H.S + Ric g (∇Φ, ∇Φ), ‡ and motivates the definition of Γ 2 where ∇ 2 Φ denotes the Hessian of Φ, ∇ 2 Φ 2 H.S the square of its Hilbert-Schmidt norm (the sum of the squares of its components) and Ric g the Ricci tensor of the Riemannian manifold (M, g).Using the Cauchy-Schwarz inequality on the one hand and the assumption Ric ≥ ρg on the other hand, we find and so Let us quickly explain the definitions and notations used in the theorem.Clearly, a second order differential operator of the form The defect is measured by the carré du champ operator defined for Φ ∈ C 2 (M) by By a simple and direct computation, Γ(Φ) = |∇Φ| 2 .Abusing notation slightly, we let Γ(Φ, Ψ) = ∇Φ•∇Ψ denote the polar form of Γ.Now, repeat the above consideration by replacing the product of real numbers, seen as a bilinear form, by the carré du champ operator Γ: again L fails to satisfy the chain rule and the defect is measured by the iterated carré du champ operator, defined for Thanks to the Bochner-Lichnerowicz formula, the Γ 2 operator can be computed as follows: Note that when W = 0, LΦ = ∆Φ.By the Cauchy-Schwarz inequality, ∇ 2 Φ 2 H.S. ≥ 1 d (∆Φ) 2 ¶ so that, in this case, the CD(ρ, d) condition is equivalent to the lower bound Ric g ≥ ρg.§ Here ∆ is the Laplace-Beltrami operator on (M, g), the dot product designates the Riemannian metric g and | • | the associated norm.
¶ with equality if and only if ∇ 2 Φ = ∆Φ d g.

Proof of Theorem B
Let us review the proof of Theorem A. We start similarly with the tight but non-sharp Sobolev's inequality (3), the proof of which remains unchanged (e.g.adapt [Heb00] Theorem 4.1).Since M is compact and W continuous, e −W is bounded above and below by positive constants.So, the Riemannian volume and the measure dν = e −W Z dVol g yield the same Sobolev space H 1 (M, dν) = H 1 (M, dVol g ).In particular, by the same proof, the quantity I(A) has a nonnegative minimizer u, which this time solves where the definition of Φ is unchanged, Then, n ′ ց n and the theorem follows.

Proof of Theorem 1.3
Repeating once again the above computation we arrive at where c = 2λ(n − 1) = 4 n−1 (n−2)A and λ = q−2 2A = 2 (n−2)A .By the CD(ρ, n) condition, the first integral is nonnegative.Since f is nonincreasing, the last integral is also nonnegative.Finally, the coefficient in front of the second integral is strictly positive if A > A * , so that v must be constant in that case.If A = A * , then all the first and third integrals vanish.In particular, f is constant on [0, v ∞ ].

Sobolev's inequality is a convexity inequality for Renyi entropies in Wasserstein space
In this section, we explain the genesis of our short proof of Theorems A and 1.3.Our strategy consists in using a gradient flow defined on the set of probability measures over M, equipped with the Wasserstein distance.If one uses the appropriate functionals, the proof is rather simple.In the next paragraph, we explain first how a gradient flow in the usual Euclidean space R m can be used to derive sharp convexity inequalities.The extension of the method to the Wasserstein space is next presented in Section 3.2.The computations are not new, but this presentation and this point of view seem to be new and useful.
Many of our considerations will be formal: although this can be done, we do not try to make all arguments rigorous.Instead, we ask the reader to keep in mind that we only want to give a guideline to the rigorous proofs presented previously.

A review of gradient flows in Euclidean space
Let m ≥ 1 and F : R m → R any C 2 function, that we call entropy in what follows.Assume that F is strictly convex and coercive i.e. lim |x|→+∞ F (x) = +∞.Then, F has unique critical point x * .In addition, F (x * ) = inf x∈R m F (x).In order to locate the point of minimum x * , one can start from an arbitrary point x ∈ R m and follow the gradient flow associated to F .More precisely, let t → S t (x) denote the solution of the ODE d dt S t (x) = −∇F (S t (x)) Thanks to the Cauchy-Lipschitz theorem, t → S t (x) is well-defined on a maximal interval I containing t = 0.In fact, the solution is bounded, hence global, since F is coercive and nondecreasing along the flow: In addition, given any Indeed, since F is bounded below and (10) holds, there exists a sequence t n → +∞ such that |∇F (S tn (x))| → 0. Since (S t (x)) is bounded, up to extraction, (S tn (x)) also converges and by continuity of |∇F |, its limit must be x * .Using (10) once more, F (S t (x)) ≤ F (S tn (x)) for t ≥ t n and so F (S t (x)) decreases to F (x * ).( 11) follows.
If we further assume that F is strongly convex, i.e. ∇ 2 F ≥ ρ Id for some ρ > 0, then the rate of convergence of the entropy along its gradient flow can be quantified (as we shall prove shortly): Note that equality holds when t = 0 and so we can differentiate the inequality at t = 0.This yields the following equivalent convexity inequality Note that the inequality is sharp in the sense that it is an equality for F (x) = ρ|x| 2 /2.In fact, one can be a bit more general and consider the following convexity inequality which holds true whenever G ∈ C 2 (R m ) and F satisfy the following convex condition: there exits ρ > 0 such that uniformly in R m , We provide three proofs of this fact, ending with the most robust.
1.A direct proof based on the gradient flow.Differentiating (9) once more, gives, for any Integrating over [0, ∞] the previous inequality becomes, we have , we proved the inequality (12), under the condition (13).As we can see, inequality ( 12) is just a clever convex inequality under the convex condition (13).As we shall see, when generalizing this proof to an infinite-dimensional setting, we are faced with two problems: proving rigorously the existence of the gradient flow (S t ) t 0 and proving the two limits ( 11) and ( 14).

A proof based on a minimization problem and the gradient flow.
To prove (12), we fix a constant A > 0, compute the quantity and show that for A > 1 2ρ , G(x * ) ≤ I(A).Letting A ց 1 2ρ , (12) will then follow.If G is coercive, which we assume in this approach, then there exits x ∈ R m such that We now consider (S t (x)) t 0 , the gradient flow starting from x.Then, since x is a minimizer, we have In addition, Since F is strictly convex and (13) holds, if A > 1 2ρ , we see that if x = x * , ∇F (x) = 0 and so which is impossible since x is a minimizer.Hence, x = x * and the following inequality holds, for any x ∈ R d and A > 1 2ρ .This proves the desired inequality (12), by letting A → 1 2ρ .Note that in this approach, we no longer need to prove the asymptotic behavior of the gradient flow (S t ) t 0 but we still need to know its existence.
3. A proof based on the minimization problem only.As in the previous proof, let x given by equation ( 15), with A > 1 2ρ .Then, x solves the Euler-Lagrange equation Multiply the previous equality by ∇F (x), to conclude again, as in ( 16), that x = x * .Again, this implies inequality (12).
This last proof is quite interesting since we completely avoid using the gradient flow.Moreover, methods based on optimization problems are often robust.

Gradient flows in the space of probability measures
In this section, we reproduce the three methods of Section 3.1, this time in the space of probability measures over M. Before doing so, we need to introduced Otto's calculus, the main point of our method.For simplicity, all computations are given on a d-dimensional smooth, connected and compact Riemannian manifold (M, g).But they can be easily generalized to the setting of weighted Riemannian manifold under the CD(ρ, n) condition (8), as in Theorem 1.3 or Theorem B.

Otto's calculus
Otto's calculus, so called by C. Villani in his book [Vil09], is a very efficient tool to compute the second derivative of a functional along its probability gradient flow.This calculus has been developed in the seminal papers [JKO98,Ott01,OV00].It allows to view the space of probability measures on a manifold, at least formally, as an infinite dimensional Riemannian manifold.Our presentation is based on [GLR20], to which we refer for more details (see also [Gen20] for an informal presentation in French).The calculus can be viewed as a heuristic guideline but all the results can be turned into rigorous statements, see the monograph [Gig12].Let P 2 (M) denote the space of probability measures on M admitting a second moment ‖ .Equip P 2 (M) with the Wasserstein distance, defined as follows: for every µ, ν ∈ P 2 (M), where the infimum is taken over all transportation plans π ∈ P(M × M) with marginals µ and ν and where d is the Riemannian distance of M.
‖ Since we assumed for simplicity that M is compact, all probability measures admit a second moment and so P 2 (M ) = P(M ) in this case.
Following the presentation of [AGS08, Chap.1], a path [0, 1] ∋ t → ν t ∈ P 2 (M) is absolutely continuous with respect to the Wasserstein distance if It turns out that given any absolutely continuous path (ν t ) t∈[0,1] , there exists a unique vector field e. in [0, 1], see [AGS08].In addition, the vector field V t is the limit in L 2 (ν t ) of gradients of compactly supported functions in M and the continuity equation holds in the sense of distributions: Conversely, given any such vector field V t , there exists an absolutely continuous path (ν t ) t∈[0,1] such that the continuity equation ( 17) holds.In other words, for almost every t ∈ [0, 1], we may see V t as a tangent vector along the path (ν t ) t∈[0,1] .So, we denote νt := V t (18) and call νt the velocity of the path (ν t ) t∈[0,1] at time t.The tangent space at a point µ ∈ P 2 (M) can then be defined by and a natural Riemannian metric can be defined via the scalar product in L 2 (µ) by ∇ϕ, ∇ψ µ = ∇ϕ • ∇ψ dµ = Γ(ϕ, ψ)dµ, for ∇ϕ, ∇ψ ∈ T µ P 2 (M).
We shall write |∇ϕ| 2 µ = Γ(ϕ)dµ the corresponding Riemannian length.Such a metric is often referred to as the Otto metric.In addition, thanks to the Benamou-Brenier formulation, the Wasserstein distance is the Riemannian distance associated to the Otto metric.

Differentiating twice Renyi's entropy using Otto's calculus
To lighten notations and formulas, we identify henceforth measures and densities.Our presentation is heuristic and all the measures considered in this section are supposed to be smooth and absolutely continuous with respect to the Riemannian measure on M. Unless specified, all integrals are viewed with respect to the normalized Riemannian measure.Now, we consider our main flow (µ t ) t 0 , started from a probability measure µ 0 = µ and solving the following nonlinear diffusion equation where α > 0, α = 1.Then, according to the continuity equation ( 17), the velocity of this flow is given by μt Consider now the Rényi entropy (of order α > 0 with α = 1), which is the main functional used in this article.Then the gradient of R α is given by see for instance [GLR20, Sec.3.2].So, if (µ t ) t 0 is a solution of (19), then μt = −grad µt R α .
In other words, (19) is the gradient flow of the Rényi entropy with respect to the Otto metric.This was proved rigorously in [Ott01].Furthermore, the Riemannian structure given to P 2 (M) allows us to define the covariant derivatives and the Hessian of a functional.A remarkable fact is that the Hessian of Rényi's entropy in the sense of Otto's calculus has an explicit formulation: for any µ ∈ P 2 (M) and ∇ϕ ∈ T µ P 2 (M), where the operator Γ 2 has been defined in (7) (see [Ott01] or [GLR20, Sec.3.3]).
Let us now turn to our three methods to prove inequality (1), under a lower bound of the Ricci curvature.

Method based on a convex inequality for the Rényi entropy
We mimic the first proof proposed in Section 3.
and so we have the exact analogue of (13), that is.
Since µ * = 1 is the unique critical point of R α , using a rather delicate analysis that we don't develop here, one can prove that the following limits hold.Hence, by the very same proof of Section 3.1, we arrive at the exact analogue of (12), that is: By using the very definitions of R α , R β , α, β and Φ we obtain for any probability measure µ.
under the normalization f 2 * = 1 (so that µ is a probability measure).This is precisely Sobolev's inequality (1).This proof was first proposed by J. Demange in [Dem08].This method is important since it shows that Sobolev's inequality under a lower bound on the Ricci tensor is just a convex inequality applied to a functional (the Rényi entropy) along its gradient flow (the fast diffusion equation).The drawback of this method is that it is not so easy to prove the existence of a smooth global solution of the nonlinear diffusion equation ( 19) and the two limits (25).

Method based a minimization problem associated with the fast diffusion equation
Now, let us mimic the second proof of Section 3.1.Given A > 0, we consider the minimization problem I(A) := inf And we prove that for any A > α 2ρ , −R β (µ * ) ≤ I(A), where µ * = 1.Then, Sobolev's inequality follows as discussed in the previous section.The first delicate point consists in proving that the infimum I(A) is attained by some measure µ, which we admit here.This being said, once we have a well-defined global smooth solution of the gradient flow (19), and once we've observed the strict convexity of R α , which follows from (23) and the CD(ρ, d) condition, then all computations done in Section 3.1 remain unchanged, leading to µ = µ * = 1 and the desired inequality is proved.The main advantage of this method, compared to the previous one, is that it is no longer necessary to prove the two delicate limits of the fast diffusion equation (25).However, one needs to prove the existence of the minimizer µ as well as the existence of a smooth solution of the fast diffusion equation ( 19).The method proposed in the proof of Theorem A avoids both problems by working in a subcritical setting and by using the limit case, that is, the elliptic equation.

Method based only on the minimization problem
Indeed, mimic the third proof of Section 3.1.We consider again the minimization problem (26).Assume that there exists a probability measure µ minimizing I(A).Then, µ satisfies the corresponding Euler-Lagrange equation, given by thanks to Otto's calculus.Apply the equality to the test function grad µ R α , to get 2A Hess µ R α (grad µ R α , grad µ R α ) − grad µ R β , grad µ R α µ = 0, Using again the strict convexity of R α and (24), we conclude that µ = 1.
The proof proposed in Section 2.1 is inspired from this one.The only difference is that we work here on the space of probability measures, whereas in Section 2.1, to prove the existence of a minimizer, we work on the space of functions v such that ||v|| q = 1, where q ∈ [1, 2 * ) is subcritical.The elliptic equation (4) is, up to a change of functions, the equation ( 27) whereas when we multiply by ∆Φ 1−d ′ and integrate in the proof of Section 2.1 is exactly applying (27) to grad µ R α .