Geometric measure theory and differential inclusions

In this paper we consider Lipschitz graphs of functions which are stationary points of strictly polyconvex energies. Such graphs can be thought as integral currents, resp. varifolds, which are stationary for some elliptic integrands. The regularity theory for the latter is a widely open problem, in particular no counterpart of the classical Allard's theorem is known. We address the issue from the point of view of differential inclusions and we show that the relevant ones do not contain the class of laminates which are used in [22] and [25] to construct nonregular solutions. Our result is thus an indication that an Allard's type result might be valid for general elliptic integrands. We conclude the paper by listing a series of open questions concerning the regularity of stationary points for elliptic integrands.


Introduction
Let Ω ⊂ R m be open and f ∈ C1 (R n×m , R) be a (strictly) polyconvex function, i.e. such that there is a (strictly) convex g ∈ C 1 such that f (X) = g(Φ(X)), where Φ(X) denotes the vector of subdeterminants of X of all orders.We then consider the following energy E : Lip(Ω, R n ) → R: Definition 1.1.Consider a map ū ∈ Lip(Ω, R n ).The one-parameter family of functions ū + εv will be called outer variations and ū will be called critical for E if Given a vector field Φ ∈ C 1 c (Ω, R m ) we let X ε be its flow 1 .The one-parameter family of functions u ε = ū • X ε will be called an inner variation.
Classical computations reduce the two conditions above to2 , respectively, and The graphs of Lipschitz functions can be naturally given the structure of integer rectifiable currents (without boundary in Ω × R m ) and of integral varifold, cf.[14,24,16].In particular the graph of any stationary point ū ∈ Lip(Ω, R n ) for a polyconvex energy E can be thought as a stationary point for a corresponding elliptic energy, in the space of integer rectifiable currents and in that of integral varifolds, respectively, see [17,Chapter 1,Section 2].Even though this fact is probably well known, it is not entirely trivial and we have not been able to find a reference in the literature: we therefore give a proof for the reader's convenience.Note that a particular example of polyconvex energy is given by the area integrand The latter is strongly polyconvex when restricted to the any ball B R ⊂ R n×m , namely there is a positive constant ε(R) such that X → A(X) − ε(R)|X| 2 is still polyconvex on B R .
When n = 1 strong polyconvexity reduces to locally uniform convexity and any Lipschitz critical point is therefore C 1,α by the De Giorgi-Nash theorem.The same regularity statement holds in the much simpler "dual case" m = 1, where criticality implies that the vector valued map ū satisfies an appropriate system of ODEs.Remarkably, L. Székelyhidi in [25] proved the existence of smooth strongly polyconvex integrands f : R 2×2 → R for which the corresponding energy has Lipschitz critical points which are nowhere C 1 .The paper [25] is indeed an extension of a previous groundbreaking result of S. Müller and V. Šverák [22], where the authors constructed a Lipschitz critical point to a smooth strongly quasiconvex energy (cf.[22] for the relevant definition) which is nowhere C 1 .A precursor of such examples can be found in the pioneering PhD thesis of V. Scheffer, [23].
Minimizers of strongly quasiconvex functions have been instead proved to be regular almost everywhere, see [12,20] and [23].Note that the "geometric" counterpart of the latter statement is Almgren's celebrated regularity theorem for integral currents minimizing strongly elliptic integrands [5].Stationary points need not to be local minimizers for the energy, even though every minimizer for an energy is a stationary point.Moreover, combining the uniqueness result in [28] and [22,Theorem 4.1], it is easy to see that there exist critical points that are not stationary.
Other than the result in [28], not much is known about the properties of stationary points, in particular it is not known whether they must be C 1 on a set of full measure.Observe that Allard's ε regularity theorem applies when f is the area integrand and allows to answer positively to the latter question for f in (4).The validity of an Allard-type ε-regularity theorem for general elliptic energies is however widely open.A first interesting question is whether one could extend the examples of Müller and Šveràk and Székelyhidi to provide counterexamples.Both in [22] and [25], the starting point of the construction of irregular solutions is rewriting the condition (2) as a diff functions erential inclusion, and then finding a so-called T N -configuration (N = 4 in the first case, N = 5 in the latter) in the set defining the differential inclusion.The main result of the present paper shows that such strategy fails in the case of stationary points.More precisely: (a) We show that ū solves (2), (3) if and only if there exists an L ∞ matrix field A that solves a certain system of linear, constant coefficients, PDEs and takes almost everywhere values in a fixed set of matrices, which we denote by K f and call inclusion set, cf.Lemma 2.2.The latter system of PDEs will be called a div-curl differential inclusion, in order to distinguish them from classical differential inclusions, which are PDE of type Du ∈ K a.e., and from "divergence differential inclusions" as for instance considered in [8].
(b) We give the appropriate generalization of T N configurations for div -curl differential inclusions, which we will call T ′ N configurations, cf.Definition 2.6.As in the "classical" case the latter are subsets of cardinality N of the set K f which satisfy a particular set of conditions.
(c) We then prove the following nonexistence result.Theorem 1.2.If f ∈ C 1 (R n×m ) is strictly polyconvex, then K f does not contain any set {A 1 , . . ., A N } which induces a T ′ N configuration.Remark 1.3 (Székelyhidi's result).Theorem 1.2 can be directly compared with the results in [25], which concern the "classical" differential inclusions induced by (2) alone.In particular [25,Theorem 1] shows the existence of a smooth strongly polyconvex integrand f ∈ C ∞ (R 2×2 ) for which the corresponding "classical" differential inclusion contains a T 5 configuration (cf.Definition 2.5).In fact the careful reader will notice that the 5 matrices given in [25, Example 1] are incorrect.This is due to an innocuous sign error of the author in copying their entries.While other T 5 configurations can be however easily computed following the approach given in [25], according to [27], the correct original ones of [25, Example 1] are the following: These five matrices form a T 5 configuration with k i = 2, ∀1 ≤ i ≤ 5, P = 0, and rank-one "arms" given by Even though it seems still early to conjecture the validity of partial regularity for stationary points, our result leans toward a positive conclusion: Theorem 1.2 can be thought as a first step in that direction.
Another indication that an Allard type ε-regularity theorem might be valid for at least some class of energies is provided by the recent paper [9] of A. De Rosa, the second named author and F. Ghiraldin, which generalizes Allard's rectifiability theorem to stationary varifolds of a wide class of energies .In fact the authors' theorem characterizes in terms of an appropriate condition on the integrand (called "atomic condition", cf.[9, Definition 1.1]) those energies for which rectifiability of stationary points hold.Furthermore one can use the ideas in [9] to show that the atomic conditions implies strong W 1,p convergence of sequences of stationary equi-Lipschitz graphs, [11].When transported to stationary Lipschitz graphs, the latter is yet another obstruction to applying the methods of [22] and [25].In [10] it has been shown that the atomic condition implies Almgren's ellipticity.It is an intriguing issue to understand if this implication can be reversed and (if not) to understand wether this (hence stronger) assumption on the integrand can be helpful in establishing regularity of stationary points.
We believe that the connection between differential inclusions and geometric measure theory might be fruitful and poses a number of interesting and challenging questions.We therefore conclude this work with some related problems in Section 8.
The rest of the paper is organized as follows: in Section 2 we rewrite the Euler Lagrange conditions (2) and (3) as a div-curl differential inclusion and we determine its wave cone.We then introduce the inclusion set K f and, after recalling the definition of T N configurations for classical differential inclusions, we define corresponding T ′ N configurations for div-curl differential inclusions.In Section 3 we give a small extension of a key result of [26] on classical T N configurations.In Section 4 we consider arbitrary sets of N matrices and give an algebraic characterization of those sets which belong to an inclusion set K f for some strictly polyconvex f .In Section 5 we then prove the main theorem of the paper, Theorem 1.2.As already mentioned, Section 6 discusses the link between stationary graphs and stationary varifolds, whereas Section 8 is a collection of open questions.

Div-curl differential inclusions, wave cones and inclusion sets
As written in the introduction, the Euler-Lagrange conditions for energies E are given by: Here we rewrite the system (5) as a differential inclusion.To do so, it is sufficient to notice that the left hand side of the second equation can be rewritten as Hence, the inner variation equation is the weak formulation of Since also the outer variation is the weak formulation of a PDE in divergence form, namely div(Df (Du)) = 0, we introduce the following terminology: Definition 2.1.A div-curl differential inclusion is the following system of partial diffential equations for a triple of maps where f ∈ C 1 (R n×m ) is a fixed function.The subset K f ⊂ R (2n+m)×m will be called the inclusion set relative to f .
The following lemma is then an obvious consequence of the above discussion solves the div-curl differential inclusion (6)-(7).

2.1.
Wave cone for div-curl differential inclusions.We recall here the definition of wave cone for a system of linear constant coefficient first order PDEs.Given a system of linear constant coefficients PDEs in the unknown z : R m ⊃ Ω → R d we consider plane wave solutions to (8), that is, solutions of the form where h : R → R. The wave cone Λ is given by the states a ∈ R d for which there is a ξ = 0 such that for any choice of the profile h the function ( 9) solves (8), that is, The following lemma is then an obvious consequence of the definition and its proof is left to the reader.
Lemma 2.3.The wave cone of the system curl X = 0 is given by rank one matrices, whereas the wave cone for the system (6) is given by triple of matrices (X, Y, Z) for which there is a unit vector ξ ∈ S m−1 and a vector u ∈ R n such that X = u ⊗ ξ, Y ξ = 0 and Zξ = 0.

Motivated by the above lemma we then define
Definition 2.4.The cone with the property that there is a direction ξ ∈ S m−1 and a vector u ∈ R n such that X = u ⊗ ξ, Y ξ = 0 and Zξ = 0.

T N configurations.
We start definining T N configurations for "classical" differential inclusions.
. ., X n , P and C 1 , . . ., C N satisfy the following N linear conditions In the rest of the note we will use the word T N configuration for the data P, C 1 , . . ., C N , k 1 , . . .k N .We will moreover say that the configuration is nondegenerate if rank(C i ) = 1 for every i.
Note that our definition is more general that the one usually given in the literature (cf.[22,25,26]) because we drop the requirement that there are no rank one connections between distinct X i and X j .Moreover, rather than calling {X 1 , . . ., X N } a T N configuration, we prefer to say that it "induces" a T N configuration, namely we regard the whole data X 1 , . . ., X N , C 1 , . . ., C N , k 1 , . . ., k N since it is not at all clear that given an ordered set {X 1 , . . ., X N } of distinct matrices there is at most one choice of the matrices C 1 , . . ., C N and of the coefficients k 1 , . . ., k N satisfying the conditions above (if we drop the condition that the set is ordered, then it is known that there is more than one choice, see [15]).
We observe that the definition of T N configuration could be split into two parts.A "geometric part", namely the points (b) and (c), can be considered as characterizing a certain "arrangement of 2N points" in the space of matrices, consisting of: • A closed piecewise linear loop, loosely speaking a polygon (not necessarily planar) with vertices P 1 = P + C 1 , P 2 = P + C 1 + C 2 , . . ., P N = P + C 1 + . . .+ C N = P ; • N additional "arms" which extend the sides of the polygon, ending in the points X 1 , . . ., X N .See Figure 1 for a graphical illustration of these facts in the case N = 4.
The closing condition in Definition 2.5(b) is a necessary and sufficient condition for the polygonal line to "close".Condition (c) determines that each X i is a point on the line containing the segment P i−1 P i .Note that the inequality k i > 1 ensures that X i is external to the segment, "on the side of P i ".The "nondegeneracy" condition is equivalent to the vertices of the polygon being all distinct.Note moreover that, in view of our definition, we include the possibility N = 2.In the latter case the T 2 configuration consists of a single rank one line and of 4 points X 1 , X 2 , C 1 , C 2 lying on it.We have decided to follow this convention, even though this is an unusual choice compared to the literature.
The second part of the Definition, namely condition (a), is of algebraic nature and related to the fact that T N configurations are used to study "classical differential inclusions", namely PDEs of the form curl X = 0.The condition prescribes simply that each vector X i − P i belongs to the wave cone of curl X = 0.

T ′
N configurations.In this section we generalize the notion of T N configuration to div-curl differential inclusions.The geometric arrangement remains the same, while the wave cone condition is replaced by the one dictated by the new PDE (6).
and the following properties hold: (a) each element (C i , D i , E i ) belongs to the wave cone Λ dc of ( 6); (b) ℓ C ℓ = 0, ℓ D ℓ = 0 and ℓ E ℓ = 0. We say that the T ′ N configuration is nondegeneate if rank(C i ) = 1 for every i.
We collect here some simple consequences of the definition above and of the discussion on T N configurations.
Proposition 2.7.Assume A 1 , . . ., A N induce a T ′ N configuration with P, Q, R, C i , D i , E i and k i as in Definition 2.6.Then: (i) {X 1 , . . ., X N } induce a T N configuration of the form (11), if they are distinct; moreover the T ′ N configuration is nondegenerate if and only if the T N configuration induced by {X 1 , . . ., X N } is nondegenerate; (ii) For each i there is an n i ∈ S m−1 and a where (•, •) denotes the Euclidean scalar product.

Preliminaries on classical T N configurations
This section is devoted to a slight generalization of a powerful machinery introduced in [26] to study T N configurations.
3.1.Székelyhidi's characterization of T N configurations in R 2×2 .We start with the following elegant characterization.Proposition 3.1.([26, Proposition 1]) Given a set {X 1 , . . ., X N } ⊂ R 2×2 and µ ∈ R, we let A µ be the following N × N matrix: Then, {X 1 , . . ., X N } induces a T N configuration if and only if there exists a vector λ ∈ R N with positive components and µ > 1 such that Even though not explicitely stated in [26], the following Corollary is part of the proof of Proposition 3.1 and it is worth stating it here again, since we will make extensive use of it in the sequel.
Corollary 3.2.Let {X 1 , . . ., X N } ⊂ R 2×2 and let µ > 1 and λ ∈ R N be a vector with positive entries such that A µ λ = 0. Define the vectors where ξ i > 0 is a normalizing constant so that t i 1 := j |t i j | = 1, ∀i.Define the matrices C j with j ∈ {1, . . ., N − 1} and P by solving recursively and set Finally, define Then P, C 1 , . . ., C N and k 1 , . . .k N give a T N configuration induced by {X 1 , . . ., X N } (i.e.(11) holds).Moreover, the following relation holds for every i: Remark 3.3.Observe that the relations ( 14) can be inverted in order to compute µ and λ (the latter up to scalar multiples) in terms of k 1 , . . ., k N .In fact, let us impose Then, regarding µ as a parameter, the equations ( 14) give a linear system in triangular form which can be explicitely solved recursively, giving the formula The following identity can easily be proved by induction: Hence, summing ( 16) and imposing j λ j = 1 we find the equation which determines uniquely µ as A second corollary of the computations in [26] is that (11) and let µ and λ be as in ( 16) and (17).Then A µ λ = 0.

3.2.
A characterization of T N configurations in R n×m .We start with a straightforward consequence of the results above.
Let us first introduce some notation concerning multi-indexes.We will use I for multi-indexes referring to ordered sets of rows of matrices and J for multi-indexes referring to ordered sets of columns.In our specific case, where we deal with matrices in R n×m we will thus have and we will use the notation |I| := r and |J| := s.In the sequel we will always have r = s.
Definition 3.5.We denote by A r the set For a matrix M ∈ R n×m and for Z ∈ A r of the form Z = (I, J), we denote by M Z the squared r × r matrix obtained by A considering just the elements a ij with i ∈ I, j ∈ J (using the order induced by I and J).
Given a set {X 1 , . . ., X N } ⊂ R n×m , µ ∈ R and Z ∈ A r , we introduce the matrix and only if there is a real µ > 1 and a vector λ ∈ R N with positive components such that Moreover, if we define the vectors t i as in (12), the coefficients k i through (14) and the matrices P and C i through (13), then P, C 1 , . . ., C N and k 1 , . . ., k N give a T N configuration induced by {X 1 , . . ., X N }.
For this reason and in view of Remark 3.3, we can introduce the following terminology: Definition 3.7.Given a T N -configuration P, C 1 , . . ., C N and k 1 , . . ., k N we let µ and λ be given by ( 16) and ( 17) and we call (λ, µ) ∈ R N +1 the defining vector of the T N configuration.
Proof of Proposition 3.6.Direction ⇐=.Fix a set {X 1 , . . ., X N } of matrices with the property that there is a common µ > 1 and a common λ with positive entries such that A µ Z λ = 0 for every Z ∈ A 2 .For each Z we consider the corresponding set {X Z 1 , . . ., Z Z N } and we use the formulas ( 12), ( 14) and ( 13) to find k 1 , . . ., k N , P (Z) and C i (Z) such that Since the coefficients k i are independent of Z, the formulas give that the matrices C i (Z) (and P (Z)) are compactible, in the sense that, if jℓ is an entry common to Z and Z ′ , then In particular there are matrics C i 's and P such that C i (Z) = C Z i and P (Z) = P Z and thus (11) holds.Moreover, we also know from Proposition 3.1 that rank(C Z i ) ≤ 1 for every Z and thus rank(C i ) ≤ 1.We also know that C Z 1 + . . .+ C Z N = 0 for every Z and thus C 1 + . . .+ C N = 0. Direction =⇒.Assume X 1 , . . ., X N induce a T N configuration as in (11).Then X Z 1 , . . ., X Z N induce a T N configuration with corresponding P Z , C Z 1 , . . ., C Z N and k 1 , . . ., k N , where the latter coefficients are independent of Z. Thus, by Corollary 3.4, A µ Z λ = 0. 3.3.Computing minors.We end this section with a further generalization, this time of (15): we want to extend the validity of it to any minor.(11) with defining vector (λ, µ).Define the vectors t 1 , . . ., t N as in (12) and for every Z ∈ A r of order r ≤ min{n, m} define the minor S : and A µ Z λ = 0. Fix any matrix A ∈ R m×m .In the following we will denote by cof(A) the m × m matrix defined 1 as cof(A) ij := (−1) i+j det(A j,i ), 1 Note that sometimes in the literature one refers to what we called cof(A) as the adjoint of A, and the adjoint of where A j,i is the m − 1 × m − 1 matrix obtained by eliminating from A the j-th row and the i-th column.It is well-known that We will need the following elementary linear algebra fact, which in the literature is sometimes called Matrix Determinant Lemma: Lemma 3.9.Let A, B be matrices in R m×m , and let rank(B) ≤ 1.Then, Moreover, we need another elementary computation, which is essentially contained in [26] and for which we report the proof at the end of the section for the reader's convenience.Lemma 3.10.Assume the real numbers µ > 1, λ 1 , . . ., λ N > 0 and k 1 , . . ., k N > 1 are linked by the formulas (14).Assume v, v 1 , . . ., v N , w 1 , . . ., w N are elements of a vector space satisfying the relations If we define the vectors t i as in (12), then Proof of Proposition 3.8.Fix the Z of the statement of the proposition.X Z 1 , . . ., X Z N induces T N with the same coefficients k 1 , . . .k N .This reduces therefore the statement to the case in which m = n, Z = ((1, . . .n), (1, . . ., n)) and the minor S is the usual determinant.
To prove the second part of the statement notice that A µ Z λ = 0 is equivalent to the following N equations: Fix i ∈ {1, . . ., N } and define matrices Y j := X j − X i , ∀j. {Y 1 , . . ., Y N } is still a T N configuration of the form and P ′ = −X i (recall that P = 0).Apply now (18) to find that and conclude the proof.
3.4.Proof of Lemma 3.10.It is sufficient to compute separately N j=1 t 1 j w j = N j=1 λ j w j and i−1 j=1 λ j w j .In fact, We can write Recalling that the defining vector and the numbers k i are related through (14), we compute Hence On the other hand, and Also, We can now compute (22): We use the just obtained identity Using that v 1 + . . .+ v N = 0 we conclude the desired identity.

Inclusions sets relative to polyconvex functions
In this section we consider the following question.Given a set of distinct matrices do they belong to a set of the form for some strictly polyconvex function f : R n×m → R? Observe that A i = A j , for i = j if and only if X i = X j , for i = j.Below we will prove the following Proposition 4.1.Let f : R n×m → R be a strictly polyconvex function of the form f (X) = g(Φ(X)), where g ∈ C 1 is strictly convex and Φ is the vector of all the subdeterminants of X, i.e.
and v s (X) = (det(X Z 1 ), . . ., det(X Z #As )) for some fixed (but arbitrary) ordering of all the elements Z ∈ A s .If The expressions in (27) can be considerably simplied when the matrices X 1 , . . ., X N induce a T N configuration.
4.1.Proof of Proposition 4.1.The strict convexity of g yields, for i = j, A simple computation shows that for the function det(•) : R r×r → R: In the following equation, we will write, for an n × m matrix M and for Z ∈ A r , cof(M Z ) T to denote the n × m matrix with 0 in every entry, except for the rows and columns corresponding to the multiindex Z = (I, J), which will be filled with the entries of the matrix cof(M Z ) T ∈ R r×r , namely, if i / ∈ I or j ∈ J, then (cof(M Z ) T ) ij = 0 and, if we eliminate all such coefficients, the remaining r × r matrix equals cof(M Z ) T .Moreover, we will identify the differential of a map from R n×m to R with the obvious associated matrix.We thus have the formula In order to simplify the notation set now d i Z := ∂ Z g(Φ(X i )).The previous expression yields: Finally, summing c i − c j on both sides: we see that the previous inequality implies the conclusion ∀i = j, The result is a direct consequence of Lemma 3.9 and Proposition 3.8.First of all, by Proposition 3.8 we have Moreover, by ( 13), we get Finally, apply Lemma 3.9 to A = X Z i and These three equalities together give (28). and

Proof of Theorem 1.2
In this section we prove the main theorem of this paper.

Gauge invariance.
In the first part we state a corollary of some obvious invariance of polyconvex functions under certain groups of transformations.This invariance will then be used in the proof of Theorem 1.2 to bring an hypothetical T ′ N configuration into a "canonical form".Lemma 5.1.Let f : R n×m be strictly polyconvex and assume that K f contains a set of matrices {A 1 , . . ., A N } which induces a nondegenerate T ′ N configuration, denoted by where Then, for every S, T ∈ R n×m , a ∈ R, there exists another strictly polyconvex function f such that the family of matrices lie in K f , ∀i, and they have the following properties: • The matrices X i , Y i have the form • the matrices Z i are of the form Proof.We consider f of the form We now show that the modification of Zi into Z i with the properties listed in the statement of the present proposition will let us fulfill also the last requirement, namely that Analogously, we denote with c i = f ( Xi ).We write We can thus rewrite For every fixed j, we decompose in a unique way V = V j + V ⊥ j , where V ⊥ j = (V n j ) ⊗ n j and V j = V − V ⊥ j .Note that, since C j = u j ⊗ v j , this implies that C T j V j − C j , V j id is a scalar multiple of the orthogonal projection on span(n j ) ⊥ .Therefore, Consequently, XT i V has the following form: Finally, we define, Resuming the main computations, we have obtained that: we are finally able to say that the first part of the Proposition is proved provided that To simplify future computations, let us use the identities 36)-(37) are easily checked by the linearity of the previous expressions and the identity (24).

Proof of Theorem 1.2. Assume by contradiction the existence of a T ′
N configuration induced by matrices {A 1 , . . ., A N } which belong to the inclusion set K f of some stictly polyconvex function f ∈ C 1 (R n×m ).Note that the corresponding {X i } must be all distinct, because Y i = Df (X i ) and Thus {X 1 , . . ., X N } induce a T N configuration.We consider coefficients k 1 , . . ., k N and matrices P, Q, R, C i , D i , E i as in Definition 2.6.By Lemma 5.1 we can assume, without loss of generality, that P = 0 = Q and tr(R) = 0 .We are now going to prove that the system of inequalities where c i and t i j are as in Corollary 4.3, cannot be fulfilled at the same time.This will then give a contradiction.In order to follow our strategy, we need to compute the following sums: Let us start computing the sum for i = 1, j λ j X T j Y j .We rewrite it in the following way: where we collected in the coefficients g ij the following quantities: As already computed in (23), we have: On the other hand, Using the equalities ℓ C ℓ = 0 = ℓ D ℓ , then also i,j C T i D j = 0, and so We just proved that In particular, since C T i D i is trace-free for every i.We also have: Since both tr(R) and tr i k i (k i − 1)λ i C T i D i = 0, then j λ j c j = 0 and we get Recall the definition of t i , namely By the previous computation (i = 1) and (36), it is convenient to rewrite (39) as Once again, let us express the sum up to i − 1 in the following way: A combinatorial argument analogous to the one in the previous case gives We rewrite (43) as E i is readily computed using (44) and the definition of Z i : The evaluation of the previous expression at the vectors n i of Proposition 2.7(ii) yields Now, since C i v = 0, ∀v ⊥ n i , we must have The last equality implies that the right hand side of (45) is exactly ν i n i , where ν i has been defined in (38).We will now prove that there exists a nontrivial subset A ⊂ {1, . . ., N } such that j∈A ξ j ν j = 0, and this will conclude the proof, being ξ j > 0, ∀j.
then we can rewrite (45) as Now, consider the set A ⊂ {1, . . ., N } defined as A = {1} ∪ {j : n j cannot be written as a linear combination of vectors n ℓ , for any ℓ ≤ j}.
Clearly span({n s : s ∈ A}) = span(n 1 , . . ., n N ) ⊂ R m and moreover {n s : s ∈ A} are linearly independent.Define S := span({n s : s ∈ A}), and consider the relation for i ∈ A. This can be rewritten as for some coefficients d αi .Recall that By the properties of the matrices C i 's, we see that Im(R) ⊆ S. Now complete (if necessary) {n s : s ∈ A} to a basis B of R m adding vectors γ j with the property that (γ j , γ k ) = (γ j , n s ) = 0, ∀j = k, s ∈ A and γ j = 1, ∀j.By the previous observation about the image of R and (46), we are able to write the matrix of the linear map associated to R for the basis B as We denoted with 0 a,b the zero matrix with a rows and b columns, with T the dim(S)× (m − dim(S)) matrix of the coefficients of Rγ j with respect to {n s : s ∈ A}, and with * numbers we are not interested in computing explicitely.Moreover, we chose an enumeration of A with . The previous matrix must have the same trace as R, so and the proof is finished.

Stationary graphs and stationary varifolds
The aim of this section is to provide the link between stationary points for energies defined on functions (or graphs) and stationary varifolds for "geometric" energies.6.1.Notation and preliminary definitions.Recall that general m-dimensional varifolds in R m+n (introduced by L.C. Young in [30] and pioneered in geometric measure theory by Almgren [4] and Allard [1]) are nonnegative Radon measures on the Grassmaniann of G(m, m + n) of (unoriented) m-dimensional planes of R m+n .In our specific case we are interested on a subclass, namely integer rectifiable varifolds, for which we can take the simpler Definition 6.1 below.A quick reference for the terminology used in this section is [7], whereas comprehensive introductions can be found in the foundational paper [1] and in the book [24].Definition 6.1.An integer rectifiable varifold V of dimension m is a couple (Γ, θ), where Γ ⊂ R m+n is a m-rectifiable set in R N , and θ : Γ → N \ {0} is a Borel map.
It is customary to denote (Γ, θ) as θ Γ and to call θ the multiplicity of the varifold.Definition 6.2.Let U be an open set of R m+n , and let Φ : R m+n → U be a diffeomorphism.The pushforward of an integer rectifiable varifold V = θ Γ through Φ is defined as For an integer rectifiable varifold θ Γ , it is customary to introduce a notion of approximate tangent plane, which exists for H m -a.e. point of Γ, we refer to [24, Theorem 3.1.8]for the relevant details.Provided it exists, the tangent plane at the point y ∈ Γ will be denoted with T y Γ and it is an element of G(m, m+n).In the following, we will identify the Grassmanian manifold with a suitable subset of orthogonal projections, i.e. for every L ∈ G(m, m+n) we consider the linear map P : R m+n → R m+n which is the orthogonal projection onto L. With this identification we have G(m, m + n) ∼ P ∈ R (m+n)×(m+n) : P = P T , P 2 = P, rank(P ) = tr(P ) = m .
Since we are interested in graphs, we introduce the following useful notation Definition 6.3.The set G 0 (m, m + n) is given by those m-dimensional planes L which are the graphs of a linear map X : R m → R n .Namely, if we regard X as an element of R n×m , L = {(x, Xx) : Regarding X as an element of R n×m , the orthogonal projection onto L, regarded as an element h(X) of R (m+n)×(m+n) , is then given by the formula h(X) := M (X)S(X)M (X) T  where or, more explicitely, The map h is a smooth diffeomorphism between R n×m and the open subset G 0 .We will use in general, i.e. for any matrix M ∈ R (m+n)×(m+n) the same splitting as in (47): In this section, we will use freely the following fact.Recall that, by (4), for every X ∈ R n×m the area element is given by By the Cauchy-Binet formula, [6, Proposition 2.69], where we used the notation introduced in Definition 3.5.Finally, throughout the section, we use the following notation: We can thus consider the corresponding varifold Γ u .If u ∈ W 1,m (Ω, R n ), then u has a precise representative which is however defined only up to a set of m-capacity 0 (but not everywhere).Moreover, if for maps u ∈ W 1,m ∩ C(Ω, R n ), for which the set-theoretic graph Γ u could be defined classically, it can be proven that Γ u does not necessarily have locally finite H m -measure, in spite of the fact that A(Du) belongs to L 1 loc .In particular the area formula fails.For this reason, following the notation and terminology of [16, Sec.1.5, 2.1], we introduce the "rectifiable part of the graph of u", which will be denoted by G u (the notation in [16] is in fact G u,Ω : we will omit the domain Ω since in our case it is always clear from the context).
First we denote the set of Lebesgue points of u by L u and we introduce the set A D (u) := {x ∈ Ω : u is approximately differentiable at x}.For the definition of approximate differentiability, see [16,Sec. 1.4,Def. 3].We also set From now on, we always assume that u so that u(x) is the Lebesgue value at every point x ∈ R u .The rectifiable part of the graph of u is then loc , this allows us to introduce the integer rectifiable varifold 1 G u .When u ∈ W 1,p for p > m, the Lusin property (namely the fact that v(x) := (x, u(x)) maps sets of Lebesgue measure zero in sets of H m -measure zero, cf.again [16]) and Morrey's embedding imply By [16, Sec.1.5, Th. 5], the approximate tangent plane T y G u coincides for H m -a.e.z 0 = (x 0 , u(x 0 )) ∈ G u or, with 1 In fact the Gu can be oriented to give an integer rectifiable current of multiplicity 1 and without boundary in [16,Pr. 1,Sec 2.1].The varifold that we consider is then the one induced by the current in the usual sense.
The following proposition allows then to pass from functionals defined on varifolds to classical functionals in the vectorial calculus of the variations (and viceversa).Proposition 6.4.Let u ∈ W 1,m (Ω, R n ), and define v(x) := (x, u(x)).Denote with C b (Ω×R n ×G 0 ) the space of continuous and bounded functions on Ω×R n ×G 0 .Then, for every ϕ ∈ C b (Ω×R n ×G 0 ), the following holds Consider therefore a functional Define moreover F, G : G 0 → R as For any map u ∈ W 1,m (Ω, R n ), we can apply (49) to write: where we have defined the map Ψ on the open subset G 0 of the Grassmanian G(m, m + n) as We are thus ready to introduce the following functional Definition 6.5.Let V = θ Γ be an m-dimensional integer rectifiable varifold in R m+n with the property that the approximate tangent T x Γ belongs to G 0 for H m -a.e.x ∈ Γ.Then The above discussion then proves the following Proposition 6.6.
6.3.First variations.We do not address here the issue of extending the functional Σ to general varifolds (namely of extending Ψ to all of G(m, m + n)).Rather, assuming that such an extension exists, we wish to show that the usual stationarity of varifolds with respect to the functional Σ is equivalent to stationarity with respect to two particular classes of deformations, which reduce to inner and outer variations in the case of graphs.We start recalling the usual stationarity condition.
) and define X ε as the flow generated by g, namely X ε (x) = γ x (ε), if γ x is the solution of the following system We define the variation of V with respect to the vector field g ∈ C 1 c (R m+n ; R m+n ) as Given an orthogonal projection P ∈ G m,m+n) , we introduce the notation P ⊥ for id m+n −P (note that, if we identify P with the linear space L which is the image of P , then P ⊥ is the projection onto the orthogonal complement of L).From [9, Lemma A.2], we know that, for where m+n) is defined through the relation B Ψ (P ), L = Ψ(P ) P, L + dΨ(P ), We are now ready to state our desired equivalence between stationarity of the map u for the energy E and stationarity of the varifold G u for the corresponding functional Σ.In what follows, given a function g on G u we will use the shorthand notation g q,Gu for the norm g L q (H m Gu) .
for some C ≥ 0, then the integer rectifiable varifold for some number holds for some C ′ , then (52) holds for some C = C(C ′ , m, p, q).Moreover, C ′ = 0 if and only C = 0, namely u is stationary for the energy E if and only if G u is stationary for the energy Σ.
Remark 6.9.As already noticed, when p > m we can replace G u with Γ u .Moreover, under such stronger assumption, the proposition holds also for q = ∞, provided we set A(Du) The proof of the previous proposition is a consequence of a few technical lemmas and will be given in the next section.7. Proof of Proposition 6.8 In the next lemma we study the growth of the matrix-fields associated to the inner and the outer variations, i.e.

A(X)
Define also the matrix-field V f : R n×m → R (m+n)×(m+n) to be In Lemma 7.2, we will prove that Combining Lemma 7.1 and 7.2 with the area formula we obtain Lemma 7.3, from which we will infer Proposition 6.8.
, where h is the map defined in (47).Then, In the statement of the Lemma and in the proof, the symbol Λ Ξ means that there exist a non-negative constant C depending only on n, m and on The lemma above is needed to get reach enough summability in order to justify the integral formulas in (the statement and the proof of) Lemma 7.3.In some sense it is thus less crucial than the next lemma, which contains instead the core computations.For these reasons, the argument of Lemma 7.1, which contains several lengthy computations is given in the appendix.
We next prove Lemma 7.2 and Lemma 7. If ℓ = 1, we identify d P g(P ) with the R (m+n)×(m+n) associated matrix representing the differential, and we denote d P g(P )[H] with d P g(P ), H .In this proof, we will use the following facts: • The tangent plane of G(m, m + n) at the point P is given by as proved in [9, Appendix A]. • Let h : R n×m → G 0 be the map defined in (47).Then, it is simple to verify that Moreover, for every H ∈ T P G(m, m + n), one has: • Recall that the area functional is defined as where M (X) = id m X .
Hence, for every X, Y ∈ R n×m , we have Recall the definition of B Ψ (P ) given in (51).Since for every H ∈ T P G(m, m + n) we have When evaluated at P = h(X), the previous expression reads (62) By (51), we know that, for every Therefore, we want to compute (62) when We wish to find an expression for Using the decomposition introduced in (48) of L in 4 submatrices, we compute Combining (60) with (64), we get and Combining (65), (66) and (67), we get that To expand (62), we now need to rewrite First, we must compute the trace part coming from (61): Hence, if H = h(X) ⊥ Lh(X) + h(X)L T h(X) ⊥ , we have just proved that: To conclude, we also need to compute (tr(SL 1 ) + tr(SX T L 2 ) + tr(XSL 3 ) + tr(XSX T L 4 )). (69) Now we sum (68) and (69) to get B Ψ (h(X)), L .Using that S −1 (X) = X T X + id m and the invariance of the trace under cyclic permutations, we rewrite tr(SL 1 ) + tr(SX T L 2 ) + tr(XSL 3 ) + tr(XSX T L 4 ) − tr(SX T L 2 ) − tr(SX T L 4 X) + tr(SX T XL 1 ) + tr(SX T XL 3 X) = tr(L 1 ) + tr(L 3 X).
Combining our previous computations, we find Since L was arbitrary, we conclude that 7.2.Proof of Lemma 7.3.Fix g as in the statement of the Lemma.By (50), we know that The previous equality and (70) yield the conclusion.
Notice that we require (52) to hold only for C 1 maps with compact support, but Lemma 7.1 implies through an approximation argument that Indeed, to prove, for instance, that the first inequality holds for any v ∈ L ∞ ∩ W 1,m 0 , pick a sequence by the dominated convergence theorem.Indeed, we required the pointwise convergence of v k to v and moreover we can bound for every k and almost every x ∈ Ω: Hence (71) with v k instead of v implies the same inequality for v by taking the limit as k → ∞.
The proof of the second inequality of (71) is analogous.We combine (71) with (58) to write Now we use the trivial estimate v(x, y) ≤ g(x, y) for all x ∈ Ω, y ∈ R n , and area formula (49) to conclude With analogous estimates, we also find ΦA 1 q (Du) q q ≤ g q L q (Gu) .Therefore, (53) holds with constant C ′ = 2C.Now assume (53).Choose the following sequence 2k (0) and Dχ k (y) ≤ 1 k , for all y ∈ R n .Using again area formula (49), we write Monotone convergence theorem implies Now we want to use (58).Using the same notation as in the statement of Lemma 7.

Moreover using the pointwise bound
Again through (72), we infer that the last term converges to 0. This implies that In a completely analogous way, and it is immediate to see that this implies (52) with constant C′ = C ′ .

Some open questions
We list here a series of questions related to the topic of the present paper.Firstly, as already explained in the introduction, the main question which motivated the investigations of this paper is the following widely open question.Question 1.Is it possible to prove an analog of W. Allard's celebrated regularity theorem [1] if we consider strongly elliptic integrands (in the sense of Almgren) Ψ on Grassmanian?
The answer to this question is far from being immediate.A major obstacle is the lack of the monotonicity formula, [2].Actually most of the proof in [1] can be carried over if one know the validity of a Michael-Simon inequality.More precisely, consider a rectifiable varifold V = θ Γ with density bounded below (e.g.θ ≥ 1) and anisotropic variation δ for some H Ψ ∈ L 1 .The anisotropic Michael-Simon inequality would then take the conjectural form Question 2. Is it possible to prove a Michael-Simon inequality as (73) for (at least some) anisotropic energies?
Of course, Question 1 has its counterpart on graphs, which amounts to extend the partial regularity of Evans for minimizers to stationary points.
Question 3. Is it possible to extend the partial regularity theorem of [12] to Lipschitz graphs that are stationary with respect to strongly polyconvex (or quasiconvex) energies?
Answering these questions in this generality seems out of reach at the moment.It is however possible to formulate several interesting intermediate questions, many of which are related to the "differential inclusions point of view" adopted in the present paper.
First of all we could consider stronger assumptions on the integrand Ψ.In the recent paper [9], A. De Rosa, the second named author and F. Ghiraldin introduce the so-called Atomic Condition.Such condition characterizes those energies for which (the appropriate extension of) Allard's rectifiability result holds.The following question is thus natural (see the forthcoming paper [11] for results in this direction): Question 4. What is the counterpart of the Atomic Condition for functionals on graphs and what can be concluded from it in the graphical case?
Secondly, a possible approach to Question 1 is a continuation-type argument on the space of all energies.Since the area functional has a particular status, the following question is particularly relevant.
Question 5. Does an Allard type result holds for integrands which are sufficiently close to the area?
In the forthcoming paper, [29], the fourth named author proves a partial result in the above direction.Using methods coming from the theory of differential inclusion, [29] shows that graphs with small Lipschitz constant that are stationary with respect to functions sufficiently close to the area are regular.These results, other than the one in the present paper, seem to point to partial regularity for stationary varifolds (or graphs), as opposed to the situation of [22,25].
We note that a key step in the proof of Evans' partial regularity theorem is the so called Caccioppoli inequality which, roughly, reads as follows: for a minimizer u defined on B 2 for all affine functions a(x) = b+Ax.The geometric counterpart of this estimate is used by Almgren in its partial regularity theorem for currents minimizing anisotropic energies, [5].These inequalities are obtained by direct comparison with suitable competitors.A similar estimate is obtained, by purely PDE techniques, by Allard in the case of stationary varifolds and it is one of the key step in establishing his regularity theorem for stationary varifolds.For co-dimension one stationary varifolds which are stationary with respect to anisotropic convex integrands, a similar inequality is known to hold true, [3].However, in general co-dimension, no condition on the integrand it is known to ensure its validity, not even in neighbourhoods of the area integrand, Ψ = 1.
Question 6.Which conditions on the integrands f or Ψ ensures the validity of a Cacciopoli type inequality for stationary points?
In [13], it is proved that for differential inclusions of the form Du ∈ K, where K is a compact set of R 2×2 that does not contain T 4 configurations, compactness properties hold.In particular, if sup j u j W 1,p (B 1 (0)) < +∞ for some p > 1, then there exists a subsequence u j k such that u j k converges strongly in W 1,q (B 1 (0)) for every q < p.This kind of compactness property can actually be used to prove partial regularity of solutions to elliptic systems of PDEs.The strategy of [13] does not apply directly to the higher dimensional case and motivates the following question Question 7. Let f ∈ C 1 (R n×m ) be a strictly polyconvex function and K f ⊂ R (2n+m)×m .Suppose W j : Ω → R (2n+m)×m is a sequence of maps such that sup j W j ∞ < +∞ and dist(W j (x), K f ) ⇀ 0 in the weak topology of L p .Then, up to subsequences, does W j converge strongly in W 1,p ?
To formulate the next questions, let us recall the following definitions, see, for instance, [19], or [21,Section 4.4].A function For compact sets K ⊂ R n×m , we define K rc = {X ∈ K : F (X) ≤ 0, ∀F : R n×m → R rank-one convex s.t.F (Z) ≤ 0, ∀Z ∈ K} and analogously K qc (resp.K pc ) where one uses quasiconvex (resp.polyconvex) functions instead of rank-one convex functions.Moveover, one has the following chain of inclusions 76) is fulfilled and we can apply the aforementioned result to obtain the desired equality (49).
On the other hand, if n ≥ m, then for Z ∈ A m we have I Z = id m , hence (79) becomes In this case To conclude the proof of the Lemma, we still need to prove that for every To perform the computation, we need to divide it into cases corresponding to the four blocks of the matrix h(X) as written in (47).To this end, recall the notation and moreover notice that h(X) is symmetric, therefore we just need to prove (80) in the case i ≤ j.
Another useful fact is the following.First notice that for every matrices N ∈ O(n), M ∈ O(m) (O(k) is the group of orthogonal matrices of order k), one has S(N XM ) = M T S(X)M.
From an easy computation we then conclude that, for every 1 ≤ i, j ≤ m + n and for every  82) tell us that we can check estimates (80) just on matrices Y := N XM with two additional hypotheses.Fix X ∈ R n×m , define Z = XM and denote the j-th column of a matrix A ∈ R n×m with A j .First, by a suitable choice of M , we can make sure that Y T Y = Z T Z = M T X T XM is diagonal.Once this choice is made, if n ≥ m, then we choose N = id n .Otherwise, if n < m, then we observe that at most n of the columns of Z are non-zero, let these be Z j 1 , . . ., Z jn and let us define J := {j 1 , . . ., j n } with 1 ≤ j 1 < j 2 < • • • < j n ≤ m.If for some j k we have Z j k = 0, then we set N = id n .Otherwise, the n × n matrix V formed using Z j 1 , . . ., Z jn has columns that are pairwise orthogonal and nonzero, hence there exists O ∈ O(n) such that V = OD, with D diagonal.In this case, we choose N = O T , so that the resulting Y has the property that Y j = y ℓ e ℓ if j = j ℓ , j ℓ ∈ {j 1 , . . ., j n }, 0 otherwise, where y j ∈ R and e ℓ are the vectors of the canonical basis of R n .Notice that this choice of M and N also implies that (1 + Y i 2 ) and S(Y ) = diag((1 + Y 1 2 ) −1 , . . ., (1 + Y m 2 ) −1 ).
We call (HP) these assumptions on the matrix Y ∈ R n×m .
First case, 1 ≤ i ≤ j ≤ m: In this case, h ij = S ij .We have  Let us explain in detail how to get the desired estimate (80) in this case.Notice that either Y l is 0, and in this case there is nothing to prove, or Y l = 0. Thanks to (HP), in Y there are at most min{m, n} non-zero columns.First let m ≤ n, then: If n < m and J is the set on indices corresponding to non-zero columns, we are in the hypothesis in which l ∈ J. Therefore we have m c=1 (1 + Y c 2 )
We also have and the desired estimate is obtained with a reasoning completely analogous to the one of (85).This concludes the proof of this case.
Second case, 1 ≤ i ≤ m < m + 1 ≤ j ≤ m + n: From now on we use m + j rather than j for the corresponding index.We thus have h ij+m (X) = (S(X)X T ) ij = m k=1 S ik x jk .
We compute the derivative using (84): Third case, m + 1 ≤ i ≤ j ≤ m + n: As above we use m + i and m + j in place of i and j.The indices i and j will then satisfy 1 ≤ i ≤ j ≤ n and we have h i+m,j+m (X) = (XS(X)X T ) ij = 1≤l,k≤m x il S lk x jk .
We compute the derivative using (84): x ac S lc S kb x il x jk .

Figure 1 .
Figure 1.The geometric arrangement of a T 4 configuration.

6 . 2 .
denotes the projection on the first factor, i.e. π(z) = π((x, y)) = x.Graphs and varifolds.If u ∈ W 1,p (Ω, R n ), Ω ⊂ R m and p > m, Morrey's embedding theorem shows the existence of a precise representative of u which is Hölder continuous.In what follows we will always assume that the map u is given pointwise by such (Hölder) continuous precise representative.Correspondingly we introduce the notation Γ u for the (set-theoretic) graph {(x, u(x)) : x ∈ Ω}, which is a relatively closed subset of Ω × R n .The classical area formula for instance [16, Cor. 2, Ch.3]) implies that Γ u is m-rectifiable and its H m measure is given by ˆΩ A(Du) .

( 1 + Y i 2 )
DS ij ) cd = 1≤a≤n x ac ∂ ad S ij = − 1≤l≤m,1≤a≤n(S id S jl x al x ad + x ad x al S il S jd ) .Now we use our previous observation (81)-(82) to consider Y satisfying (HP), so that in particular Y T Y is diagonal.In this case, we have|∂ ab S ij (Y )| ≤ 1≤l≤m (|S ib S jl y al | + |y al S il S jb |) .For every 1 ≤ i, b, j, l ≤ m, 1 ≤ a ≤ n, A(Y )|S ib S jl y al | ≤ m c=1 |y al | (1 + Y b 2 )(1 + Y l 2 ).

∂∂
ab h ij+m (X) = m k=1 δ ab jk S ik + m k=1 x jk ∂ ab S ik = m k=1 δ ab jk S ik − 1≤l,k≤m(S ib S kl x al x jk + x al x jk S il S kb ) , and also(X T Dh ij+m (X)) ab = 1≤c≤n x ca ∂ cb h ij = 1≤k≤m,1≤c≤n x ca δ cb jk S ik − 1≤l,k≤m,1≤c≤n (x ca S ib S kl x cl x jk + x ca x cl x jk S il S kb ) = x ja S ib − 1≤l,k≤m,1≤c≤n (x ca S ib S kl x cl x jk + x ca x cl x jk S il S kb ) Since S −1 (X) = id m +X T X, δ ij = 1≤k≤m S ik (δ kj + 1≤c≤n x ck x cj ) = S ij + 1≤k≤m,1≤c≤n S ik x ck x cj ,(86)In the first case, if j = a, i = b, we haveab h ij+m (Y ) = 1 1 + Y ja 2 − y ai y jb (1 + Y i 2 )(1 + Y b 2 ) ,and it is now easy to see thatA(Y )|∂ ab h ij+m (Y )| 1 + Y n−1 .Since if j = a or b = i, ∂ ab h ij+m (Y ) = −y ai y jb S iiS bb , the same estimate follows.To finish the second case, we still need to show thatA(Y )|(Y T Dh ij+m (Y )) ab | 1 + Y min{m,n}−1 .To do so, we recall (87) to estimate|(Y T Dh ij+m (Y )) ab | ≤ m k=1 S ib |y jk |S ka + m k=1 |y jk |S kb δ ai + m k=1 |y jk |S kb S ai .With similar computations to the one to prove (85), we estimate for1 ≤ i, b, a, k ≤ m, 1 ≤ j ≤ n, A(Y )S ib |y jk |S ka ≤ 0 if Y k = 0 or k = a, l =k (1 + Y l 2 ) otherwise, that implies A(Y )S ib |y jk |S ka 1 + Y min{m,n}−1 .Finally, since also for every 1 ≤ j ≤ n, 1 ≤ k, b ≤ m A(Y )|y jk |S kb ≤ 0 if Y k = 0 or k = b, l =k (1 + Y l 2) otherwise, we find A(Y )|y jk |S kb 1 + Y min{m,n}−1 , ∀1 ≤ k, b ≤ m, 1 ≤ j ≤ n.This completes the proof of the second case.
S lk x jk + 1≤l,k≤m δ ab jk S lk x il + 1≤l,k≤m x il ∂ ab S lk x jk = 1≤k≤m δ ia S bk x jk + 1≤l≤m δ ja S lb x il − 1≤l,k,c≤m S lb S kc x ac x il x jk − 1≤l,k,c≤m 0 for every i.Proof.(i) and (ii) are an obvious consequence of Definition 2.6 and of Definition 2.4.After extending n i to an orthonormal basis {n i , v j 2 , . . .v j m } of R m we can explicitely compute