Schur polynomials and matrix positivity preservers

A classical result by Schoenberg (1942) identifies all real-valued functions that preserve positive semidefiniteness (psd) when applied entrywise to matrices of arbitrary dimension. Schoenberg's work has continued to attract significant interest, including renewed recent attention due to applications in high-dimensional statistics. However, despite a great deal of effort in the area, an effective characterization of entrywise functions preserving positivity in a fixed dimension remains elusive to date. As a first step, we characterize new classes of polynomials preserving positivity in fixed dimension. The proof of our main result is representation theoretic, and employs Schur polynomials. An alternate, variational approach also leads to several interesting consequences including (a) a hitherto unexplored Schubert cell-type stratification of the cone of psd matrices, (b) new connections between generalized Rayleigh quotients of Hadamard powers and Schur polynomials, and (c) a description of the joint kernels of Hadamard powers.


Introduction and main result
Endomorphisms of matrix spaces with positivity constraints have long been studied in connection with a variety of topics: the geometry of classical domains in complex space, matrix monotone functions [18], positive definite functions [3,4,7,23,26], hyperbolic or positive definite polynomials and global optimization algorithms [6,14]. In this paper, we study the entrywise calculus on the cone of positive semidefinite matrices, with the aim of characterizing positivity preservers in that setting.
Given ρ ∈ (0, ∞), let D(0, ρ) and D(0, ρ) denote the open and closed complex discs of radius ρ centered at the origin, respectively. Given integers 1 ≤ k ≤ N and a set I ⊂ C, let P k N (I) denote the set of positive semidefinite N × N matrices, with entries in I and rank at most k. Let P N (I) := P N N (I). A function f : I → C induces an entrywise map of matrix spaces, sending A = (a jk ) ∈ P N (I) to f [A] := (f (a jk )). Starting from positive definite functions [7,23,26], it is natural to classify all entrywise functions f [−] preserving positive semidefiniteness (positivity). It is an easy consequence of the Schur product theorem [24] that if f : (−ρ, ρ) → R is analytic with non-negative Taylor coefficients, then f [A] ∈ P N for all A ∈ P N and all N ≥ 1. A celebrated result of Schoenberg shows the converse. Schoenberg's theorem and its ramifications were persistently examined and revisited, see for instance Rudin [22], Berg-Christensen-Ressel-Porcu [3,4,8], Hiai [16], to cite only a few. The present investigation evolves out of Schoenberg's result by imposing the challenging condition of dealing with matrices of fixed dimension. This is a much harder question, that is open despite tremendous activity in the field.
It is worth recalling that Schoenberg was motivated by the problem of isometrically embedding positive definite metrics into Hilbert space; see e.g. [26]. In [23], he sought to classify positive definite functions on spheres S d−1 ⊂ R d . This can be reformulated via Gram matrices, as classifying the entrywise functions preserving positivity on correlation matrices of all dimensions, with rank at most d. A strong need to study the fixed dimension case also arises out of current demands from the fast expanding field of data science. In modern settings, functions f are often applied entrywise to high-dimensional correlation matrices A, in order to improve their properties (better conditioning, Markov random field structure, etc.); see e.g. [5,15,21]. The "regularized" matrices f [A] are ingredients in further statistical procedures, for which it is critical that they be positive semidefinite. Also, in applications the dimension of the problem is known, and so, preserving positivity in all dimensions unnecessarily limits the class of functions that can be used. There is thus strong motivation from applications to study the fixed dimension case.
While characterization results have recently been obtained in fixed dimension under additional rank and sparsity constraints arising in practice [11,12,13], the original problem in fixed dimension has remained open for more than 70 years. A necessary condition for continuous functions was developed by Horn (and attributed to Loewner) in his doctoral thesis [17]. The result was recently extended in [11] to low-rank matrices, and without the continuity assumption: Note that all real power functions x α preserve positivity on P 1 N ((0, ρ)), yet such functions need not have even a single positive derivative on (0, ρ). However, Theorem 1.2 shows that working with a small one-parameter extension of P 1 N ((0, ρ)) guarantees that f (k) is non-negative on (0, ρ) for 0 ≤ k ≤ N − 3. Theorem 1.2 is sharp, since the entrywise power x α , for α ∈ (N − 2, N − 1), preserves positivity on P N ((0, ρ)), but not on P N +1 ((0, ρ)). See [9,10,16] for more on entrywise powers preserving positivity. Consequently, in this paper we study analytic functions which preserve P N for fixed N , when applied entrywise. Note that any analytic function mapping (0, ρ) to R necessarily has real Taylor coefficients. Now a variant of Theorem 1.2 for analytic functions, obtained using generalized Vandermonde matrices, shows that the same conclusions hold if one works merely with rank-one matrices: N ((0, ρ)) → P N (R) for some integer N ≥ 1, then the first N non-zero Taylor coefficients c j are strictly positive.
Given f (z) = k≥0 c k z k such that c 0 , . . . , c N −1 > 0, a natural challenging question to ask is if the next non-zero coefficient c M can be negative; and if so, to provide a negative threshold for the coefficient c M , where M ≥ N . Resolving these questions, open since Horn's 1969 paper, provides a quantitative version of Schoenberg's theorem. Our main result answers these questions in the affirmative, and illustrates the complexity of the negative threshold bound. It is also surprising that preserving positivity on P N (D(0, ρ)) is equivalent to preserving positivity on the much smaller set of real rank-one matrices, P 1 N ((0, ρ)).  Notice that the condition c 0 , . . . , c N −1 ≥ 0 follows from Lemma 1.3. Theorem 1.4 now provides the first construction of a polynomial that preserves positivity on P N , but not on P N +1 . Indeed, this is the case when −C(c; z M ; N, ρ) −1 ≤ c N < 0, by Theorem 1.2. Remark 1.6 Theorem 1.4 can naturally be used to provide a sufficient condition for an arbitrary analytic function to preserve positivity on P N (D(0, ρ)). The reader is referred to [1] for more details.

Proof of the main result
We now sketch the proof of Theorem 1.4. Recall that the Schur product theorem provides the first examples of entrywise functions preserving positivity, namely, the functions of the form ∞ k=0 c k z k with c k ≥ 0. That these are the only functions preserving positivity in all dimensions is Schoenberg's theorem (Theorem 1.1). In some sense, our proof of the fixed dimension case in Theorem 1.4 returns to Schur by crucially using symmetric functions among other techniques, specifically, Schur polynomials and Schur complements. Indeed, the technical heart of the proof is an explicit Jacobi-Trudi type identity, which is valid in any field and may be interesting in its own right.
Given a partition, i.e., a non-increasing N -tuple of non-negative integers n = (n N ≥ · · · ≥ n 1 ), define the corresponding Schur polynomial s n (x 1 , . . . , x N ) over a field F with at least N elements, to be the unique polynomial extension to F N of s n (x 1 , . . . , The last equation can also be deduced from the Weyl Character Formula in type A; see, for example, [ The technical heart of the proof involves the following explicit determinantal identity. Sketch of proof. We first show the following fact: Let A := uv T for u, v ∈ F N . Given a strict partition n = (n m > n m−1 > · · · > n 1 ) and scalars (c n1 , . . . , c nm ) ∈ F m , the following determinantal identity holds: , and the sum is over all subsets n ′ of cardinality N .
The proof of (2.5) uses the matrix X(u, n) := (u n k j ) 1≤j≤N,1≤k≤m and the Cauchy-Binet formula Using ( since the determinant in each of the remaining terms contains at least two rows of the rank-one matrix can be computed using (2.5) again, to yield: , with pairwise distinct ǫ k ∈ (0, 1) , and t ′ ∈ (0, 1). Thus, ∆ N (u) = 0. Taking the limit as t ′ → 0 + , since the final term in (2.7) must be non-negative, it follows by Theorem 2.2 that It remains to show that . One now shows that Next, we claim that for all 1 ≤ m ≤ N and A = uu * ∈ P 1 N (D(0, ρ)), every principal m × m submatrix of the matrix is positive semidefinite. Notice that the rank-one case of (1) follows by setting m = N .

Consequences of the main theorem
Theorem 1.4 leads to a host of consequences that initiate the development of an entrywise matrix calculus, in parallel to the well-studied functional calculus. We now discuss two of these consequences in detail: linear matrix inequalities and connections to Rayleigh quotients.

Linear matrix inequalities for Hadamard powers
where ≤ stands for the Loewner ordering. Moreover, the constant C(c; z M ; N, ρ) is sharp in (3.2).
Notice here that the right-hand side of (3.2) cannot involve fewer Hadamard powers, by Lemma 1.  As with the main theorem, the proof of (1) and (2) crucially uses symmetric functions, specifically, connections between Schur polynomials and Young tableaux.

Sketch of proof.
It is also easy to show (2) for N = 1. Thus, assume N > 1. We first show that (2) follows from (1). Indeed, if (1) holds, then using u ∈ [0, u j is such a monomial for all j, we get u 1 = · · · = u N , a contradiction. Hence det f [A] > 0 as claimed. ✷ Theorem 1.4 also fits naturally into the framework of spectrahedra and the matrix cube problem [6,20]; see [1] for more details.

Rayleigh quotients
Given a domain K ⊂ C, functions g, h : K → C, and a set of matrices P ⊂ ∪ N ≥1 P N (K), define C(h; g; P) to be the smallest real number such that g[A] ≤ C(h; g; P) · h[A] for all A ∈ P. That is, C(h; g; P) is the extreme critical value of the family of linear pencils {−g[A] + Rh[A] : A ∈ P}. This notation helps achieve a uniform and consistent formulation of the aforementioned theorems by We now discuss an alternate, variational approach to proving Theorem 1.4, which proceeds as follows: (I) Bound A •M by lower Hadamard powers for a single matrix A, i.e., by α A · h c [A] for the smallest constant α A > 0.
(II) Now take the supremum of α A over all matrices A ∈ P N (D(0, ρ)).
Notice that the first step (I) simply involves computing the extreme critical value α A = C(h c ; z M ; A), using the above notation. This and an improved understanding of ker h c [A], can be achieved as follows: Then K(A) = n≥0 ker A •n , and the extreme critical value is finite for all A: Moreover, the bound C(c; z M ; N, ρ) is sharp, and is obtained as the supremum of the Rayleigh constant C(h c ; z M ; A) as A runs over the smaller set P 1 N ((ρ − ǫ, ρ)) for any ǫ ∈ (0, ρ). Sketch of proof. The first step is to show how the Schur polynomials s µ(M,N,j) in Theorem 2.2 serve an additional purpose: they are precisely the "universal coefficients" involved in expressing A •M as a combination of lower Hadamard powers, for any matrix A and over any field F. More precisely, if A is an N × N matrix with entries in F, and a 1 , . . . , a N are its rows, then we first claim that Having proved the claim, the second step is to show that K(A) = ker h c [A] ⊂ ker A •M for all M ≥ 0. This is obvious if 0 ≤ M < N , while for M ≥ N , we use (3.7) to compute: It follows that K(A) ⊂ n≥0 ker A •n . The reverse inclusion is easy to show, as is (the equality in) the next assertion. The subsequent inequality and the last sentence in the result follow from Theorem 1.4. ✷ It is also of interest to find a closed-form expression for the generalized Rayleigh quotient C(h c ; z M ; A) for a given matrix A. The following result provides two such expressions, consequently revealing new and unexpected connections between Rayleigh quotients and Schur polynomials.
where C †/2 , ̺(C) denote the principal square root of the Moore-Penrose inverse of C, and the spectral radius of C, respectively. For instance, if A = uu * with u having distinct coordinates, then Sketch of proof. The proof of (3.9) uses the theory of Kronecker normal forms and Rayleigh quotients, and is omitted for brevity. To show the first equality in (3.10), Finally, we show that the last equality in (3.10) holds more generally, for any rank-one matrix A = uv T , where u, v are vectors with distinct coordinates in any field F. Indeed, notice by the proof of (2.5) that Equation ( (D(0, ρ)), or even on P 1 N ([0, ρ]). Specifically, it is not continuous at the matrix A = ρ1 N ×N . This spectral discontinuity phenomenon warrants further exploration, and is to be the subject of future work [2].

Stratification of the cone, and the simultaneous kernels
In the final section, we take a closer look at the simultaneous kernel K(A) defined in (3.6). As we now discuss, this space crucially depends on a canonical block decomposition of the matrix A. We begin by isolating this refined structure. Consider the following two examples: Given the positivity of A 1 , A 2 , one can show that all entries of B are equal, while u 1 = −u 2 . In fact, if entries in each diagonal block of a positive semidefinite matrix lie in a G-orbit for some subgroup G ⊂ C × , this imposes constraints on the off-diagonal blocks. This is distilled into the following result. 1. Each diagonal block A Ij of A is a submatrix with rank at most one.
2. The entries of each diagonal block A Ij lie in a single G-orbit. (1), (2) have maximal size.

The diagonal blocks A Ij of A satisfying
In this case, each off-diagonal block of A also has rank at most one, with all its entries in a single G-orbit. , where a, d ≥ 0, g, g ∈ G, b, c ∈ C. We claim that c ∈ b·G, and that the minor a b ag c is singular. This is because 0 ≤ det B = −a(|c| 2 +|b| 2 |g| 2 −2ℜ(bcg)) = −a|c−bg| 2 .
Hence either a = 0, in which case b = c = 0, by the positivity of B, or c = bg. The proof repeatedly uses computations along similar lines, to show that there exists C ∈ P k (C) with rank(C) = rank(A), and vectors u j ∈ C |Ij | with entries in a single G-orbit, such that A Ii×Ij = c ij u i u * j , for all 1 ≤ i, j ≤ k. ✷ Denote by (Π N , ≺) the poset of all partitions of {1, . . . , N } under refinement. Then one has the partition map π G : P N (C) → Π N , sending 0 to {{1, . . . , N }} and all other matrices A to π G (A). Define S G π to be the fiber of this map: S G π := {A ∈ P N (C) : π G (A) = π}, ∀π ∈ Π N . (4.2) Corollary 4.3 Fix a subgroup G ⊂ C × . The sets S G π form a Schubert cell-type stratification of the cone: Moreover, every A ∈ P N (C) has rank at most |π C × (A)|.
The stratification of the cone P N (C) is noteworthy in that the generalized Rayleigh quotient map Ψ c,M (defined in Section 3.2) is discontinuous at the point ρ1 N ×N as one is jumping across strata S {1} π . Motivated by Proposition 3.5, a precise description of the simultaneous kernel K(A) = n≥0 ker A •n is in order. It turns out that the map A → K(A) depends crucially (and solely) on the stratification. The proof of this result is fairly involved, and we refer the reader to [1, Section 5] for details.
We conclude with the following surprising consequence of Theorem 4.5: as A runs over the uncountable set of matrices in P N (C), the set of simultaneous kernels K(A) = n≥0 ker A •n is, nevertheless, a finite set of subspaces of C N , indexed by Π N . This is in stark contrast to the situation for the usual matrix powers, in which case n≥1 ker A n = ker A can vary over an uncountable set of subspaces of C N .
Other ramifications of this work, as well as complete proofs can be found elsewhere [1,2].