On trees, tanglegrams, and tangled chains

Tanglegrams are a class of graphs arising in computer science and in biological research on cospeciation and coevolution. They are formed by identifying the leaves of two rooted binary trees. The embedding of the trees in the plane is irrelevant for this application. We give an explicit formula to count the number of distinct binary rooted tangle-grams with n matched leaves, along with a simple asymptotic formula and an algorithm for choosing a tanglegram uniformly at random. The enumeration formula is then extended to count the number of tangled chains of binary trees of any length. This work gives a new formula for the number of binary trees with n leaves. Several open problems and conjectures are included along with pointers to several followup articles that have already appeared.


Introduction
Tanglegrams are graphs obtained by taking two binary rooted trees with the same number of leaves and matching each leaf from the tree on the left with a unique leaf from the tree on the right. The embedding of the trees in the plane is irrelevant for this application. This construction is used in the study of cospeciation and coevolution in biology. The embedding of the trees in the plane is irrelevant for this application. For example, the tree on the left may represent the phylogeny of a host, such as gopher, while the tree on the right may represent a parasite, such as louse [19, page 71]. One important problem is to reconstruct the historical associations between the phylogenies of host and parasite under a model of parasites switching hosts, which is an instance of the more general problem of cophylogeny estimation. See [19,20] for applications in biology. Diaconis and Holmes have previously demonstrated how one can encode a phylogenetic tree as a series of binary matchings [7], which is a distinct use of matchings from that discussed here.
In computer science, the Tanglegram Layout Problem (TL) is to find a drawing of a tanglegram in the plane with the left and right trees both given as planar embeddings with the smallest number of crossings among (straight) edges matching the leaves of the left tree and the right tree [3]. These authors point out that tanglegrams occur in the analysis of software projects and clustering problems.
In this paper, we give the exact enumeration of tanglegrams with n matched pairs of vertices, along with a simple asymptotic formula and an algorithm for choosing a tanglegram uniformly at random. We refer to the number of matched vertices in a tanglegram as its size. Furthermore, two tanglegrams are considered to be equivalent if one is obtained from the other by replacing the tree on the left or the tree on the right by isomorphic trees. For example, in Figure 1, the two non-equivalent tanglegrams of size 3 are shown. We state our main results here postponing some definitions until Section 2. The following is our main theorem.
Theorem 1. The number of tanglegrams of size n is where the sum is over binary partitions of n and z λ is defined by Equation (1). Note, if λ has one part, the corresponding empty product in the numerator is 1.
The first 10 terms of the sequence t n starting at n = 1 are Example. The binary partitions of n = 4 are (4), (2, 2), (2, 1, 1) and (1, 1, 1, 1), so as shown in Figure 2. It takes a computer only a moment to compute and under a minute to compute all 3160 integer digits of t 1000 using a recurrence based on Theorem 1, see Section 5. We can use the main theorem to study the asymptotics of the sequence t n .
We can also compute approximations of higher degree. For example, we have A side result of the proof is a new formula for the number of inequivalent binary trees, called the Wedderburn-Etherington numbers [18, A001190]. Theorem 3. The number of inequivalent binary trees with n leaves is where the sum is over binary partitions of n.
A tangled chain is an ordered sequence of k binary trees with matchings between neighboring trees in the sequence. For k = 1, these are inequivalent binary trees, and for k = 2, these are tanglegrams, so the following generalizes Theorems 1 and 3.
In terms of computational biology, tangled chains of length k formalize the essential input to a variety of problems on k leaf-labeled (phylogenetic) trees (e.g. [22]). Theorem 4. The number of ordered tangled chains of length k for n is where the sum is over binary partitions of n. From the enumerative point of view, it is also quite natural to ask how likely a particular tree T is to appear on one side or the other of a uniformly selected tanglegram. In Section 6, we give a simple explicit conjecture for the asymptotic growth of the expected number of copies of T on one side of a tanglegram as a function of T and the size of the tanglegram. For example, the cherries of a binary tree are pairs of leaves connected by a common parent. We conjecture that the expected number of cherries in one of the binary trees of a tanglegram of size n chosen in the uniform distribution is n/4.
Further discussion of the applications of tanglegrams along with several variations on the theme are described in [17]. In particular, tanglegrams can be used to compute the subtree-prune-regraft distance between two binary trees. In a recent follow up paper, Gessel has used the formula given here for binary trees to count several variations on tanglegrams using the theory of species [11].
Gessel also noted that our formula for binary trees can be interpreted as an instance of Burnside's lemma. Let S n act on leaf labeled binary trees with n leaves by permuting the labels. The number of fixed points of w ∈ S n under this action only depends on the cycle type of w. If we multiply and divide our formula by n!, then n!/z λ counts the number of permutations in S n with cycle type λ. Hence, the product corresponding to a binary partition λ counts the number of fixed points of a permutation w with type λ. If w has cycle type which is not a binary partition then w has no fixed trees under this action. Similar reasoning can be applied to S n actions on pairs of trees to relate the formula for tanglegrams to fixed points, and this extends to tangled chains. This proves the following corollary.
Corollary 5. The product (λ) i=2 2(λ i + · · · + λ (λ) ) − 1 k counts the number of fixed points of any permutation w ∈ S n of cycle type λ, a binary partition of n, acting on ordered labeled tangled chains of size n and length k.
The extended abstract proceeds as follows. In Section 2, we define our terminology. We sketch the proof of the main theorems in Section 3. Section 4 contains an algorithm to choose a tanglegram uniformly at random for a given n and we give an asymptotic approximation for the number of tanglegrams. We conclude with several open problems and conjectures in Section 6.
The full version of the paper is [2]. Several papers continuing the study of trees, tanglegrams and tangled chains have recently appeared on the arXiv including [6,10,11,16].

Background
In this section, we recall some vocabulary and notation on partitions and trees. This terminology can also be found in standard textbooks on combinatorics such as [21]. We use these terms to give the formal definition of tanglegrams and the notation used in the main theorems.
A partition λ = (λ 1 , λ 2 , . . . , λ k ) is a weakly decreasing sequence of positive integers. The length (λ) of a partition is the number of entries in the sequence, and |λ| denotes the sum of the entries of λ. We say λ is a binary partition if all its parts are equal to a nonnegative power of 2. Binary partitions have appeared in a variety of contexts, see for instance in [14,15] and [18, A000123]. When writing partitions, we sometimes omit parentheses and commas.
If λ is a nonempty binary partition with m i occurrences of the letter 2 i for each i, we also de- The numbers z λ are well known since the number of permutations in S n with cycle type λ is n!/z λ [21, A tree is a graph with no cycles; some experts call this a non-plane tree since the embedding in the plane is irrelevant. A rooted tree has one distinguished vertex assumed to be a common ancestor of all other vertices. The neighbors of the root are its children. Each vertex other than the root has a unique parent going along the path back to the root, the other neighbors are its children. In a binary tree, each vertex either has two children or no children. A vertex with no children is a leaf, and a vertex with two children is an internal vertex.
Two binary rooted trees with distinct labeled leaves are said to be equivalent if there is an isomorphism from one to the other as graphs mapping the root of one to the root of the other. Let B n be the set of inequivalent binary rooted trees with n ≥ 1 leaves, and let b n be the number of elements in the set B n . The sequence of b n 's for n ≥ 1 begins 1, 1, 1, 2, 3, 6, 11, 23, 46, 98.
We can inductively define a linear order on rooted trees as follows. We say that T > S if either: • T has more leaves than S • T and S have the same number of leaves, T has subtrees T 1 and T 2 , T 1 ≥ T 2 , S has subtrees S 1 and S 2 , S 1 ≥ S 2 , and T 1 > S 1 or T 1 = S 1 and T 2 > S 2 We assume that every tree T in B n , n ≥ 2, is presented so that T 1 ≥ T 2 , where T 1 is the left subtree (or upper subtree if the tree is drawn with the root on the left or on the right) and T 2 is the right (or lower) subtree. Each tree T ∈ B n represents a distinct S n orbit on leaf labeled binary trees with n-leaves. We can define its automorphism group A(T ) as follows. Fix a labeling on the leaves of T using the numbers 1, 2, . . . , n. Label each internal vertex by the union of the labels for each of its children. The edges in T are pairs of subsets from [n] := {1, . . . , n}, each representing the label of a child and its parent.
be a permutation in the symmetric group S n . Then, v ∈ A(T ) if permuting the leaf labels by the function i → v(i) for each i leaves the set of edges fixed.
A theorem from [13] tells us that if T is a tree with subtrees T 1 and T 2 , then A(T ) is isomorphic to Since the automorphism group of a tree on one vertex is trivial, this implies that the general A(T ) can be obtained from copies of Z 2 by direct and wreath products (see [17] for more details). Furthermore, if T 1 = T 2 , then the conjugacy type of an element of A(T ) is λ 1 ∪ λ 2 , where λ i is the conjugacy type of an element of A(T i ), i = 1, 2, and λ 1 ∪ λ 2 is the multiset union of the two sequences written in decreasing order. If T 1 = T 2 , then for an arbitrary element of A(T ) either the leaves in each subtree remain in that subtree, or all leaves are mapped to the other subtree. The conjugacy type of an element of A(T ) is then either λ 1 ∪ λ 2 , where λ i is the conjugacy type of an element of A(T i ), i = 1, 2, or it is 2λ 1 , where λ 1 is the conjugacy type of an element of A(T 1 ). In particular, the conjugacy type of any element of the automorphism group of a binary tree must be a binary partition.
Next, we define tanglegrams. Given a permutation v ∈ S n along with two trees T, S ∈ B n each with leaves labeled 1, . . . , n, we construct an ordered binary rooted tanglegram (T, v, S) of size n with T as the left tree, S as the right tree, by identifying leaf i in T with leaf v(i) in S. Note, (T, v, S) and (T , v , S ) are considered to represent the same tanglegram provided T = T , S = S as trees and v = uvw where u ∈ A(T ) and w ∈ A(S). Let T n be the set of all ordered binary rooted tanglegrams of size n, and let t n be the number of elements in the set T n . For example, t 3 = 2 and t 4 = 13. Figures 1 and 2 show the tanglegrams of sizes 3 and 4 where we draw the leaves of the left and right tree on separate vertical lines and show the matching using dotted lines. The dotted lines are not technically part of the graph, but this visualization allows us to give a planar drawing of the two trees.
We remark that the plane binary trees with n ≥ 2 leaves are a different family of objects from B n that also come up in this paper. These are trees embedded in the plane so the left child of a vertex is distinguishable from the right child. The plane binary trees with n + 1 leaves are well known to be counted by Catalan numbers

Sketch of proof of the main theorem
The focus of this section is the proof of Theorem 1. The theorem will follow from a auxiliary result, and the proof of Theorem 4 is similar and is omitted in this extended abstract. The number of tanglegrams is, by definition, equal to where the sums on the right are over inequivalent binary trees with n leaves, and C(T, S) is the set of double cosets of the symmetric group S n with respect to the double action of A(T ) on the left and A(S) on the right. See [17] for more details. Let us fix T ∈ B n and S ∈ B n and write C = C(T, S). Then where C w is the double coset of S n that contains w. It is known (e.g. [ where A(T ) λ (respectively, A(S) λ ) denotes the elements of A(T ) (resp., A(S)) of type λ.
To get the formula for t n we want to sum Equation (2) over all pairs of trees, and fortunately a change of the order of summation helps. Indeed, we have and the main theorem is proved once we have shown the following proposition.
Proposition 6. For a binary partition λ, where A(T ) λ denotes the elements of A(T ) of type λ.
The proposition also implies Theorem 3, as If λ = 1 n , then |A(T ) λ | = 1 for all T ∈ B n , so the proposition is saying that This is equivalent to T 2 n−1 /|A(T )| = c n−1 . Since 2 n−1 /|A(T )| counts all plane binary trees isomorphic to T , this is just the well-known fact that there are c n−1 plane binary trees with n leaves. For a general λ, however, the proposition is far from obvious. What we need is a recursion satisfied by the expression on the right, analogous to the recursion c n = c 0 c n−1 + c 1 c n−1 + · · · + c n−1 c 0 for Catalan numbers. Lemma 7. For a nonempty subset S = {i 1 < i 2 < . . . < i k } of the natural numbers define Let n ≥ 2, let x denote variables x 1 , x 2 , . . ., and let x/2 denote x 1 /2, x 2 /2, . . .. Then The proof is by induction on n. See [2] for complete details. Example. For n = 3, the lemma says that where the last three terms on the right-hand side correspond to subsets {1}, {1, 2}, and {1, 3}, respectively. As another example, take x i = 2 for all i. Then r S (x) = (2|S| − 3)!! (where we interpret (−1)!! as 1), r S (x/2) = 0, and by the obvious symmetry of S and [n] \ S the lemma yields which is equivalent to the standard recurrence for Catalan numbers.
Proof of Proposition 6. Say λ is a binary partition of n. The proof is by induction on n. For n = 1, the statement is obvious. Assume that the statement holds for all binary partitions up to size n − 1. Our task is to show by showing the left hand side satisfies a recurrence similar to (5). This can be done by a careful analysis of all possible cases and is omitted in this extended abstract.

Random generation of tanglegrams
Algorithm 1 (Random generation of w ∈ A(T )).
Input: Binary tree T ∈ B n . Procedure: If T is the tree with one vertex, let w be the unique element of A(T ). Otherwise, the root of T has subtrees T 1 and T 2 . Assume the leaves of T 1 are labeled [1, k] and the leaves of T 2 are labeled [k + 1, n]. Use the algorithm recursively to produce w i ∈ A(T i ), i = 1, 2 where A(T 1 ) is a subset of the permutations of [1, n] which fix [k + 1, n] and A(T 2 ) is a subset of the permutations of [1, n] which fix [1, k]. Construct w as follows. Say f : [1, k] −→ [k + 1, n] mapping i to i + k induces an isomorphism of T 1 and T 2 . Define the "tree flip permutation" π to be the product of the transpositions interchanging i with f (i) for all 1 ≤ i ≤ k.
Algorithm 2 (Random generation of T with non-empty A(T ) λ and w ∈ A(T ) λ ).
Input: Binary partition λ of n.
Procedure: If n = 1, let T be the tree with one vertex, and let w be the unique element of A(T ).

Algorithm 3 (Random generation of tanglegrams).
Input: Integer n. Procedure: Pick a random binary partition λ of n with probability proportional to z λ q 2 λ where t n = z λ q 2 λ . Use Algorithm 2 twice to produce random trees T and S and permutations u ∈ A(T ) λ , v ∈ A(S) λ . Among the permutations w for which u = wvw −1 , pick one at random from the z λ possibilities.
Output: Binary trees T and S and double coset A(T )wA(S), or equivalently (T, w, S).
Algorithm 4 (Random generation of T ∈ B n ). Algorithm 4 is not the first of its kind, see also [9].
Input: Integer n. Procedure: Pick a random binary partition λ of n with probability proportional to q λ . Use Algorithm 2 to produce a random tree T (and a permutation u ∈ A(T ) λ ).
Output: Binary tree T .

Algorithm 5 (Random generation of tangled chains).
Input: Positive integers k and n.
Procedure: Pick a random binary partition λ of n with probability proportional to z k−1 λ q k λ where t(k, n) = z k−1 λ q k λ . Use Algorithm 2 k times to produce random trees T i and permutations u i ∈ A(T i ) λ for i = 1, . . . , k. Among the permutations w i for which u i = w i u i+1 w −1 i , pick one uniformly at random for each i = 1, . . . , k − 1.
Theorem 8. For any positive integer n, the following hold. Algorithm 1 produces every permutation w ∈ A(T ) with probability 1 |A(T )| . Algorithm 2 produces every pair (T, w), where w ∈ A(T ) λ , with probability 1 |A(T )|·q λ . Algorithm 3 produces every tanglegram with probability 1 tn . Algorithm 4 produces every inequivalent binary tree with probability 1 bn . Algorithm 5 produces every tangled chain of length k of trees in B n with probability 1 t(k,n) .

A recurrence for enumerating tanglegrams and tangled chains
In this section, we give a recurrence for computing t n . Recall that for each nonempty binary partition λ, we can construct its multiplicity vector m λ = (m 0 , m 1 , m 2 , m 3 , . . .) where m i is the number of times 2 i occurs in λ. The map λ → m λ is a bijection from binary partitions to vectors of nonnegative integers with only finitely many nonzero entries. The quantity z λ for a binary partition λ is easily expressed in terms of the multiplicities in m λ as We will use the functions f 2 (s) : with base cases c(h, 0, s) = r(h, 0, s) = 1.
The general case is spelled out in [2]. The proof is a direct consequence of Theorem 1. Similar recurrence relations hold for all tangled chains.

Final remarks
Variants on tanglegrams Tanglegrams as described here fit in a set of more general setting of pairs of graphs with a bijection between certain subsets of the vertices (more completely described and motivated in [17]). One can also consider unordered tanglegrams by identifying (T, v, S) with (S, v −1 , T ). For example, the 4th and 5th tanglegrams in Figure 2 are equivalent as unordered tanglegrams, and so are the 8th and 10th. From this picture, the reader can verify that there are 10 unordered tanglegrams of size 4.
Because of reversibility assumptions for the continuous time Markov mutation models commonly used to reconstruct phylogenetic trees, unrooted trees are the most common output of phylogenetic inference algorithms. Thus another variant of tanglegrams involves using unrooted trees in place of rooted ones. The motivation for studying these variants comes from noting that many problems in computational phylogenetics such as distance calculation between trees [1] "factor" through a problem on tanglegrams.

Connection with symmetric functions
The main theorems suggest that symmetric functions might be at play; note, for example, the similarity with the formula h n = λ z −1 λ p λ , where h n is the homogeneous symmetric function, p λ the power sum symmetric function, and the sum is over all partitions of n. Is there a connection between tanglegrams (or more generally tangled chains) and symmetric functions?
Based on a manuscript version of this paper, Ira Gessel pointed out that there is indeed a connection between symmetric functions and the enumeration of tanglegrams based on the theory of species. He has beautifully spelled out this connection. This approach leads to a simple formula for the number of unordered tanglegrams and a generating function for the number of unrooted tanglegrams along with several other variations on tanglegrams [11].

Alternative proofs
Recently, Eric Fusy gave a combinatorial proof of our main results, which also yields a remarkable simplification of the random sampler for tangled chains [10].

The shape of a random tanglegram
Given an algorithm for random generation, it is natural to ask for the probability of certain substructures in trees, tanglegrams and tangled chains. For example, cherries (two leaves with a common parent) play an important role in the literature on tanglegrams, see [4, pp. 325-326]. In the original version of this abstract and the corresponding full length paper, we stated several open problems and conjectures on the limiting distribution of certain substructures. Many of these problems have now been solved by Konvalinka and Wagner [16] and Czabarka, Székely, and Wagner [6]. In particular, Konvalinka and Wagner show that the two halves of a random tanglegram essentially look like two independently chosen random plane binary trees.