$2\times 2$ monotone grid classes are finitely based

In this note, we prove that all $2 \times 2$ monotone grid classes are finitely based, i.e., defined by a finite collection of minimal forbidden permutations. This follows from a slightly more general result about certain $2 \times 2$ (generalized) grid classes having two monotone cells in the same row.


Introduction
In recent years, the emerging theory of grid classes has led to some of the major structural and enumerative developments in the study of permutation patterns. Particular highlights include the characterisation of all possible "small" growth rates (Huczynska and Vatter, 2006;Kaiser and Klazar, 2003;Vatter, 2011) and the subsequent result that all classes with these growth rates have rational generating functions (Albert et al., 2015).
To support results such as these, the study of grid classes themselves has gained importance. Restricting one's attention to monotone grid classes, it is known that the structure of the matrix defining a grid class determines both its growth rate (Bevan, 2015), and whether it is well-partially-ordered (Murphy and Vatter, 2003).
One remaining open question about monotone grid classes concerns their bases, that is, the sets of minimal forbidden permutations of the classes. Backed up by some computational evidence, it is widely believed that all monotone grid classes are finitely based, but this is only known to be true for certain families, most notably those whose row-column graphs (i) are forests (Albert et al., 2013). To date, the only other instances of monotone grid classes that are known to have a finite basis are two 2 × 2 grid classes. The first concerns the class of skew-merged permutations, Av(2143, 3412), in (Stankova, 1994), while the second is in Waton's PhD thesis (Waton, 2007). Inspired by Waton's approach, we show that a certain family of (non-monotone) 2 × 2 grid classes are all finitely based, from which we can conclude the following result.
The rest of this section covers a number of prerequisite definitions. In Section 2 we introduce a more general construction than grid classes, based on juxtapositions, that are known to be finitely based, and use these to characterise the grid classes they contain. In Section 3 we consider three separate cases that will enable us to prove our more general result (Theorem 1.2), and thence Theorem 1.1.
Writing permutations in one-line notation, we say that the permutation σ is contained in a permutation π, denoted σ ≤ π, if there is a subsequence of the entries of π that have the same relative ordering as the entries of σ. A specific instance of a set of entries of π witnessing this containment is called a copy of σ in π. Containment forms a partial order on the set of all permutations, and sets of permutations which are closed downwards in this order are called permutation classes. Specifically, if C is a permutation class, π ∈ C and σ ≤ π, then we must have σ ∈ C. For convenience later, we regard the empty permutation as belonging to every permutation class.
While permutation classes can be defined in a number of ways (for example, the set of all permutations that can be sorted by a stack forms a permutation class), a convenient characterisation can be given in terms of the unique set of minimal forbidden permutations that do not lie in the class. We call the set B the basis of a class C if C = {π : β ≤ π for all β ∈ B}, and B is minimal with this property, and we write C = Av(B). By its minimality, the set B must form an antichain under ≤, but since infinite antichains are know to exist in the containment partial order, B need not be finite. When the basis of C is finite, we say that C is finitely based. We frequently make use of a graphical perspective, in which we represent a permutation π by plotting the points (i, π(i)) (i = 1, . . . , |π|) in the plane. Indeed, we do not distinguish between the permutation π written in one-line notation, and the graphical representation of π.
For m, n ≥ 1, let M be an m × n matrix whose entries are permutation classes (including possibly the empty class). The grid class of the matrix M, denoted Grid(M), is the permutation class consisting of all permutations π for which (in the graphical perspective) there exist m − 1 horizontal and n − 1 vertical lines which divide the entries of π into mn rectangles, so that the (possibly zero) entries of π in each rectangle form a copy of a permutation from the class in the corresponding entry of M. When the entries of M are all either Av(12), Av(21) or ∅, then Grid(M) is a monotone grid class.
We are mostly concerned with 2 × 2 matrices in this paper, and in this case it will prove convenient to refer to these grid classes more succinctly. If M = A B C D is a matrix consisting of permutation classes, then we write to mean Grid(M). Additionally, when (say) A = Av(21), then we may refer to the cell A using , reflecting the fact that all points in this cell are increasing. Similarly, we may write when A = Av(12). Finally, where the entries of the 2 × 2 matrix M are either arbitrary or clear from the context, we may also simply refer to Grid(M) as .
We are ready to state our general theorem, from which Theorem 1.1 will follow. Our approach makes use of an existing result, which although not originally presented in this way, can be cast in terms of grid classes. For permutation classes C and D, the (horizontal) juxtaposition of C and D is the 1 × 2 grid class C D . Similarly, the vertical juxtaposition of C and D is the 2 × 1 grid class Lemma 1.3 (Atkinson, 1999). Whenever C and D are finitely based, so are the horizontal and vertical juxtapositions of C and D.
For clarity, we occasionally write C D for the horizontal juxtaposition C D (we do not need the corresponding vertical juxtaposition notation).

Juxtapositions and relative bases
In this section, we give a characterisation of 2 × 2 grid classes of the form where A, B, C and D are four fixed (but arbitrary) permutation classes. We begin by considering the following related class, formed by the horizontal juxtaposition of two vertical juxtapositions: Note that if A, B, C and D are finitely based, then by repeated application of Lemma 1.3 so too is F . Clearly, E ⊆ F . We are interested in the basis of E, which we can separate into two parts: those basis elements of E that lie within F , and those basis elements of E that are not in F . By minimality and since E ⊆ F , this latter set must also be basis elements of F . The set of basis elements of E that are contained in F we call the relative basis of E in F , and we have the following observation.
Observation 2.1. Let C and D be two permutation classes such that D finitely based, and C ⊆ D. Then C is finitely based if and only if the relative basis of C in D is finite.
Consider any permutation π in the set F \ E. Since π lies in the juxtaposition class F , we can write π = π 1 π 2 with We refer to the division line v that separates π 1 from π 2 as a v-line. Additionally, any horizontal division line in π 1 that demonstrates π 1 as a member of the vertical juxtaposition is called a left h-line of π, and similarly any valid horizontal division line in π 2 is called a right h-line. Thus, we can recognise π ∈ F by means of a division triple, (v, r, ℓ), where v is the v-line, r the right h-line, and ℓ the left h-line. Fig. 1: The relationship between the division (v, r, ℓ) and (v ′ , r ′ , ℓ ′ ) in the proof of Lemma 2.2. The small arrows indicate that the corresponding division lines have been chosen to be extremal in the direction specified by the arrows.
The condition that π ∈ F \ E can now be described as follows: for every division triple (v, r, ℓ) that recognises π ∈ F , the right h-line r and the left h-line ℓ cannot be at the same height. We use the symbol to denote the set of permutations in F which have a division triple (v, r, ℓ) where ℓ is no higher than r, and to denote those permutations which have a division where ℓ is no lower than r. Note that and are both in fact permutation classes, and also that F = ∪ . Our main result of this section now follows. It shows in particular that π ∈ F \ E cannot simultaneously lie in and , and hence the relative basis of E in F can be divided into two disjoint parts: those that lie in and those that lie in .
Lemma 2.2. Any 2 × 2 grid class E = is equal to the intersection of the corresponding classes and . That is, Proof: First, it is clear that ⊆ ∩ , so suppose that we have a permutation π in ∩ . Consider π first as a member of . There exists at least one division triple (v, r, ℓ) which recognises this, and we choose any valid v-line v, together with the lowest right h-line r and the highest left h-line ℓ. Note in particular that for any right h-line that is lower than r, there must exist a basis element in the top right cell. If ℓ and r coincide, then we have π ∈ and we are done, so we may assume that ℓ is strictly lower than r.
Next, consider π as an element of . We pick a division (v ′ , r ′ , ℓ ′ ) by first choosing any v-line v ′ which either coincides with v or lies further to the left (the case where v ′ is to the right of v will follow upon rotating the picture by 180 • ). Next choose any valid r ′ , noting that r ′ must be at least as high as r to avoid introducing a basis element into the top right cell. Finally, choose ℓ ′ to be as low as possible, subject to the division triple (v ′ , r ′ , ℓ ′ ) remaining a valid division for membership of (see Figure 1). We claim that ℓ ′ is at the same height as r ′ .
Suppose, for a contradiction, that ℓ ′ lies strictly above r ′ , and let ℓ ′′ be the left h-line that has the same height as r ′ . Since the division triple (v ′ , r ′ , ℓ ′′ ) does not witness π ∈ (but (v ′ , r ′ , ℓ ′ ) does), there must exist some basis element in the top left region defined by (v ′ , r ′ , ℓ ′′ ). However, this region is contained in the top left region defined by (v, r, ℓ), so this is impossible.

Main results
We are ready to start proving our three main results. Proof: First, let B denote the relative basis of E inside the juxtaposition Since F is finitely based, by Observation 2.1 it suffices to show that B is finite. By Lemma 2.2 and the comments preceding it, any π ∈ B lies in exactly one of or . Consider first the case where π ∈ . We will identify a bounded number of points in π that demonstrate π ∈ E.
We begin by identifying two division triples, (v L , r L , ℓ L ) and (v R , r R , ℓ R ): v L is the leftmost v-line recognising π ∈ , and v R is the rightmost such v-line. Subject to these choices, we pick ℓ L and ℓ R to be as high as possible, and r L and r R as low as possible.
We now prove the following claim: if (v, r, ℓ) is any other division triple recognising π ∈ where the left h-line ℓ is chosen as high as possible, then ℓ is at the same height as either ℓ L or ℓ R .
If ℓ L and ℓ R are at the same height, the claim follows immediately, so we can assume that ℓ L is strictly higher than ℓ R . The situation is as depicted in Figure 2: we identify four points, a, b, c and d, which are distinct (except possibly b = c) and which form the copies of 21 that define ℓ L and ℓ R . Note that a and c lie immediately above ℓ L and ℓ R , and, except that the relative positions of a and c can be interchanged providing b = c, the points must be arranged in the way shown in Figure 2 in order that π ∈ . For the same reason, all other points of π that lie in the marked rectangular regions 1, 2, 3 and 4 (defined by the bounding dotted and dashed lines) in Figure 2 must lie on the diagonal segments indicated.
Consider any division triple (v, r, ℓ) recognising π ∈ where ℓ is chosen as high as possible. If v lies further left than all points in the region labelled 4 in Figure 2, then we can choose ℓ at the same height as ℓ L . On the other hand, if any point from region 4 lies to the left of v, then c must lie above ℓ, and thus ℓ is at the same height as ℓ R . This completes the claim.
We can now identify the following bounded collection of points of π: (i) a basis element of C which defines v R , (ii) a basis element of D to define v L , and (iii) at most 4 points a, b, c and d defining the two left h-lines ℓ R and ℓ L . It remains to identify a bounded number of points to ensure that any division triple (v, r, ℓ) recognising π ∈ has ℓ strictly lower than r. For this, it suffices to consider only the extremal triples (v, r, ℓ) where

Fig. 2:
The relationships between the division triples (vL, rL, ℓL) and (vR, rR, ℓR), the points defining ℓL and ℓR, and the restrictions on the placement of points in the four rectangular regions 1-4. ℓ is as high as possible, and r is as low as possible. We identify the extremal triple (v X , r X , ℓ X ) where the v-line v X is chosen to lie immediately to the left of all points in region 4 of Figure 2. By the earlier claim, ℓ X has the same height as ℓ L . The lowest right h-line r X must lie strictly above ℓ X , and is defined by a basis element of D to the right of v X , with one point lying immediately below r X . Observe that for any extremal triple (v, r, ℓ) where v lies to the left of v X , we have that ℓ is at the same height as ℓ X , and r can be no lower than r X . In particular, since π as a basis element is minimally not in E, if r is higher than r X then it is because of points in π that we have already identified. Similarly, the position of the line r R is fixed by a basis element of D to the right of v R . For any extremal triple (v, r, ℓ) where v is further right than v X , we know that ℓ is at the same height as ℓ R , and r can be no lower than r R (because of the basis element of D). Thus, again by the minimality of π, if r is strictly higher than r R it is because of points that we have already identified.
From this, we conclude that if π ∈ is a basis element of E relative to F then the number of points in π is bounded, as π comprises the points identified in (i), (ii) and (iii) above, and by at most two basis elements of D.
The argument for a basis element π that lies in is similar, and we omit some of the details. The process begins by identifying the leftmost and rightmost v-lines v L and v R , and the corresponding highest right h-lines r L and r R . The left hand picture in Figure 3 illustrates that r L and r R cannot have different heights (else π ∈ ). In the right hand picture of Figure 3, the points forming a basis element of C that defines the line ℓ L ensures that in any extremal triple (v, r, ℓ), r is lower than ℓ. Thus π consists of (i) a basis element of C which defines v R , (ii) a basis element of D to define v L , (iii) a copy of 21 to define r R , and (iv) a basis element of C to define ℓ L .
A similar approach, of bounding the number of possible left and right h-lines, can be applied for the other two cases, so we only sketch the proofs. Fig. 3: The relationships between the division triples (vL, rL, ℓL) and (vR, rR, ℓR) when π ∈ . On the left, if rL and rR are at different heights, then ℓL is at the same height as rL. On the right, if rL and rR are at the same height, then the points defining ℓL guarantee π ∈ E for every triple (v, r, ℓ)  is finitely based.

Proof (sketch):
We need only consider relative basis elements of E that lie in , as the argument for is symmetric. Thus, consider a basis element π ∈ of E. Define the division triples (v R , r R , ℓ R ) and (v L , r L , ℓ L ) recognising π ∈ by choosing v R to be the rightmost v-line, and v L the leftmost, and then selecting r L and r R as low as possible, and ℓ L and ℓ R as high as possible.
We claim that ℓ R and ℓ L have the same height. In Figure 4, the point c which defines the line r R , forces the region below ℓ R and between v L and v R to be empty. Consequently, the pair of points a and b (which forms a copy of 21 and hence defines the height of ℓ R ) must lie to the left of v L . This means that a and b also define the highest position of every left h-line ℓ in a division triple (v, r, ℓ) recognising π ∈ .
The proof concludes by noting that we can demonstrate π ∈ E by the following points: (i) a basis element of C which defines v R , (ii) a basis element of D to define v L , (iii) a copy of 21 to define ℓ R , and (iv) a basis element of D to define r R . Proof (sketch): As before, by symmetry it suffices to consider a relative basis element π ∈ of E. Define the division triples (v R , r R , ℓ R ) and (v L , r L , ℓ L ) as in earlier proofs.
We claim that in any division triple (v, r, ℓ) recognising π ∈ where ℓ is as high as possible, ℓ has the same height as either ℓ L or ℓ R . The situation is illustrated in Figure 5: if v lies to the right of the point a then ℓ can be no higher than ℓ R . On the other hand, if v lies to the left of a, then the only available copy of 12 has b as the '2', so ℓ has the same height as ℓ L . With these two left h-lines defined, we need only identify two copies of basis elements of D to define corresponding lowest right h-lines in each case. Thus, π ∈ E is identified by the following points: (i) a basis element of C to defines v R , (ii) a basis element of D to define v L , (iii) at most two copies of 21 to define ℓ R and ℓ L , and (iv) at most two basis elements of D to define r R and r L .
Proof of Theorem 1.1: First, the only 2 × 2 monotone grid classes whose row-column graphs are not forests (and hence finitely based by Albert et al. (2013)) are those where all four cells are non-empty. Any such 2 × 2 monotone grid class can be described as a grid class in one of the three forms covered by Lemmas 3.1, 3.2 and 3.3, upon taking the classes C and D to be Av(12) or Av(21), and possibly appealing to symmetry.

Concluding remarks
Non-monotone 2 × 2 grids One obvious question arising from this work is how far one might be able to extend Theorem 1.2 within the context of 2 × 2 grids: in particular, can one replace the two monotone classes in the lower row by something more general? Any approach to this question would need to bear in mind that there do exist 2 × 2 grid classes which are not finitely based, even though each entry of the matrix is finitely based. The primary example of this, given both in Murphy's PhD thesis (Murphy, 2002) and in (Atkinson and Stitt, 2002), is where C = Av(321654). (Note this example is more normally written as a direct sum, C ⊕ C.) This example can likely be adapted to produce other instances where the grid class is not finitely based, even though its individual entries are.
Larger grids There are a number of difficulties encountered when one tries to extend our results here to larger grids. Even in the "next" case of 2 × 3 grids, there seems to be no obvious analogue to Lemma 2.2 to enable us to consider relative bases inside some larger class. The primary issue is that our proof relied on the fact that the heights of all possible left-h-lines (or, analogously, right-h-lines) form a contiguous set of values, but this need no longer be the case.