A further analysis of Cuckoo Hashing with a Stash and Random Graphs of Excess r

. Cuckoo hashing is a hash table data structure offering constant access time, even in the worst case. As a drawback, the construction fails with small, but practically signiﬁcant probability. However, Kirsch et al. (2008) showed that a constant-sized additional memory, the so called stash, is sufﬁcient to reduce the failure rate drastically. But so far, using a modiﬁed insertion procedure that demands additional running time to look for an admissible key is required. As a major contribution of this paper, we show that the same bounds on the failure probability hold even without this search process and thus, the performance increases. Second, we extend the analysis to simpliﬁed cuckoo hashing, a variant of the original algorithm offering increased performance. Further, we derive some explicit asymptotic approximations concerning the number of usual resp. bipartite graphs related to the data structures. Using these results, we obtain much more precise asymptotic expansions of the success rate. These calculations are based on a generating function approach and applying the saddle point method. Finally, we provide numerical results to support the theoretical analysis.


Introduction
In computer science, hash tables are a frequently used tool to build dictionary-like data structures that support fast insertion, search and potentially also deletion operations, see, e.g.Cormen et al. (2001).All these algorithms are based on using hash functions that map data records (keys) to a unique memory cell of the table.We say that two different keys collide if both try to occupy a single memory slot.The critical point of each hash algorithm is the handling of colliding keys.
In particular, hash algorithms that offer constant access time even in the worst case are of high interest.One of these algorithms is cuckoo hashing that was first proposed by Pagh and Rodler (2004).In contrast to common hash techniques like open addressing and hashing with chaining (see, e.g., Knuth (1998)), collisions are resolved by rearranging keys.This is achieved by using two independent hash functions h 1 and h 2 , both map a key to a unique position in the data structure.These are the only allowed storage locations of that key and, hence search operations require at most to access two memory cells.There are several different variants of cuckoo hashing known in the literature, see Fotakis et al. (2005); Dietzfelbinger and Weidling (2007) and Kutzelnigg (2010).In this paper we consider the following two: The standard algorithm introduced in Pagh and Rodler (2004) splits the available memory cells in two equal parts and grants each hash function exclusively access to one of these regions.However, this split-up is not necessary.We obtain an in some sense simplified algorithm, if both hash functions address the whole table.It is shown in Kutzelnigg (2009Kutzelnigg ( , 2010) ) that this variation, henceforth referred to as simplified cuckoo hashing, offers improved search and insertion performance.
Despite this difference, the insertion process works for both algorithms considered here as follows: To insert a new key x, we put it into the table position indicated by h 1 (x).Now, if this position was previously empty, the insertion is complete.Otherwise, we kick-out the key y that previously occupied h 1 (x) = h 1 (y) and move it to its alternative position h 2 (y).If this position was already in use by a key z, we continue moving z to its alternative position.We carry on in this way until an empty cell is found or we detect an endless loop.The latter case can be uncovered by the number of kick-out steps performed.The standard way to handle this critical situation is to rebuild the complete data structure using two new hash functions.Fortunately, the occurrence of an endless loop is a rare event.More precisely, both variants of cuckoo hashing succeed with probability 1 − O(1/m), conditioned that the load factor (i) is less than 0.5, see Devroye and Morin (2003), Drmota and Kutzelnigg (2009), and Pagh and Rodler (2004).
The analysis is hereby usually based on the assumption that the values of the hash functions form a sequence of independent uniform random numbers.This seems to impose strong conditions on the hash functions in use.However, numerical results obtained using quite simple hash functions are in good accordance with the results obtained by this model.Thus, we assume the same conditions on the hash functions to be satisfied in this paper too.A further discussion of this model, hash functions suitable for practical implementation, and additional references are given in Kutzelnigg (2009), but see also Dietzfelbinger and Schellbach (2009).
Although the success probability of cuckoo hashing tends to one under the conditions stated above, there is still a non negligible failure rate, even for large data structures.For instance, consider the outcome of one of the numerical experiments given in Table 8: There occurred 14355 errors among 10 7 constructions of data structures, each possessing in total 10 5 memory cells and a load factor α = 0.45.Clearly, rebuilding a table requires high additional running time.Hence, cuckoo hashing is not attractive for applications where this behavior is not acceptable.To overcome this weak point, Kirsch et al. (2008) suggested to use a small additional memory, the so called stash.This stash is used to store items that could not be placed in the table itself.Interestingly, this modification drastically reduces the failure rate.More precisely, the authors showed that an additional memory of s items is sufficient to achieve a success probability of 1 − O m −s−1 for the standard algorithm.However, the analysis is based on the usage of a modified insertion algorithm that requires additional steps to look for an admissible keys that can be placed in the stash.Details will be discussed in Section 4. We will show that the same result can be achieved without this search process, as it was already conjectured in Kirsch et al. (2008).Next, we prove that the same bound is valid for the simplified version of cuckoo hashing.Further, we demonstrate how detailed asymptotic expansions of the success rate can be obtained using a generating function approach.
In particular, this can be used to show that the failure rate is in fact in Θ m −s−1 .Finally, we support our analysis by numerical results.

The Cuckoo Graph
Our analysis is based on the cuckoo graph, a concept first developed in Devroye and Morin (2003).Hereby, standard cuckoo hashing without a stash is modeled using a bipartite multi graph.Each memory cell corresponds to a labeled node.Further, a node is colored "red" if and only if the according cell belongs to the first half of the memory slots.All other vertices are colored "blue".We further assume that the whole data structure contains 2m storage positions, each part numbered from 1 to m.We label the nodes accordingly.Next, each key is encoded by an edge that joins the two possible storage locations of that key.Additionally, the i-th inserted edge is labeled by i to encode the evolution.Note that the obtained graph is bipartite by construction.
Clearly, it is impossible for the algorithm to succeed if a component exists that has more edges than vertices.Thus, all components of the cuckoo graph must either be trees or unicyclic, i.e. contain either none or exactly one cycle.Interestingly, this condition is not only necessary, but also sufficient.More precisely, the key observation in Devroye and Morin (2003) is that cuckoo hashing succeeds if and only if all connected components of the cuckoo graph are either trees or unicyclic.Further, it is common to call the "bad" parts, i.e. connected components possessing more than one cycle, complex.
Concerning simplified cuckoo hashing, a similar modeling is possible, see Kutzelnigg (2010).In contrast to the bipartite graph described above, the one-table data structure is represented by a non-bipartite edge and node labeled directed graph.Edges are again used to represent keys and they still connect the two possible storage locations.However, it is necessary to use directed edges, to determine the primary storage position uniquely.This is closely related to the multi graph process described in Janson et al. (1993).Despite these differences in the definition of the graph model, simplified cuckoo hashing is again successful if and only if the shadow graph (ii) of its cuckoo graph does not contain a component with more than one cycle.
Additionally to the obvious implementation of the simplified algorithm that allows both storage positions of a key to be equal, there is a further method.Assume that the data structure contains 2m storage positions.For each key x, we assume that the hash value of the second hash function h 2 (x) is selected uniformly at random among the set {1, 2, . . ., 2m} \ h 1 (x).This can be achieved by using a function k(x) that maps x to a (pseudo) random number from 1 to 2m − 1 and setting h 2 (x) = h 1 (x) + k(x) mod 2m.As a small drawback, the evaluation of h 2 (x) requires an additional summation.On the other hand, each key now possesses in any way two different storage positions and the corresponding cuckoo graph does not contain self-loops.Consequently, we expect increased performance.In fact, this can be verified by a theoretical analysis, see Kutzelnigg (2010).However, note that the actual behavior is very similar, especially if tables are getting full.
(ii) Given a directed graph, we obtain its shadow graph by replacing each edge by an undirected one.Note that we consider multi graphs, thus multiple edges that occur in this process are retained.

Results
Theorem 1 Consider a standard cuckoo hash data structure possessing two subtables of size m and holding n = 2αm keys, where α ∈ (0, 0.5) is fixed.Whenever an endless loop is detected, the last kicked-out element is placed in the stash of size s ≥ 1.Then, the construction succeeds with probability (1) Hereby, c(α, s) = 0 depends on α and s, but not on m and it can be calculated explicitly with our method.
An essential difference of this result compared to (Kirsch et al., 2008, Theorem 1) is that we do not require additional steps to search for admissible keys.Further, our asymptotic result is much more precise.
The proof of Theorem 1 is split up into two parts.First, a proof that the described insertion strategy succeeds with probability 1 − O m −s−1 is given in Section 4. These considerations are a refinement of the original analysis that can be found in Kirsch et al. (2008).The further proof is based on a generating function approach on the cuckoo graph and can be found in Section 7. In particular, we prove that c(α, s) = 0 holds.Moreover, we present a method to compute this coefficient, and we have implemented it with Maple.However, note that the calculations are limited by the available memory of the machine that executed the computer algebra system.Using a workstation equipped with 12GB RAM, we were successful in solving the problem for s ∈ {0, 1, 2}.
Note that during a successful insertion procedure, each memory slot is visited at most twice.Thus, an insertion procedure in a table previously holding i keys is clearly unsuccessful if it takes more than 2i + 1 kick-outs.However, we can usually do much better and stop earlier, see Kutzelnigg (2009) for corresponding numerical data.On the other hand, there is a risk that we stop procedures that would have been successful.The probability can be bounded as follows: For each k, there exists a sufficiently large number β = β(k) that does not depend on m, such that the probability that there is an insertion that takes more than β log m steps is O m −k , see (Kirsch et al., 2008, Lemma 4 and (1)).Hence, (1) still holds if an insertion procedure is stopped and declared unsuccessful after β log m kick-outs performed, provided that β is chosen sufficiently large.
Furthermore, we can analyze the simplified version by similar methods, see Sections 5 and 7 for details.Note that we distinguish between the straightforward implementation, and the version without self-loops as described in the previous section.
Theorem 2 Consider a simplified cuckoo hash table possessing 2m memory cells and holding n = 2αm keys, where α ∈ (0, 0.5) is fixed.Suppose that h 2 is implemented such that h 2 (x) = h 1 (x) holds for all keys x.Whenever an endless loop is detected, the last kicked-out element is placed in the stash of size s ≥ 1.Then, the construction succeeds with probability Hereby, c(α, s) = 0 depends on α and s, but not on m and it can be calculated explicitly with our method.
Similar to the considerations given above, it is again possible to stop presumably unsuccessful insertion procedures early.
Until now, we have successfully calculated the numbers c(α, s) for all s ≤ 8. Unfortunately, Theorem 2 does not cover the situation where self-loops are not excluded.However, we can analyze it with our generating function approach and yield: Theorem 3 Suppose that the conditions of Theorem 2 are fulfilled, except that the data structure is implemented in the way such that self-loops may occur.Further, let s ∈ {0, 1, . . ., 8}.Then, the probability that the construction of a simplified cuckoo hash table succeeds equals Hereby, ĉ(α, s) = 0 depends on α and s, but not on m and it can be calculated explicitly with our method.
The proof of this result can be found in Section 7.

Insertion
The results given in this section hold for both types of cuckoo graphs.However note that the original paper of Kirsch et al. (2008) considered the bipartite version only.
4.1 The insertion procedure of Kirsch et al. (2008) Once an endless loop is discovered, it is required to search for an admissible key to be put in the stash, such that the remaining keys can be placed in the original hash table.Clearly, a key is acceptable if and only if the cuckoo graph does not possess a complex component after removing the coressponding edge.We continue with an observation stated in Kirsch et al. (2008), and we are using the same notation as in that paper: Given the cuckoo graph G, we denote the total number of edges that closed a cycle when inserted by f (G).Further, the number of the connected components containing at least one cycle is denoted by T (G).Consequently, the number of elements that could not be stored in the table itself and have to be placed in the stash, equals f (G) − T (G).Moreover, we continue describing the modified insertion procedure used in Kirsch et al. (2008): Whenever an endless loop is detected, the last insertion step created a component possessing two cycles.We proceed searching for an edge of that component that is contained in one of this two cycles.This is done by counting how often a memory cell is accessed during an insertion and thus, it is quite computational expensive resp.memory consuming.Next we delete the selected edge from the graph and put the corresponding key into the stash.Note that an edge that closes a cycle on insertion in the original graph, still closes a cycle if we have removed some edges as described above.Hence, we have found a way to remove exactly f (G) − T (G) edges, such that all conflicts are resolved.Clearly, removing less edges can not repair all problems and hence the solution has minimal cardinality.

Insertion without search for an admissible key
However, removing an edge that belongs to a cycle is not the only possible way to decrease f (G) − T (G).We can alternatively choose the edge such that T (G) increases.This corresponds to breaking up a component in two parts, but both of that new components must not be trees.We proceed using the following definition: Definition 1 Consider a connected graph H.The excess e(H) is given by e Furthermore, for a graph G, we define where the sum is taken over all components H of G.
The insertion procedure is modified as follows: As long as no irresolvable conflict occurs in a basic cuckoo hash data structure, we perform the insertions using the original algorithm of Pagh and Rodler (2004) that is described in Section 1.However, each time we detect an endless loop, we put the last kicked-out key in the stash, and delete the corresponding edge.More precisely, we even may select that key arbitrarily among the set of all keys that have been kicked out in the current insertion procedure, but it is of course convenient to choose the last evicted key.We then proceed in the usual way to insert further keys, until the next problem is discovered.Thus, we avoid a complicated cycle detection mechanism.
The remainder of this section is used to proof that this strategy is alway successful.First, we say that an edge is disturbing, if its insertion increases ẽ(G).Recall that we are considering labeled and thus ordered edges.Since ẽ(G) cannot decrease by inserting an additional edge (iii) , ẽ(G) equals the number of disturbing edges.However, it is possible to decrease ẽ(G) by removing certain edges: 2 Next, we consider the influence of removed edges on further disturbing edges: Lemma 2 Let G denote a graph arising from G, where we have already removed ẽ(G) edges such that ẽ(G ) = 0 holds.Assume that a new edge g is disturbing in G .Then, it is disturbing in G too.
Proof: We consider two different situations in G , see Figure 1.First, suppose that g is a bridge in G that connects H 1 and H 2 .By similar reasoning as in the proof of Lemma 1, both of this components cannot in G: Fig. 1: A disturbing edge g.We distinguish whether g is a bridge in G or not. (iii) Note that the number of nodes is fixed, and hence remains unchanged.
or Fig. 2: The two possible types of bicyclic components.
be trees, because otherwise g would not be a disturbing edge.Thus, e(H 1 ) ≥ 0 and e(H 2 ) ≥ 0 hold.Assume that g is a bridge in G that connects components H 1 ⊇ H 1 and H 2 ⊇ H 2 .We conclude that e(H 1 ) ≥ e(H 1 ) ≥ 0 and e(H 2 ) ≥ e(H 2 ) ≥ 0 are fulfilled.Hence g is disturbing in G.The proof of the two further cases follows by a similar argument as above.2 Lemma 3 Assume that we always perform the following steps whenever the insertion procedure has entered an endless loop: We stop at an arbitrarily selected moment, put the current homeless key in the stash of potential unlimited capacity, and delete the corresponding edge.By doing so, the stash holds ẽ(G) keys finally, where G denotes the original cuckoo graph without deleted edges.
Proof: Consider the graph G that encodes the current state of the data structure.The algorithm enters an endless loop whenever a component containing two cycles is created.This can either happen by closing a further cycle in an unicyclic component or by connecting two different unicyclic components.Hence, the new edge is disturbing for G .There exist two different typical situations, depicted in Figure 2.Each of the nodes can be considered as root of an additional attached tree, not depicted in the figure .Observe that the last inserted edge must be contained in the drawn part.None of the keys belonging to one of the depicted edges has a possible storage position in these potentially attached trees.Hence the presence of this appendix has no influence on the current insertion.In particular, none of these keys can ever become kicked-out in the current insertion operation.Note that arbitrarily selection and deletion of one the drawn edge decreases ẽ, because of Lemma 1.
Note that the frequency of the occurrence of a disturbing edge is bounded by ẽ(G), due to Lemma 2. Hence, we place at most ẽ(G) keys in the stash.On the other hand, this is the minimum number of elements that have to be stored outside the table . 2 To prove Theorem 1, it hence remains to show that P(ẽ(G) ≥ s+1) has asymptotic expansion (1).This is done in Section 7. Due to (Kirsch et al., 2008, (2)), the relation ) is already known.We just want to correct a minor mistake.When an insertion hits a cyclic component, several (and not only one) vertices might be visited twice.In particular, k + 1 is not a valid upper bound for the number of steps that an insertion requires in a component of size k, as claimed in (Kirsch et al., 2008, Lemma 1).However, 2k is sufficient bound, see Devroye and Morin (2003) or Kutzelnigg (2009) for further details.

Simplified Cuckoo Hashing
Instead of segmenting the available memory, this modified version grants both hash functions access to the whole table.This variant was first mentioned in Pagh and Rodler (2004), a detailed analysis can be found in Kutzelnigg (2009Kutzelnigg ( , 2010)).In particular, the simplified algorithm offers improved performance of search and insertion operations.
In this section, we prove that the data structure of Theorem 2 succeeds with probability 1 − O(m −s−1 ).First, recall that all lemmata given in Section 4 did not require a bipartite structure of G. Hence, ẽ(G) still equals the number of keys that can not be stored in the table itself.Recall that we consider the implementation without self-loops only.This is based on the fact that it is possible to adopt the analysis of Kirsch et al. (2008) for this version.Nevertheless, because of the results given in Section 7, we conjecture that the same bound holds for the other implementation too.Our proof is based on a stochastic dominance argument, and we repeat resp.adopt the following definitions and results from Kirsch et al. (2008): For a distribution D, let G(2m, D) denote the distribution over graphs with 2m nodes, obtained by sampling l ∼ D and inserting l edges independently.Hereby, start point and end point of each edge are selected uniformly and independently, except that no self loops are allowed.Clearly, the cuckoo graph of the simplified variant has distribution G(2m, D) when D is concentrated at n.
Given two graphs G and G possessing the same set of vertices, we say that G ≥ G holds if each edge of G is also contained in G .Additionally, we generalize this relation on t-tuples of graphs by applying it component-by-component, i.e.
Definition 2 Let µ and ν be two probability measures over t-tuples of graphs with common vertex set.Then, µ stochastically dominates ν, in short notation µ ) holds for all non-decreasing functions g.
Further, let Po(λ) denote the Poisson distribution with parameter λ.
Lemma 4 Assume that λ > 0 is fixed.Then, the conditional distribution of G ∼ G(2m, Po(λ)), conditioned on the property that G has at least n edges stochastically dominates G(2m, n).
Proof: The proof follows the lines of (Kirsch et al., 2008, Lemma 3).Obviously, the conditional distribution of G under the assumption that there are exactly n edges is G(2m, n).Since k 1 > k 2 implies G(2m, k 1 ) G(2m, k 2 ), the result follows. 2 It is easy to see that the parameter λ can bee chosen such that the probability that G(2m, Po(λ)) has less than n edges is exponentially small: Lemma 5 (Kirsch et al. (2008)) Let ε > 0 be fixed and define λ = (1 + ε )n.Then we yield Thus, it remains to show that for G ∼ G(2m, Po(λ)) the relation Furthermore, let B v denote the number of edges that belong to C v that closed a circle on insertion.
Lemma 6 Assume that ε < 1/(2α) − 1 and t,k,n ≥ 1 hold.Then, we get Proof: The condition on ε ensures that λ = (1 + ε )n is selected such that λ/m < 1 holds.As in the proof of (Kirsch et al., 2008, Lemma 5), we consider a breath first search starting at v. We bound the number of cycles that are created by visiting a new node by a Poisson random variable in each step.The only difference is that a single edge now corresponds to a Poisson random variable with parameter λ/ 2m m , instead of λ/m 2 .Thus, we can apply (Kirsch et al., 2008, Lemma 6) with m replaced by 2m and yield the claimed result. 2 By the same reasoning as (Devroye and Morin, 2003, Lemma 1), we obtain the following result for the size of a component: Lemmata 6 and 7 together with Chernoff's bound are sufficient to prove (2), see Kirsch et al. (2008).

Enumerating complex graphs
In this section, we review resp.establish the generating functions that enable us to count the number of cuckoo graphs of certain type resp.asymptotic approximations of these numbers.First, let us consider node and edge labeled usual, but directed graphs.Let F denote a family of such graphs.Then, the corresponding bivariate generating function is the formal power series where m(G) denotes the number of nodes and n(G) the number of edges of G.However, it is usually simpler to consider an univariant generating function first.For instance, node labeled rooted trees are enumerated by the well known tree function t(x) satisfying t(x) = xe t(x) , see Flajolet and Sedgewick (2009).Since a tree possesses exactly one node more than edges, the corresponding bivariate generating function t(x, v) satisfies We thus slightly abuse notation by denoting univariate and bivariate generating function by the same letter, however the correct interpretation should be clear from the context.Similarly, we define trivariate generating functions enumerating bipartite graphs: Again, we avoid using the edge-counting variable v whenever possible.Further, we make use of the notation [x m ]A(x) to extract the m-th coefficient of a power series A(x) that means

Usual graphs
Additionally to the already mentioned function t(x) counting rooted trees, the function t(x) counting unrooted trees is well-known too.In particular, the relation holds, see Flajolet and Sedgewick (2009).Using these functions, we describe the generating functions of graphs with excess r ≥ 0. Since the number of nodes is uniquely determined for all these types of graphs, we can concentrate on counting nodes.Thus univariate generating functions are sufficient, because we get the bivariate function afterwards by replacing x with 2xv and multiplying the function by an additional factor of (2v) r .However, this can only be done if self-loops and multiple edges are compensated in the original construction.We follow the approach of Janson et al. (1993) and assign a graph with adjacency matrix A = (a ij ) ij the compensation factor In particular, the following results hold for the general multi graph model: Lemma 8 (Janson et al. (1993)) Assume that our graph might contain self-loops.Then, the generating function C(x) of a connected graph with exactly one cycle is given by Further, let E r (x) denote the generating function of graphs consisting of complex components only and having excess r, i.e. exactly r more edges than vertices.Then the relation Hereby, the constants e rd are given as Since we also consider the situation where each key possesses two different storage positions, we also require the generating functions counting graphs without self-loops.
Definition 3 Let ϑ x denote the differential operator x ∂ ∂x that corresponds to marking a node.
Lemma 9 Assume that we consider a multi graph without self-loops.Then, the generating function C(x) of a connected graph with exactly one cycle is given by Further, let E r (x) denote the generating function of graphs consisting of complex components only and having excess r, i.e. exactly r more edges than vertices.These functions satisfy the differential recurrence (3) Moreover, E 0 = 1 holds, since only the empty graph is complex and has excess 0.
Note that the previously mentioned results are not explicitly given in Janson et al. (1993).However, the case where both, self-loops and multiple edges are forbidden, is considered in that paper.Thus, the proof of Lemma 9 follows immediately by combining these methods and results.Finally, using the differential recursion (3) and a computer algebra system, it is easy to obtain all generating functions that are required in the next section.

Bipartite graphs
In contrast to usual graphs, only few results concerning bipartite graphs are known in the literature.However, trees and unicyclic components have been studied recently, see also Gimenez et al. (2005): Lemma 10 (Drmota and Kutzelnigg ( 2009)) The generating functions t 1 (x, y) resp.t 2 (x, y) of rooted bipartite trees possessing a root node of first resp.second type and the generating function of unrooted bipartite trees t(x, y) are given by t 1 (x, y) = xe t2(x,y) , t 2 (x, y) = ye t1(x,y) and t(x, y) = t 1 (x, y) + t 2 (x, y) − t 1 (x, y)t 2 (x, y).
The partial derivatives of the functions t(x, y) and t 1 (x, y) are given by and .
Further, the generating function of a connected bipartite graph with exactly one cycle is given by Concerning components with positive excess, we proceed as in Janson et al. (1993) and adopt the calculation to the present situation.Thereby, we make use of the following shortened notation: Definition 4 Let ϑ x denote the differential operator x ∂ ∂x that corresponds to marking a vertex of first kind.Similarly, we define the operators ϑ y and ϑ v for marking a node of second kind resp.an edge.
Further, we obtain the following result: Lemma 11 Let E = E(x, y, v) denote the generating function for the complex part of a bipartite graph that means all its components have positive excess.Then, the following equation holds: Proof: Obviously, the left hand side of our equation represents all complex bipartite graphs having a marked edge with the edge count decreased by one.Thus, the right hand side should yield all ways how the complex part can grow by one edge.In particular, the terms correspond to the following operations from left to right: • The new edge connects two edges of an unicyclic component and thus closes a further loop in it.
• It might also join two different unicyclic components, and hence a bicyclic component arises.
• Furthermore, the new edge might attach a tree to an already complex component.There are two distinct situations, the node of first kind can belong to each of the two involved components.
• Similar to the previous case, but we select an unicyclic component instead of a tree.
• Finally, we may connect two edges belonging to the complex part.
Additionally it is straightforward to verify (4) adopting the calculations of Janson et al. (1993), but we skip this tedious and technical alternative proof. 2 To solve the differential equation ( 4), we transform it into a recursion formula: Lemma 12 Let E r (x, y) denote the generating function of bipartite graphs consisting of complex components only and having excess r.These functions satisfy the partial differential recurrence Moreover, E 0 = 1 holds, since only the empty graph is complex and has excess 0.
Proof: We start rewriting (4).Using Lemma 10, we obtain Furthermore, it is straightforward to verify that holds.Thus we get x,y,v) ϑ x ϑ y e C(x,y,v) E(x, y, v). (5) Next we partition E(x, y, v) into summands of equal excess.Thus we write Further, we immediately yield and similarly holds.Hereby (ϑ x E r )(xv, yv) means ϑ x E r (x, y) with x and y subsequently replaced by xv and yv.
Finally, we plug in ( 6) and ( 7) into ( 5), equate the coefficients of v r−1 on both sides and then set v = 1.
Thus we obtain what completes the proof. 2 Solving the recursion of Lemma 12 in general seems to be out of reach, but using a computer algebra system, it is quite easy to get some results.In particular, the solutions exhibit a certain pattern, hence it is possible to write down an ansatz and compute the coefficients.Thus, we get for instance , and

Detailed Asymptotic Expansions
In this section, we present a generating function approach that enables us to calculate detailed asymptotic expansions of the number of cuckoo graphs satisfying ẽ(G) ≤ s.Together with the number of all cuckoo graphs, we further obtain the percentage of graphs that fulfill the very same property.Hence, the proofs follow the same idea as the analysis of the success probability given in Drmota and Kutzelnigg (2009) and Kutzelnigg (2009).

Usual graphs
We concentrate on the case where self-loops are allowed.The second case can be treated in the same way, by just exchanging the generating functions.Thus, we obtain a similar result, however the non-zero coefficients of the asymptotic expansion are different.First, it is straightforward to count all directed node and edge labeled multi graphs possessing 2m nodes an n edges: Second, we determine #G r 2m,n , the number of all graphs containing a complex part with r more edges than nodes, using the generating functions given in the previous section.In particular, each of these graphs contains 2m − n + r (unrooted) tree components.This is easy to see, because of the fact that the insertion of an edge reduces the number of tree components by one, except if it increases the excess of the complex part.Next, the graph contains a (possibly empty) set of unicyclic components.Hence we infer by elementary combinatorial constructions (see e.g.Flajolet and Sedgewick (2009)): (8) We use Cauchy's Formula to obtain an integral representation of that formula.This integral can be asymptotically evaluated using the saddle point method, see Lemma 13 that can be found in the appendix.For technical reasons, we define the ratio Then, it turns out that the saddle point is given by x 0 = (1 − ε)e ε−1 .In particular, the result for the special case r = 0 that corresponds to an empty stash, is given in Drmota and Kutzelnigg (2009) and Kutzelnigg (2009).However, in the present situation, it is not sufficient to provide the second order term of the expansion only, we require several further terms.The calculation of these asymptotic expansions has been done with Maple in a semi-automatic way.In particular, we obtain where the c r i (ε) depend on ε, but not on m.The first non-zero coefficient is given by and in particular c 0 0 (ε) = 1 holds.We conclude that the fraction of cuckoo graphs that possesses a cyclic part of excess r is given by #G r 2m,n /#G 2m,n .Then, the percentage of graphs possessing a single bicyclic component of excess one equals Furthermore, the probability that ẽ(G) is less or equal s for a randomly selected graph G is given by Moreover, for 0 < i ≤ 8 we verified i r=0 c r i (ε) = 0, using our Maple worksheets (iv) .Thus, we yield for s ≤ 8 In particular, holds.Further, we can replace ε by 1 − 2α + O(m −1 ).We obtain similar results for all other s, satisfying s ≤ 8, what completes the proof of Theorem 3. Finally, we prove that the failure rate of a data structure possessing a stash of size s is in fact Θ(m −s−1 ).That is, we show that c(α, s) as defined in Theorem 2, is not equal to zero.To do so, we consider a complex component of excess s + 1, consisting of exactly two nodes connected by s + 3 edges.Clearly, the bivariate resp.univariate generating function b of such a component is given by b We further consider (8) once more, but we replace E by b and repeat the application of the saddle point method.Similarly to (10), we obtain that the probability that exactly one such component occurs in the cuckoo graph, while all other components are either trees or unicyclic, is given by b(x 0 )t(x 0 ) s+1 2 s+1 m s+1 = 0.
Since the algorithm fails in the current situation, the proof of Theorem 2 is completed.

Bipartite Graphs
In general, this proof follows the same idea as the previous one.However, things are a bit more complicated because of the bivariate generating functions that capture the bipartite structure of the cuckoo graph.Again, we start counting all graphs without restrictions to the type of their components.Let G m1,m2,n denote the set of all vertex and edge labeled bipartite multi graphs and |E| = n.By definition, it is obvious that the number of all graphs having m nodes of each type and n edges is given by #G m,m,n = m 2n .
We proceed counting all bipartite graphs possessing a complex part of excess r.As in the univariate situation, such a graph further contains 2m − n + r unrooted tree components, and a (possibly empty) set of unicyclic components: An asymptotic expansion can be derived applying a double saddle point method, see Lemma 14 and Drmota and Kutzelnigg (2009).The saddle point is given by x 0 = y 0 = n m e −n/m .All calculations have again been done with Maple and we obtain an expansion as we had in (9), satisfying the same properties but the non-zero coefficients are different.Finally, the probability that the construction of a data structure requires a stash of at most s items is given by and we yield the claimed results.In particular, setting again ε = 1 − n/m, we get Finally, we use the same approach as for the usual graph to construct a cuckoo graph that requires a stash of size s + 1 and occurs with probability Θ(m −s−1 ).difference.From the data given in all three tables, we additionally observe the following properties: For fixed s > 0 and m, the percentage of hash tables that demand a stash of size s increases as the load factor α increases.On the other hand, increasing m while holding α constant, decreases the expected number of items in the stash.Note that these numerical results are in good accordance with the theoretical analysis.We conclude that an additional memory of small size is sufficient to reduce the failure rate of cuckoo hashing drastically.Though, small table sizes and/or load factors close to the limit load of 0.5 demand special attention.

Summary and Conclusion
We showed that it is possible to use a stash without a complicated cycle-detection mechanism to determine an admissible key that can be placed in the stash.This is because of the fact that we have proved that it is sufficient to put the last kicked-out key in the stash, whenever an otherwise unresolvable situation occurs.The new insertion procedure thus offers better performance than the modified version required in Kirsch et al. (2008).As a further advantage, it is possible to break up large clusters and hence speed up further insertion procedures.Further, we adopted the analysis to simplified cuckoo hashing, a variant of the original algorithm that grants both hash functions access to the whole table.Finally, we presented a method to obtain exact asymptotic expansions of the failure rate.All these results extend the original analysis given in Kirsch et al. (2008) and verify that this algorithm offers very interesting properties.
As future work, we suggest a detailed analysis of the partial differential recursion of the generating functions of complex bipartite graphs with positive excess.Using these results, it will be possible to study the structure of these graphs.

A Asymptotic Expansions via the Saddle Point Method
This appendix provides results that are required to infer asymptotic expansions of the coefficients of generating functions counting the number of cuckoo graphs without "bad" components.These results can be obtained using a saddle point approach, see, e.g., Drmota (1994); Flajolet and Sedgewick (2009); Gardy (1995); Good (1957) for details concerning this method and Kutzelnigg (2009) for proofs of the lemmata at full length.Note that further coefficients of the asymptotic expansions can be calculated in the same way, but the expressions are so complicated that it does not make sense to provide them outside a computer algebra system.A maple worksheet is available on request from the author.
Lemma 13 Let f (x) and g(x) be analytic functions locally around 0 such that all coefficients [x m ]f (x) and [x m ]g(x) are non negative, f (0) = 0, and such that the "aperiodicity condition" gcd{m|[x m ]f (x) > 0} = 1 holds.
Let R be a compact interval of the positive real line that is contained in the radius of convergence of f (x) and g(x).Furthermore set Then we have uniformly for m/k ∈ S, where x 0 is uniquely determined by and the constants κ 2 and H are given in the following way.Let κ i and κ i be the cummulants .
Let R 1 and R 2 be compact intervals of the positive real line such that R = R 1 × R 2 is contained in the regions of convergence of f (x, y) and g(x, y).Furthermore set Then we have [x m1 y m2 ]g(x, y)f (x, y) k = g(x 0 , y 0 )f (x 0 , y 0 ) k 2πx m1 0 y m2 0 k y) : (x, y) ∈ R .
and only if the removal of that edge does not create a new tree.Proof: Denote the component that contains g by H. First, if g is contained in a cycle, e(H \g) = e(H)−1 holds.Thus ẽ decreases by one, except if the excess of H equals 0. But in the latter case, H \ g is a new tree.Second, if g is not contained in a cycle, it is a bridge connecting H 1 and H 2 .Hence, the relation e(H) = e(H 1 ) + e(H 2 ) + 1 is satisfied.If both H 1 and H 2 are trees, H is also a tree and none of these components influences the calculation of ẽ.If only one component is a tree, say H 2 , e(H) = e(H 1 ) holds and ẽ remains again unchanged.However, if both H 1 and H 2 posses nonnegative excess, we obtain ẽ Tab. 3: Stash sizes required for simplified cuckoo hashing possessing a tables of size 2m and holding αm keys.The data structure is implemented such that both storage locations of a key are surely different (i.e.without self-loops).