4 Models for the performance analysis and the control of networks
In this paper, we discuss the problem of estimating the number of "elephants'' in a stream of IP packets. First, the problem is formulated in the context of multisets. Next, we explore some of the theoretical space complexity of this problem, and it is shown that it cannot be solved with less than $\Omega (n)$ units of memory in general, $n$ being the number of different elements in the multiset. Finally, we describe an algorithm, based on Durand-Flajolet's LOGLOG algorithm coupled with a thinning of the packet stream, which returns an estimator of the number of elephants using a small amount of memory. This algorithm allows a good estimation for particular families of random multiset. The mean and variance of this estimator are computed. The algorithm is then tested on synthetic data.