Frédéric Giroire - Order statistics and estimating cardinalities of massive data sets

dmtcs:3353 - Discrete Mathematics & Theoretical Computer Science, January 1, 2005, DMTCS Proceedings vol. AD, International Conference on Analysis of Algorithms - https://doi.org/10.46298/dmtcs.3353
Order statistics and estimating cardinalities of massive data setsArticle

Authors: Frédéric Giroire ORCID1

  • 1 Algorithms

We introduce a new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data. It is based on order statistics rather that on bit patterns in binary representations of numbers. We analyse three families of estimators. They attain a standard error of $\frac{1}{\sqrt{M}}$ using $M$ units of storage, which places them in the same class as the best known algorithms so far. They have a very simple internal loop, which gives them an advantage in term of processing speed. The algorithms are validated on internet traffic traces.


Volume: DMTCS Proceedings vol. AD, International Conference on Analysis of Algorithms
Section: Proceedings
Published on: January 1, 2005
Imported on: May 10, 2017
Keywords: traffic analysis,cardinality,estimates,very large multiset,[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS],[INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM],[MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO],[INFO.INFO-CG] Computer Science [cs]/Computational Geometry [cs.CG],[INFO.INFO-HC] Computer Science [cs]/Human-Computer Interaction [cs.HC]

3 Documents citing this article

Consultation statistics

This page has been seen 320 times.
This article's PDF has been downloaded 207 times.