Philippe Flajolet ; Éric Fusy ; Olivier Gandouet ; Frédéric Meunier - HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

dmtcs:3545 - Discrete Mathematics & Theoretical Computer Science, January 1, 2007, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07) - https://doi.org/10.46298/dmtcs.3545
HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

Authors: Philippe Flajolet 1; Éric Fusy 1; Olivier Gandouet 2; Frédéric Meunier 1

This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.


Volume: DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07)
Section: Proceedings
Published on: January 1, 2007
Imported on: May 10, 2017
Keywords: cardinality estimation,Probabilistic algorithm,[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS],[INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM],[MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO],[INFO.INFO-CG] Computer Science [cs]/Computational Geometry [cs.CG]

Linked publications - datasets - softwares

Source : ScholeXplorer IsRelatedTo ARXIV 1612.02284
Source : ScholeXplorer IsRelatedTo DOI 10.48550/arxiv.1612.02284
  • 10.48550/arxiv.1612.02284
  • 1612.02284
LogLog-Beta and More: A New Algorithm for Cardinality Estimation Based on LogLog Counting

131 Documents citing this article

Consultation statistics

This page has been seen 1016 times.
This article's PDF has been downloaded 1332 times.