HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

Philippe Flajolet; Éric Fusy; Olivier Gandouet; Frédéric Meunier

doi:10.46298/dmtcs.3545

Philippe Flajolet ; Éric Fusy ; Olivier Gandouet ; Frédéric Meunier - HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

dmtcs:3545 - Discrete Mathematics & Theoretical Computer Science, January 1, 2007, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07) - https://doi.org/10.46298/dmtcs.3545

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithmConference paper

Authors: Philippe Flajolet ¹; Éric Fusy ¹; Olivier Gandouet ²; Frédéric Meunier ¹

1 Algorithms
2 Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier

This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$ . This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.

https://doi.org/10.46298/dmtcs.3545

Source: HAL:hal-00406166v2

Volume: DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07)

Section: Proceedings

Published on: January 1, 2007

Imported on: May 10, 2017

Keywords: cardinality estimation,Probabilistic algorithm,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS],[INFO.INFO-DM]Computer Science [cs]/Discrete Mathematics [cs.DM],[MATH.MATH-CO]Mathematics [math]/Combinatorics [math.CO],[INFO.INFO-CG]Computer Science [cs]/Computational Geometry [cs.CG]

Bibliographic References

260 Documents citing this article

Alessio Campanelli;Giulio Ermanno Pibiri;Jason Fan;Rob Patro, 2024, Where the patterns are: repetition-aware compression for colored de Bruijn graphs, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2024.07.09.602727.

Alexander T. Leighton;Yun William Yu, Lecture notes in computer science, Secure Federated Boolean Count Queries Using Fully-Homomorphic Cryptography, pp. 54-67, 2024, 10.1007/978-1-0716-3989-4_4.

Arad Kotzer;Daniel Gandelman;Ori Rottenstreich, 2024, SoK: Applications of Sketches and Rollups in Blockchain Networks, IEEE Transactions on Network and Service Management, 21, 3, pp. 3194-3208, 10.1109/tnsm.2024.3372604.

Brian Tsan;Asoke Datta;Yesdaulet Izenov;Florin Rusu, 2024, Approximate Sketches, Proceedings of the ACM on Management of Data, 2, 1, pp. 1-24, 10.1145/3639321, https://doi.org/10.1145/3639321.

Chengyuan Huang;Tianfan Zhang;Li Wang;Yibo Xiao;Chao Yang;et al., 2024, MpScope: Enabling multi-pipeline monitoring inside a switch, Computer Networks, 254, pp. 110764, 10.1016/j.comnet.2024.110764.

Damu Ding, IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, CARBINE: Exploring Additional Properties of HyperLogLog for Secure and Robust Flow Cardinality Estimation, pp. 471-480, 2024, Vancouver, BC, Canada, 10.1109/infocom52122.2024.10621185.

Damu Ding;Ozlem Kesgin;Noa Zilberman, Oxford University Research Archive (ORA) (University of Oxford), INDDoS+: Secure DDoS Detection Mechanism in Programmable Switches, pp. 197-202, 2024, Pisa, Italy, 10.1109/hpsr62440.2024.10635967, https://ora.ox.ac.uk/objects/uuid:9485df2f-2b29-4bae-9036-835f13fe5a3b.

Enhan Li;Wenhao Wu;Wang Zhaohua;Zhenyu Li;Jianwei Niu, Proceedings of the 8th Asia-Pacific Workshop on Networking, SAROS: A Self-Adaptive Routing Oblivious Sampling Method for Network-wide Heavy Hitter Detection, pp. 142-148, 2024, Sydney Australia, 10.1145/3663408.3663429.

Giulio Ermanno Pibiri;Jason Fan;Rob Patro, Lecture notes in computer science, Meta-colored Compacted de Bruijn Graphs, pp. 131-146, 2024, 10.1007/978-1-0716-3989-4_9.

Haibo Wang, 2024, Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams, Proceedings of the VLDB Endowment, 17, 11, pp. 3124-3137, 10.14778/3681954.3681988.

Hanze Chen;Zhengyan Zhou;Xinyang Chen;Pengpai Shi;Yanni Wu;et al., 2024 IEEE 49th Conference on Local Computer Networks (LCN), CardSketch: Shift Attention for Network-wide Cardinality Telemetry, pp. 1-7, 2024, Normandy, France, 10.1109/lcn60385.2024.10639631.

Jens-Uwe Ulrich;Bernhard Y. Renard, 2024, Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters, Genome Research, 34, 6, pp. 914-924, 10.1101/gr.278623.123.

Jiajun Li;Runlin Lei;Sibo Wang;Zhewei Wei;Bolin Ding, 2024, Learning-based Property Estimation with Polynomials, Proceedings of the ACM on Management of Data, 2, 3, pp. 1-27, 10.1145/3654994.

Jianshu Zhao;Xiaofei Zhao;Jean Pierre-Both;Konstantinos T. Konstantinidis, 2024, BinDash 2.0: New MinHash Scheme Allows Ultra-fast and Accurate Genome Search and Comparisons, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2024.03.13.584875, https://doi.org/10.1101/2024.03.13.584875.

Jiqing Gu;Chao Song;Haipeng Dai;Li Lu;Ming Liu, 2024, Compact Estimator for Streaming Triangle Counting, IEEE Transactions on Knowledge and Data Engineering, 36, 8, pp. 3712-3724, 10.1109/tkde.2024.3371228.

Nan Wu;Xin Yuan;Shuo Wang;Hongsheng Hu;Minhui Xue, Proceedings of the ACM Web Conference 2022, Cardinality Counting in "Alcatraz": A Privacy-aware Federated Learning Approach, pp. 3076-3084, 2024, Singapore Singapore, 10.1145/3589334.3645655.

Otmar Ertl, 2024, UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting, arXiv (Cornell University), 17, 7, pp. 1655-1668, 10.14778/3654621.3654632, http://arxiv.org/abs/2308.16862.

Patrik Goldschmidt;Jan Kučera, NOMS 2024-2024 IEEE Network Operations and Management Symposium, Windower: Feature Extraction for Real-Time DDoS Detection Using Machine Learning, pp. 1-10, 2024, Seoul, Korea, Republic of, 10.1109/noms59830.2024.10575699.

Peng Jia;Pinghui Wang;Rundong Li;Junzhou Zhao;Junlan Feng;et al., 2022 IEEE 38th International Conference on Data Engineering (ICDE), A Compact and Accurate Sketch for Estimating a Large Range of Set Difference Cardinalities, pp. 1338-1351, 2024, Utrecht, Netherlands, 10.1109/icde60146.2024.00110.

Pinghui Wang;Dongdong Xie;Junzhou Zhao;Jinsong Li;Zhicheng Li;et al., 2024, Half-Xor: A Fully-Dynamic Sketch for Estimating the Number of Distinct Values in Big Tables, IEEE Transactions on Knowledge and Data Engineering, 36, 7, pp. 3111-3125, 10.1109/tkde.2024.3359710.

Pinghui Wang;Yitong Liu;Zhicheng Li;Rundong Li, 2024, An LDP Compatible Sketch for Securely Approximating Set Intersection Cardinalities, Proceedings of the ACM on Management of Data, 2, 1, pp. 1-27, 10.1145/3639281.

Robert Schulze;Tom Schreiber;Ilya Yatsishin;Ryadh Dahimene;Alexey Milovidov, 2024, ClickHouse - Lightning Fast Analytics for Everyone, Proceedings of the VLDB Endowment, 17, 12, pp. 3731-3744, 10.14778/3685800.3685802.

Rémi Vaudaine;Pierre Borgnat;Paulo Gonçalves;Rémi Gribonval;Márton Karsai, 2024, Temporal network compression via network hashing, Applied Network Science, 9, 1, 10.1007/s41109-023-00609-9, https://doi.org/10.1007/s41109-023-00609-9.

Sari Sultan;Kia Shakiba;Albert Lee;Paul Chen;Michael Stumm, Proceedings of the Nineteenth European Conference on Computer Systems, TTLs Matter: Efficient Cache Sizing with TTL-Aware Miss Ratio Curves and Working Set Sizes, pp. 387-404, 2024, Athens Greece, 10.1145/3627703.3650066.

Shaolong Zhou;Hanwen Zhang;Guoju Gao;Yu-e Sun;He Huang;et al., Lecture notes in computer science, KTSketch: Finding k-Persistent t-Spread Flows in High-Speed Networks, pp. 326-342, 2024, 10.1007/978-981-97-7241-4_21.

Simone Marini;Alexander Barquero;Anisha Ashok Wadhwani;Jiang Bian;Jaime Ruiz;et al., 2024, OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2024.03.15.585215, https://doi.org/10.1101/2024.03.15.585215.

Thi Hoang Thi Pham;Pascal Molli;Brice Nédelec;Hala Skaf-Molli;Julien Aimonier-Davat, Lecture notes in computer science, CRAWD: Sampling-Based Estimation of Count-Distinct SPARQL Queries, pp. 98-115, 2024, 10.1007/978-3-031-77850-6_6.

Tomer Berg;Or Ordentlich;Ofer Shayevitz, 2024, Statistical Inference With Limited Memory: A Survey, arXiv (Cornell University), 5, pp. 623-644, 10.1109/jsait.2024.3481296, https://arxiv.org/abs/2312.15225.

Yiyan Qi;Rundong Li;Pinghui Wang;Yufang Sun;Rui Xing, arXiv (Cornell University), QSketch: An Efficient Sketch for Weighted Cardinality Estimation in Streams, pp. 2432-2443, 2024, Barcelona Spain, 10.1145/3637528.3671695, https://arxiv.org/abs/2406.19143.

Yuxing Han;Haoyu Wang;Lixiang Chen;Yifeng Dong;Xing Chen;et al., arXiv (Cornell University), ByteCard: Enhancing ByteDance's Data Warehouse with Learned Cardinality Estimation, pp. 41-54, 2024, Santiago AA Chile, 10.1145/3626246.3653376, https://arxiv.org/abs/2403.16110.

Zhen Li;Qingquan Liao;Wenbin Liu;Peng Xu;Linlin Zhuo;et al., 2024, Multi-source data integration for explainable miRNA-driven drug discovery, Future Generation Computer Systems, 160, pp. 109-119, 10.1016/j.future.2024.05.055.

Zheng Zhang;Jie Lu;Quan Ren;Ziyong Li;Yuxiang Hu;et al., 2024, An Accurate and Invertible Sketch for Super Spread Detection, Electronics, 13, 1, pp. 222, 10.3390/electronics13010222, https://doi.org/10.3390/electronics13010222.

Alexander Smirnov;Anton Chizhov;Ilya Shchuckin;Nikita Bobrov;George Chernishev, 2023 33rd Conference of Open Innovations Association (FRUCT), Fast Discovery of Inclusion Dependencies with Desbordante, pp. 264-275, 2023, Zilina, Slovakia, 10.23919/fruct58615.2023.10143047.

Andrea Monterubbiano;Raphael Azorin;Gabriele Castellano;Massimo Gallo;Salvatore Pontarelli;et al., 2023, SPADA: A Sparse Approximate Data Structure Representation for Data Plane Per-flow Monitoring, Proceedings of the ACM on Networking, 1, CoNEXT3, pp. 1-25, 10.1145/3629149, https://doi.org/10.1145/3629149.

Antonios Kontaxakis;Nikos Giatrakos;Dimitris Sacharidis;Antonios Deligiannakis, 2023, And synopses for all: A synopses data engine for extreme scale analytics-as-a-service, Zenodo (CERN European Organization for Nuclear Research), 116, pp. 102221, 10.1016/j.is.2023.102221, https://zenodo.org/record/7916098.

Chao Gao;Jiong Yu;Zhenzhen He;Xiaoqiao Xiong, 2023, ConvMADE: Convolution Makes Cardinality Estimation Stronger, IEEE Access, 11, pp. 98005-98015, 10.1109/access.2023.3312312, https://doi.org/10.1109/access.2023.3312312.

Chaoyi Ma;Olufemi O. Odegbile;Dimitrios Melissourgos;Haibo Wang;Shiping Chen, 2023, From CountMin to Super kJoin Sketches for Flow Spread Estimation, IEEE Transactions on Network Science and Engineering, 11, 3, pp. 2353-2370, 10.1109/tnse.2023.3279665.

Feiyu Wang;Qizhi Chen;Yuanpeng Li;Tong Yang;Yaofeng Tu;et al., 2023, JoinSketch: A Sketch Algorithm for Accurate and Unbiased Inner-Product Estimation, Proceedings of the ACM on Management of Data, 1, 1, pp. 1-26, 10.1145/3588935.

Giambattista Amati;Antonio Cruciani;Daniele Pasquini;Paola Vocca;Simone Angelini, Lecture notes in computer science, propagate: A Seed Propagation Framework to Compute Distance-Based Metrics on Very Large Graphs, pp. 671-688, 2023, 10.1007/978-3-031-43418-1_40.

Giulio Ermanno Pibiri;Jason Fan;Rob Patro, 2023, Meta-colored compacted de Bruijn graphs, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2023.07.21.550101, https://doi.org/10.1101/2023.07.21.550101.

Gourav Singhal;Amritpal Singh, 2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC), Probabilistic Data Structure in smart agriculture, pp. 780-785, 2023, Jalandhar, India, 10.1109/icsccc58608.2023.10176985.

Haitian Chen;Ziwei Wang;Yunchuan Li;Ruixin Yang;Yan Zhao;et al., 2023, Deep Learning-Based Bloom Filter for Efficient Multi-key Membership Testing, Data Science and Engineering, 8, 3, pp. 234-246, 10.1007/s41019-023-00224-9, https://doi.org/10.1007/s41019-023-00224-9.

Hanwen Zhang;He Huang;Yu-E Sun;Zhaojie Wang, 2023 IEEE 31st International Conference on Network Protocols (ICNP), MIME: Fast and Accurate Flow Information Compression for Multi-Spread Estimation, pp. 1-11, 2023, Reykjavik, Iceland, 10.1109/icnp59255.2023.10355571.

Hao Su;Qingjun Xiao, Proceedings of the 7th Asia-Pacific Workshop on Networking, Online Detection of 1D and 2D Hierarchical Super-Spreaders in High-Speed Networks, pp. 109-115, 2023, Hong Kong China, 10.1145/3600061.3600080.

Ioana O. Bercea;Lorenzo Beretta;Jonas Klausen;Jakob Bæk Tejs Houen;Mikkel Thorup, 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), Locally Uniform Hashing, pp. 1440-1470, 2023, Santa Cruz, CA, USA, 10.1109/focs57990.2023.00089.

Jens-Uwe Ulrich;Bernhard Y. Renard, 2023, Taxor: Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2023.07.20.549822, https://doi.org/10.1101/2023.07.20.549822.

Jianjun Chen;Rui Shi;Heng Chen;Li Zhang;Ruidong Li;et al., 2023, Krypton: Real-Time Serving and Analytical SQL Engine at ByteDance, Proceedings of the VLDB Endowment, 16, 12, pp. 3528-3542, 10.14778/3611540.3611545.

Juliane Verwiebe;Philipp M. Grulich;Jonas Traub;Volker Markl, 2023, Survey of window types for aggregation in stream processing systems, The VLDB Journal, 32, 5, pp. 985-1011, 10.1007/s00778-022-00778-6, https://doi.org/10.1007/s00778-022-00778-6.

Jun Jie Sim;Weizhuang Zhou;Fook Mun Chan;Meenatchi Sundaram Muthu Selva Annamalai;Xiaoxia Deng;et al., 2023, CoVnita, an end-to-end privacy-preserving framework for SARS-CoV-2 classification, Scientific Reports, 13, 1, 10.1038/s41598-023-34535-8, https://doi.org/10.1038/s41598-023-34535-8.

Lingxi Wu;Minxuan Zhou;Weihong Xu;Ashish Venkat;Tajana Rosing;et al., 2023, Abakus: Accelerating k -mer Counting with Storage Technology, ACM Transactions on Architecture and Code Optimization, 21, 1, pp. 1-26, 10.1145/3632952, https://doi.org/10.1145/3632952.

Marc Löchner;Alexander Dunkel;Dirk Burghardt, Volunteered Geographic Information, Protecting Privacy in Volunteered Geographic Information Processing, pp. 277-297, 2023, 10.1007/978-3-031-35374-1_14.

Marc Löchner;Dirk Burghardt, 2023, Using HyperLogLog to Prevent Data Retention in Social Media Streaming Data Analytics, ISPRS International Journal of Geo-Information, 12, 2, pp. 60, 10.3390/ijgi12020060, https://doi.org/10.3390/ijgi12020060.

Mujtaba Hassan;Arish Sateesan;Jo Vliegen;Stjepan Picek;Nele Mentens, Lecture notes in computer science, Evolving Non-cryptographic Hash Functions Using Genetic Programming for High-speed Lookups in Network Security Applications, pp. 302-318, 2023, 10.1007/978-3-031-30229-9_20.

Nikhil Sheoran;Supawit Chockchowwat;Arav Chheda;Suwen Wang;Riya Verma;et al., 2023, A Step Toward Deep Online Aggregation, Proceedings of the ACM on Management of Data, 1, 2, pp. 1-28, 10.1145/3589269.

Nikolaj Tatti, 2023, Fast computation of distance-generalized cores using sampling, Knowledge and Information Systems, 65, 6, pp. 2429-2453, 10.1007/s10115-023-01830-9, https://doi.org/10.1007/s10115-023-01830-9.

Pedro Reviriego;Jim Apple;Alvaro Alonso;Otmar Ertl;Niv Dayan, 2023, Cardinality Estimation Adaptive Cuckoo Filters (CE-ACF): Approximate Membership Check and Distinct Query Count for High-Speed Network Monitoring, IEEE/ACM Transactions on Networking, 32, 2, pp. 959-970, 10.1109/tnet.2023.3302306, https://doi.org/10.1109/tnet.2023.3302306.

Qingjun Xiao;Shiwei Yang;Panpan Li;Kangying Li;Lin Wen, 2023, Multi-Resolution Odd Sketch for Mining Extended Jaccard Similarity of Dynamic Streaming sets, IEEE Transactions on Network Science and Engineering, 11, 3, pp. 2399-2414, 10.1109/tnse.2023.3275809.

Qingjun Xiao;Yuexiao Cai;Yunpeng Cao;Shigang Chen, 2023, Accurate and O(1)-Time Query of Per-Flow Cardinality in High-Speed Networks, IEEE/ACM Transactions on Networking, 31, 6, pp. 2994-3009, 10.1109/tnet.2023.3268980, https://doi.org/10.1109/tnet.2023.3268980.

Ruini Xue;Yijun Wang;Shengbo Liu;Yunxiang Li;Wenhong Tian;et al., Lecture notes in computer science, SPAC: Scalable Pattern Approximate Counting in Graph Mining, pp. 214-232, 2023, 10.1007/978-3-031-22677-9_12.

Sam A. Markelon;Mia Filic;Thomas Shrimpton, Repository for Publications and Research Data (ETH Zurich), Compact Frequency Estimators in Adversarial Environments, pp. 3254-3268, 2023, Copenhagen Denmark, 10.1145/3576915.3623216, http://hdl.handle.net/20.500.11850/635680.

Shouyou Song;Pu Wang;Yangchun Li;Nanyang Wang;Wei Jiang, 2023 8th International Conference on Data Science in Cyberspace (DSC), Towards Accurate and Efficient Super Spreader Detection with Sketching, pp. 506-511, 2023, Hefei, China, 10.1109/dsc59305.2023.00079.

Svenja Mehringer;Enrico Seiler;Felix Droop;Mitra Darvish;René Rahn;et al., 2023, Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries, Genome biology, 24, 1, 10.1186/s13059-023-02971-4, https://doi.org/10.1186/s13059-023-02971-4.

Timothé Rouzé;Igor Martayan;Camille Marchet;Antoine Limasset, 2023, Fractional Hitting Sets for Efficient and Lightweight Genomic Data Sketching, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2023.06.21.545875, https://doi.org/10.1101/2023.06.21.545875.

Xiangyang Gou;Lei Zou, 2023, Sliding window-based approximate triangle counting with bounded memory usage, The VLDB Journal, 32, 5, pp. 1087-1110, 10.1007/s00778-023-00783-3.

Xiaocan Wu;He Huang;Yang Du;Yu-E Sun;Shigang Chen, 2023, Coupon Filter: A Universal and Lightweight Filter Framework for More Accurate Data Stream Processing, Computer Networks, 228, pp. 109748, 10.1016/j.comnet.2023.109748.

Yang Du;He Huang;Yu-E Sun;Kejian Li;Boyu Zhang;et al., IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, A Better Cardinality Estimator with Fewer Bits, Constant Update Time, and Mergeability, pp. 1-10, 2023, New York City, NY, USA, 10.1109/infocom53939.2023.10229088.

Youwen Zhu;Qimeng Song;Yonglong Luo, 2023, Differentially Private Top- $k$ Flows Estimation Mechanism in Network Traffic, IEEE Transactions on Network Science and Engineering, 11, 3, pp. 2462-2472, 10.1109/tnse.2023.3309250.

Yuanming Zhang;Pinghui Wang;Yiyan Qi;Kuankuan Cheng;Junzhou Zhao;et al., 2023, Fast Gumbel-Max Sketch and its Applications, arXiv (Cornell University), 35, 9, pp. 9350-9363, 10.1109/tkde.2023.3237857, https://arxiv.org/abs/2302.05176.

Yuming Lin;Zejun Xu;Yinghao Zhang;You Li;Jingwei Zhang, 2023, Cardinality estimation with smoothing autoregressive models, World Wide Web, 26, 5, pp. 3441-3461, 10.1007/s11280-023-01195-7.

Yunchuan Li;Ziwei Wang;Ruixin Yang;Yan Zhao;Rui Zhou;et al., Lecture notes in computer science, Learned Bloom Filter for Multi-key Membership Testing, pp. 62-79, 2023, 10.1007/978-3-031-30637-2_5.

Zhao Chang;Feifei Li;Yulong Shen, 2023, Generalized Measure-Biased Sampling and Priority Sampling, IEEE Transactions on Knowledge and Data Engineering, 36, 11, pp. 6251-6265, 10.1109/tkde.2023.3340673.

Zhaoheng Li;Xinyu Pi;Yongjoo Park, arXiv (Cornell University), S/C: Speeding up Data Materialization with Bounded Memory, pp. 1981-1994, 2023, Anaheim, CA, USA, 10.1109/icde55515.2023.00393, http://arxiv.org/abs/2303.09774.

Zhengxin Zhang;Qing Li;Guanglin Duan;Dan Zhao;Jingyu Xiao;et al., 2023, Pontus: Finding Waves in Data Streams, Proceedings of the ACM on Management of Data, 1, 1, pp. 1-26, 10.1145/3588960.

Zhuochen Fan;Ruixin Wang;Yalun Cai;Ruwen Zhang;Tong Yang;et al., 2023, OneSketch: A Generic and Accurate Sketch for Data Streams, Queen Mary Research Online (Queen Mary University of London), 35, 12, pp. 12887-12901, 10.1109/tkde.2023.3278028, https://qmro.qmul.ac.uk/xmlui/handle/123456789/88622.

Zunying Qin;Xingyuan Chen;Jingru Cui;Zhe Zhang, 2023 3rd International Conference on Electronic Information Engineering and Computer (EIECT), FastSO: A Fast Weighted Cardinality Estimation Algorithm, pp. 494-499, 2023, Shenzhen, China, 10.1109/eiect60552.2023.10441998.

Alessandro Cornacchia;Giuseppe Bianchi;Andrea Bianco;Paolo Giaccone, 2022 IEEE 11th IFIP International Conference on Performance Evaluation and Modeling in Wireless and Wired Networks (PEMWN), Designing Probabilistic Flow Counting over Sliding Windows, pp. 1-6, 2022, Rome, Italy, 10.23919/pemwn56085.2022.9963868.

Alessandro Cornacchia;Giuseppe Bianchi;Andrea Bianco;Paolo Giaccone, 2022, Staggered HLL: Near-continuous-time cardinality estimation with no overhead, Computer Communications, 193, pp. 168-175, 10.1016/j.comcom.2022.06.038.

Arik Rinberg;Alexander Spiegelman;Edward Bortnikov;Eshcar Hillel;Idit Keidar;et al., 2022, Fast Concurrent Data Sketches, ACM Transactions on Parallel Computing, 9, 2, pp. 1-35, 10.1145/3512758, https://doi.org/10.1145/3512758.

Bonnie Berger;Yun William Yu, 2022, Navigating bottlenecks and trade-offs in genomic data analysis, PubMed Central, 24, 4, pp. 235-250, 10.1038/s41576-022-00551-z, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10204111.

Chaoyi Ma;Haibo Wang;Olufemi O. Odegbile;Shigang Chen;Dimitrios Melissourgos, 2022, Virtual Filter for Non-Duplicate Sampling With Network Applications, IEEE/ACM Transactions on Networking, 30, 6, pp. 2818-2833, 10.1109/tnet.2022.3182694.

Damu Ding;Marco Savi;Federico Pederzolli;Domenico Siracusa, AMS Dottorato Institutional Doctoral Theses Repository (University of Bologna), Design and Development of Network Monitoring Strategies in P4-enabled Programmable Switches, pp. 1-6, 2022, Budapest, Hungary, 10.1109/noms54207.2022.9789848, http://amsdottorato.unibo.it/9691/1/PhD_thesis_DAMU.pdf.

Dinh Nguyen Dao;Rhongho Jang;Changhun Jung;David Mohaisen;DaeHun Nyang, 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Minimizing Noise in HyperLogLog-Based Spread Estimation of Multiple Flows, pp. 331-342, 2022, Baltimore, MD, USA, 10.1109/dsn53405.2022.00042.

Dor Harris;Arik Rinberg;Ori Rottenstreich, 2022, Compressing Distributed Network Sketches With Traffic-Aware Summaries, IEEE Transactions on Network and Service Management, 20, 2, pp. 1962-1975, 10.1109/tnsm.2022.3172299.

Francesco Da Dalt;Simon Scherrer;Adrian Perrig, 2022, Bayesian Sketches for Volume Estimation in Data Streams, Repository for Publications and Research Data (ETH Zurich), 16, 4, pp. 657-669, 10.14778/3574245.3574252, https://hdl.handle.net/20.500.11850/616030.

Fuheng Zhao;Divyakant Agrawal;Amr El Abbadi;Ahmed Metwally, 2022, SpaceSaving ±, arXiv (Cornell University), 15, 6, pp. 1215-1227, 10.14778/3514061.3514068, https://arxiv.org/abs/2112.03462.

Giorgio Audrito;Ferruccio Damiani;Gianluca Torta, Lecture notes in computer science, Bringing Aggregate Programming Towards the Cloud, pp. 301-317, 2022, 10.1007/978-3-031-19759-8_19.

Graham Cormode, 2022, Current Trends in Data Summaries, ACM SIGMOD Record, 50, 4, pp. 6-15, 10.1145/3516431.3516433.

Haibo Wang;Chaoyi Ma;Olufemi O. Odegbile;Shigang Chen;Jih-Kwon Peir, 2022, Randomized Error Removal for Online Spread Estimation in High-Speed Networks, IEEE/ACM Transactions on Networking, 31, 2, pp. 558-573, 10.1109/tnet.2022.3197968.

Haibo Wang;Chaoyi Ma;Shigang Chen;Yuanda Wang, 2022, Fast and Accurate Cardinality Estimation by Self-Morphing Bitmaps, IEEE/ACM Transactions on Networking, 30, 4, pp. 1674-1688, 10.1109/tnet.2022.3147204, https://doi.org/10.1109/tnet.2022.3147204.

Hao Xu;Xiulong Liu;Zhelin Liang;Hongyan Sun;Weilian Xue;et al., 2022, A Transaction Cardinality Estimation Approach for QoS-Adjustable Intelligent Blockchain Systems, IEEE Journal on Selected Areas in Communications, 40, 12, pp. 3672-3684, 10.1109/jsac.2022.3213327.

Hao Zheng;Chen Tian;Tong Yang;Huiping Lin;Chang Liu;et al., Proceedings of the ACM SIGCOMM 2022 Conference, FlyMon, 2022, Amsterdam Netherlands, 10.1145/3544216.3544239.

Hengrui Wang;Huiping Lin;Zheng Zhong;Tong Yang;Muhammad Shahzad, 2022, Enhanced Machine Learning Sketches for Network Measurements, IEEE Transactions on Computers, 72, 4, pp. 957-970, 10.1109/tc.2022.3185560.

Hui Han;Zheng Yan;Xuyang Jing;Witold Pedrycz, 2022, Applications of sketches in network traffic measurement: A survey, Information Fusion, 82, pp. 58-85, 10.1016/j.inffus.2021.12.007.

Jiajun Li;Zhewei Wei;Bolin Ding;Xiening Dai;Lu Lu;et al., arXiv (Cornell University), Sampling-based Estimation of the Number of Distinct Values in Distributed Environment, pp. 893-903, 2022, Washington DC USA, 10.1145/3534678.3539390, http://arxiv.org/abs/2206.05476.

Julien Aimonier-Davat;Hala Skaf-Molli;Pascal Molli;Arnaud Grall;Thomas Minier, 2022, Online approximative SPARQL query processing for COUNT-DISTINCT queries with web preemption, Semantic Web, 13, 4, pp. 735-755, 10.3233/sw-222842, https://doi.org/10.3233/sw-222842.

Kanat Tangwongsan;Martin Hirzel;Scott Schneider, Encyclopedia of Big Data Technologies, Sliding-Window Aggregation Algorithms, pp. 1-7, 2022, 10.1007/978-3-319-63962-8_157-2.

Kenneth G. Paterson;Mathilde Raynal, Repository for Publications and Research Data (ETH Zurich), HyperLogLog: Exponentially Bad in Adversarial Settings, pp. 154-170, 2022, Genoa, Italy, 10.1109/eurosp53844.2022.00018, http://hdl.handle.net/20.500.11850/535117.

Lin Chen;Raphael C.-W. Phan;Zhili Chen;Dan Huang, IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, Persistent Items Tracking in Large Data Streams Based on Adaptive Sampling, pp. 1948-1957, 2022, London, United Kingdom, 10.1109/infocom48880.2022.9796709.

Lu Tang;Yao Xiao;Qun Huang;Patrick P. C. Lee, 2022, A High-Performance Invertible Sketch for Network-Wide Superspreader Detection, IEEE/ACM Transactions on Networking, 31, 2, pp. 724-737, 10.1109/tnet.2022.3198738.

Mahdi Belbasi;Antonio Blanca;Robert S. Harris;David Koslicki;Paul Medvedev, 2022, The minimizer Jaccard estimator is biased and inconsistent*, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2022.01.14.476226, https://doi.org/10.1101/2022.01.14.476226.

Mahmudur Rahman Hera;N. Tessa Pierce-Ward;David Koslicki, 2022, Debiasing FracMinHash and deriving confidence intervals for mutation rates across a wide range of evolutionary distances, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2022.01.11.475870, https://doi.org/10.1101/2022.01.11.475870.

Martino Ciaperoni;Aristides Gionis;Athanasios Katsamanis;Panagiotis Karras, Proceedings of the 2022 International Conference on Management of Data, SIEVE: A Space-Efficient Algorithm for Viterbi Decoding, pp. 1136-1145, 2022, Philadelphia PA USA, 10.1145/3514221.3526170.

Matti Karppa;Rasmus Pagh, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, HyperLogLogLog, pp. 753-761, 2022, Washington DC USA, 10.1145/3534678.3539246.

Mia Filic;Kenneth G. Paterson;Anupama Unnikrishnan;Fernando Virdia, Repository for Publications and Research Data (ETH Zurich), Adversarial Correctness and Privacy for Probabilistic Data Structures, pp. 1037-1050, 2022, Los Angeles CA USA, 10.1145/3548606.3560621, https://hdl.handle.net/20.500.11850/569894.

Michał Wyszomirski, 2022, Analysis of the possibility of using key-value store NoSQL databases for IFC data processing in the BIM-GIS integration process, Polish Cartographical Review, 54, 1, pp. 11-22, 10.2478/pcr-2022-0002, https://doi.org/10.2478/pcr-2022-0002.

Ming Chen;Renxiang Zhou;Hanhua Chen;Hai Jin, 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS), Scube: Efficient Summarization for Skewed Graph Streams, pp. 100-110, 2022, Bologna, Italy, 10.1109/icdcs54860.2022.00019.

Ming Chen;Renxiang Zhou;Hanhua Chen;Jiang Xiao;Hai Jin;et al., Rare & Special e-Zone (The Hong Kong University of Science and Technology), Horae: A Graph Stream Summarization Structure for Efficient Temporal Range Query, pp. 2792-2804, 2022, Kuala Lumpur, Malaysia, 10.1109/icde53745.2022.00254, http://repository.hkust.edu.hk/ir/Record/1783.1-118324.

Peng Jia;Pinghui Wang;Junzhou Zhao;Jing Tao;Ye Yuan;et al., 2022, Erasable Virtual HyperLogLog for Approximating Cumulative Distribution over Data Streams, IEEE Transactions on Knowledge and Data Engineering, 34, 11, pp. 5336-5350, 10.1109/tkde.2021.3052938.

Pierce Smith;Sangmi Lee Pallickara;Shrideep Pallickara, 2021 IEEE International Conference on Big Data (Big Data), Griddle: Effective Query Support over Voluminous Gridded Spatial Datasets, pp. 792-799, 2022, Osaka, Japan, 10.1109/bigdata55660.2022.10020796.

Pierluigi Crescenzi;Andrea Marino, Encyclopedia of Big Data Technologies, Degrees of Separation and Diameter in Large Graphs, pp. 1-7, 2022, 10.1007/978-3-319-63962-8_59-2.

Ruijie Miao;Zheng Zhong;Jiarui Guo;Zikun Li;Tong Yang;et al., 2022, BurstSketch: Finding Bursts in Data Streams, IEEE Transactions on Knowledge and Data Engineering, 35, 11, pp. 11126-11140, 10.1109/tkde.2022.3223686.

Sagnik Mukherjee;Eva Hauthal;Dirk Burghardt, 2022, Analyzing the EU Migration Crisis as Reflected on Twitter, KN - Journal of Cartography and Geographic Information, 72, 3, pp. 213-228, 10.1007/s42489-022-00114-6, https://doi.org/10.1007/s42489-022-00114-6.

Shir Landau Feibish;Zaoxing Liu;Nikita Ivkin;Xiaoqi Chen;Vladimir Braverman;et al., Proceedings of the Symposium on SDN Research, Flow-level loss detection with Δ-sketches, pp. 25-32, 2022, Virtual Event, 10.1145/3563647.3563653.

Svenja Mehringer;Enrico Seiler;Felix Droop;Mitra Darvish;René Rahn;et al., 2022, Hierarchical Interleaved Bloom Filter: Enabling ultrafast, approximate sequence queries, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2022.08.01.502266, https://doi.org/10.1101/2022.08.01.502266.

Wenhua Guo;Kaixuan Ye;Yiyan Qi;Peng Jia;Pinghui Wang, 2022, Generalized Sketches for Streaming Sets, Applied Sciences, 12, 15, pp. 7362, 10.3390/app12157362, https://doi.org/10.3390/app12157362.

Yang Du;He Huang;Yu-E Sun;Shigang Chen;Guoju Gao;et al., 2022, Self-Adaptive Sampling Based Per-Flow Traffic Measurement, IEEE/ACM Transactions on Networking, 31, 3, pp. 1010-1025, 10.1109/tnet.2022.3212066.

Yu-Kuen Lai;Cheng-Lin Tsai;Cheng-Han Chuang;Xiu-Wen Ku;Jim Hao Chen, 2022, Tabular Interpolation Approach Based on Stable Random Projection for Estimating Empirical Entropy of High-Speed Network Traffic, IEEE Access, 10, pp. 104934-104953, 10.1109/access.2022.3210336, https://doi.org/10.1109/access.2022.3210336.

Yuhan Wu;Zhuochen Fan;Qilong Shi;Yixin Zhang;Tong Yang;et al., Proceedings of the 51st International Conference on Parallel Processing, SHE: A Generic Framework for Data Stream Mining over Sliding Windows, pp. 1-12, 2022, Bordeaux France, 10.1145/3545008.3545009.

Zhiwei Xu;Xin Wang;Yujun Zhang, 2022, Towards Persistent Detection of DDoS Attacks in NDN: A Sketch-Based Approach, IEEE Transactions on Dependable and Secure Computing, 20, 4, pp. 3449-3465, 10.1109/tdsc.2022.3196187.

Alexander T. Leighton;Yun William Yu, 2021, Secure federated Boolean count queries using fully-homomorphic cryptography, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.11.10.468090, https://doi.org/10.1101/2021.11.10.468090.

Ali Zahir;Anees Ullah;Pedro Reviriego;Syed Riaz Ul Hassnain, 2021, Efficient Leading Zero Count (LZC) Implementations for Xilinx FPGAs, e-Archivo (Carlos III University of Madrid), 14, 1, pp. 35-38, 10.1109/les.2021.3101688, http://hdl.handle.net/10016/34413.

Anees Ullah;Pedro Reviriego;Adeel Akram;Malik Najmus Siraj, 2021, Switch-Based High Cardinality Node Detection, e-Archivo (Carlos III University of Madrid), 13, 4, pp. 190-193, 10.1109/les.2021.3062155, http://hdl.handle.net/10016/33713.

Badih Ghazi;Ben Kreuter;Ravi Kumar;Pasin Manurangsi;Jiayu Peng;et al., 2021, Multiparty Reach and Frequency Histogram: Private, Secure, and Practical, Proceedings on Privacy Enhancing Technologies, 2022, 1, pp. 373-395, 10.2478/popets-2022-0019, https://doi.org/10.2478/popets-2022-0019.

Bin Fu;Pengfei Gu;Yuming Zhao, 2021, Approximate set union via approximate randomization, arXiv (Cornell University), 890, pp. 210-239, 10.1016/j.tcs.2021.09.016, https://arxiv.org/abs/1802.06204.

Catherine Péquegnat;Jonathan Schaeffer;Claudio Satriano;Helle Pedersen;Jérôme Touvier;et al., 2021, RÉSIF-SI: A Distributed Information System for French Seismological Data, 92, 3, pp. 1832-1853, 10.1785/0220200392, https://hal.archives-ouvertes.fr/hal-03193314.

Chaoyi Ma;Haibo Wang;Olufemi O Odegbile;Shigang Chen, 2021 IEEE 29th International Conference on Network Protocols (ICNP), Virtual Filter for Non-duplicate Sampling, pp. 1-11, 2021, Dallas, TX, USA, 10.1109/icnp52444.2021.9651974.

Chaoyi Ma;Shigang Chen;Youlin Zhang;Qingjun Xiao;Olufemi O. Odegbile, 2021, Super Spreader Identification Using Geometric-Min Filter, IEEE/ACM Transactions on Networking, 30, 1, pp. 299-312, 10.1109/tnet.2021.3108033, https://doi.org/10.1109/tnet.2021.3108033.

Clément Agret;Bastien Cazaux;Antoine Limasset, 2021, Toward optimal fingerprint indexing for large scale genomics, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.11.04.467355, https://doi.org/10.1101/2021.11.04.467355.

Damien Desfontaines;James Voss;Bryant Gipson;Chinmoy Mandayam, 2021, Differentially private partition selection, Proceedings on Privacy Enhancing Technologies, 2022, 1, pp. 339-352, 10.2478/popets-2022-0017, https://doi.org/10.2478/popets-2022-0017.

Damu Ding;Marco Savi;Domenico Siracusa, 2021, Tracking Normalized Network Traffic Entropy to Detect DDoS Attacks in P4, arXiv (Cornell University), 19, 6, pp. 4019-4031, 10.1109/tdsc.2021.3116345, https://arxiv.org/abs/2104.05117.

Daobing Zhu;Dongsheng He;Shuhuan Fan;Jianming Liao;Mengshu Hou, Lecture notes in computer science, GACE: Graph-Attention-Network-Based Cardinality Estimator, pp. 332-345, 2021, 10.1007/978-3-030-86475-0_32.

Dominik Durner;Viktor Leis;Thomas Neumann, Zenodo (CERN European Organization for Nuclear Research), JSON Tiles: Fast Analytics on Semi-Structured Data, pp. 445-458, 2021, Virtual Event China, 10.1145/3448016.3452809, https://zenodo.org/record/5779798.

Eva Hauthal;Alexander Dunkel;Dirk Burghardt, 2021, Emojis as Contextual Indicants in Location-Based Social Media Posts, ISPRS International Journal of Geo-Information, 10, 6, pp. 407, 10.3390/ijgi10060407, https://doi.org/10.3390/ijgi10060407.

Fuheng Zhao;Sujaya Maiyya;Ryan Wiener;Divyakant Agrawal;Amr El Abbadi, 2021, KLL ± approximate quantile sketches over dynamic datasets, Proceedings of the VLDB Endowment, 14, 7, pp. 1215-1227, 10.14778/3450980.3450990.

Graham Cormode;Minos Garofalakis;Michael Shekelyan, Warwick Research Archive Portal (University of Warwick), Data-Independent Space Partitionings for Summaries, pp. 285-298, 2021, Virtual Event China, 10.1145/3452021.3458316, https://wrap.warwick.ac.uk/152103/1/WRAP-Data-independent-space-partitionings-summaries-2021.pdf.

Hai Lan;Zhifeng Bao;Yuwei Peng, 2021, A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration, Data Science and Engineering, 6, 1, pp. 86-101, 10.1007/s41019-020-00149-7, https://doi.org/10.1007/s41019-020-00149-7.

Haibo Wang;Chaoyi Ma;Olufemi O Odegbile;Shigang Chen;Jih-Kwon Peir, 2021, Randomized error removal for online spread estimation in data streaming, Proceedings of the VLDB Endowment, 14, 6, pp. 1040-1052, 10.14778/3447689.3447707.

Hao Zheng;Yanan Jiang;Chen Tian;Long Cheng;Qun Huang;et al., 2021, Rethinking Fine-Grained Measurement From Software-Defined Perspective: A Survey, IEEE Transactions on Services Computing, 15, 6, pp. 3649-3667, 10.1109/tsc.2021.3103968.

Hun Namkung;Daehyeok Kim;Zaoxing Liu;Vyas SekaR;Peter Steenkiste, Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR), Telemetry Retrieval Inaccuracy in Programmable Switches, 2021, Virtual Event USA, 10.1145/3482898.3483359.

Jakub Lemiesz, 2021, On the algebra of data sketches, Proceedings of the VLDB Endowment, 14, 9, pp. 1655-1667, 10.14778/3461535.3461553.

Jie Xu;Wei Ding, 2021, Rough Estimator Based Asynchronous Distributed Super Points Detection on High Speed Network Edge, Algorithms, 14, 10, pp. 277, 10.3390/a14100277, https://doi.org/10.3390/a14100277.

Kanat Tangwongsan;Martin Hirzel;Scott Schneider, 2021, In-order sliding-window aggregation in worst-case constant time, arXiv (Cornell University), 30, 6, pp. 933-957, 10.1007/s00778-021-00668-3, https://arxiv.org/abs/2009.13768.

Maciej Besta;Marc Fischer;Vasiliki Kalavri;Michael Kapralov;Torsten Hoefler, 2021, Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems, arXiv (Cornell University), 34, 6, pp. 1860-1876, 10.1109/tpds.2021.3131677, https://arxiv.org/abs/1912.12740.

Monica Chiosa;Thomas B. Preußer;Gustavo Alonso, 2021, SKT, Repository for Publications and Research Data (ETH Zurich), 14, 11, pp. 2369-2382, 10.14778/3476249.3476287, https://hdl.handle.net/20.500.11850/505690.

Moustafa Shokrof;C. Titus Brown;Tamer A. Mansour, 2021, MQF and buffered MQF: quotient filters for efficient storage of k-mers with their counts and metadata, BMC Bioinformatics, 22, 1, 10.1186/s12859-021-03996-x, https://doi.org/10.1186/s12859-021-03996-x.

N. Priyanka;T. R. Reshmi;K. Murugan, 2021, CEOF: Enhanced Clustering-based Entries Optimization scheme to prevent Flow table overflow, Wireless Networks, 28, 1, pp. 69-83, 10.1007/s11276-021-02823-8.

Nikolaj Tatti, arXiv (Cornell University), Fast computation of distance-generalized cores using sampling, pp. 609-618, 2021, Auckland, New Zealand, 10.1109/icdm51629.2021.00072, https://arxiv.org/abs/2112.06154.

Otmar Ertl, 2021, SetSketch, arXiv (Cornell University), 14, 11, pp. 2244-2257, 10.14778/3476249.3476276, https://arxiv.org/abs/2101.00314.

Pedro Reviriego;Alfonso Sanchez-Macian;Shanshan Liu;Fabrizio Lombardi, 2021, On the Security of the K Minimum Values (KMV) Sketch, e-Archivo (Universidad Carlos III de Madrid), 19, 5, pp. 3539-3545, 10.1109/tdsc.2021.3101280, http://hdl.handle.net/10016/35750.

Pedro Reviriego;Pilin Junsangsri;Shanshan Liu;Fabrizio Lombardi, 2021, Error-Tolerant Data Sketches Using Approximate Nanoscale Memories and Voltage Scaling, IEEE Transactions on Nanotechnology, 21, pp. 16-22, 10.1109/tnano.2021.3139394, https://doi.org/10.1109/tnano.2021.3139394.

Ran Ben-Basat;Gil Einziger;Shir Landau Feibish;Jalil Moraney;Bilal Tayh;et al., 2021, Routing-Oblivious Network-Wide Measurements, UCL Discovery (University College London), 29, 6, pp. 2386-2398, 10.1109/tnet.2021.3061737, https://discovery.ucl.ac.uk/id/eprint/10152896/.

Renzhi Wu;Bolin Ding;Xu Chu;Zhewei Wei;Xiening Dai;et al., 2021, Learning to be a statistician, arXiv (Cornell University), 15, 2, pp. 272-284, 10.14778/3489496.3489508, https://arxiv.org/abs/2202.02800.

Reuven Cohen;Liran Katzir;Aviv Yehezkel, 2021 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Efficient Service Chain Verification Using Sketches and Small Samples, 2021, Heraklion, Greece, 10.1109/nfv-sdn53031.2021.9665020.

Rudi Poepsel-Lemaitre;Martin Kiefer;Joscha von Hein;Jorge-Arnulfo Quiané-Ruiz;Volker Markl, 2021, In the land of data streams where synopses are missing, one framework to bring them all, Proceedings of the VLDB Endowment, 14, 10, pp. 1818-1831, 10.14778/3467861.3467871.

Seth Pettie;Dingyu Wang, arXiv (Cornell University), Information theoretic limits of cardinality estimation: Fisher meets Shannon, pp. 556-569, 2021, Virtual Italy, 10.1145/3406325.3451032, https://arxiv.org/abs/2007.08051.

Shao-Jie Qiao;Guo-Ping Yang;Nan Han;Hao Chen;Fa-Liang Huang;et al., 2021, Cardinality Estimator: Processing SQL with a Vertical Scanning Convolutional Neural Network, Journal of Computer Science and Technology, 36, 4, pp. 762-777, 10.1007/s11390-021-1351-7.

Shaopeng Liu;David Koslicki, 2021, CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.12.06.471436, https://doi.org/10.1101/2021.12.06.471436.

Simon Scherrer;Che-Yu Wu;Yu-Hsi Chiang;Benjamin Rothenberger;Daniele E. Asoni;et al., Lirias (KU Leuven), Low-Rate Overuse Flow Tracer (LOFT): An Efficient and Scalable Algorithm for Detecting Overuse Flows, pp. 265-276, 2021, Chicago, IL, USA, 10.1109/srds53918.2021.00034, https://lirias.kuleuven.be/handle/20.500.12942/699586.

Valerio Bruschi;Salvatore Pontarelli;Jerome Tollet;Dave Barach;Giuseppe Bianchi, 2021, FlowFight: High performance–low memory top-k spreader detection, Computer Networks, 196, pp. 108239, 10.1016/j.comnet.2021.108239.

Yuan Qiu;Yilei Wang;Ke Yi;Feifei Li;Bin Wu;et al., Proceedings of the 2022 International Conference on Management of Data, Weighted Distinct Sampling: Cardinality Estimation for SPJ Queries, pp. 1465-1477, 2021, Virtual Event China, 10.1145/3448016.3452821.

Zaoxing Liu;Hun Namkung;Anup Agarwal;Antonis Manousis;Peter Steenkiste;et al., arXiv (Cornell University), Sketchy With a Chance of Adoption: Can Sketch-Based Telemetry Be Ready for Prime Time?, pp. 9-16, 2021, Tokyo, Japan, 10.1109/netsoft51509.2021.9492582, https://arxiv.org/abs/2012.06001.

Zhiwei Chen;Shaoxu Song;Ziheng Wei;Jingyun Fang;Jiang Long, 2021, Approximating median absolute deviation with bounded error, Proceedings of the VLDB Endowment, 14, 11, pp. 2114-2126, 10.14778/3476249.3476266.

Ziye Tao;Griffin M. Weber;Yun William Yu, 2021, Expected 10-anonymity of HyperLogLog sketches for federated queries of clinical data repositories, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.01.30.428918, https://doi.org/10.1101/2021.01.30.428918.

Amira Soliman;Sarunas Girdzijauskas;Mohamed-Rafik Bouguelia;Sepideh Pashami;Slawomir Nowaczyk, Lecture notes in computer science, Decentralized and Adaptive K-Means Clustering for Non-IID Data Using HyperLogLog Counters, pp. 343-355, 2020, 10.1007/978-3-030-47426-3_27, https://doi.org/10.1007/978-3-030-47426-3_27.

Amit Kulkarni;Monica Chiosa;Thomas B. Preusser;Kaan Kara;David Sidler;et al., arXiv (Cornell University), HyperLogLog Sketch Acceleration on FPGA, pp. 47-56, 2020, Gothenburg, Sweden, 10.1109/fpl50879.2020.00019, https://arxiv.org/abs/2005.13332.

Antonis Kontaxakis;Nikos Giatrakos;Antonios Deligiannakis, Zenodo (CERN European Organization for Nuclear Research), A Synopses Data Engine for Interactive Extreme-Scale Analytics, pp. 2085-2088, 2020, Virtual Event Ireland, 10.1145/3340531.3412154, https://zenodo.org/record/3977402.

Bin Fu;Pengfei Gu;Yuming Zhao, arXiv (Cornell University), Approximate Set Union via Approximate Randomization, pp. 591-602, 2020, 10.1007/978-3-030-58150-3_48, http://arxiv.org/abs/1802.06204.

Bruno Veloso;Carlos Martins;Raphael Espanha;Raul Azevedo;João Gama, Proceedings of the 35th Annual ACM Symposium on Applied Computing, Fraud detection using heavy hitters, pp. 482-489, 2020, Brno Czech Republic, 10.1145/3341105.3373842.

Cha Hwan Song;Pravein Govindan Kannan;Bryan Kian Hsiang Low;Mun Choon Chan, Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies, FCM-sketch, 2020, Barcelona Spain, 10.1145/3386367.3432729.

Lin Chen;Haipeng Dai;Lei Meng;Jihong Yu, 2020, Finding needles in a hay stream: On persistent item lookup in data streams, Computer Networks, 181, pp. 107518, 10.1016/j.comnet.2020.107518.

Marc Löchner;Ramian Fathi;David ‘-1’ Schmid;Alexander Dunkel;Dirk Burghardt;et al., 2020, Case Study on Privacy-Aware Social Media Data Processing in Disaster Management, ISPRS International Journal of Geo-Information, 9, 12, pp. 709, 10.3390/ijgi9120709, https://doi.org/10.3390/ijgi9120709.

Martin Kiefer;Ilias Poulakis;Sebastian Breß;Volker Markl, 2020, Scotch, Proceedings of the VLDB Endowment, 14, 3, pp. 281-293, 10.14778/3430915.3430919.

Matteo Ceccarello;Andrea Pietracaprina;Geppino Pucci;Eli Upfal, 2020, Distributed Graph Diameter Approximation, Algorithms, 13, 9, pp. 216, 10.3390/a13090216, https://doi.org/10.3390/a13090216.

Mingran Yang;Junbo Zhang;Akshay Gadre;Zaoxing Liu;Swarun Kumar;et al., Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, Joltik, pp. 1-14, 2020, London United Kingdom, 10.1145/3372224.3419204, https://doi.org/10.1145/3372224.3419204.

Pedro Reviriego;Jorge Martinez;Ori Rottenstreich;Shanshan Liu;Fabrizio Lombardi, 2020, Remove Minimum (RM): An Error-Tolerant Scheme for Cardinality Estimate by HyperLogLog, IEEE Transactions on Dependable and Secure Computing, pp. 1, 10.1109/tdsc.2020.3013746.

Peng Jia;Pinghui Wang;Yuchao Zhang;Xiangliang Zhang;Jing Tao;et al., 2020, Accurately Estimating User Cardinalities and Detecting Super Spreaders Over Time, King Abdullah University of Science and Technology Repository (King Abdullah University of Science and Technology), 34, 1, pp. 92-106, 10.1109/tkde.2020.2975625, http://hdl.handle.net/10754/661680.

Prashanth Menon;Amadou Ngom;Lin Ma;Todd C. Mowry;Andrew Pavlo, 2020, Permutable compiled queries, Proceedings of the VLDB Endowment, 14, 2, pp. 101-113, 10.14778/3425879.3425882.

Seung Geol Choi;Dana Dachman-soled;Mukul Kulkarni;Arkady Yerukhimovich, 2020, Differentially-Private Multi-Party Sketching for Large-Scale Statistics, Proceedings on Privacy Enhancing Technologies, 2020, 3, pp. 153-174, 10.2478/popets-2020-0047.

Xiaoqi Chen;Shir Landau-Feibish;Mark Braverman;Jennifer Rexford, Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, BeauCoup, pp. 226-239, 2020, Virtual Event USA, 10.1145/3387514.3405865.

Yael Hitron;Cameron Musco;Merav Parter, International Conference on Distributed Computing, Spiking Neural Networks Through the Lens of Streaming Algorithms, pp. 18-, 2020, 10.4230/lipics.disc.2020.10.

Yong-Cheol Seo;Sung-Hoon Park, 2020, PlexDB: Efficient Compaction Algorithm for Deduplication of LSM-tree based Key-Value Store, Journal of Digital Contents Society, 21, 8, pp. 1501-1506, 10.9728/dcs.2020.21.8.1501.

Yun William Yu;Griffin M. Weber, 2020, HyperMinHash: MinHash in LogLog space, IEEE Transactions on Knowledge and Data Engineering, pp. 1, 10.1109/tkde.2020.2981311, https://doi.org/10.1109/tkde.2020.2981311.

Amritpal Singh;Sahil Garg;Ravneet Kaur;Shalini Batra;Neeraj Kumar;et al., 2019, Probabilistic data structures for big data analytics: A comprehensive review, Knowledge-Based Systems, 188, pp. 104987, 10.1016/j.knosys.2019.104987.

Benjamin W. Priest;George Cybenko;Satinder Singh;Massimiliano Albanese;Peng Liu, Lecture notes in computer science, Online and Scalable Adaptive Cyber Defense, pp. 232-261, 2019, 10.1007/978-3-030-30719-6_10.

César Sánchez;Gerardo Schneider;Wolfgang Ahrendt;Ezio Bartocci;Domenico Bianculli;et al., 2019, A survey of challenges for runtime verification from advanced application domains (beyond software), Formal Methods in System Design, 54, 3, pp. 279-335, 10.1007/s10703-019-00337-w, https://doi.org/10.1007/s10703-019-00337-w.

Damien Desfontaines;Andreas Lochbihler;David Basin, 2019, Cardinality Estimators do not Preserve Privacy, Proceedings on Privacy Enhancing Technologies, 2019, 2, pp. 26-46, 10.2478/popets-2019-0018, https://doi.org/10.2478/popets-2019-0018.

Daniel N. Baker;Ben Langmead, 2019, Dashing: fast and accurate genomic distances with HyperLogLog, Genome biology, 20, 1, 10.1186/s13059-019-1875-0, https://doi.org/10.1186/s13059-019-1875-0.

Daniel Ting, Proceedings of the 2022 International Conference on Management of Data, Approximate Distinct Counts for Billions of Datasets, pp. 69-86, 2019, Amsterdam Netherlands, 10.1145/3299869.3319897, https://doi.org/10.1145/3299869.3319897.

Derrick E. Wood;Jennifer Lu;Ben Langmead, 2019, Improved metagenomic analysis with Kraken 2, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/762302, https://doi.org/10.1101/762302.

Derrick E. Wood;Jennifer Lu;Ben Langmead, 2019, Improved metagenomic analysis with Kraken 2, Genome biology, 20, 1, 10.1186/s13059-019-1891-0, https://doi.org/10.1186/s13059-019-1891-0.

Jarosław Błasiok, 2019, Optimal Streaming and Tracking Distinct Elements with High Probability, ACM Transactions on Algorithms, 16, 1, pp. 1-28, 10.1145/3309193, https://doi.org/10.1145/3309193.

Ji Sun;Guoliang Li, 2019, An end-to-end learning-based cost estimator, arXiv (Cornell University), 13, 3, pp. 307-319, 10.14778/3368289.3368296, https://arxiv.org/abs/1906.02560.

Juan P. A. Lopes;Fabiano S. Oliveira;Paulo E. D. Pinto;Valmir C. Barbosa, Lecture notes in computer science, Sketching Data Structures for Massive Graph Problems, pp. 57-67, 2019, 10.1007/978-3-030-14177-6_5.

Kanat Tangwongsan;Martin Hirzel;Scott Schneider, Encyclopedia of Big Data Technologies, Sliding-Window Aggregation Algorithms, pp. 1516-1521, 2019, 10.1007/978-3-319-77525-8_157.

Laura Rettig;Mourad Khayati;Philippe Cudré-Mauroux;Michał Piorkówski, Springer eBooks, Online Anomaly Detection over Big Data Streams, pp. 289-312, 2019, 10.1007/978-3-030-11821-1_16.

Pierluigi Crescenzi;Andrea Marino, Encyclopedia of Big Data Technologies, Degrees of Separation and Diameter in Large Graphs, pp. 652-658, 2019, 10.1007/978-3-319-77525-8_59.

Praveen Kumar;Nandita Dukkipati;Nathan Lewis;Yi Cui;Yaogong Wang;et al., Proceedings of the ACM Special Interest Group on Data Communication, PicNIC, 2019, Beijing China, 10.1145/3341302.3342093.

Reuven Cohen;Yuval Nezri, 2019, Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning, arXiv (Cornell University), 27, 5, pp. 2098-2110, 10.1109/tnet.2019.2940705, https://arxiv.org/abs/1903.05728.

Will P. M. Rowe, 2019, When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data, Genome biology, 20, 1, 10.1186/s13059-019-1809-x, https://doi.org/10.1186/s13059-019-1809-x.

Yun William Yu;Griffin M Weber, 2019, Federated queries of clinical data repositories: balancing accuracy and privacy, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/841072, https://doi.org/10.1101/841072.

Axel Michel;Benjamin Nguyen;Philippe Pucheral, Communications in computer and information science, The Case for Personalized Anonymization of Database Query Results, pp. 261-285, 2018, 10.1007/978-3-319-94809-6_13.

David Vengerov, Lecture notes in computer science, A Two-List Framework for Accurate Detection of Frequent Items in Data Streams, pp. 228-243, 2018, 10.1007/978-3-319-96136-1_19.

FP Breitwieser;SL Salzberg, 2018, KrakenHLL: Confident and fast metagenomics classification using unique k-mer counts, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/262956, https://doi.org/10.1101/262956.

Jacek Cichoń;Karol Gotfryd, 2018, Average Counting via Approximate Histograms, ACM Transactions on Sensor Networks, 14, 2, pp. 1-32, 10.1145/3177922.

Jiecao Chen;Qin Zhang, Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Distinct Sampling on Streaming Data with Near-Duplicates, 2018, Houston TX USA, 10.1145/3196959.3196978.

K. T. Sridhar;Jimson Johnson, Communications in computer and information science, Entropy Aware Adaptive Compression for SQL Column Stores, pp. 90-104, 2018, 10.1007/978-3-319-99987-6_7.

Kanat Tangwongsan;Martin Hirzel;Scott Schneider, Encyclopedia of Big Data Technologies, Sliding-Window Aggregation Algorithms, pp. 1-6, 2018, 10.1007/978-3-319-63962-8_157-1.

Laurent Bindschaedler;Jasmina Malicevic;Nicolas Schiper;Ashvin Goel;Willy Zwaenepoel, Infoscience (Ecole Polytechnique Fédérale de Lausanne), Rock you like a hurricane, 2018, Porto Portugal, 10.1145/3190508.3190532, https://infoscience.epfl.ch/record/253574/files/hurricane.pdf.

Lukasz Golab, Encyclopedia of Big Data Technologies, Types of Stream Processing Algorithms, pp. 1-7, 2018, 10.1007/978-3-319-63962-8_193-1.

Muhammad Aamir Saleem;Rohit Kumar;Toon Calders;Torben Bach Pedersen, 2018, Effective and efficient location influence mining in location-based social networks, VBN Forskningsportal (Aalborg Universitet), 61, 1, pp. 327-362, 10.1007/s10115-018-1240-8, https://vbn.aau.dk/ws/files/310348974/KAIS.pdf.

Pierluigi Crescenzi;Andrea Marino, Encyclopedia of Big Data Technologies, Degrees of Separation and Diameter in Large Graphs, pp. 1-7, 2018, 10.1007/978-3-319-63962-8_59-1.

Theo Jepsen;Masoud Moshref;Antonio Carzaniga;Nate Foster;Robert Soulé, Proceedings of the Symposium on SDN Research, Life in the Fast Lane, 2018, Los Angeles CA USA, 10.1145/3185467.3185494.

Toon Calders, Lecture notes in business information processing, Three Big Data Tools for a Data Scientist’s Toolbox, pp. 112-133, 2018, 10.1007/978-3-319-96655-7_5.

Yongjoo Park;Barzan Mozafari;Joseph Sorenson;Junhao Wang, Proceedings of the 2022 International Conference on Management of Data, VerdictDB, 2018, Houston TX USA, 10.1145/3183713.3196905, https://doi.org/10.1145/3183713.3196905.

David Koslicki;Hooman Zabeti, 2017, IMPROVING MIN HASH VIA THE CONTAINMENT INDEX WITH APPLICATIONS TO METAGENOMIC ANALYSIS, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/184150, https://doi.org/10.1101/184150.

Edith Cohen, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, HyperLogLog Hyperextended, pp. 105-114, 2017, Halifax NS Canada, 10.1145/3097983.3098020.

Jake Wires;Pradeep Ganesan;Andrew Warfield, Proceedings of the 2017 Symposium on Cloud Computing, Sketches of space, pp. 535-547, 2017, Santa Clara California, 10.1145/3127479.3132021.

Jorge R. Murillo;Gustavo Totoy;Cristina L. Abad, Proceedings of the 16th Workshop on Adaptive and Reflective Middleware, Instrumenting cloud caches for online workload monitoring, pp. 1-6, 2017, Las Vegas Nevada, 10.1145/3152881.3152884.

Pradeep Javangula;Kourosh Modarre;Paresh Shenoy;Yi Liu;Aran Nayebi, 2017, Efficient Hybrid Algorithms for Computing Clusters Overlap, Procedia Computer Science, 108, pp. 1050-1059, 10.1016/j.procs.2017.05.212, https://doi.org/10.1016/j.procs.2017.05.212.

Rahmtin Rotabi;Krishna Kamath;Jon Kleinberg;Aneesh Sharma, arXiv (Cornell University), Detecting Strong Ties Using Network Motifs, pp. 983-992, 2017, Perth, Australia, 10.1145/3041021.3055139, http://arxiv.org/abs/1702.07390.

Reuven Cohen;Liran Katzir;Aviv Yehezkel, arXiv (Cornell University), A Minimal Variance Estimator for the Cardinality of Big Data Set Intersection, pp. 95-103, 2017, Halifax NS Canada, 10.1145/3097983.3097999, http://arxiv.org/abs/1606.00996.

Rohit Kumar;Muhammad Aamir Saleem;Toon Calders;Xike Xie;Torben Bach Pedersen, Lecture notes in computer science, Activity-Driven Influence Maximization in Social Networks, pp. 345-348, 2017, 10.1007/978-3-319-71273-4_28.

Sean Rhea;Eric Wang;Edmund Wong;Ethan Atkins;Nat Storer, Proceedings of the 2017 ACM International Conference on Management of Data, LittleTable, 2017, Chicago Illinois USA, 10.1145/3035918.3056102.

Shir Landau Feibish;Yehuda Afek;Anat Bremler-Barr;Edith Cohen;Michal Shagam, Proceedings of the fifth ACM/IEEE Workshop on Hot Topics in Web Systems and Technologies, Mitigating DNS random subdomain DDoS attacks by distinct heavy hitters sketches, pp. 1-6, 2017, San Jose California, 10.1145/3132465.3132474.

Surajit Chaudhuri;Bolin Ding;Srikanth Kandula, Proceedings of the 2017 ACM International Conference on Management of Data, Approximate Query Processing, 2017, Chicago Illinois USA, 10.1145/3035918.3056097.

A.V. VALIALKIN;O.I. KONASHEVYCH, 2016, Real-time Method of Accurate Unique IPs Counting Across High Number of Distinct Dimensions and Distinct Time Frames for Big Data Systems., Èlektronnoe modelirovanie, 38, 3, pp. 63-74, 10.15407/emodel.38.03.063, https://doi.org/10.15407/emodel.38.03.063.

Alon Halevy;Flip Korn;Natalya F. Noy;Christopher Olston;Neoklis Polyzotis;et al., Proceedings of the 2022 International Conference on Management of Data, Goods, 2016, San Francisco California USA, 10.1145/2882903.2903730, https://doi.org/10.1145/2882903.2903730.

Branislav Kveton;Hung Bui;Mohammad Ghavamzadeh;Georgios Theocharous;S. Muthukrishnan;et al., Lecture notes in computer science, Graphical Model Sketch, pp. 81-97, 2016, 10.1007/978-3-319-46128-1_6.

Edith Cohen, Encyclopedia of Algorithms, All-Distances Sketches, pp. 59-64, 2016, 10.1007/978-1-4939-2864-4_574.

Edith Cohen, Encyclopedia of Algorithms, Min-Hash Sketches, pp. 1282-1287, 2016, 10.1007/978-1-4939-2864-4_573.

Luiz C. Irber;C. Titus Brown, 2016, Efficient cardinality estimation for k-mers in large DNA sequencing data sets, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/056846, https://doi.org/10.1101/056846.

Marcel Kornacker;Alexander Behm;Victor Bittorf;Taras Bobrovytsky;Casey Ching;et al., Edition HMD, Impala: Eine moderne, quellen-offene SQL Engine für Hadoop, pp. 159-178, 2016, 10.1007/978-3-658-11589-0_8.

Philipp Brandes;Marcin Kardas;Marek Klonowski;Dominik Pająk;Roger Wattenhofer, Lecture notes in computer science, Approximating the Size of a Radio Network in Beeping Model, pp. 358-373, 2016, 10.1007/978-3-319-48314-6_23.

Shigang Chen;Min Chen;Qingjun Xiao, Wireless networks, Per-Flow Cardinality Measurement, pp. 47-76, 2016, 10.1007/978-3-319-47340-6_3.

Shigang Chen;Min Chen;Qingjun Xiao, Wireless networks, Persistent Spread Measurement, pp. 77-104, 2016, 10.1007/978-3-319-47340-6_4.

Shigang Chen;Min Chen;Qingjun Xiao, Wireless networks, Introduction, pp. 1-9, 2016, 10.1007/978-3-319-47340-6_1.

Graham Cormode, Lecture notes in computer science, Streaming Methods in Data Analysis, pp. 3-6, 2015, 10.1007/978-3-319-20424-6_1.

Junzhou Zhao;John C.S. Lui;Don Towsley;Pinghui Wang;Xiaohong Guan, arXiv (Cornell University), Tracking Triadic Cardinality Distributions for Burst Detection in Social Activity Streams, pp. 15-25, 2015, Palo Alto California USA, 10.1145/2817946.2817955, https://arxiv.org/abs/1411.3808.

Lorenz Fischer;Roi Blanco;Peter Mika;Abraham Bernstein, Lecture notes in computer science, Timely Semantics: A Study of a Stream-Based Ranking System for Entity Relationships, pp. 429-445, 2015, 10.1007/978-3-319-25010-6_28, https://doi.org/10.1007/978-3-319-25010-6_28.

Rohit Kumar;Toon Calders;Aristides Gionis;Nikolaj Tatti, Lecture notes in computer science, Maintaining Sliding-Window Neighborhood Profiles in Interaction Networks, pp. 719-735, 2015, 10.1007/978-3-319-23525-7_44, https://doi.org/10.1007/978-3-319-23525-7_44.

Christine Fricker;Philippe Robert;Yousra Chabchoub, 2014, Improving the Detection of On-Line Vertical Port Scan in IP Traffic, 5, 1, pp. 61-74, 10.4018/ijsse.2014010104, https://hal.inria.fr/hal-00773108/file/IJSSE.pdf.

David P. Woodruff;Qin Zhang, 2014, When distributed computation is communication expensive, arXiv (Cornell University), 30, 5, pp. 309-323, 10.1007/s00446-014-0218-3, https://arxiv.org/abs/1304.4636.

Edith Cohen, Encyclopedia of Algorithms, Min-Hash Sketches, pp. 1-7, 2014, 10.1007/978-3-642-27848-8_573-1.

Edith Cohen, Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, All-distances sketches, revisited, pp. 88-99, 2014, Snowbird Utah USA, 10.1145/2594538.2594546.

Edith Cohen, Encyclopedia of Algorithms, All-Distances Sketches, pp. 1-7, 2014, 10.1007/978-3-642-27848-8_574-1.

Johanna Amann;Seth Hall;Robin Sommer, CiteSeer X (The Pennsylvania State University), Count Me In: Viable Distributed Summary Statistics for Securing High-Speed Networks, pp. 320-340, 2014, 10.1007/978-3-319-11379-1_16, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.640.8547.

Mümine KAYA;Selma Ayşe ÖZEL, 2014, Türkçe Dokümanlardaki Benzerliklerin Tespiti İçin Mevcut Yazılımların Karşılaştırılması ve Türkçe Karakter Kullanımı ile Kök Almanın Etkisinin İncelenmesi, Çukurova Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, 29, 2, pp. 115-130, 10.21605/cukurovaummfd.242836, https://doi.org/10.21605/cukurovaummfd.242836.

Paolo Boldi, Lecture notes in computer science, Algorithmic Gems in the Data Miner’s Cave, pp. 1-15, 2014, 10.1007/978-3-319-07890-8_1.

David P. Woodruff;Qin Zhang, arXiv (Cornell University), When Distributed Computation Is Communication Expensive, pp. 16-30, 2013, 10.1007/978-3-642-41527-2_2, https://arxiv.org/abs/1304.4636.

Nina Evseenko, arXiv (Cornell University), New hybrid distributed voting algorithm, pp. 1-3, 2013, Sousse, 10.1109/iccat.2013.6522026, http://arxiv.org/abs/1305.0711.

Paolo Boldi;Marco Rosa;Sebastiano Vigna, 2013, Robustness of social and web graphs to node removal, Social Network Analysis and Mining, 3, 4, pp. 829-842, 10.1007/s13278-013-0096-x.

C. Baquero;P. S. Almeida;R. Menezes;P. Jesus, 2011, Extrema Propagation: Fast Distributed Estimation of Sums and Network Sizes, Portuguese National Funding Agency for Science, Research and Technology (RCAAP Project by FCT), 23, 4, pp. 668-675, 10.1109/tpds.2011.209, http://hdl.handle.net/1822/33841.

Jacek Cichoń;Jakub Lemiesz;Marcin Zawada, CiteSeer X (The Pennsylvania State University), On Cardinality Estimation Protocols for Wireless Sensor Networks, pp. 322-331, 2011, 10.1007/978-3-642-22450-8_25, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.714.3498.

Chi-ho Lam;Wing Cheong Lau;Onching Yue, 2010, Enhancing distributed traffic monitoring via traffic digest splitting, Computer Networks, 55, 6, pp. 1379-1393, 10.1016/j.comnet.2010.12.021.

Christian Lochert;Björn Scheuermann;Martin Mauve, 2010, A probabilistic method for cooperative hierarchical aggregation of data in VANETs, Ad Hoc Networks, 8, 5, pp. 518-530, 10.1016/j.adhoc.2009.12.008.

Miklós Csűrös, CiteSeer X (The Pennsylvania State University), Approximate Counting with a Floating-Point Counter, pp. 358-367, 2010, 10.1007/978-3-642-14031-0_39, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.150.1829.

Sébastien Nedjar, Lecture notes in computer science, Exact and Approximate Sizes of Convex Datacubes, pp. 204-215, 2009, 10.1007/978-3-642-03730-6_17.

Sébastien Nedjar;Alain Casali;Rosine Cicchetti;Lotfi Lakhal, 2009, Emerging Cubes: Borders, size estimations and lossless reductions, Information Systems, 34, 6, pp. 536-550, 10.1016/j.is.2009.03.001.

Xingang Shi;Dah-Ming Chiu;John C.S. Lui, 2009, An online framework for catching top spreaders and scanners, Computer Networks, 54, 9, pp. 1375-1388, 10.1016/j.comnet.2009.12.003.

Frédéric Giroire, 2008, Order statistics and estimating cardinalities of massive data sets, Discrete Applied Mathematics, 157, 2, pp. 406-427, 10.1016/j.dam.2008.06.020, https://doi.org/10.1016/j.dam.2008.06.020.

Sources : OpenCitations, OpenAlex & Crossref

Share and export

Consultation statistics

This page has been seen 2222 times.

This article's PDF has been downloaded 3576 times.