This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.
;Simone Ciccolella
;Luca Denti
;Raghuram Raghuram Dandinasivara;Gianluca Della Vedova
;et al., 2025, PanSpace: Fast and Scalable Indexing for Massive Bacterial Databases, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2025.03.19.644115, https://doi.org/10.1101/2025.03.19.644115.
;Ruifeng Sun;Jianhang Liu
, 2025, MLDDoS: a distributed denial of service attack detection method using multi-level sketch, The Journal of Supercomputing, 81, 2, 10.1007/s11227-025-06942-3.
;Guanyu Li
;Cheng Guo
;Renyu Yang
;Shicheng Wang
;et al., 2025, SuperFE: A Scalable and Flexible Feature Extractor for ML-based Traffic Analysis Applications, Proceedings of the Twentieth European Conference on Computer Systems, pp. 818-834, 10.1145/3689031.3696081.
;Lu Tang
, 2025, Streaming Algorithms in Network Measurement, Engineering cyber-physical systems and critical infrastructures, pp. 53-76, 10.1007/978-3-031-83149-2_3.
;Jingya Wu
;Wenyan Lu;Huawei Li
;Xiaowei Li
;et al., 2025, FUS: FPGA-based Universal Sketch with homogeneous and heterogeneous memory architectures, CCF Transactions on High Performance Computing, 7, 3, pp. 275-290, 10.1007/s42514-025-00222-5.
;Jason Fan
;Rob Patro
, 2024, Where the patterns are: repetition-aware compression for colored de Bruijn graphs⋆, PubMed Central, 10.1101/2024.07.09.602727, https://www.ncbi.nlm.nih.gov/pmc/articles/11257547.
;Yun William Yu
, 2024, Secure Federated Boolean Count Queries Using Fully-Homomorphic Cryptography, Lecture notes in computer science, pp. 54-67, 10.1007/978-1-0716-3989-4_4.
, 2024, SoK: Applications of Sketches and Rollups in Blockchain Networks, IEEE Transactions on Network and Service Management, 21, 3, pp. 3194-3208, 10.1109/tnsm.2024.3372604.
;Asoke Datta
;Yesdaulet Izenov
;Florin Rusu
, 2024, Approximate Sketches, Proceedings of the ACM on Management of Data, 2, 1, pp. 1-24, 10.1145/3639321, https://doi.org/10.1145/3639321.
;Tianfan Zhang
;Li Wang;Yibo Xiao;Chao Yang
;et al., 2024, MpScope: Enabling multi-pipeline monitoring inside a switch, Computer Networks, 254, pp. 110764, 10.1016/j.comnet.2024.110764.
, 2024, CARBINE: Exploring Additional Properties of HyperLogLog for Secure and Robust Flow Cardinality Estimation, IEEE INFOCOM 2024 - IEEE Conference on Computer Communications, pp. 471-480, 10.1109/infocom52122.2024.10621185.
;Ozlem Kesgin;Noa Zilberman
, 2024, INDDoS+: Secure DDoS Detection Mechanism in Programmable Switches, 2024 IEEE 25th International Conference on High Performance Switching and Routing (HPSR), pp. 197-202, 10.1109/hpsr62440.2024.10635967.
;Wenhao Wu
;Wang Zhaohua
;Zhenyu Li
;Jianwei Niu
, 2024, SAROS: A Self-Adaptive Routing Oblivious Sampling Method for Network-wide Heavy Hitter Detection, Proceedings of the 8th Asia-Pacific Workshop on Networking, pp. 142-148, 10.1145/3663408.3663429.
;Jason Fan
;Rob Patro
, 2024, Meta-colored Compacted de Bruijn Graphs, ARCA (Università Ca' Foscari Venezia), pp. 131-146, 10.1007/978-1-0716-3989-4_9, https://hdl.handle.net/10278/5060402.
, 2024, Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams, Proceedings of the VLDB Endowment, 17, 11, pp. 3124-3137, 10.14778/3681954.3681988.
;Zhengyan Zhou
;Xinyang Chen
;Pengpai Shi;Yanni Wu;et al., 2024, CardSketch: Shift Attention for Network-wide Cardinality Telemetry, 2024 IEEE 49th Conference on Local Computer Networks (LCN), pp. 1-7, 10.1109/lcn60385.2024.10639631.
;Bernhard Y. Renard
, 2024, Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters, PubMed Central, 34, 6, pp. 914-924, 10.1101/gr.278623.123, https://www.ncbi.nlm.nih.gov/pmc/articles/11293544.
;Runlin Lei
;Sibo Wang
;Zhewei Wei
;Bolin Ding
, 2024, Learning-based Property Estimation with Polynomials, Proceedings of the ACM on Management of Data, 2, 3, pp. 1-27, 10.1145/3654994.
;Xiaofei Zhao
;Jean Pierre-Both;Konstantinos T. Konstantinidis
, 2024, BinDash 2.0: New MinHash Scheme Allows Ultra-fast and Accurate Genome Search and Comparisons, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2024.03.13.584875, https://doi.org/10.1101/2024.03.13.584875.
;Qi Liang
;Jing Shao
;Qile Wang
;Yilin Zhao
;et al., 2024, P2Sketch: Finding Persistent Items in Data Streams Based on Periodic Arrival, 2024 IEEE International Performance, Computing, and Communications Conference (IPCCC), pp. 1-8, 10.1109/ipccc59868.2024.10850146.
;Zhen Gao
;Pedro Reviriego
;Shanshan Liu
;Fabrizio Lombardi
, 2024, Dependability of the K Minimum Values Sketch: Protection and Comparative Analysis, IEEE Transactions on Computers, 74, 1, pp. 210-221, 10.1109/tc.2024.3475588.
;Chao Song
;Haipeng Dai
;Li Lu
;Ming Liu
, 2024, Compact Estimator for Streaming Triangle Counting, IEEE Transactions on Knowledge and Data Engineering, 36, 8, pp. 3712-3724, 10.1109/tkde.2024.3371228.
;Jianyu Wu
;Tong Yang
, 2024, HoppingTimer: A Near-optimal Framework for Basic Estimation of Data Streams in Hopping Windows, 2024 International Scientific and Technical Conference Modern Computer Network Technologies (MoNeTeC), pp. 1-13, 10.1109/monetec60984.2024.10768166.
;Jana Giceva
;Thomas Neumann
;Viktor Leis
, 2024, High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance, Proceedings of the ACM on Management of Data, 2, 6, pp. 1-27, 10.1145/3698813.
;Keran Kocher
;Ella Kummer;Anupama Unnikrishnan
, 2024, Deletions and Dishonesty: Probabilistic Data Structures in Adversarial Settings, Lecture notes in computer science, pp. 137-168, 10.1007/978-981-96-0894-2_5.
;Xin Yuan
;Shuo Wang
;Hongsheng Hu
;Minhui Xue
, 2024, Cardinality Counting in "Alcatraz": A Privacy-aware Federated Learning Approach, Proceedings of the ACM Web Conference 2024, pp. 3076-3084, 10.1145/3589334.3645655.
, 2024, UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting, Proceedings of the VLDB Endowment, 17, 7, pp. 1655-1668, 10.14778/3654621.3654632.
;Jan Kučera
, 2024, Windower: Feature Extraction for Real-Time DDoS Detection Using Machine Learning, NOMS 2024-2024 IEEE Network Operations and Management Symposium, 1, pp. 1-10, 10.1109/noms59830.2024.10575699.
;Pinghui Wang
;Rundong Li
;Junzhou Zhao
;Junlan Feng
;et al., 2024, A Compact and Accurate Sketch for Estimating a Large Range of Set Difference Cardinalities, 2024 IEEE 40th International Conference on Data Engineering (ICDE), pp. 1338-1351, 10.1109/icde60146.2024.00110.
;Dongdong Xie
;Junzhou Zhao
;Jinsong Li
;Zhicheng Li
;et al., 2024, Half-Xor: A Fully-Dynamic Sketch for Estimating the Number of Distinct Values in Big Tables, IEEE Transactions on Knowledge and Data Engineering, 36, 7, pp. 3111-3125, 10.1109/tkde.2024.3359710.
;Yitong Liu
;Zhicheng Li
;Rundong Li
, 2024, An LDP Compatible Sketch for Securely Approximating Set Intersection Cardinalities, Proceedings of the ACM on Management of Data, 2, 1, pp. 1-27, 10.1145/3639281.
;Paulo Gonçalves
;Rémi Gribonval
;Márton Karsai
, 2024, Temporal network compression via network hashing, Applied Network Science, 9, 1, 10.1007/s41109-023-00609-9, https://doi.org/10.1007/s41109-023-00609-9.
;Kia Shakiba
;Albert Lee
;Paul Chen
;Michael Stumm
, 2024, TTLs Matter: Efficient Cache Sizing with TTL-Aware Miss Ratio Curves and Working Set Sizes, Proceedings of the Nineteenth European Conference on Computer Systems, pp. 387-404, 10.1145/3627703.3650066.
;Guoju Gao
;Yu-e Sun;He Huang
;Yang Du
;et al., 2024, P$$^2$$S-Sketch: A Sketch Family for Priority-Aware Per-Flow Spread Measurement in Network Data Stream, Lecture notes in computer science, pp. 386-402, 10.1007/978-981-96-0821-8_26.
;Hanwen Zhang
;Guoju Gao
;Yu-e Sun
;He Huang
;et al., 2024, KTSketch: Finding k-Persistent t-Spread Flows in High-Speed Networks, Lecture notes in computer science, pp. 326-342, 10.1007/978-981-97-7241-4_21.
;Alexander Barquero
;Anisha Ashok Wadhwani;Jiang Bian
;Jaime Ruiz;et al., 2024, OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2024.03.15.585215, https://doi.org/10.1101/2024.03.15.585215.
;Pascal Molli
;Brice Nédelec
;Hala Skaf-Molli
;Julien Aimonier-Davat
, 2024, CRAWD: Sampling-Based Estimation of Count-Distinct SPARQL Queries, pp. 98-115, 10.1007/978-3-031-77850-6_6, https://hal.science/hal-04726343.
;Or Ordentlich
;Ofer Shayevitz
, 2024, Statistical Inference With Limited Memory: A Survey, IEEE Journal on Selected Areas in Information Theory, 5, pp. 623-644, 10.1109/jsait.2024.3481296.
;Jiaqi Zheng
;Hao Qian
;Shiju Zhao;Hongxuan Zhang;et al., 2024, In Search of a Memory-Efficient Framework for Online Cardinality Estimation, IEEE Transactions on Knowledge and Data Engineering, 37, 1, pp. 392-407, 10.1109/tkde.2024.3486571.
;Rundong Li
;Pinghui Wang
;Yufang Sun;Rui Xing, 2024, QSketch: An Efficient Sketch for Weighted Cardinality Estimation in Streams, Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2432-2443, 10.1145/3637528.3671695.
;Haoyu Wang;Lixiang Chen
;Yifeng Dong;Xing Chen
;et al., 2024, ByteCard: Enhancing ByteDance's Data Warehouse with Learned Cardinality Estimation, Companion of the 2024 International Conference on Management of Data, pp. 41-54, 10.1145/3626246.3653376.
;Yiqi Chen
;Tao Zhang
;Yang Wang
;Ran Shu
;et al., 2024, NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering, 2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 1518-1531, 10.1109/micro61859.2024.00111.
;Qingquan Liao
;Wenbin Liu;Peng Xu
;Linlin Zhuo
;et al., 2024, Multi-source data integration for explainable miRNA-driven drug discovery, Future Generation Computer Systems, 160, pp. 109-119, 10.1016/j.future.2024.05.055.
;Jie Lu
;Quan Ren;Ziyong Li
;Yuxiang Hu
;et al., 2024, An Accurate and Invertible Sketch for Super Spread Detection, Electronics, 13, 1, pp. 222, 10.3390/electronics13010222, https://doi.org/10.3390/electronics13010222.
;Yongjie Wang
;Yuliang Lu
, 2024, HSS: A Memory-Efficient, Accurate, and Fast Network Measurement Framework in Sliding Windows, IEEE Transactions on Network and Service Management, 21, 6, pp. 5958-5976, 10.1109/tnsm.2024.3460751.
;Ilya Shchuckin
;Nikita Bobrov
;George Chernishev
, 2023, Fast Discovery of Inclusion Dependencies with Desbordante, DOAJ (DOAJ: Directory of Open Access Journals), pp. 264-275, 10.23919/fruct58615.2023.10143047.
;Raphael Azorin
;Gabriele Castellano
;Massimo Gallo
;Salvatore Pontarelli
;et al., 2023, SPADA: A Sparse Approximate Data Structure Representation for Data Plane Per-flow Monitoring, Proceedings of the ACM on Networking, 1, CoNEXT3, pp. 1-25, 10.1145/3629149, https://doi.org/10.1145/3629149.
;Dimitris Sacharidis
;Antonios Deligiannakis
, 2023, And synopses for all: A synopses data engine for extreme scale analytics-as-a-service, Zenodo (CERN European Organization for Nuclear Research), 116, pp. 102221, 10.1016/j.is.2023.102221, https://zenodo.org/record/7916098.
;Jiong Yu
;Zhenzhen He
;Xiaoqiao Xiong
, 2023, ConvMADE: Convolution Makes Cardinality Estimation Stronger, IEEE Access, 11, pp. 98005-98015, 10.1109/access.2023.3312312, https://doi.org/10.1109/access.2023.3312312.
;Olufemi O. Odegbile
;Dimitrios Melissourgos
;Haibo Wang
;Shiping Chen
, 2023, From CountMin to Super kJoin Sketches for Flow Spread Estimation, IEEE Transactions on Network Science and Engineering, 11, 3, pp. 2353-2370, 10.1109/tnse.2023.3279665.
;Qizhi Chen
;Yuanpeng Li
;Tong Yang
;Yaofeng Tu
;et al., 2023, JoinSketch: A Sketch Algorithm for Accurate and Unbiased Inner-Product Estimation, Proceedings of the ACM on Management of Data, 1, 1, pp. 1-26, 10.1145/3588935.
;Antonio Cruciani
;Daniele Pasquini;Paola Vocca
;Simone Angelini, 2023, propagate: A Seed Propagation Framework to Compute Distance-Based Metrics on Very Large Graphs, Lecture notes in computer science, pp. 671-688, 10.1007/978-3-031-43418-1_40.
;Jason Fan
;Rob Patro
, 2023, Meta-colored compacted de Bruijn graphs, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2023.07.21.550101, https://doi.org/10.1101/2023.07.21.550101.
, 2023, Probabilistic Data Structure in smart agriculture, 2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC), 12, pp. 780-785, 10.1109/icsccc58608.2023.10176985.
;Ziwei Wang
;Yunchuan Li
;Ruixin Yang
;Yan Zhao
;et al., 2023, Deep Learning-Based Bloom Filter for Efficient Multi-key Membership Testing, Data Science and Engineering, 8, 3, pp. 234-246, 10.1007/s41019-023-00224-9, https://doi.org/10.1007/s41019-023-00224-9.
;He Huang
;Yu-E Sun
;Zhaojie Wang
, 2023, MIME: Fast and Accurate Flow Information Compression for Multi-Spread Estimation, 2023 IEEE 31st International Conference on Network Protocols (ICNP), pp. 1-11, 10.1109/icnp59255.2023.10355571.
;Qingjun Xiao
, 2023, Online Detection of 1D and 2D Hierarchical Super-Spreaders in High-Speed Networks, Proceedings of the 7th Asia-Pacific Workshop on Networking, pp. 109-115, 10.1145/3600061.3600080.
;Lorenzo Beretta
;Jonas Klausen
;Jakob Bæk Tejs Houen;Mikkel Thorup
, 2023, Locally Uniform Hashing, 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pp. 1440-1470, 10.1109/focs57990.2023.00089.
;Bernhard Y. Renard
, 2023, Taxor: Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2023.07.20.549822, https://doi.org/10.1101/2023.07.20.549822.
;Rui Shi
;Heng Chen
;Li Zhang
;Ruidong Li
;et al., 2023, Krypton: Real-Time Serving and Analytical SQL Engine at ByteDance, Proceedings of the VLDB Endowment, 16, 12, pp. 3528-3542, 10.14778/3611540.3611545.
;Jonas Traub;Volker Markl
, 2023, Survey of window types for aggregation in stream processing systems, The VLDB Journal, 32, 5, pp. 985-1011, 10.1007/s00778-022-00778-6, https://doi.org/10.1007/s00778-022-00778-6.
;Weizhuang Zhou
;Fook Mun Chan;Meenatchi Sundaram Muthu Selva Annamalai
;Xiaoxia Deng;et al., 2023, CoVnita, an end-to-end privacy-preserving framework for SARS-CoV-2 classification, Scientific Reports, 13, 1, pp. 7461, 10.1038/s41598-023-34535-8, https://doi.org/10.1038/s41598-023-34535-8.
;Minxuan Zhou
;Weihong Xu
;Ashish Venkat
;Tajana Rosing
;et al., 2023, Abakus: Accelerating k -mer Counting with Storage Technology, ACM Transactions on Architecture and Code Optimization, 21, 1, pp. 1-26, 10.1145/3632952, https://doi.org/10.1145/3632952.
;Alexander Dunkel
;Dirk Burghardt
, 2023, Protecting Privacy in Volunteered Geographic Information Processing, Volunteered Geographic Information, pp. 277-297, 10.1007/978-3-031-35374-1_14.
;Dirk Burghardt
, 2023, Using HyperLogLog to Prevent Data Retention in Social Media Streaming Data Analytics, ISPRS International Journal of Geo-Information, 12, 2, pp. 60, 10.3390/ijgi12020060, https://doi.org/10.3390/ijgi12020060.
;Arish Sateesan
;Jo Vliegen
;Stjepan Picek
;Nele Mentens
, 2023, Evolving Non-cryptographic Hash Functions Using Genetic Programming for High-speed Lookups in Network Security Applications, Lecture notes in computer science, pp. 302-318, 10.1007/978-3-031-30229-9_20.
;Supawit Chockchowwat
;Arav Chheda;Suwen Wang
;Riya Verma
;et al., 2023, A Step Toward Deep Online Aggregation, Proceedings of the ACM on Management of Data, 1, 2, pp. 1-28, 10.1145/3589269.
;Jim Apple
;Alvaro Alonso
;Otmar Ertl
;Niv Dayan
, 2023, Cardinality Estimation Adaptive Cuckoo Filters (CE-ACF): Approximate Membership Check and Distinct Query Count for High-Speed Network Monitoring, IEEE/ACM Transactions on Networking, 32, 2, pp. 959-970, 10.1109/tnet.2023.3302306, https://doi.org/10.1109/tnet.2023.3302306.
;Shiwei Yang
;Panpan Li
;Kangying Li;Lin Wen
, 2023, Multi-Resolution Odd Sketch for Mining Extended Jaccard Similarity of Dynamic Streaming sets, IEEE Transactions on Network Science and Engineering, 11, 3, pp. 2399-2414, 10.1109/tnse.2023.3275809.
;Yuexiao Cai
;Yunpeng Cao
;Shigang Chen
, 2023, Accurate and O(1)-Time Query of Per-Flow Cardinality in High-Speed Networks, IEEE/ACM Transactions on Networking, 31, 6, pp. 2994-3009, 10.1109/tnet.2023.3268980, https://doi.org/10.1109/tnet.2023.3268980.
;Yijun Wang
;Shengbo Liu
;Yunxiang Li
;Wenhong Tian
;et al., 2023, SPAC: Scalable Pattern Approximate Counting in Graph Mining, Lecture notes in computer science, pp. 214-232, 10.1007/978-3-031-22677-9_12.
;Nanyang Wang
;Wei Jiang
, 2023, Towards Accurate and Efficient Super Spreader Detection with Sketching, 2023 8th International Conference on Data Science in Cyberspace (DSC), pp. 506-511, 10.1109/dsc59305.2023.00079.
;Felix Droop;Mitra Darvish
;René Rahn;et al., 2023, Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries, Genome biology, 24, 1, pp. 131, 10.1186/s13059-023-02971-4, https://doi.org/10.1186/s13059-023-02971-4.
;Igor Martayan
;Camille Marchet
;Antoine Limasset
, 2023, Fractional Hitting Sets for Efficient and Lightweight Genomic Data Sketching, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2023.06.21.545875, https://doi.org/10.1101/2023.06.21.545875.
, 2023, Sliding window-based approximate triangle counting with bounded memory usage, The VLDB Journal, 32, 5, pp. 1087-1110, 10.1007/s00778-023-00783-3.
;He Huang
;Yang Du
;Yu-E Sun
;Shigang Chen
, 2023, Coupon Filter: A Universal and Lightweight Filter Framework for More Accurate Data Stream Processing, Computer Networks, 228, pp. 109748, 10.1016/j.comnet.2023.109748.
;He Huang
;Yu-E Sun
;Kejian Li;Boyu Zhang
;et al., 2023, A Better Cardinality Estimator with Fewer Bits, Constant Update Time, and Mergeability, IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, pp. 1-10, 10.1109/infocom53939.2023.10229088.
;Qimeng Song
;Yonglong Luo
, 2023, Differentially Private Top-$k$ Flows Estimation Mechanism in Network Traffic, IEEE Transactions on Network Science and Engineering, 11, 3, pp. 2462-2472, 10.1109/tnse.2023.3309250.
;Pinghui Wang
;Yiyan Qi
;Kuankuan Cheng;Junzhou Zhao
;et al., 2023, Fast Gumbel-Max Sketch and its Applications, IEEE Transactions on Knowledge and Data Engineering, 35, 9, pp. 9350-9363, 10.1109/tkde.2023.3237857.
;Yinghao Zhang
;You Li
;Jingwei Zhang
, 2023, Cardinality estimation with smoothing autoregressive models, World Wide Web, 26, 5, pp. 3441-3461, 10.1007/s11280-023-01195-7.
;Ziwei Wang
;Ruixin Yang
;Yan Zhao
;Rui Zhou
;et al., 2023, Learned Bloom Filter for Multi-key Membership Testing, Lecture notes in computer science, pp. 62-79, 10.1007/978-3-031-30637-2_5.
;Feifei Li
;Yulong Shen
, 2023, Generalized Measure-Biased Sampling and Priority Sampling, IEEE Transactions on Knowledge and Data Engineering, 36, 11, pp. 6251-6265, 10.1109/tkde.2023.3340673.
;Xinyu Pi;Yongjoo Park
, 2023, S/C: Speeding up Data Materialization with Bounded Memory, 2023 IEEE 39th International Conference on Data Engineering (ICDE), pp. 1981-1994, 10.1109/icde55515.2023.00393.
;Qing Li
;Guanglin Duan
;Dan Zhao
;Jingyu Xiao
;et al., 2023, Pontus: Finding Waves in Data Streams, Proceedings of the ACM on Management of Data, 1, 1, pp. 1-26, 10.1145/3588960.
;Ruixin Wang
;Yalun Cai
;Ruwen Zhang
;Tong Yang
;et al., 2023, OneSketch: A Generic and Accurate Sketch for Data Streams, Queen Mary Research Online (Queen Mary University of London), 35, 12, pp. 12887-12901, 10.1109/tkde.2023.3278028, https://qmro.qmul.ac.uk/xmlui/handle/123456789/88622.
;Jingru Cui;Zhe Zhang
, 2023, FastSO: A Fast Weighted Cardinality Estimation Algorithm, 2023 3rd International Conference on Electronic Information Engineering and Computer (EIECT), 1, pp. 494-499, 10.1109/eiect60552.2023.10441998.
;Giuseppe Bianchi
;Andrea Bianco
;Paolo Giaccone
, 2022, Staggered HLL: Near-continuous-time cardinality estimation with no overhead, Computer Communications, 193, pp. 168-175, 10.1016/j.comcom.2022.06.038.
;Giuseppe Bianchi
;Andrea Bianco
;Paolo Giaccone
, 2022, Designing Probabilistic Flow Counting over Sliding Windows, 2022 IEEE 11th IFIP International Conference on Performance Evaluation and Modeling in Wireless and Wired Networks (PEMWN), pp. 1-6, 10.23919/pemwn56085.2022.9963868.
;Alexander Spiegelman
;Edward Bortnikov
;Eshcar Hillel
;Idit Keidar
;et al., 2022, Fast Concurrent Data Sketches, ACM Transactions on Parallel Computing, 9, 2, pp. 1-35, 10.1145/3512758, https://doi.org/10.1145/3512758.
;Yun William Yu
, 2022, Navigating bottlenecks and trade-offs in genomic data analysis, PubMed Central, 24, 4, pp. 235-250, 10.1038/s41576-022-00551-z, https://www.ncbi.nlm.nih.gov/pmc/articles/10204111.
;Haibo Wang
;Olufemi O. Odegbile
;Shigang Chen
;Dimitrios Melissourgos
, 2022, Virtual Filter for Non-Duplicate Sampling With Network Applications, IEEE/ACM Transactions on Networking, 30, 6, pp. 2818-2833, 10.1109/tnet.2022.3182694.
;Changhun Jung
;David Mohaisen
;DaeHun Nyang
, 2022, Minimizing Noise in HyperLogLog-Based Spread Estimation of Multiple Flows, 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 331-342, 10.1109/dsn53405.2022.00042.
;Arik Rinberg
;Ori Rottenstreich
, 2022, Compressing Distributed Network Sketches With Traffic-Aware Summaries, IEEE Transactions on Network and Service Management, 20, 2, pp. 1962-1975, 10.1109/tnsm.2022.3172299.
;Amr El Abbadi
;Ahmed Metwally
, 2022, SpaceSaving ±, Proceedings of the VLDB Endowment, 15, 6, pp. 1215-1227, 10.14778/3514061.3514068.
;Ferruccio Damiani
;Gianluca Torta
, 2022, Bringing Aggregate Programming Towards the Cloud, Lecture notes in computer science, pp. 301-317, 10.1007/978-3-031-19759-8_19.
, 2022, Current Trends in Data Summaries, ACM SIGMOD Record, 50, 4, pp. 6-15, 10.1145/3516431.3516433.
;Chaoyi Ma
;Olufemi O. Odegbile
;Shigang Chen
;Jih-Kwon Peir, 2022, Randomized Error Removal for Online Spread Estimation in High-Speed Networks, IEEE/ACM Transactions on Networking, 31, 2, pp. 558-573, 10.1109/tnet.2022.3197968.
;Chaoyi Ma
;Shigang Chen
;Yuanda Wang
, 2022, Fast and Accurate Cardinality Estimation by Self-Morphing Bitmaps, IEEE/ACM Transactions on Networking, 30, 4, pp. 1674-1688, 10.1109/tnet.2022.3147204.
;Xiulong Liu
;Zhelin Liang;Hongyan Sun;Weilian Xue
;et al., 2022, A Transaction Cardinality Estimation Approach for QoS-Adjustable Intelligent Blockchain Systems, IEEE Journal on Selected Areas in Communications, 40, 12, pp. 3672-3684, 10.1109/jsac.2022.3213327.
;Chen Tian
;Tong Yang
;Huiping Lin
;Chang Liu
;et al., 2022, FlyMon, Proceedings of the ACM SIGCOMM 2022 Conference, pp. 486-502, 10.1145/3544216.3544239.
;Huiping Lin
;Zheng Zhong;Tong Yang
;Muhammad Shahzad
, 2022, Enhanced Machine Learning Sketches for Network Measurements, IEEE Transactions on Computers, 72, 4, pp. 957-970, 10.1109/tc.2022.3185560.
;Zheng Yan
;Xuyang Jing
;Witold Pedrycz
, 2022, Applications of sketches in network traffic measurement: A survey, Aaltodoc (Aalto University), 82, pp. 58-85, 10.1016/j.inffus.2021.12.007, https://aaltodoc.aalto.fi/handle/123456789/112506.
;Zhewei Wei
;Bolin Ding
;Xiening Dai;Lu Lu
;et al., 2022, Sampling-based Estimation of the Number of Distinct Values in Distributed Environment, arXiv (Cornell University), pp. 893-903, 10.1145/3534678.3539390, http://arxiv.org/abs/2206.05476.
;Hala Skaf-Molli
;Pascal Molli
;Arnaud Grall;Thomas Minier
, 2022, Online approximative SPARQL query processing for COUNT-DISTINCT queries with web preemption, Semantic Web, 13, 4, pp. 735-755, 10.3233/sw-222842, https://doi.org/10.3233/sw-222842.
;Martin Hirzel
;Scott Schneider, 2022, Sliding-Window Aggregation Algorithms, Encyclopedia of Big Data Technologies, pp. 1-7, 10.1007/978-3-319-63962-8_157-2.
;Raphael C.-W. Phan
;Zhili Chen
;Dan Huang
, 2022, Persistent Items Tracking in Large Data Streams Based on Adaptive Sampling, IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, pp. 1948-1957, 10.1109/infocom48880.2022.9796709.
;Yao Xiao
;Qun Huang
;Patrick P. C. Lee
, 2022, A High-Performance Invertible Sketch for Network-Wide Superspreader Detection, IEEE/ACM Transactions on Networking, 31, 2, pp. 724-737, 10.1109/tnet.2022.3198738.
;Antonio Blanca
;Robert S. Harris
;David Koslicki
;Paul Medvedev
, 2022, The minimizer Jaccard estimator is biased and inconsistent*, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2022.01.14.476226, https://doi.org/10.1101/2022.01.14.476226.
;N. Tessa Pierce-Ward
;David Koslicki
, 2022, Debiasing FracMinHash and deriving confidence intervals for mutation rates across a wide range of evolutionary distances, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2022.01.11.475870, https://doi.org/10.1101/2022.01.11.475870.
;Aristides Gionis
;Athanasios Katsamanis
;Panagiotis Karras
, 2022, SIEVE: A Space-Efficient Algorithm for Viterbi Decoding, Aaltodoc (Aalto University), pp. 1136-1145, 10.1145/3514221.3526170, https://aaltodoc.aalto.fi/handle/123456789/125396.
;Rasmus Pagh
, 2022, HyperLogLogLog, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 753-761, 10.1145/3534678.3539246.
;Kenneth G. Paterson
;Anupama Unnikrishnan
;Fernando Virdia
, 2022, Adversarial Correctness and Privacy for Probabilistic Data Structures, Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp. 1037-1050, 10.1145/3548606.3560621.
, 2022, Analysis of the possibility of using key-value store NoSQL databases for IFC data processing in the BIM-GIS integration process, Polish Cartographical Review, 54, 1, pp. 11-22, 10.2478/pcr-2022-0002, https://doi.org/10.2478/pcr-2022-0002.
;Renxiang Zhou;Hanhua Chen
;Hai Jin
, 2022, Scube: Efficient Summarization for Skewed Graph Streams, 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS), pp. 100-110, 10.1109/icdcs54860.2022.00019.
;Renxiang Zhou;Hanhua Chen
;Jiang Xiao
;Hai Jin
;et al., 2022, Horae: A Graph Stream Summarization Structure for Efficient Temporal Range Query, 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 2792-2804, 10.1109/icde53745.2022.00254.
;Pinghui Wang
;Junzhou Zhao
;Jing Tao
;Ye Yuan
;et al., 2022, Erasable Virtual HyperLogLog for Approximating Cumulative Distribution over Data Streams, IEEE Transactions on Knowledge and Data Engineering, 34, 11, pp. 5336-5350, 10.1109/tkde.2021.3052938.
;Shrideep Pallickara, 2022, Griddle: Effective Query Support over Voluminous Gridded Spatial Datasets, 2022 IEEE International Conference on Big Data (Big Data), pp. 792-799, 10.1109/bigdata55660.2022.10020796.
;Andrea Marino
, 2022, Degrees of Separation and Diameter in Large Graphs, Encyclopedia of Big Data Technologies, pp. 1-7, 10.1007/978-3-319-63962-8_59-2.
;Zheng Zhong;Jiarui Guo
;Zikun Li
;Tong Yang
;et al., 2022, BurstSketch: Finding Bursts in Data Streams, IEEE Transactions on Knowledge and Data Engineering, 35, 11, pp. 11126-11140, 10.1109/tkde.2022.3223686.
;Eva Hauthal
;Dirk Burghardt
, 2022, Analyzing the EU Migration Crisis as Reflected on Twitter, KN - Journal of Cartography and Geographic Information, 72, 3, pp. 213-228, 10.1007/s42489-022-00114-6, https://doi.org/10.1007/s42489-022-00114-6.
;Zaoxing Liu
;Nikita Ivkin;Xiaoqi Chen
;Vladimir Braverman
;et al., 2022, Flow-level loss detection with Δ-sketches, Proceedings of the Symposium on SDN Research, pp. 25-32, 10.1145/3563647.3563653.
;Kaixuan Ye
;Yiyan Qi
;Peng Jia
;Pinghui Wang
, 2022, Generalized Sketches for Streaming Sets, Applied Sciences, 12, 15, pp. 7362, 10.3390/app12157362, https://doi.org/10.3390/app12157362.
;He Huang
;Yu-E Sun
;Shigang Chen
;Guoju Gao
;et al., 2022, Self-Adaptive Sampling Based Per-Flow Traffic Measurement, IEEE/ACM Transactions on Networking, 31, 3, pp. 1010-1025, 10.1109/tnet.2022.3212066.
;Cheng-Lin Tsai;Cheng-Han Chuang;Xiu-Wen Ku;Jim Hao Chen, 2022, Tabular Interpolation Approach Based on Stable Random Projection for Estimating Empirical Entropy of High-Speed Network Traffic, IEEE Access, 10, pp. 104934-104953, 10.1109/access.2022.3210336, https://doi.org/10.1109/access.2022.3210336.
;Zhuochen Fan
;Qilong Shi
;Yixin Zhang
;Tong Yang
;et al., 2022, SHE: A Generic Framework for Data Stream Mining over Sliding Windows, Proceedings of the 51st International Conference on Parallel Processing, pp. 1-12, 10.1145/3545008.3545009.
;Xin Wang
;Yujun Zhang
, 2022, Towards Persistent Detection of DDoS Attacks in NDN: A Sketch-Based Approach, IEEE Transactions on Dependable and Secure Computing, 20, 4, pp. 3449-3465, 10.1109/tdsc.2022.3196187.
;Yun William Yu
, 2021, Secure federated Boolean count queries using fully-homomorphic cryptography, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.11.10.468090, https://doi.org/10.1101/2021.11.10.468090.
;Anees Ullah
;Pedro Reviriego
;Syed Riaz Ul Hassnain, 2021, Efficient Leading Zero Count (LZC) Implementations for Xilinx FPGAs, e-Archivo (Carlos III University of Madrid), 14, 1, pp. 35-38, 10.1109/les.2021.3101688, http://hdl.handle.net/10016/34413.
;Pedro Reviriego
;Adeel Akram
;Malik Najmus Siraj
, 2021, Switch-Based High Cardinality Node Detection, e-Archivo (Carlos III University of Madrid), 13, 4, pp. 190-193, 10.1109/les.2021.3062155, http://hdl.handle.net/10016/33713.
;Ben Kreuter;Ravi Kumar
;Pasin Manurangsi
;Jiayu Peng
;et al., 2021, Multiparty Reach and Frequency Histogram: Private, Secure, and Practical, Proceedings on Privacy Enhancing Technologies, 2022, 1, pp. 373-395, 10.2478/popets-2022-0019, https://doi.org/10.2478/popets-2022-0019.
;Pengfei Gu
;Yuming Zhao
, 2021, Approximate set union via approximate randomization, Theoretical Computer Science, 890, pp. 210-239, 10.1016/j.tcs.2021.09.016.
;Jonathan Schaeffer
;Claudio Satriano
;Helle Pedersen
;Jérôme Touvier;et al., 2021, RÉSIF-SI: A Distributed Information System for French Seismological Data, Archimer (Ifremer), 92, 3, pp. 1832-1853, 10.1785/0220200392, https://archimer.ifremer.fr/doc/00689/80067/.
;Haibo Wang
;Olufemi O Odegbile
;Shigang Chen
, 2021, Virtual Filter for Non-duplicate Sampling, Clark Digital Commons (Clark University), pp. 1-11, 10.1109/icnp52444.2021.9651974.
;Shigang Chen
;Youlin Zhang
;Qingjun Xiao
;Olufemi O. Odegbile
, 2021, Super Spreader Identification Using Geometric-Min Filter, IEEE/ACM Transactions on Networking, 30, 1, pp. 299-312, 10.1109/tnet.2021.3108033.
;Bastien Cazaux
;Antoine Limasset
, 2021, Toward optimal fingerprint indexing for large scale genomics, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.11.04.467355, https://doi.org/10.1101/2021.11.04.467355.
;Shuhuan Fan
;Jianming Liao;Mengshu Hou
, 2021, GACE: Graph-Attention-Network-Based Cardinality Estimator, Lecture notes in computer science, pp. 332-345, 10.1007/978-3-030-86475-0_32.
;Thomas Neumann
, 2021, JSON Tiles: Fast Analytics on Semi-Structured Data, Proceedings of the 2021 International Conference on Management of Data, pp. 445-458, 10.1145/3448016.3452809.
;Alexander Dunkel
;Dirk Burghardt
, 2021, Emojis as Contextual Indicants in Location-Based Social Media Posts, ISPRS International Journal of Geo-Information, 10, 6, pp. 407, 10.3390/ijgi10060407, https://doi.org/10.3390/ijgi10060407.
;Ryan Wiener;Divyakant Agrawal
;Amr El Abbadi
, 2021, KLL ± approximate quantile sketches over dynamic datasets, Proceedings of the VLDB Endowment, 14, 7, pp. 1215-1227, 10.14778/3450980.3450990.
;Minos Garofalakis
;Michael Shekelyan
, 2021, Data-Independent Space Partitionings for Summaries, Warwick Research Archive Portal (University of Warwick), 38, pp. 285-298, 10.1145/3452021.3458316.
;Zhifeng Bao
;Yuwei Peng
, 2021, A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration, Data Science and Engineering, 6, 1, pp. 86-101, 10.1007/s41019-020-00149-7, https://doi.org/10.1007/s41019-020-00149-7.
;Chaoyi Ma
;Olufemi O Odegbile
;Shigang Chen
;Jih-Kwon Peir, 2021, Randomized error removal for online spread estimation in data streaming, Proceedings of the VLDB Endowment, 14, 6, pp. 1040-1052, 10.14778/3447689.3447707.
;Yanan Jiang;Chen Tian
;Long Cheng
;Qun Huang
;et al., 2021, Rethinking Fine-Grained Measurement From Software-Defined Perspective: A Survey, IEEE Transactions on Services Computing, 15, 6, pp. 3649-3667, 10.1109/tsc.2021.3103968.
;Daehyeok Kim
;Zaoxing Liu
;Vyas SekaR
;Peter Steenkiste
, 2021, Telemetry Retrieval Inaccuracy in Programmable Switches, Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR), pp. 176-182, 10.1145/3482898.3483359.
, 2021, On the algebra of data sketches, Proceedings of the VLDB Endowment, 14, 9, pp. 1655-1667, 10.14778/3461535.3461553.
;Wei Ding
, 2021, Rough Estimator Based Asynchronous Distributed Super Points Detection on High Speed Network Edge, Algorithms, 14, 10, pp. 277, 10.3390/a14100277, https://doi.org/10.3390/a14100277.
;Martin Hirzel
;Scott Schneider, 2021, In-order sliding-window aggregation in worst-case constant time, arXiv (Cornell University), 30, 6, pp. 933-957, 10.1007/s00778-021-00668-3, http://arxiv.org/abs/2009.13768.
;Marc Fischer
;Vasiliki Kalavri
;Michael Kapralov
;Torsten Hoefler
, 2021, Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems, arXiv (Cornell University), 34, 6, pp. 1860-1876, 10.1109/tpds.2021.3131677, https://arxiv.org/pdf/1912.12740v3.
;Gustavo Alonso
, 2021, SKT, Repository for Publications and Research Data (ETH Zurich), 14, 11, pp. 2369-2382, 10.14778/3476249.3476287, http://hdl.handle.net/20.500.11850/505690.
;C. Titus Brown
;Tamer A. Mansour
, 2021, MQF and buffered MQF: quotient filters for efficient storage of k-mers with their counts and metadata, BMC Bioinformatics, 22, 1, pp. 71, 10.1186/s12859-021-03996-x, https://doi.org/10.1186/s12859-021-03996-x.
;K. Murugan
, 2021, CEOF: Enhanced Clustering-based Entries Optimization scheme to prevent Flow table overflow, Wireless Networks, 28, 1, pp. 69-83, 10.1007/s11276-021-02823-8.
, 2021, Fast computation of distance-generalized cores using sampling, 2021 IEEE International Conference on Data Mining (ICDM), 16, pp. 609-618, 10.1109/icdm51629.2021.00072.
, 2021, SetSketch, arXiv (Cornell University), 14, 11, pp. 2244-2257, 10.14778/3476249.3476276, http://arxiv.org/abs/2101.00314.
;Alfonso Sanchez-Macian
;Shanshan Liu
;Fabrizio Lombardi
, 2021, On the Security of the K Minimum Values (KMV) Sketch, e-Archivo (Carlos III University of Madrid), 19, 5, pp. 3539-3545, 10.1109/tdsc.2021.3101280, http://hdl.handle.net/10016/35750.
;Pilin Junsangsri
;Shanshan Liu
;Fabrizio Lombardi
, 2021, Error-Tolerant Data Sketches Using Approximate Nanoscale Memories and Voltage Scaling, IEEE Transactions on Nanotechnology, 21, pp. 16-22, 10.1109/tnano.2021.3139394.
;Gil Einziger
;Shir Landau Feibish
;Jalil Moraney;Bilal Tayh
;et al., 2021, Routing-Oblivious Network-Wide Measurements, IEEE/ACM Transactions on Networking, 29, 6, pp. 2386-2398, 10.1109/tnet.2021.3061737.
;Bolin Ding
;Xu Chu
;Zhewei Wei
;Xiening Dai;et al., 2021, Learning to be a statistician, arXiv (Cornell University), 15, 2, pp. 272-284, 10.14778/3489496.3489508, http://arxiv.org/abs/2202.02800.
;Liran Katzir
;Aviv Yehezkel, 2021, Efficient Service Chain Verification Using Sketches and Small Samples, 2021 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), 10.1109/nfv-sdn53031.2021.9665020.
;Martin Kiefer
;Joscha von Hein;Jorge-Arnulfo Quiané-Ruiz
;Volker Markl
, 2021, In the land of data streams where synopses are missing, one framework to bring them all, Proceedings of the VLDB Endowment, 14, 10, pp. 1818-1831, 10.14778/3467861.3467871.
;Dingyu Wang
, 2021, Information theoretic limits of cardinality estimation: Fisher meets Shannon, Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp. 556-569, 10.1145/3406325.3451032.
;Guo-Ping Yang
;Nan Han
;Hao Chen
;Fa-Liang Huang
;et al., 2021, Cardinality Estimator: Processing SQL with a Vertical Scanning Convolutional Neural Network, Journal of Computer Science and Technology, 36, 4, pp. 762-777, 10.1007/s11390-021-1351-7.
;David Koslicki
, 2021, CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.12.06.471436, https://doi.org/10.1101/2021.12.06.471436.
;Jerome Tollet
;Dave Barach;Giuseppe Bianchi
, 2021, FlowFight: High performance–low memory top- k spreader detection, Computer Networks, 196, pp. 108239, 10.1016/j.comnet.2021.108239.
;Yilei Wang
;Ke Yi
;Feifei Li
;Bin Wu
;et al., 2021, Weighted Distinct Sampling: Cardinality Estimation for SPJ Queries, Rare & Special e-Zone (The Hong Kong University of Science and Technology), pp. 1465-1477, 10.1145/3448016.3452821.
;Hun Namkung
;Anup Agarwal
;Antonis Manousis
;Peter Steenkiste
;et al., 2021, Sketchy With a Chance of Adoption: Can Sketch-Based Telemetry Be Ready for Prime Time?, 2021 IEEE 7th International Conference on Network Softwarization (NetSoft), pp. 9-16, 10.1109/netsoft51509.2021.9492582.
;Shaoxu Song
;Ziheng Wei
;Jingyun Fang
;Jiang Long, 2021, Approximating median absolute deviation with bounded error, Proceedings of the VLDB Endowment, 14, 11, pp. 2114-2126, 10.14778/3476249.3476266.
;Yun William Yu
, 2021, Expected 10-anonymity of HyperLogLog sketches for federated queries of clinical data repositories, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/2021.01.30.428918, https://doi.org/10.1101/2021.01.30.428918.
;Przemyslaw Uznanski
, 2020, Cardinality estimation using Gumbel distribution, Leibniz-Zentrum für Informatik (Schloss Dagstuhl), 10.4230/lipics.esa.2022.76, https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.76.
;Mohamed-Rafik Bouguelia
;Sepideh Pashami
;Slawomir Nowaczyk
, 2020, Decentralized and Adaptive K-Means Clustering for Non-IID Data Using HyperLogLog Counters, Lecture notes in computer science, pp. 343-355, 10.1007/978-3-030-47426-3_27, https://doi.org/10.1007/978-3-030-47426-3_27.
;Antonios Deligiannakis
, 2020, A Synopses Data Engine for Interactive Extreme-Scale Analytics, Zenodo (CERN European Organization for Nuclear Research), pp. 2085-2088, 10.1145/3340531.3412154, https://zenodo.org/record/3977402.
;Pengfei Gu
;Yuming Zhao
, 2020, Approximate Set Union via Approximate Randomization, Lecture notes in computer science, pp. 591-602, 10.1007/978-3-030-58150-3_48.
;Carlos Martins
;Raphael Espanha;Raul Azevedo;João Gama
, 2020, Fraud detection using heavy hitters, Proceedings of the 35th Annual ACM Symposium on Applied Computing, pp. 482-489, 10.1145/3341105.3373842.
;Pravein Govindan Kannan
;Bryan Kian Hsiang Low
;Mun Choon Chan
, 2020, FCM-sketch, Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies, pp. 78-92, 10.1145/3386367.3432729.
;Haipeng Dai
;Lei Meng;Jihong Yu
, 2020, Finding needles in a hay stream: On persistent item lookup in data streams, Computer Networks, 181, pp. 107518, 10.1016/j.comnet.2020.107518.
;Ramian Fathi
;David ‘-1’ Schmid
;Alexander Dunkel
;Dirk Burghardt
;et al., 2020, Case Study on Privacy-Aware Social Media Data Processing in Disaster Management, ISPRS International Journal of Geo-Information, 9, 12, pp. 709, 10.3390/ijgi9120709, https://doi.org/10.3390/ijgi9120709.
;Ilias Poulakis;Sebastian Breß;Volker Markl
, 2020, Scotch, Proceedings of the VLDB Endowment, 14, 3, pp. 281-293, 10.14778/3430915.3430919.
;Andrea Pietracaprina
;Geppino Pucci
;Eli Upfal
, 2020, Distributed Graph Diameter Approximation, Algorithms, 13, 9, pp. 216, 10.3390/a13090216, https://doi.org/10.3390/a13090216.
;Junbo Zhang
;Akshay Gadre
;Zaoxing Liu
;Swarun Kumar
;et al., 2020, Joltik, Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, pp. 1-14, 10.1145/3372224.3419204.
;Jorge Martinez
;Ori Rottenstreich
;Shanshan Liu
;Fabrizio Lombardi
, 2020, Remove Minimum (RM): An Error-Tolerant Scheme for Cardinality Estimate by HyperLogLog, IEEE Transactions on Dependable and Secure Computing, pp. 1, 10.1109/tdsc.2020.3013746.
;Pinghui Wang
;Yuchao Zhang
;Xiangliang Zhang
;Jing Tao
;et al., 2020, Accurately Estimating User Cardinalities and Detecting Super Spreaders Over Time, IEEE Transactions on Knowledge and Data Engineering, 34, 1, pp. 92-106, 10.1109/tkde.2020.2975625.
;Amadou Ngom
;Lin Ma
;Todd C. Mowry
;Andrew Pavlo
, 2020, Permutable compiled queries, Proceedings of the VLDB Endowment, 14, 2, pp. 101-113, 10.14778/3425879.3425882.
;Dana Dachman-soled
;Mukul Kulkarni;Arkady Yerukhimovich
, 2020, Differentially-Private Multi-Party Sketching for Large-Scale Statistics, Proceedings on Privacy Enhancing Technologies, 2020, 3, pp. 153-174, 10.2478/popets-2020-0047, https://doi.org/10.2478/popets-2020-0047.
;Shir Landau-Feibish;Mark Braverman
;Jennifer Rexford
, 2020, BeauCoup, Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pp. 226-239, 10.1145/3387514.3405865.
, 2020, PlexDB: Efficient Compaction Algorithm for Deduplication of LSM-tree based Key-Value Store, Journal of Digital Contents Society, 21, 8, pp. 1501-1506, 10.9728/dcs.2020.21.8.1501.
;Sahil Garg
;Ravneet Kaur
;Shalini Batra
;Neeraj Kumar
;et al., 2019, Probabilistic data structures for big data analytics: A comprehensive review, Knowledge-Based Systems, 188, pp. 104987, 10.1016/j.knosys.2019.104987.
;George Cybenko
;Satinder Singh;Massimiliano Albanese
;Peng Liu
, 2019, Online and Scalable Adaptive Cyber Defense, Lecture notes in computer science, pp. 232-261, 10.1007/978-3-030-30719-6_10.
;Gerardo Schneider
;Wolfgang Ahrendt
;Ezio Bartocci
;Domenico Bianculli
;et al., 2019, A survey of challenges for runtime verification from advanced application domains (beyond software), Formal Methods in System Design, 54, 3, pp. 279-335, 10.1007/s10703-019-00337-w, https://doi.org/10.1007/s10703-019-00337-w.
;David Basin
, 2019, Cardinality Estimators do not Preserve Privacy, Proceedings on Privacy Enhancing Technologies, 2019, 2, pp. 26-46, 10.2478/popets-2019-0018, https://doi.org/10.2478/popets-2019-0018.
;Ben Langmead
, 2019, Dashing: fast and accurate genomic distances with HyperLogLog, Genome biology, 20, 1, pp. 265, 10.1186/s13059-019-1875-0, https://doi.org/10.1186/s13059-019-1875-0.
;Jennifer Lu
;Ben Langmead
, 2019, Improved metagenomic analysis with Kraken 2, Genome biology, 20, 1, pp. 257, 10.1186/s13059-019-1891-0, https://doi.org/10.1186/s13059-019-1891-0.
;Jennifer Lu
;Ben Langmead
, 2019, Improved metagenomic analysis with Kraken 2, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/762302, https://doi.org/10.1101/762302.
, 2019, Optimal Streaming and Tracking Distinct Elements with High Probability, ACM Transactions on Algorithms, 16, 1, pp. 1-28, 10.1145/3309193, https://doi.org/10.1145/3309193.
;Guoliang Li
, 2019, An end-to-end learning-based cost estimator, Proceedings of the VLDB Endowment, 13, 3, pp. 307-319, 10.14778/3368289.3368296.
;Paulo E. D. Pinto
;Valmir C. Barbosa
, 2019, Sketching Data Structures for Massive Graph Problems, Lecture notes in computer science, pp. 57-67, 10.1007/978-3-030-14177-6_5.
;Martin Hirzel
;Scott Schneider, 2019, Sliding-Window Aggregation Algorithms, Encyclopedia of Big Data Technologies, pp. 1516-1521, 10.1007/978-3-319-77525-8_157.
;Mourad Khayati
;Philippe Cudré-Mauroux
;Michał Piorkówski, 2019, Online Anomaly Detection over Big Data Streams, Applied Data Science, pp. 289-312, 10.1007/978-3-030-11821-1_16.
;Andrea Marino
, 2019, Degrees of Separation and Diameter in Large Graphs, Encyclopedia of Big Data Technologies, pp. 652-658, 10.1007/978-3-319-77525-8_59.
;Nandita Dukkipati;Nathan Lewis
;Yi Cui
;Yaogong Wang
;et al., 2019, PicNIC, Proceedings of the ACM Special Interest Group on Data Communication, pp. 351-366, 10.1145/3341302.3342093.
, 2019, When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data, Genome biology, 20, 1, pp. 199, 10.1186/s13059-019-1809-x, https://doi.org/10.1186/s13059-019-1809-x.
;Griffin M Weber
, 2019, Federated queries of clinical data repositories: balancing accuracy and privacy, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/841072, https://doi.org/10.1101/841072.
;Philippe Pucheral, 2018, The Case for Personalized Anonymization of Database Query Results, Communications in computer and information science, pp. 261-285, 10.1007/978-3-319-94809-6_13.
;SL Salzberg
, 2018, KrakenHLL: Confident and fast metagenomics classification using unique k-mer counts, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/262956, https://doi.org/10.1101/262956.
;Karol Gotfryd
, 2018, Average Counting via Approximate Histograms, ACM Transactions on Sensor Networks, 14, 2, pp. 1-32, 10.1145/3177922.
, 2018, Distinct Sampling on Streaming Data with Near-Duplicates, Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp. 369-382, 10.1145/3196959.3196978.
;Martin Hirzel
;Scott Schneider, 2018, Sliding-Window Aggregation Algorithms, Encyclopedia of Big Data Technologies, pp. 1-6, 10.1007/978-3-319-63962-8_157-1.
;Jasmina Malicevic;Nicolas Schiper;Ashvin Goel
;Willy Zwaenepoel
, 2018, Rock you like a hurricane, Infoscience (Ecole Polytechnique Fédérale de Lausanne), pp. 1-15, 10.1145/3190508.3190532, http://infoscience.epfl.ch/record/253574.
, 2018, Types of Stream Processing Algorithms, Encyclopedia of Big Data Technologies, pp. 1-7, 10.1007/978-3-319-63962-8_193-1.
;Rohit Kumar
;Toon Calders
;Torben Bach Pedersen
, 2018, Effective and efficient location influence mining in location-based social networks, Knowledge and Information Systems, 61, 1, pp. 327-362, 10.1007/s10115-018-1240-8.
;Andrea Marino
, 2018, Degrees of Separation and Diameter in Large Graphs, Encyclopedia of Big Data Technologies, pp. 1-7, 10.1007/978-3-319-63962-8_59-1.
;Masoud Moshref
;Antonio Carzaniga
;Nate Foster
;Robert Soulé
, 2018, Life in the Fast Lane, Proceedings of the Symposium on SDN Research, pp. 1-7, 10.1145/3185467.3185494.
, 2018, Three Big Data Tools for a Data Scientist’s Toolbox, Lecture notes in business information processing, pp. 112-133, 10.1007/978-3-319-96655-7_5.
;Barzan Mozafari
;Joseph Sorenson;Junhao Wang
, 2018, VerdictDB, arXiv (Cornell University), pp. 1461-1476, 10.1145/3183713.3196905, http://arxiv.org/abs/1804.00770.
;Hooman Zabeti, 2017, IMPROVING MIN HASH VIA THE CONTAINMENT INDEX WITH APPLICATIONS TO METAGENOMIC ANALYSIS, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/184150, https://doi.org/10.1101/184150.
, 2017, HyperLogLog Hyperextended, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 105-114, 10.1145/3097983.3098020.
;Gustavo Totoy;Cristina L. Abad
, 2017, Instrumenting cloud caches for online workload monitoring, Proceedings of the 16th Workshop on Adaptive and Reflective Middleware, 3, pp. 1-6, 10.1145/3152881.3152884.
;Martin Hirzel
;Scott Schneider, 2017, Low-Latency Sliding-Window Aggregation in Worst-Case Constant Time, Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, pp. 66-77, 10.1145/3093742.3093925.
;Scott Schneider;Kanat Tangwongsan
, 2017, Sliding-Window Aggregation Algorithms, Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, pp. 11-14, 10.1145/3093742.3095107.
;Aran Nayebi
, 2017, Efficient Hybrid Algorithms for Computing Clusters Overlap, Procedia Computer Science, 108, pp. 1050-1059, 10.1016/j.procs.2017.05.212, https://doi.org/10.1016/j.procs.2017.05.212.
;Jon Kleinberg
;Aneesh Sharma
, 2017, Detecting Strong Ties Using Network Motifs, arXiv (Cornell University), pp. 983-992, 10.1145/3041021.3055139, http://arxiv.org/abs/1702.07390.
;Liran Katzir
;Aviv Yehezkel, 2017, A Minimal Variance Estimator for the Cardinality of Big Data Set Intersection, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 95-103, 10.1145/3097983.3097999.
;Muhammad Aamir Saleem
;Toon Calders
;Xike Xie
;Torben Bach Pedersen
, 2017, Activity-Driven Influence Maximization in Social Networks, Lecture notes in computer science, pp. 345-348, 10.1007/978-3-319-71273-4_28.
;Edmund Wong;Ethan Atkins;Nat Storer, 2017, LittleTable, Proceedings of the 2017 ACM International Conference on Management of Data, pp. 125-138, 10.1145/3035918.3056102.
;Yehuda Afek;Anat Bremler-Barr
;Edith Cohen
;Michal Shagam, 2017, Mitigating DNS random subdomain DDoS attacks by distinct heavy hitters sketches, Proceedings of the fifth ACM/IEEE Workshop on Hot Topics in Web Systems and Technologies, pp. 1-6, 10.1145/3132465.3132474.
;Bolin Ding
;Srikanth Kandula
, 2017, Approximate Query Processing, Proceedings of the 2017 ACM International Conference on Management of Data, pp. 511-519, 10.1145/3035918.3056097.
;Flip Korn;Natalya F. Noy;Christopher Olston;Neoklis Polyzotis
;et al., 2016, Goods, Proceedings of the 2016 International Conference on Management of Data, pp. 795-806, 10.1145/2882903.2903730.
;Hung Bui
;Mohammad Ghavamzadeh
;Georgios Theocharous;S. Muthukrishnan
;et al., 2016, Graphical Model Sketch, Lecture notes in computer science, pp. 81-97, 10.1007/978-3-319-46128-1_6.
, 2016, All-Distances Sketches, Encyclopedia of Algorithms, pp. 59-64, 10.1007/978-3-642-27848-8_574-1.
, 2016, Min-Hash Sketches, Encyclopedia of Algorithms, pp. 1282-1287, 10.1007/978-1-4939-2864-4_573.
;C. Titus Brown
, 2016, Efficient cardinality estimation for k-mers in large DNA sequencing data sets, bioRxiv (Cold Spring Harbor Laboratory), 10.1101/056846, https://doi.org/10.1101/056846.
;Dominik Pająk
;Roger Wattenhofer
, 2016, Approximating the Size of a Radio Network in Beeping Model, Lecture notes in computer science, pp. 358-373, 10.1007/978-3-319-48314-6_23.
;Min Chen
;Qingjun Xiao
, 2016, Persistent Spread Measurement, Wireless networks, pp. 77-104, 10.1007/978-3-319-47340-6_4.
;Min Chen
;Qingjun Xiao
, 2016, Per-Flow Cardinality Measurement, Wireless networks, pp. 47-76, 10.1007/978-3-319-47340-6_3.
;Min Chen
;Qingjun Xiao
, 2016, Introduction, Wireless networks, pp. 1-9, 10.1007/978-3-319-47340-6_1.