For any class of binary functions on [n]={1, ..., n} a classical result by Sauer states a sufficient condition for its VC-dimension to be at least d: its cardinality should be at least O(nd-1). A necessary condition is that its cardinality be at least 2d (which is O(1) with respect to n). How does the size of a 'typical' class of VC-dimension d compare to these two extreme thresholds ? To answer this, we consider classes generated randomly by two methods, repeated biased coin flips on the n-dimensional hypercube or uniform sampling over the space of all possible classes of cardinality k on [n]. As it turns out, the typical behavior of such classes is much more similar to the necessary condition; the cardinality k need only be larger than a threshold of 2d for its VC-dimension to be at least d with high probability. If its expected size is greater than a threshold of O(&log;n) (which is still significantly smaller than the sufficient size of O(nd-1)) then it shatters every set of size d with high probability. The behavior in the neighborhood of these thresholds is described by the asymptotic probability distribution of the VC-dimension and of the largest d such that all sets of size d are shattered.

Source : oai:HAL:hal-00853123v1

Volume: Vol. 10 no. 1

Section: Combinatorics

Published on: January 1, 2008

Submitted on: March 26, 2015

Keywords: [MATH.MATH-ST] Mathematics [math]/Statistics [math.ST],[STAT.TH] Statistics [stat]/Statistics Theory [stat.TH]

This page has been seen 368 times.

This article's PDF has been downloaded 341 times.