Discrete Mathematics & Theoretical Computer Science |
For any class of binary functions on [n]={1, ..., n} a classical result by Sauer states a sufficient condition for its VC-dimension to be at least d: its cardinality should be at least O(nd-1). A necessary condition is that its cardinality be at least 2d (which is O(1) with respect to n). How does the size of a 'typical' class of VC-dimension d compare to these two extreme thresholds ? To answer this, we consider classes generated randomly by two methods, repeated biased coin flips on the n-dimensional hypercube or uniform sampling over the space of all possible classes of cardinality k on [n]. As it turns out, the typical behavior of such classes is much more similar to the necessary condition; the cardinality k need only be larger than a threshold of 2d for its VC-dimension to be at least d with high probability. If its expected size is greater than a threshold of O(&log;n) (which is still significantly smaller than the sufficient size of O(nd-1)) then it shatters every set of size d with high probability. The behavior in the neighborhood of these thresholds is described by the asymptotic probability distribution of the VC-dimension and of the largest d such that all sets of size d are shattered.