Computer Science > Machine Learning

Title:Bounding Embeddings of VC Classes into Maximum Classes

Abstract: One of the earliest conjectures in computational learning theory-the Sample
Compression conjecture-asserts that concept classes (equivalently set systems)
admit compression schemes of size linear in their VC dimension. To-date this
statement is known to be true for maximum classes---those that possess maximum
cardinality for their VC dimension. The most promising approach to positively
resolving the conjecture is by embedding general VC classes into maximum
classes without super-linear increase to their VC dimensions, as such
embeddings would extend the known compression schemes to all VC classes. We
show that maximum classes can be characterised by a local-connectivity property
of the graph obtained by viewing the class as a cubical complex. This geometric
characterisation of maximum VC classes is applied to prove a negative embedding
result which demonstrates VC-d classes that cannot be embedded in any maximum
class of VC dimension lower than 2d. On the other hand, we show that every VC-d
class C embeds in a VC-(d+D) maximum class where D is the deficiency of C,
i.e., the difference between the cardinalities of a maximum VC-d class and of
C. For VC-2 classes in binary n-cubes for 4 <= n <= 6, we give best possible
results on embedding into maximum classes. For some special classes of Boolean
functions, relationships with maximum classes are investigated. Finally we give
a general recursive procedure for embedding VC-d classes into VC-(d+k) maximum
classes for smallest k.