[Solved]: Vapnik-Chervonenkis Dimension: why cannot four points on a line be shattered by rectangles?

Problem Detail: So I’m reading “Introduction to Machine Learning” 2nd edition, by Bishop, et. all. On page 27 they discuss the Vapnik-Chervonenkis Dimension which is,

“The maximum number of points that can be shattered by H [the hypothesis class] is called the Vapnik-Chervonenkis (VC) Dimension of H, is denoted VC(H) and measures the capacity of H.”

Whereas “shatters” indicates a hypothesis $h in H$ for a set of N data points such that it separates the positive examples from the negative. In such an example it is said that “H shatters N points”. So far I think I understand this. However, the authors lose me with the following:

“For example, four points on a line cannot be shattered by rectangles.”

There must be some concept here I’m not fully understanding, because I cannot understand why this is the case. Can anyone explain this to me?

Asked By : BrotherJack

Answered By : JeffE

The definition of “a set $P$ can be shattered by rectangles” is that for every subset of $P$, there is a rectangle that contains precisely that subset and excludes the rest of $P$. Equivalently, every labeling of the points as positive and negative is consistent with at least one hypothesis in $H$. Now consider four points $p,q,r,s$ along a line in the plane. Since there is no rectangle that contains $p$ and $r$ but excludes $q$ and $s$, these four points cannot be shattered by rectangles.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/1917 Ask a Question Download Related Notes/Documents