Mathematics for the interested outsider

The Inclusion-Exclusion Principle

In combinatorics we have a method of determining the cardinality of unions in terms of the sets in question and their intersection: the Inclusion–Exclusion Principle. Predictably enough, this formula is reflected in the subspaces of a vector space. We could argue directly in terms of bases, but it’s much more interesting to use the linear algebra we have at hand.

Let’s start with just two subspaces and of some larger vector space. We’ll never really need that space, so we don’t need to give it a name. The thing to remember is that and might have a nontrivial intersection — their sum may not be direct.

Now we consider the sum of the subspaces. Every vector in is the sum of a vector in and a vector in . This tells us that there’s a linear map defined by . Note carefully that this is linear as a function of the pair. It’s not bilinear — linear in each of and separately — which would mean bringing in the tensor product.

This function is always surjective, but it may fail to be injective. Specifically, if and have a nontrivial intersection then it gives us a nontrivial kernel. If then . But rather than talk about elements, let’s use the intersection to parametrize the kernel! We’ve got another linear map , defined by .

Now it’s clear that has a trivial kernel. On the other hand, its image is precisely the set of pairs that kills off. That means we’ve got an exact sequence:

defined by . Then the kernel contains triples like , , and . We use the pairwise intersections to cover the kernel by the map

defined by . The image of is exactly the kernel of , but this time it has a nontrivial kernel itself! Now we need another map

defined by . Now is injective, and its image is exactly the kernel of . We stick these maps all together to have the long exact sequence

Taking the Euler characteristic and juggling again we find

We include the dimensions of the individual subspaces, exclude the dimensions of the pairwise intersections, and include back in the dimension of the triple intersection.

For larger numbers of subspaces we can construct longer and longer exact sequences. The th term consists of the direct sum of the intersections of the subspaces, taken at a time. The th map consists of including each -fold intersection into each of the -fold intersections it lies in, with positive and negative signs judiciously sprinkled throughout to make the sequence exact.

The result is that the dimension of a sum of subspaces is the alternating sum of the dimensions of the -fold intersections. First add the dimension of each subspace, then subtract the dimension of each pairwise intersection, then add back in the dimension of each triple intersection, and so on.

From here it’s the work of a moment to derive the combinatorial inclusion-exclusion principle. Given a collection of sets, just construct the free vector space on each of them. These free vector spaces will have nontrivial intersections corresponding to the nonempty intersections of the sets, and we’ve got canonical basis elements floating around to work with. Then all these dimensions we’re talking about are cardinalities of various subsets of the sets we started with, and the combinatorial inclusion-exclusion principle follows. Of course, as I said before we could have proved it from the other side, but that would require a lot of messy hands-on work with bases and such.

The upshot is that the combinatorial inclusion-exclusion principle is equivalent to the statement that exact sequences of vector spaces have trivial Euler characteristic! This little formula that we teach in primary or secondary school turns out to be intimately connected with one of the fundamental tools of homological algebra and the abstract approach to linear algebra. Neat!

What is also good about this observation is that this form of inclusion-exclusion holds in lattices more general than distributive lattices (which were the types of lattice emphasized in the post you linked to) — subspace lattices satisfy the weaker property of modularity.

Sorry, I don’t know sequences, but it seems either you have a major error or I completely don’t understand your post.

dim(U+V+W)=dim(U)+dim(V)+dim(W)-dim(UV)-dim(UW)-dim(WV)+dim(UVW)

Consider R^2 as a vector space over R and let:
U={(x,y):x=0}
V={(x,y):y=0}
W={(x,y):x=y}

These are clearly subspaces. U+V=R^2 and their intersection is {0}. Same for U,W and V,W.

So the left side is dim(U+V+W)=dim(R^2)=2 and the right side is dim(U)+dim(V)+dim(W)=3 because the intersections vanish.

The problem is that in general (U+V) intersection W is not equal to (U intersection W) + (V intersection W). Take these 3 mentioned spaces as a counterexample. Once my class received a task to determine whether PIE is valid for n vector spaces. Of course everybody gave a “proof” of it.:)

Best, –k.g.

(Feel free to add TeX or correct my English; don’t know how to use it)

You learned the inclusion-exclusion principle in primary school? I don’t think anybody I know who didn’t do math contests in high school has the faintest idea what it is.

Anyway, I’ve always found the use of the free vector space construction in combinatorics extremely mysterious. For example, in poset theory one can use certain linear operators to show that certain graded posets are rank-unimodal and Sperner, and some of the consequences of this method don’t have known “direct” proofs. So does the usefulness of linear algebra in combinatorics ultimately stem from categorical and “topological” properties of the category of vector spaces? What makes this category unique for combinatorial purposes?

[…] have a nonempty intersection, which leads us to think that maybe this has something to do with the inclusion-exclusion principle. We’re “overcounting” the intersection by just adding, so let’s subtract it […]

[…] little more about characteristic functions, let’s see how they can be used to understand the inclusion-exclusion principle. Our first pass was through a very categorified lens, which was a neat tie-in to Euler […]

I’m working on a research project where I want to use cycles in the homology of this kind of chain complex to show that two finite families of subspaces of vector spaces are not the same. I cannot find a name for this chain complex construction or a reference for its construction. Can you tell me a reference, or at least a name that I can type into google?

About this weblog

This is mainly an expository blath, with occasional high-level excursions, humorous observations, rants, and musings. The main-line exposition should be accessible to the “Generally Interested Lay Audience”, as long as you trace the links back towards the basics. Check the sidebar for specific topics (under “Categories”).

I’m in the process of tweaking some aspects of the site to make it easier to refer back to older topics, so try to make the best of it for now.