I am trying to figure out a good approach/formula for counting valid subsets of S. A valid subset is one where the largest element of the subset is less than the sum of the rest of the subset.

The way to begin is to start by computing the sets by hand for n = 1, 2, etc. After a while you'll either see a pattern or realize that the combinatorics are too complicated for you to figure out.

The other thing to do as you go through these exercises is to develop some notation and start to gain familiarity with how the sets work.

So let's take n = 1.

We only have two possible subsets, [itex]\emptyset[/itex] and {1}.

[itex]\emptyset[/itex] has no largest element so there's nothing we can say about it. From now on we won't consider the empty set.

The largest element of {1} is 1. Let's denote this as L({1}) = 1. For any set, L(A) is the largest element of the set.

What's the sum of the remaining elements? Well the set of remaining elements other than 1 is empty. The empty sum is zero. Let's denote this as S({1}) = 0. In other words for any set A, S(A) is the sum of the elements in A excluding L(A).

Now there are three possibilities for any set A:

* S(A) < L(A). In this case we'll call A an "L-set," for "less than."

* S(A) = L(A). In this case we'll call A an "E-set," for "equal."

* S(A) > L(A). In this case we'll call A a "G-set," for "greater than." These are the ones you're interested in. My notation is backward from your description, my apologies. This seemed natural to me because if you arrange the elements in order, the largest is on the right and the sum of the rest of them is on the left. So S is to the left of L in my mind. You see that a key part of the process is developing your own mental model of the problem.

With this notation, let's just bang out a bunch of examples and see what we see.

Also note that your problem has a constraint that makes it much easier than it could be. For each n we only have to consider subsets of {1, 2, 3, ...n}. In other words we don't have to deal with arbitrary subsets of the natural numbers such as {47, 100, 148}, which is an L-set. But we aren't considering those types of arbitrary sets. For us, the only possible 3-element set is {1,2,3}, which happens to be an E-set. So your problem is a special case of a much harder problem.

Ok for n = 1 we already saw that we have an enumeration of the subsets:

{1}: L

For n = 2 we have:

{1} : L
{2} : L
{1, 2} : L

We now make a couple of handy observations. Every singleton is an L-set. And every 2-element set is an L-set. Handy to know, saves us some thinking as we go. You see the point of these warmup exercises is to start to become familiar with the problem space.

So if G(n) is the number of G-sets for n, we have G(1) = G(2) = G(3) = 0; and G(4) = 2.

I'll leave the rest to you. Clearly n = 5 is tedious and n = 6 more tedious still. But at some point you might see the pattern by which G-sets are formed.

I hope I provided an answer to the question, How do we start a problem like this?

What we do is start computing the objects of interest by hand for n = 1, 2, ...

As we go, we are forced to confront some conceptual issues such as remembering that an empty sum is zero. We also develop some notation to serve as a shorthand for ideas that we're using over and over. And as we go, we start to develop some skill at manipulating the objects of interest. We know without having to think that singletons and 2-bangers are L-sets. We know that the smallest G-set is {2,3,4}, and so on.

I think that this is going to get combinatorially tricky as the numbers get larger. But this is how to get started. Roll up your sleeves and look at examples.

ps -- If you know a programming language, this is the kind of problem that would be straightforward to program. The benefit of that approach is that you can produce a table for G(n) for much larger values of n than you could do by hand. However there is also some virtue to writing out a lot of cases by hand so you develop intuition for the problem.

On the subject of finding shortcuts, it may be useful to realize that, for N>3, the whole set S={1,2,3,...N} is a 'G-set' (thus 'valid'), since the largest element is N and the sum of the others is N(N-1)/2, and (N-1)/2 > 1 when N>3. On these conditions, any subset including N will be 'valid', as the sum of the others will not be higher than in the worst case, S itself.

I've tried most of these techniques already; in all actuality my set S is more like {1, 2, 3, 4, 6, 9, 13, 19, 28, 41} where the first three terms are hardcoded and the rest: a(n) = a(n-1) + a(n-3)

Problem: When I try to even list out all possible sets by hand, I don't see any patterns, nor do I know how to take advantage of combinatorics. Every time I think I've found a pattern, it diverges. I am aware of the points stated so far in this thread though. I think my issue is understanding the combinatorics behind the counting.

On the subject of finding shortcuts, it may be useful to realize that, for N>3, the whole set S={1,2,3,...N} is a 'G-set' (thus 'valid'), since the largest element is N and the sum of the others is N(N-1)/2, and (N-1)/2 > 1 when N>3. On these conditions, any subset including N will be 'valid', as the sum of the others will not be higher than in the worst case, S itself.

And that's precisely the reason why it is not true that any subset containing N will be valid!

For example, for [itex]N\geq 3[/itex] , the set [itex]\{1,2,N\}[/itex] is not valid...

Here's a general suggestion that may or may not help: You say that you have calculated your desired result for small values of N. Search for that sequence in the On-Line Encyclopedia of Integer Sequences. If you get a match, the information might be useful.

I have had that site open pretty much the entire time I've been working on this and nothing's been particularly helpful, unfortunately. None of the sequences match (or any of the subsequences I get when I try creative ways to partition the results).

I wonder (without really knowing what I'm talking about) if the problem can be approached from the other end: counting ways to partition an integer K as a sum of distinct terms, all smaller than your given N; and then adding all these counts from K=N+1 to N(N-1)/2.

perhaps, but for very large N I am not sure if that will work because it would require calculating that for each and every K.

Plus, having a theoretical answer or a practical method are two very different things; your comment suggests that you are looking for the latter, but the former is a necessary step anyway.

P.S.: Please compare the output of the following Python program (Python is almost pseudo-code, so it should be "understandable"), to see if this is what you expect. (I need some time to explain to myself why this seems to work; my subconscious tends to work faster --though not always right-- than my rational part, I guess.)

At first I thought this program would run in linear time because, while the problem is subdivided in two and recursively solved, I thought the left branch of the tree would extinguish pretty quickly. When trying with larger numbers I see that this is not really true. It's still way faster than the brute-force approach of generating all subsets and counting them.

Here is some rationale as to why the above program appears to work. Consider, for example, the case N=5, where you want all subsets of {1,2,3,4} whose sum is larger than 5. I have listed all subsets below.

And notice that, by construction, all sums *after* the middle row are equal to 4 plus the corresponding sums *before* the middle row.

Therefore, the sums larger than 5 are:
- the sums after the middle (rows 9-15) larger than 5, which are as many as the sums before the middle (rows 1-7) that are larger than 5-4 = 1;
- plus 1 if the middle row were larger than 5, which it isn't; so nothing added here;
- plus the sums before the middle (rows 1-7) which are larger than 5.

So we are referring now *only* to the sums in the first rows (1-7).

These sums before the middle can be figured out recursively, as they follow the same pattern: