I'm developing a library intended for public release. It contains various methods for operating on sets of objects - generating, inspecting, partitioning and projecting the sets into new forms. In case it's relevant, it's a C# class library containing LINQ-style extensions on IEnumerable, to be released as a NuGet package.

Some of the methods in this library can be given unsatisfiable input parameters. For example, in the combinatoric methods, there is a method to generate all sets of n items that can be constructed from a source set of m items. For example, given the set:

1, 2, 3, 4, 5

and asking for combinations of 2 would produce:

1, 2
1, 3
1, 4
etc...
5, 3
5, 4

Now, it's obviously possible to ask for something that can't be done, like giving it a set of 3 items and then asking for combinations of 4 items while setting the option that says it can only use each item once.

In this scenario, each parameter is individually valid:

The source collection is not null, and does contain items

The requested size of combinations is a positive nonzero integer

The requested mode (use each item only once) is a valid choice

However, the state of the parameters when taken together causes problems.

In this scenario, would you expect the method to throw an exception (eg. InvalidOperationException), or return an empty collection? Either seems valid to me:

You can't produce combinations of n items from a set of m items where n > m if you're only allowed to use each item once, so this operation can be deemed impossible, hence InvalidOperationException.

The set of combinations of size n that can be produced from m items when n > m is an empty set; no combinations can be produced.

The argument for an empty set

My first concern is that an exception prevents idiomatic LINQ-style chaining of methods when you're dealing with datasets that may have unknown size. In other words, you might want to do something like this:

If your input set is of variable size, this code's behaviour is unpredictable. If .CombinationsOf() throws an exception when someInputSet has fewer than 4 elements, then this code will sometimes fail at runtime without some pre-checking. In the above example this checking is trivial, but if you're calling it halfway down a longer chain of LINQ then this might get tedious. If it returns an empty set, then result will be empty, which you may be perfectly happy with.

The argument for an exception

My second concern is that returning an empty set might hide problems - if you're calling this method halfway down a chain of LINQ and it quietly returns an empty set, then you may run into issues some steps later, or find yourself with an empty result set, and it may not be obvious how that happened given that you definitely had something in the input set.

Since the empty set is mathematically correct, chances are that when you get it it actually is what you want. Mathematical definitions and conventions are generally chosen for consistency and convenience so that things just work out with them.
– asmeurerNov 30 '16 at 17:00

5

@asmeurer They're chosen so that theorems are consistent and convenient. They're not chosen to make programming easier. (That is sometimes a side benefit, but sometimes they make programming harder, too.)
– jpmc26Nov 30 '16 at 17:16

7

@jpmc26 "They're chosen so that theorems are consistent and convenient" - making sure that your program always works as expected is essentially equivalent to proving a theorem.
– artemNov 30 '16 at 20:58

2

@jpmc26 I don't get why you mentioned functional programming. Proving correctness for imperative programs is quite possible, always advantageous, and can be done informally as well - just think a little bit and use common mathematical sense when you write your program, and you will spend less time testing. Statistically proven on sample of one ;-)
– artemNov 30 '16 at 22:22

5

@DmitryGrigoryev 1/0 as undefined is much more mathematically correct than 1/0 as infinity.
– asmeurerDec 1 '16 at 17:16

Mathematically, yes, but it's also very likely a source of error. If this type of input is expected, it might be better to require that the user catch the exception in a general purpose library.
– Casey KuballNov 30 '16 at 16:24

56

I disagree that it's a "very likely" cause of error. Suppose, for example, you were implementing a naive "matchmaknig" algorithm from a largish input set. You might ask for all combinations of two items, find the single "best" match among them, and then remove both elements and start over with the new, smaller set. Eventually your set will be empty, and there will be no pairs left to consider: an exception at this point is obstructive. I think there are a lot of ways to end up in a similar situation.
– amalloyNov 30 '16 at 17:09

11

Actually you do not know if this is an error in the eyes of the user. The empty set is something the user should check for, if necessary. if(result.Any()) DoSomething(result.First()); else DoSomethingElse(); is much better than try{result.first().dosomething();}catch{DoSomethingElse();}
– GuranDec 1 '16 at 9:25

4

@TripeHound That's what tipped the decision in the end: requiring the developer using this method to check, and then throw, has a much smaller impact than throwing an exception that they don't want, in terms of development effort, program performance, and simplicity of program flow.
– anaximanderDec 1 '16 at 14:12

6

@Darthfett Try comparing this to an existing LINQ extension method: Where(). If I have a clause: Where(x => 1 == 2) I don't get an exception, I get an empty set.
– NecorasDec 1 '16 at 18:25

So I'd say, if all your parameters validate but the result is an empty set return an empty set, you’re not the only one doing it.

As said by @Bakuriu in the comments, this is also the same for an SQL query like SELECT <columns> FROM <table> WHERE <conditions>. As long as <columns>, <table>, <conditions> are well formed formed and refer to existing names, you can build a set of conditions that exclude each other. The resulting query would just yield no rows instead of throwing an InvalidConditionsError.

Downvoted because it's not idiomatic in the C#/Linq language space (see sbecker's answer for how similar out of bounds problems are handled in the language). If this were a Python question it'd be a +1 from me though.
– James SnellNov 30 '16 at 14:15

32

@JamesSnell I barely see how it relates to the out-of-bounds problem. We’re not picking elements by indexes to reorder them, we’re picking n elements in a collection to count the number of way we can pick them. If at some point there is no elements left while picking them, then there is no (0) way of picking n elements from said collection.
– Mathias EttingerNov 30 '16 at 14:27

@MathiasEttinger Python's guidelines for how and when to use exceptions are very different to C#'s, so using behaviour from Python modules as a guideline is not a good metric for C#.
– Ant PDec 1 '16 at 13:26

2

Instead of picking Python we could just pick SQL. Does a well-formedSELECT query over a table produce an exception when the result set is empty? (spoiler: no!). The "well-formedness" of a query depends only on the query itself and some metadata (e.g. does the table exists and have this field (with this type)?) Once the query is well-formed and you execute it you expect either a (possibly empty) valid result or an exception because the "server" had a problem (e.g. db server went down or something).
– BakuriuDec 5 '16 at 8:56

+1, 0 is a perfectly valid, and indeed extremely important number. There are so many cases where a query might legitimately return zero hits that raising an exception would be highly annoying.
– Kilian FothNov 30 '16 at 13:07

I'm still none the wiser as to if this answer suggests the OP should throw or return an empty set...
– James SnellNov 30 '16 at 14:19

7

Your answer says I could do either, depending on whether there is an error, but you don't mention whether you would consider the example scenario to be an error. In the comment above you say I should return an empty set "if there are no problems", but again, unsatisfiable conditions could be considered a problem, and you go on to say that I am not raising exceptions when I should. So which should I do?
– anaximanderNov 30 '16 at 14:21

3

Ah, I see - you may have misunderstood; I'm not chaining anything, I'm writing a method that I expect to be used in a chain. I'm wondering whether the hypothetical future developer who uses this method would want it to throw in the above scenario.
– anaximanderNov 30 '16 at 16:03

You are dealing with mathematical operations, so it might be a good advice to stick with the same mathematical definitions. From a mathematical standpoint the number of r-sets of an n-set (i.e. nCr) is well defined for all r > n >= 0. It is zero. Therefore returning an empty set would be the expected case from a mathematical standpoint.

This is a good point. If the library were higher level, like picking combinations of colors to make a palette -- then it makes sense to throw an error. Because you know a palette with no colors isn't a palette. But a set with no entries is still a set and the math does define it as equaling an empty set.
– Captain ManDec 1 '16 at 16:23

I find a good way of determining whether to use an exception, is to imagine people being involved in the transaction.

Taking fetching the contents of a file as an example:

Please fetch me the contents of file, "doesn't exist.txt"

a. "Here's the contents: an empty collection of characters"

b. "Erm, there's a problem, that file doesn't exist. I don't know what to do!"

Please fetch me the contents of file, "exists but is empty.txt"

a. "Here's the contents: an empty collection of characters"

b. "Erm, there's a problem, there's nothing in this file. I don't know what to do!"

No doubt some will disagree, but to most folk, "Erm, there's a problem" makes sense when the file doesn't exist and returning "an empty collection of characters" when the file is empty.

So applying the same approach to your example:

Please give me all all combinations of 4 items for {1, 2, 3}

a. There aren't any, here's an empty set.

b. There's a problem, I don't know what to do.

Again, "There's a problem" would make sense if eg null were offered as the set of items, but "here's an empty set" seems a sensible response to the above request.

If returning an empty value masks a problem (eg a missing file, a null), then an exception generally should be used instead (unless your chosen language supports option/maybe types, then they sometimes make more sense). Otherwise, returning an empty value will likely simplify the cost and complies better with the principle of least astonishment.

This advice is good, but does not apply to this case. A better example: How many days after the 30:th of january?: 1 How many days after the 30:th of february?: 0 How many days after the 30:th of madeupuary?: Exception How many working hours on the 30:th of february: Exception
– GuranDec 1 '16 at 9:35

As it's for a general purpose library my instinct would be Let the end user choose.

Much like we have Parse() and TryParse() available to us we can have the option of which we use depending on what output we need from the function. You'd spend less time writing and maintaining a function wrapper to throw the exception than arguing over choosing a single version of the function.

+1 because I always like the idea of choice and because this pattern has a tendency to encourage programmers to validate their own input. The mere existence of a TryFoo function makes it obvious that certain input combinations can cause problems and if the documentation explains what those potential problems are, the coder can check for them and handle invalid inputs in a cleaner fashion than they could by just catching an exception or responding to an empty set. Or they can be lazy. Either way, the decision is theirs.
– aleppkeNov 30 '16 at 15:45

19

No, an empty set is the mathematically correct answer. Doing something else is redefining firmly agreed-upon mathematics, which shouldn't be done on a whim. While we're at it, shall we redefine the exponentiation function to throw an error if the exponent is equal to zero, because we decide we don't like the convention that any number raised to the "zeroth" power is equal to 1?
– WildcardNov 30 '16 at 17:51

1

I was about to answer something similar. The Linq extension methods Single() and SingleOrDefault() immediately sprung to mind. Single throws an exception if there is zero or > 1 result, while SingleOrDefault won't throw, and instead returns default(T). Perhaps OP could use CombinationsOf() and CombinationsOfOrThrow().
– RubberDuckDec 1 '16 at 10:59

1

@Wildcard - you're talking about two different results to the same input which is not the same thing. The OP is only interested in choosing a) the result or b) throwing an exception which is not a result.
– James SnellDec 1 '16 at 23:08

4

Single and SingleOrDefault (or Firstand FirstOrDefault) are a completely different story, @RubberDuck. Picking up my example from my previous comment. It's perfectly fine to anser none to the query "What even primes greater than two exist?", since this is a mathematically sound and sensible answer. Anyway, if you are asking "What is the first even prime greater than two?" there is no natural answer. You just cannot answer the question, because you are asking for a single number. It's not none (it's not a single number, but a set), not 0. Hence we throw an exception.
– Paul KertscherDec 2 '16 at 11:58

You need to validate the arguments provided when your function is called. And as a matter of fact, you want to know how to handle invalid arguments.
The fact that multiple arguments depend on each other, doesn't make up for the fact that you validate the arguments.

Thus I would vote for the ArgumentException providing the necessary information for the user to understand what went wrong.

As an example, check the
public static TSource ElementAt<TSource>(this IEnumerable<TSource>, Int32) function in Linq. Which throws an ArgumentOutOfRangeException if the index is less than 0 or greater than or equal to the number of elements in source. Thus the index is validated in regards to the enumerable provided by the caller.

+1 for citing an example of a similar situation in the language.
– James SnellNov 30 '16 at 14:21

13

Oddly, your example got me thinking, and led me to something that I think makes a strong counter-example. As I noted, this method is designed to be LINQ-like, so that users can chain it together with other LINQ methods. If you do new[] { 1, 2, 3 }.Skip(4).ToList(); you get an empty set, which makes me think that returning an empty set is perhaps the better option here.
– anaximanderNov 30 '16 at 14:30

14

This is not a language semantics issue, the semantics of the problem domain already provides the correct answer, that is to return the empty set. Doing anything else violates the principle of least astonishment for someone working in the combinatoric problem domain.
– UkkoNov 30 '16 at 15:48

1

I'm starting to understand the arguments for returning an empty set. Why it would make sense. At least for this particular example.
– sbeckerDec 1 '16 at 6:36

2

@sbecker, to bring the point home: If the question were "How many combinations are there of ..." then "zero" would be a completely valid answer. By the same token, "What are the combinations such that ..." has the empty set as a completely valid answer. If the question were, "What is the first combination such that ...", then and only then would an exception be appropriate, since the question is not answerable. See also Paul K's comment.
– WildcardDec 2 '16 at 18:34

You should do one of the following (though continuing to consistently throw on basic problems such as a negative number of combinations):

Provide two implementations, one that returns an empty set when the inputs together are nonsensical, and one that throws. Try calling them CombinationsOf and CombinationsOfWithInputCheck. Or whatever you like. You can reverse this so the input-checking one is the shorter name and the list one is CombinationsOfAllowInconsistentParameters.

Please note that the private method being different from the public ones is required for the throwing or action behavior to occur when the linq chain is created instead of some time later when it is enumerated. You want it to throw right away.

Note, however, that of course it has to enumerate at least the first item in order to determine if there are any items. This is a potential drawback that I think is mostly mitigated by the fact that any future viewers can quite easily reason that a ThrowIfEmpty method has to enumerate at least one item, so should not be surprised by it doing so. But you never know. You could make this more explicit ThrowIfEmptyByEnumeratingAndReEmittingFirstItem. But that seems like gigantic overkill.

I think #2 is quite, well, awesome! Now the power is in the calling code, and the next reader of the code will understand exactly what it's doing, and won't have to deal with unexpected exceptions.

Please explain what you mean. How is something in my post "not behaving like a set" and why is that a bad thing?
– ErikEDec 3 '16 at 2:34

You are basically doin the same of my answer, instead of wrapping the query you happend to query a If Condition, the result does not change u.u
– GameDeveloperDec 3 '16 at 2:35

I can't see how my answer is "basically doin the same of" your answer. They seem completely different, to me.
– ErikEDec 3 '16 at 2:36

Basically you are just enforcing the same condition of "my bad class". But you are not telling that ^^. Do you agree, tht you are enforcing the existence of 1 element in the query result?
– GameDeveloperDec 3 '16 at 2:38

You're going to have to speak more clearly for the evidently stupid programmers who still don't get what you're talking about. What class is bad? Why is it bad? I'm not enforcing anything. I'm allowing the USER of the class to decide the final behavior instead of the CREATOR of the class. This allows one to use a consistent coding paradigm where Linq methods don't randomly throw because of a computational aspect that isn't really a violation of normal rules such as if you asked for -1 items at a time.
– ErikEDec 3 '16 at 2:39

While I get where you're going with this, I do try to avoid methods having lots of options passed in that modify their behaviour. I'm still not 100% happy with the mode enum that's there already; I really don't like the idea of an option parameter that changes a method's exception behaviour. I feel like if a method needs to throw an exception, it needs to throw; if you want to hide that exception that's your decision as calling code, and not something you can make my code do with the right input option.
– anaximanderNov 30 '16 at 14:25

If the caller expects a set that isn't empty, then it's up to the caller to either handle an empty set or throw an exception. As far as the callee is concerned, an empty set is a perfectly fine answer. And you can have more than one caller, with different opinions about whether empty sets are fine or not.
– gnasher729Dec 3 '16 at 20:11

Write code assuming first one option, then the other. Consider which one would work best in practice.

Add a "strict" boolean parameter to indicate whether you want the parameters to be strictly verified or not. For example, Java's SimpleDateFormat has a setLenient method to attempt parsing inputs that don't fully match the format. Of course, you'd have to decide what the default is.

Based on your own analysis, returning the empty set seems clearly right — you've even identified it as something some users may actually want and have not fallen into the trap of forbidding some usage because you can't imagine users ever wanting to use it that way.

If you really feel that some users may want to force nonempty returns, then give them a way to ask for that behavior rather than forcing it on everyone. For example, you might:

Make it a configuration option on whatever object is performing the action for the user.

It really depends on what your users expect to get. For (a somewhat unrelated) example if your code performs division, you may either throw an exception or return Inf or NaN when you divide by zero. Neither is right or wrong, however:

if you return Inf in a Python library, people will assault you for hiding errors

if you raise an error in a Matlab library, people will assault you for failing to process data with missing values

In your case, I'd pick the solution which will be least astonishing for end users. Since you're developing a library dealing with sets, an empty set seems like something your users would expect to deal with, so returning it sounds like a sensible thing to do. But I may be mistaken: you have a much better understanding of the context than anyone else here, so if you expect your users to rely on the set always being not empty, you should throw an exception right away.

Solutions which let the user choose (like adding a "strict" parameter) aren't definitive, since they replace the original question with a new equivalent one: "Which value of strict should be the default?"

It is common conception (in mathematics) that when you select elements over a set you could find no element and hence you obtain an empty set. Of course you have to be consistent with mathematics if you go this way:

Common set rules:

Set.Foreach(predicate); // always returns true for empty sets

Set.Exists(predicate); // always returns false for empty sets

Your question is very subtle:

It could be that input of your function has to respect a contract: In that case any invalid input should raise a exception, that's it, the function is not working under regular parameters.

It could be that input of your function has to behave exactly like a set, and therefore should be able to return an empty set.

Now if I were in you I would go the "Set" way, but with a big "BUT".

Assume you have a collection that "by hypotesis" should have only female students:

Now your collection is no longer a "pure set", because you have a contract on it, and hence you should enforce your contract with a exception.

When you use your "set" functions in a pure way you should not throw exceptions in case of empty set, but if you have collections that are no more "pure sets" then you should throw exceptions where proper.

You should always do what feels more natural and consistent: to me a set should adhere to set rules, while things that are not sets should have their properly thought rules.

In your case it seems a good idea to do:

List SomeMethod( Set someInputSet){
var result = someInputSet
.CombinationsOf(4, CombinationsGenerationMode.Distinct)
.Select(combo => /* do some operation to a combination */)
.ToList();
// the only information here is that set is empty => there are no combinations
// BEWARE! if 0 here it may be invalid input, but also a empty set
if(result.Count == 0) //Add: "&&someInputSet.NotEmpty()"
// we go a step further, our API require combinations, so
// this method cannot satisfy the API request, then we throw.
throw new Exception("you requsted impossible combinations");
return result;
}

But it is not really a good idea, we have now a invalid state that can occurr at runtime at random moments, however that's implicit in the problem so we cannot remove it, sure we can move the exception inside some utility method (it is exactly the same code, moved in different places), but that's wrong and basically the best thing you can do is stick to regular set rules.

Infact adding new complexity just to show you can write linq queries methods seems not worth for your problem, I'm pretty sure that if OP can tell us more about it's domain, probably we could find the spot where the exception is really needed (if at all, it is possible the problem does not require any exception at all).

The problem is that you are using mathematically defined objects to resolve a problem, but the problem itself may not be requiring a mathematically defined object as answer (infact here returning a list). It all depends on the higher-level of the problem, regardless of how you solve it, that's called encapsulation/information hiding.
– GameDeveloperDec 1 '16 at 12:23

The reason for which your "SomeMethod" should no longer behave like a set is that you are asking a piece of information "out of Sets",specifically look at the commented part "&&someInputSet.NotEmpty()"
– GameDeveloperDec 1 '16 at 12:30

1

NO! You shouldn't throw an exception in your female class. You should use the Builder pattern which enforces that you can't even GET an instance of the FemaleClass unless it has only females (it's not publicly creatable, only the Builder class can hand out an instance), and possibly has at least one female in it. Then your code all over the place that takes FemaleClass as an argument doesn't have to handle exceptions because the object is in the wrong state.
– ErikEDec 3 '16 at 1:37

You are assuming that class is so simple that you can enforce its preconditions simply by constructor, in reality. You argument is flawed. If you forbid to no longer have females, then or you cannot have a method for removing studends from the class or your method has to throw a exception when there's 1 student left. You just have moved my problem elsewhere. The point is that there will be always some precondition/function state that cannot be maintained "by design" and that the user is going to break:that why Exceptions exist! This is pretty theoretical stuff,I hope the downvote is not yours.
– GameDeveloperDec 3 '16 at 2:07

The downvote is not mine, but if it was, I'd be proud of it. I downvote carefully but with conviction. If the rule of your class is that it MUST have an item in it, and all the items must meet a specific condition, then allowing the class to be in an inconsistent state is a bad idea. Moving the problem elsewhere is exactly my intention! I want the problem to occur at building time, not at use time. The point of using a Builder class instead of a constructor throwing is that you can build up a very complex state in many pieces and parts through inconsistencies.
– ErikEDec 3 '16 at 2:11

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).