In ReverseString(), I would say return an empty string because the return type is string, so the caller is expecting that. Also, this way, the caller would not have to check to see if a NULL was returned.

In FindPerson(), returning NULL seems like a better fit. Regardless of whether or not NULL or an empty Person Object (new Person()) is returned the caller is going to have to check to see if the Person Object is NULL or empty before doing anything to it (like calling UpdateName()). So why not just return NULL here and then the caller only has to check for NULL.

Does anyone else struggle with this? Any help or insight is appreciated.

It will vary. You could also argue that the method should stop values which are incorrect up front, for instance your stringToReverse parameter being passed in as empty or null. If you performed that check (threw back an exception), it will always return something other than null.
–
Aaron McIverNov 17 '11 at 18:42

1

@Boris I didn't see your comment. I actually just posted Special Case as an answer.
–
Thomas Owens♦Nov 17 '11 at 18:49

16 Answers
16

StackOverflow has a good discussion about this exact topic in this Q&A. In the top rated question, kronoz notes:

Returning null is usually the best idea if you intend to indicate that
no data is available.

An empty object implies data has been returned, whereas returning null
clearly indicates that nothing has been returned.

Additionally, returning a null will result in a null exception if you
attempt to access members in the object, which can be useful for
highlighting buggy code - attempting to access a member of nothing
makes no sense. Accessing members of an empty object will not fail
meaning bugs can go undiscovered.

Personally, I like to return empty strings for functions that return strings to minimize the amount of error handling that needs to be put in place. However, you'll need to make sure that the group that your working with will follow the same convention - otherwise the benefits of this decision won't be achieved.

However, as the poster in the SO answer noted, nulls should probably be returned if an object is expected so that there is no doubt about whether data is being returned.

In the end, there's no single best way of doing things. Building a team consensus will ultimately drive your team's best practices.

Personally, I like to return empty strings for functions that return strings to minimize the amount of error handling that needs to be put in place. Never understood this argument. The null check is what, at most < 10 chars? Why do we act like this is back bending labor?
–
Aaron McIverNov 17 '11 at 18:44

13

Nobody said it was back bending labor. But it is labor. It adds a special case to the code which means one more branch to test. Plus, it disrupts the flow of the code. for x in list_of_things() {...} is arguably quicker to grok then l = list_of_things(); if l != null {...}
–
Bryan OakleyNov 17 '11 at 20:08

2

@Bryan: there is a big difference between an empty list of values, and an empty string. A string seldom represents a list of characters.
–
kevin clineNov 18 '11 at 4:45

1

@kevin cline: of course. But in the case of a string, you may want that string to appear in a log whether it's null or not. Having to check if it's null so you can print an empty string obfuscates the real intent. Or perhaps it's something like database field that contains user-contributed content that you want to add to a web page. It's easier to do emit(user.hometown) than if user.hometown == null { emit("") else {emit(user.hometown)}
–
Bryan OakleyNov 18 '11 at 12:11

1

@AaronMcIver: It's only 10 chars... every time you call it. Now, if you're a fan of AOP, there are some very nice ways to solve this problem.
–
Steve EversNov 18 '11 at 14:51

...and to be honest you can assume that every .NET programmer knows about the Try... pattern because it's used internally by the .NET framework. That means they don't have to read the documentation to understand what it does, which is more important to me than sticking to some purist's view of functions (understanding that result is an out parameter, not a ref parameter).

So I'd go with TryFindPerson because you seem to indicate it's perfectly normal to be unable to find it.

If, on the other hand, there's no logical reason that the caller would ever provide a personId that didn't exist, I would probably do this:

public Person GetPerson(int personId);

...and then I'd throw an exception if it was invalid. The Get... prefix implies that the caller knows it should succeed.

+1, I would answer this question with a reference to Clean Code if you hadn't already done so. I like the mantra: "IF YOUR FUNCTION CANT DO WHAT IT'S NAME SAYS, THEN THROW AN EXCEPTION." Muuuch better than checking for null every dozen lines or so....
–
GrahamNov 17 '11 at 19:45

7

I've given up assuming anything about the knowledge of people.
–
CaffGeekNov 17 '11 at 22:02

3

"The problem with using null is that the person using the interface doesn't know if null is a possible outcome, and whether they have to check for it, because there's no not null reference type." Since reference types can be null, can't you always assume that null is a possible outcome?
–
Tommy CarlierNov 18 '11 at 10:52

4

If a function returns a pointer (C / C++) or a reference to an object (Java), I always assume that it may return null.
–
GiorgioNov 18 '11 at 12:09

6

"The problem with using null is that the person using the interface doesn't know if null is a possible outcome, and whether they have to check for it, [...]" If a method might return null, it should be explicitly mentioned as part of the API contract.
–
Zsolt TörökNov 18 '11 at 12:22

Nulls are awkward things in object-oriented programs because they
defeat polymorphism. Usually you can invoke foo freely on a variable
reference of a given type without worrying about whether the item is
the exact type or a sub-class. With a strongly typed language you can
even have the compiler check that the call is correct. However, since
a variable can contain null, you may run into a runtime error by
invoking a message on null, which will get you a nice, friendly stack
trace.

If it's possible for a variable to be null, you have to remember to
surround it with null test code so you'll do the right thing if a null
is present. Often the right thing is same in many contexts, so you end
up writing similar code in lots of places - committing the sin of code
duplication.

Nulls are a common example of such problems and others crop up
regularly. In number systems you have to deal with infinity, which has
special rules for things like addition that break the usual invariants
of real numbers. One of my earliest experiences in business software
was with a utility customer who wasn't fully known, referred to as
"occupant." All of these imply altering the usual behavior of the
type.

Instead of returning null, or some odd value, return a Special Case
that has the same interface as what the caller expects.

I would think ReverseString() would return the reversed string and it would throw an IllegalArgumentException if passed in a Null.

I think FindPerson() should follow the NullObject Pattern or raise an unchecked exception about not finding something if you should always be able to find something.

Having to deal with Null is something that should be avoided. Having Null in languages has been called a Billion Dollar Mistake!

I call it my billion-dollar mistake. It was the invention of the null
reference in 1965. At that time, I was designing the first
comprehensive type system for references in an object oriented
language (ALGOL W).

My goal was to ensure that all use of references should be absolutely
safe, with checking performed automatically by the compiler. But I
couldn't resist the temptation to put in a null reference, simply
because it was so easy to implement.

This has led to innumerable errors, vulnerabilities, and system
crashes, which have probably caused a billion dollars of pain and
damage in the last forty years.

In recent years, a number of program analysers like PREfix and PREfast
in Microsoft have been used to check references, and give warnings if
there is a risk they may be non-null. More recent programming
languages like Spec# have introduced declarations for non-null
references. This is the solution, which I rejected in 1965.

I see both sides of this argument, and I realize some rather influential voices (e.g., Fowler) advocate not returning nulls in order to keep code clean, avoid extra error-handling blocks, etc.

However, I tend to side with proponents of returning null. I find there is an important distinction in invoking a method and it responding with I don't have any data and it responding with I have this empty String.

Since I've seen some of the discussion referencing a Person class, consider the scenario where you attempt to look up an instance of the class. If you pass in some finder attribute (e.g., an ID), a client can immediately check for null to see if no value was found. This is not necessarily exceptional (hence not needing exceptions), but it should also be documented clearly. Yes, this requires some rigor on the part of the client, and no, I don't think that's a bad thing at all.

Now consider the alternative where you return a valid Person object... that has nothing in it. Do you put nulls in all of its values (name, address, favoriteDrink), or do you now go populate those with valid but empty objects? How is your client to now determine that no actual Person was found? Do they need to check if the name is an empty String instead of null? Isn't this sort of thing actually going to lead to as much or more code clutter and conditional statements than if we'd just checked for null and moved on?

Again, there are points on either side of this argument I could agree with, but I find this makes the most sense to the most people (making the code more maintainable).

Return null if you need to know if the item exists or not. Otherwise, return the expected data type. This is especially true if you're returning a list of items. It's usually safe to assume if the caller is wanting a list, they'll want to iterate over the list. Many (most? all?) languages fail if you try to iterate over null but not when you iterate over a list that is empty.

I find it frustrating to try and use code like this, only to have it fail:

for thing in get_list_of_things() {
do_something_clever(thing)
}

I shouldn't have to add a special case for the case when there are no things.

In cases where a list or other group of values is expected, I always elect to return an empty list, for exactly this reason. The default action (iterating across an empty list) is almost always right, and if it isn't then the caller can explicitly check to see if the list is empty.
–
TMNNov 17 '11 at 19:03

I would recommend using the Null Object pattern whenever possible. It simplifies the code when calling your methods, and you don't get to read ugly code like

if (someObject != null && someObject.someMethod () != whateverValue)

For instance, if a method returns a collection of objects who fit some pattern, then returning an empty collection makes more sense than returning null, and iterating over this empty collection will have virtually no performance penalty. Another case would be a method that returns the class instance used to log data, returning a Null object rather than null is preferable in my opinion, as it does not force users to always check if the returned reference is null.

In cases where returning null makes sense (calling the findPerson () method for example), I'd try to at least provide a method that returns if the object is present (personExists (int personId) for example), another example would be the containsKey () method on a Map in Java). It makes callers' code cleaner, as you can easily see that there is a possibility that the desired object might not be available (person does not exist, key is not present in map). Constantly checking if a reference is null obfuscates the code in my opinion.

In my opinion there is a difference between returning NULL, returning some empty result (e.g. the empty string or an empty list), and throwing an exception.

I normally take the following approach. I consider a function or method f(v1, ..., vn) call as the application of a function

f : S x T1 x ... x Tn -> T

where S it the "state of the world" T1, ..., Tn are the types of input parameters, and T is the return type.

I first try to define this function. If the function is partial (i.e. there are some input values for which it is not defined) I return NULL to signal this. This is because I want the computation to terminate normally and tell me that the function I have requested is not defined on the given inputs. Using, e.g., an empty string as return value is ambiguous because it could be that the function is defined on the inputs and the empty string is the correct result.

I think the extra check for a NULL pointer in the calling code is necessary because you are applying a partial function and it is the task of the called method to tell you if the function if not defined for the given input.

I prefer to use exceptions for errors that do not allow to carry out the computation (i.e. it was not possible to find any answer).

For example, suppose I have a class Customer and I want to implement a method

Customer findCustomer(String customerCode)

to search for a customer in the application database by its code.
In this method, I would

are part of the semantics of what I am doing and I would not just "skip them" in order to make the code read better. I do not think it is a good practice to simplify the semantics of the problem at hand just to simplify the code.

Of course, since the check for null occurs very often is it good if the language supports some special syntax for it.

I would also consider using the Null Object pattern (as suggested by Laf) as long as I can distinguish the null object of a class from all other objects.

null is the best thing to return if and only if the following following conditions apply:

the null result is expected in normal operation. It could be expected that you may not be able to find a person in some reasonable circumstances, so findPerson() returning null is fine. However if it is a genuinely unexpected failure (e.g. in a function called saveMyImportantData()) then you should be throwing an Exception. There's a hint in the name - Exceptions are for exceptional circumstances!

null is used to mean "not found / no value". If you mean something else, then return something else! Good examples would be floating point operations returning Infinity or NaN - these are values with specific meanings so it would get very confusing if you returned null here.

The function is meant to return a single value, such as findPerson(). If it was designed to return a collection e.g. findAllOldPeople() then an empty collection is best. A corollary of this is that a function which returns a collection should never return null.

In addition, make sure that you document the fact that the function can return null.

If you follow these rules, nulls are mostly harmless. Note that if you forget to check for null, you will normally get a NullPointerException immediately afterwards, which is usually a pretty easy bug to fix. This fail fast approach is much better than having a fake return value (e.g. an empty string) which gets quietly propagated around your system, possibly corrupting data, without throwing an exception.

Finally, if you apply these rules to the two functions listed in the question:

FindPerson - it would be appropriate for this function to return null if the person was not found

ReverseString - seems like it would never fail to find a result if passed a string (since all strings can be reversed, including the empty string). So it should never return null. If something goes wrong (out of memory?) then it should throw an exception.

1) if semantic of function is that it can return nothing, caller MUST test for it, otherwise he will eg. take your money and give it to noone.

2) its good to make finctions that semanticaly always return something (or throw).

with eg. logger, it make sense to return logger that does not log if no logger was defined. When returning collection, it almost never makes sense (except on "low enough level", where the collection itself is THE data, not what it contains) to return nothing, because empty set is set, not not nothing.

with person exemple, i would go "hibernate" way (get vs load, IIRC, but there its complicated by proxy objects for lazy loading) of having two functions, one that returns null and second that throws.

Methods returning collections should return empty but others could return null, because for the collections it might so happen that you will have object but there are no elements in it, so the caller will validate for just size rather both.

To me there are two cases here. If you're returning some sort of list you should always return the list no matter what, empty if you don't have anything to put in it.

The only case where I see any debate is when you're returning a single item. I find myself preferring to return null in case of failure in such a situation on the basis of making things fail fast.

If you return a null object and the caller needed a real object they might go ahead and try to use the null object producing unexpected behavior. I think it's better for the routine to go boom if they forget to deal with the situation. If something's wrong I want an exception ASAP. You're less likely to ship a bug this way.

NULL should be returned if the application expects a data to be available but the data is not available. For example, a service that returns the CITY based on zipcode should return null if the city is not found. The caller can then decide to handle the null or blow up.

Empty list should be returned if there are two possibilities. Data
available or NO data available. For example a service that returns
the CITY(s) if the population is greater than certain number. It can
return a empty list if No data that satisfies the given criteria.