If you can write a generic class, like List, you deserve some respect. You’re delivering generic behavior to potentially any type that you, or someone else, can use it for. While it’s one of the fancier features of programming languages, it’s also a great workaround for what would otherwise be type-unsafe operations in a type-safe language.

A workaround, you say? Do I not respect the ingenuity of the concept? Yes, I do. I’ve even made generic classes myself and use them around the clock to get things done. But think about it: What are we actually using them for?

Unless you’re doing C++ (with stuff like shared_ptr), you’re probably using generics exclusively for collections. Maybe you use them for the occasional function that needs same-type parameters, like compare(T, T) or something similar, but it’s probably for collections most of the time.

And that is kind of weird. Why is the fact that you can have collections of objects not a first-class language construct? The whole point of making software is to process multiple things of the same type. We could say it’s the essence of making software. And even though some languages do support native arrays, most people prefer collection classes.

I know that collection classes work very well. Still, the fact that we have to rely on classes to do something that is the essential purpose of programming has some drawbacks.

Interaction with a class-based collection is limited to plain functions, whereas native collections would be able to better integrate into the language.

Collections make it difficult for the compiler to get a grip on what’s really happening—which can prevent it from making smart optimizations.

You need generics to make sure the use of those classes is still type-safe, as for the rest of a type-safe language.

Generic collection classes are limited to containing references to other objects, although value-based semantics would be way more efficient for some objects (that’s why Java recently introduced so-called packed arrays).

Class-based collections are objects themselves, which make them sharable by default. Although that can come in handy, it’s not normally what you need. That’s why languages started getting immutable arrays and variant ways of making things constant—to overcome that.

This same by-reference behavior also resulted in the distinction between no-array (null) and an empty array (having 0 elements), but that distinction is rarely of any use. This is why every junior programmer has to learn to always check for both situations—and why even senior programmers might hesitate to return a null although the result is empty: because you’re never sure whether the caller of your function will check on null and empty.

I love thinking about the future of software development and, for that matter, the future of programming languages, and I think there is a case to be made for promoting object collections to first-class citizens.

True, there are languages that already support collections out of the box, but they still suffer from some of the issues above and most of them are dynamic languages. I would like to see if we can go beyond merely fixing the issues I mentioned, especially in a type-safe language.

The first thing we could do is to make it possible to define a plural name with every class. Like this:

class Person/Persons {
String name,
Integer age
}

This may sound like a weird idea, but it’s actually been around since Apple introduced AppleScript in 1993. It’s not just that you can write

function foo(Persons persons) { …. }

instead of

function foo(ArrayList persons) { …. }

It also means that the compiler really knows that persons is not just another object but a list of Person objects. That makes it possible to introduce some very neat operators. For example, to filter we could write

persons [ age > 18 ]

instead of

persons.filter({ Person p | p.age > 18 })

Although in recent years some programming languages have gotten closer to the first form, they typically still base that on handing over a closure to a filter function. The first form is not just the shortest form for a query like that, it is also fully type-safe and can potentially be optimized because the compiler knows what’s going on.

We can do the same for a map-operation. For example:

persons { name() }

This could return all the names for the persons concerned. Because following paths like this is very common in object oriented programming, we could even go further and make it possible to define plural names for properties too. You could then write queries like

persons.names()

This is not something that would normally be possible, because names() will never be a function defined in a generic List class. You can also property-chain that as in:

persons.addresses().cities()

This would return all the cities in which those people live. (Ignore for now that you would probably want the result to contain only unique cities. I will write about that another time.)

The explicit definition of singular and plural also means that we can have set-level methods. Functions like sum() and average() could, for example, be defined as such, making it possible to write:

persons.ages().average()

and

persons.count()

And all such expressions will never fail due to a NullPointerException, because lists would have value-semantics and thus can never be null.

I’m deliberately ignoring the fact that we can have sets, dictionaries, and other types of collections. That’s because I’m proposing a more conceptual view of collections. In general, it makes no sense to have an object in a list twice. I might write about that later on too, but for now it means that I regard every list as a set anyway. A dictionary (hashmap) can be represented as a list with a structure in it (key, value), and I would prefer that the compiler or runtime take care of the rest.

What I mainly want to put forward is that collections are so fundamentally connected to programming and processing data that we limit ourselves in thinking that a list is “just another” object in the class hierarchy. Making collections feel more at home in a programming language may lead to a whole new thinking about object orientation. The time might be right to think about set-oriented languages.

Post navigation

2 thoughts on “Why Generic Programming is a Workaround”

I like the idea, but not the plural form of a class. I would not mind Person and Person[], and persons.name (with persons of type Person[], and returning an [] of the type of name.). Allthough I also like the use of […] for a predicate.