Seems to be similar to the direction taken by John De Goes (using algebras to describe your problem domain, building programs from those algebras via free monads/applicatives/etc., interpreting those programs using natural transformations). I think there is definitely something to this kind of thinking.

Both authors argue that modern programming languages do not have the best ergonomics for this style of programming, albeit it is at least possible to do most of it in languages such as Haskell and Scala.

It depends on whether reusing your data or reusing your code is more important; and that's application specific.

Reusing code means you need to shoehorn your data into as generic a container as possible, because types are what make one bit of code incompatible with some other piece of data. Reduce the number of types, and you naturally increase the reusability of code. But there's irreducible complexity; the structure of data doesn't go away, if it's not explicit in types, it'll be implicit in code. There's a risk of it being so implicit that it's not until runtime that a description of the data being manipulated is known - this is a logical consequence of building your program by composing functions.

It might be that you need to do lots of computation, and you need to heavily optimize your computation primitives. Reusing this code, and the right code, is very important. So the data needs to change shape. This is what you get in databases - the constraints of the relational model - and in monadic code dealing with lists, etc.

On the other hand, programs are usually easier to understand when the data structures are mapped out explicitly, and the operations programs perform are understood in terms of projections between types or deltas within data structures. Sometimes, when projecting between types, all you need to know about a function are its input and output types, and you know what it does.

Most line of business applications don't have much reused code, outside of libraries for UI and accessing external services. Data is more important. And that's fine.

> I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

I don't think Torvalds' quote applies in this case: it's not really "killing datastructures", so much as "using abstract datastructures"; i.e. it's like programming to an interface rather than a concrete instance. In fact, that's exactly what it is, since he's using typeclasses (which can be thought of as a sort of interface). We could achieve a similar thing with existential types (AKA modules).

I understood the Torvalds quote as being more like: would it be worse if some chunk of your library code disappeared, or if some chunk of your datastructures disappeared? If it's the former, then your library functions are "too smart", i.e. they're probably doing non-obvious things to their inputs. If it's the latter, then the data is more "self-describing".

Note that I use the phrase "library functions", since (first-class) functions are a perfectly good way to represent data (e.g. Church encoding and its variants).

First-class functions can perfectly represent data, but that doesn't mean they're a perfectly good way to represent data.

For example, if you use Church encoding for your numbers, how do you do static analysis of bounds checking? If you implement if-statements by passing two functions to a function which calls one or the other for truth and falsity, how do you determine dead code at compile time? You just move a bunch of problems from compile time to run time, where they're harder to reason about, in most Turing-complete languages.

Perhaps this is an artificial dichotomy? What we really want is verification that some properties hold. These properties should be specified in a uniform way, irrespective of the time that the check can apply.

When I specify a property, why do I have to also pick the time - 'run time' or 'compile time' and why the language used to specify exactly the same property is so different in each case? I think there is some unnecessary conflation here.

> For example, if you use Church encoding for your numbers, how do you do static analysis of bounds checking?

I dunno; I wouldn't use Church encoding in that situation. For the same reason I wouldn't use infinite precision rationals to represent 24bit R/G/B colours on a microcontroller, and I wouldn't use floats in a computer algebra system.

You seem to have strawmanned "functions are a perfectly good datastructure" into "always use functions for everything", which is (a) obviously not what I claimed and (b) not even the point I was trying to make; it was just a footnote pointing out that "code vs data" has nuances because sometimes our "data" is code.

> If you implement if-statements by passing two functions to a function which calls one or the other for truth and falsity, how do you determine dead code at compile time?

Supercompilation. Still, there's not much point taking something as expressive as functions, and using them to implement something impoverished and abstract like booleans. Much better to implement something actually relevant for your domain; for example, using functions to implement booleans, then using those booleans to implement permission checks (e.g. `if(isAdmin, doAction, permissionDenied)`), is harder and more complicated than just implementing the permissions checks directly using functions.

Note that this is something the actually happens all the time! Any time an OO programmer has two objects with different implementations of the same method (say, an `Admin` object whose `doAction` method does the action, and an `UnpriviledgedUser` class whose `doAction` method shows a "permission denied" message), they're essentially "implementing if-statements by passing two functions to a dynamic-dispatcher which calls one or the other for truth and falsity". It's not a problem.

Note that Smalltalk actually does implement "if statements" using objects with different `ifTrue` methods. Yet the only reason they do so, and in fact why any language has things as inherently-meaningless as booleans (and functions, strings, arrays, etc.), is because the language designers/implementers aren't able to predict the domains that we'll be dealing with. The best they can do is provide generic building blocks which are sufficiently expressive for us to construct semantically-meaningful domains for ourselves. Hence any attempt to implement <generic building block A> using <generic building block B> is usually pointless from a practical perspective (although is useful for learning): it's basically a constructive proof that <generic building block A> is expressive enough that we don't need <generic building block B> at all, and hence is self-defeating!

You seem to have strawmanned "functions are a perfectly good datastructure" into "always use functions for everything"

You said that using functions for everything was, and I quote, "perfectly good". That seems to imply to me that it cannot be improved, and conveyed an unwarranted absolute position.

Note that Smalltalk actually does implement "if statements"

Smalltalk is of course what I had in mind when I wrote that. And of course Smalltalk's lasting contribution is the dynamic optimization gymnastics it inspired to compensate for its design choices, and not the design choices themselves.

language designers/implementers aren't able to predict the domains that we'll be dealing with. The best they can do is provide generic building blocks which are sufficiently expressive for us to construct semantically-meaningful domains for ourselves

I'll deny this to my dying day. The most expressive language with the most generic primitives is not the best language, and it is far, very far from the best we language designers can do. The languages that give people the most ability to construct domains for themselves are the ones that end up with the lowest common denominators between codebases and programmers. It's the reason Lisp and Smalltalk didn't take over the world. Too much freedom is a bad thing; form is liberating, and more importantly, prosocial.

An average new graduate wouldn't be productive in any codebase in a few days. And even if they were, that's a terribly low bar to set. Software development education sucks and most people have to learn on the job. So why not teach them ways to write more reliable and maintainable code.

That is probably good advice but this style of coding
(which comes down to lawless type classes without inherent utility) does make it much more difficult to read unknown code.

Usually documentation isn't fleshed out in internal modules so I end up flipping between various instances and call sides just to figure out what the type class is supposed to do. This doesn't mean that there aren't valid use cases for this but please don't use it as default.

There is an easy design choice - which Haskell didn't take - which would allow us to have transparent Identity functors, associative Compose and many others: Add a conversion rule to your language and don't eagerly expand definitions at the type level.

and so on. This works, since explicit conversion during type checking allows you to associate type classes to otherwise transparent definitions.

Of course, this is no silver bullet and complicates other things. Everything works out beautifully though, if you combine conversion with bidirectional type checking, at the expense of less powerful type inference.

Pardon my ignorance, but can't this be more easily translated into terms of (modern) object oriented programming, where you avoid mutating state?

Here, the "natural transformation" is a function that takes an input object which adheres to a specific interface (i.e. provides a set of specific methods), operates only on that input interface, and returns another object that conforms to the expected output interface.

Avoiding mutating state means that either those objects have no attributes, or that the state can only be set on construction. (In FP speak, this corresponds currying of common parameters. Or to an internal abstract type, if the constructor is private and you can only create it via a specific set of static methods).

Is this modern style of OOP equivalent to "Functor-Oriented Programming", or am I missing something?

> Avoiding mutating state means that either those objects have no attributes, or that the state can only be set on construction. (In FP speak, this corresponds currying of common parameters. Or to an internal abstract type, if the constructor is private and you can only create it via a specific set of static methods).

I don't think the style you're talking about is something that most people would call "object oriented programming" ("modern" or otherwise). Indeed the usual name for the programming style in which you avoid mutating state is "functional programming".

Functors are functors, natural transformations are natural transformations; neither has much in common with classes, methods or interfaces. Functors are being used to achieve composition here, but the use of functors is an essential part of the pattern; talking about "composition-oriented programming" would be vague to the point of meaninglessness (every programmer believes they're being compositional), and there simply isn't an OO equivalent of open recursive types as far as I know.

The unification-fd library referenced in the article is probably a pretty good example. I have not looked at it, though, and it would be nice to have explicit bits of code pointed out in an explanation.

> With functor oriented programming, polymorphism is not necessarily about using functions polymorphically. Instead, polymorphism provides correctness guarantees by ensuring that a particular function can only touch the specific layers of functors in its type signature and is independent of the rest of the data structure. One benefits from polymorphism even when a function is only ever invoked at a single type.

Even though you don't call summarize1 polymorphically, the signature tells you that it's only looking at the structure of the list rather than what's inside it - a reader can immediately see that it's going to do something like return the length of the list. Whereas summarize2 could do other things, e.g. sum the elements of the list, add and subtract alternating elements...

A list is a pretty simple structure, these kinds of guarantees become more useful when the structure itself is complicated. E.g. in the project I'm currently working on I have a tree-like datatype that only ever contains Strings, but I've written it generically so I can be confident that my structural functions only work on the structure, and I have a clear separation between functions that operate on the structure of the tree and functions that operate on the content of the tree.

It would be nice to know what kind of code OP writes. Ie. are we talking application code or library code, and if it's libraries are we talking networking libraries, graphics libraries or math libraries?

Usually I find that Haskellers write very "low-level" code, ie. implement math abstractions or the like, while I prefer to write applications (in Haskell), where this abstraction technique may not be suitable.

With application code you’re interfacing with some real-life system, and need to adapt to all the corner cases of this, whereas with library code you might be implementing something entirely theoretical, where everything fits together perfectly in some abstraction because there are no corner cases to consider (it’s just a clean math concept).

With a math concept, the elegant concept already exists beforehand, and you just implement it. With application code, there might be 27 different but-ifs, and-also-remembers that you need to consider, where developing an elegant, coherent concept of the entire app is very challenging, or even impossible.

With some application code, I find I’m often the first one to make sense of it all, taking all the corner cases and creating a coherent model out of it, i.e. making it all fit into some well-defined abstraction. And then a new corner case appears and my model falls on the floor, and needs to be redesigned.

The definition of exact real arithmetic doesn’t change over time. There’s no customer who calls you up and asks you to add a feature that doesn’t fit into your existing model/abstraction.

It's basically a functional programming language where functions can be applied to a type. The type is then passed as the first parameter. So instead of `afunction(param1, param2)` you would do `param1.afunction(param2)`.

Are you suggesting that the difference between a.foo(b) and foo(a, b) is analogous to the difference between goto and functions? Functions give structure and restrictions to goto. Method syntax is just a different way to write the exact same thing.

> I'm suggesting all of programming is syntactic sugar.
I wasn't critiquing anything, just suggesting that moving the first argument of a function call to come before a . isn't enough to consider something OO.

There's typically more to OO than just method syntax. Dynamic dispatch, subtyping, etc. And saying something is OO without encapsulation isn't really saying anything, given that encapsulation is arguably the most important part of OO.

I'd argue that, it's not remotely the most important part nor is sub-typing. Encapsulation encourages better OO, but you can take a well encapsulated OO program, and remove all encapsulation by making all the data public, and that program is still a well structured OO program. OO is about the way objects talk to each other to accomplish work, the message, the API between objects; it's not about the many little things like encapsulation that encourage better messaging. There are dozens of practices that encourage better OO, but you can write good OO without them if you know what good OO is and don't need those training wheels.

Well, now we have to get semantic; if by encapsulation you mean you keeping your data private and only exposing behavior publicly, then you're wrong. If by encapsulation you mean all behavior that uses an objects data is in the object and no outside code relies on that objects data even if it is public, then you're agreeing with me.

You clearly didn't play out my thought experiment above or you wouldn't be making the statement you just made.

I agree that in your thought experiment, the program remains OO, but the language is not. You can write OO programs in many languages, but I would argue that you don't have an OO language unless it has special support for OO programming.

I wasn't arguing it was an OO language, merely that it was OO code. OO used to be done in C in exactly the same basic way, passing the struct that is the object as the first arg to all the member functions; Python does this explicitly as well.

A strict type system makes sure your program is statically correct, and catches many bugs caused by the programmer. The real world though is dynamic and I prefer to check all input data at runtime, eg. each function check if what's passed into it is correct. It not only catches a lot of bugs, those caused by the programmer and the ones caused by interacting with the outside word, it also makes the code easier to test. This probably sounds like I'm swearing in church to a Haskel programmer though.

I'm not sure what you mean, or perhaps you have an incorrect understanding of how Haskell works. Of course data is still checked at runtime; for example a parser will check that a string is formatted in valid way, or a lookup in a hash table will check that a key is valid. This is done at runtime as it would be in virtually any language.

However there's no way in Haskell (short of low-level type system tricks) to check that a function with an argument of type `Int` actually receives an `Int` when it is called. There's no need to, because it is guaranteed by the type system. The strength of these guarantees, coupled with the ability to define rich, expressive types, are what makes Haskell programs very robust. For example you can represent all possible results of a valid JSON parse:

Which variant of this type is actually encountered will not be determined until runtime, so the program is required to respond to the different cases; however a function which takes type `Value` will never need to worry that it doesn't actually receive one of the above variants at runtime.

> Instead of writing transformations between data structures, one writes natural transformations between functors, where a natural transformation between functors F and G is a polymorphic function of type forall a. F a -> G a.