First, let me say I love Julia! It is a terrific new tool in my box for Machine Learning

I am an old time programmer who is used to type safety and static type checking, and with Julia, I am experiencing many of the same frustrations I have with Python-- namely when I try to use someone else’s code, it can be a bear.

For example, I am currently using Flux to do some NLP. I want to make a text classifier using a Conv layer. I feed it my word embeddings and get a mystery crash deep inside NNLib. Obviously I am not passing it the kind of input it expects. The Flux code has no argument types, and the comments do not mention types. I waste huge amounts of time trying to reverse engineer the stack dump and guess what I did wrong.

This, to a certain extent, is a side-effect of Julia being so generic. With multiple dispatch, you usually don’t restrict to a specific set of types, as anything that respects the interface (more on this below) will work. Eg this can be used to extend functions written in pure Julia with AD or interval arithmetic, even when these functions were not written with this in mind.

The downside is that interfaces are not formally defined, and even if some have some recognized conventions this is not explicitly enforced by the language. A non-working program could then be a mistake on the programmer’s part, a bug in eg Flux, or some corner case of the interface which no one explored so far.

My usual strategy is to make a self-contained minimal example of the bug (which you should do), and ask here, then open an issue.

I’ve been playing with Flux recently and it’s been a lot of fun. But also slightly frustrating because being a rather experimental package it’s in a state of flux itself, and the internals are only very lightly documented at the moment. (Not to complain; my own packages suffer from many documentation problems of their own.)

For Flux specifically, I found it very helpful to build up my Flux models interactively in the REPL with a “toy sized” batch of data. This can be done incrementally, adding one layer at a time until the shape of the data matches correctly between layers. The loss function and a single step of an optimizer can be tested in a similar way. Once all this is working I would try a training run on some medium sized example data.

On a meta level, I think it’s useful to pursue the tightest feedback loop possible for experimentation and discovery, and this is where the interactive environments which can be created using python and julia really shine. It’s that qualitative difference in creativity which comes with having the result right now, rather than in five minutes after recompiling and reloading the data. But there’s a cost for production code bases compared to static languages, which seems to come in the form of more tests. Things which the compiler doesn’t test for you inevitably need to be tested manually.

Coming back to your original problem, yes we don’t have interface definitions in the language and this leads to a large amount of duck typing. Right now, I don’t think there’s a better solution than clear API documentation. Type constraints on functions is a non-solution in general because the type tree (being a tree) often can’t capture the full set of types a function will work on (*). Amusingly this is exactly the same problem faced by C++ template libraries which use trait-based dispatch rather than normal class inheritance. These C++ libraries fail in exactly the same way you’re experiencing: deep within the implementation in a place the user should never see (witness, eg, something like boost::iostreams)! The problem isn’t solved there either as C++ concepts have been rejected from the last several standards.

(*) Yes, Unions can help out here, but are inherently non-extensible. Manually defined traits help and are extensible, but are often clumsy to use.

Thank you Chris, Tamas, Jonathan and Petr. That is exactly the kind of advice I was looking for. I will concentrate on building up small things in the REPL.

I like the flexibility of Julia, and use it in my own code. But it sure would be nice if there was some optional way to declare ‘contracts’

For example, in the case of Flux.Conv it seems to only work for 2d convolutions and AbstractArray data that is of dimensions [w,h,d,n], so if you have a single grey scale image, you have to reshape it as [x,y,1,1]. Furthermore, while Conv works on TrackedArrays, operations like transpose on TrackedArray causes a stack dump. I guess the return of transpose(TrackedArray) is no longer an AbstractArray. It sure would be nice if the compiler could optionally catch that sort of thing.