Prior to the UK referendum vote to leave the EU on June 23rd, my personally espoused view on the EU was rather academic. My internal argument was that the EU had its problems but it was on the whole a positive force. Free trade is good, for which we need free movement of people and capital, and all the positives and negatives can be weighed and I judged a net positive outcome for the EU. That said, I could understand others would take a different view and maybe a more global outlook would be positive along with access to the European free market. Ultimately though, I thought the stability of remaining in trumped just about everything else.

Waking up to the results on June 24th, it was clear to me I was deeply shocked, saddened and upset. This is not a normal response to a purely academic outcome – I was clearly much more emotionally invested in the concept of the EU than I realised.

To me, I discovered, the EU is about so much more than its trade agreements and open borders. It is about a shared political identity. The vision of a Europe free from conflict, in which every country works towards goals that matter for the next century and beyond. A vision in which petty tribalism is put aside in favour of our shared humanity. A vision in which divisions across continents are as relevant to the political discussion as divisions across countries.

It was put to me yesterday that as a northerner, I should be more concerned with the North/South divide in the UK than EU membership. Putting aside the obvious irrelevance of the North/South UK divide to the EU issue (though in no way diminishing the importance of the problem), it is precisely this thinking that I detest so much. The problems of inequality and disharmony are magnified enormously on the global scale compared to the UK. It is on the global scale we should seek to address them.

The EU is an imperfect entity (though contrary to popular opinion it has changed over time, largely for the better) that represents and embodies this aspiration. To a large extent, we now have a continent of equal opportunity. For the half a billion EU citizens, there is nothing to stop them seeking the best opportunities across a continent, and they can be confident that fundamental principles of political stability, anti-corruption and environmental standards will be fit for purpose everywhere (and yes, I’m well aware in places this aspiration falls a little short, but it’s only a matter of time before the new member states reach parity).

This makes us all richer. Not just richer in the financial sense, but richer culturally as well as providing us with a richer shared humanity.

I look forward to the day that Turkey (and Russia and …) will join the EU, because then we will have another state that is able to claim to share the ideals of the EU, of peace, rule of law and outward looking positivity and we will have another country that can join the level playing field. Moreover, I look forward to a World in which the state can be considered as largely an administrative region and every person enjoys the same rights and freedoms and opportunities as I do in Western Europe. Nationalistic tribalism needs to be a thing of the past. Sure we can celebrate our cultural distinctions, but they should never be used to divide.

I do not identify as a European because I was born in Europe. I identify as a European because I share a common cause with hundreds of millions of fellow Europeans across a continent in a way that has made great things possible.

It seems that my fresh shiny installation of Ubuntu Wily isn’t liked much by the recent versions of Vivado. After a little too long, I managed to get Vivado working just fine, though it wasn’t trivial.

Basically the problem is down to the fact that 14.04 is the most recent supported version of Vivado. The solution then is to run Vivado in a 14.04 chroot, which, with schroot, can be pretty seamless.

The following shows how I got it working, as much for my reference as anything else. See man pages, as well as the Ubuntu and Debian docs on chroot and schroot for more info on the various aspects of the problem. You’ll obviously need schroot installed as a prerequisite.

We use Vivado 2015.03 here, but anything else should be much the same with suitably twiddles to paths and so on.

Essentially, install a new root file system that can be accessed through a chroot with something like:

Now append the following to your fstab so all the relevant system directories are visible (since the purpose is not specifically to sandbox Vivado, there is no problem with exposing everything through the chroot):

At this point I installed lsb. I’m not sure this is necessary for running Vivado itself, but some of the support tools may fail quietly if it’s not present (specifically, this may be a problem for the license acquisition step). This can be done from the chroot, accessing it as root:

sudo schroot -c vivado-trusty
apt-get install lsb

Now you should have the chroot set up properly and can enter it with schroot -c vivado-trusty.

Now at this point, I simply used a previously installed version of Vivado, though it would also be a good time to install Vivado from scratch if you haven’t already.

If you use a previously installed version, it needs to be visible to the chroot. In my case it was in home due to the way I’d set up my partitions, root being small by comparison, so was visible inside the chroot with the mounts that I’d set up.

It was necessary for me to run install_fnp.sh (inside Xilinx/Vivado/2015.3/bin/unwrapped/lnx64.o) which sets up the “trusted” server (or whatever it’s called).

One important thing that was problematic for the generation of the license was that the ethernet device has a different name in 15.10 (due to some persistent naming changes). The Vivado license manager expects the older style naming of “eth0”, “eth1” etc to look up the mac address. Without changing the ethernet device name, every attempt to get a new license presented my with a greyed out box.

I reverted this by adding net.ifnames=0 biosdevname=0 to the GRUB_CMDLINE_LINUX_DEFAULT line in /etc/default/grub, so the line now looks something like:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash net.ifnames=0 biosdevname=0"

and running sudo update-grub. At this point, the license manager was able to generate an html page that allowed me to generate a new license.

This all got Vivado up and running. For a bit of polish I created two additional files to make running Vivado simpler. In ~/bin I created

chmod both those to be +x and you can now run vivado from outside the chroot and it will all automagically work. You shouldn’t even notice it’s running in a chroot. For some reason WordPress seems to be removing the underscores from vivado_in_env in the above script. Of course they need to be there.

(if you’re interested, the parentheses in vivado_in_env creates a sub-shell that prevents the annoying environment leakage that breaks other commands when source /opt/Xilinx/Vivado/2015.3/settings64.sh is run.)

I came to a really neat realisation a little time ago that when using test-drive development (TDD), the testing of your software defines the specification of your software. This may be obvious but it really highlights the fundamental difference in approach between the agile approach to software and other more specification first approaches (like the waterfall model) – with agile methodologies, the spec of the written software is only defined by the tests. Done properly, there is no disconnect between the specification and the implementation, only between the backlog and the specification.

I’ve recently being looking at behaviour driver development (BDD) and how that fits with TDD and general good practice in development. There’s loads of bluster around BDD and no end of frameworks and new languages that define special new ways of generating your integration tests such that normal people can read the tests and yet it still magically actually performs the test. I was always uncomfortable though with the notion that BDD and TDD were actually all that different when it comes down to it; I couldn’t see what was offered by the various tools.

I came across this lovely blog post, which very much clarified my own thoughts (and confirmed a few too). The central point of the post, which I’ll restate here, is BDD is all about communication. Given that, BDD is just TDD with proper communication.

There are TDD purists who will argue that if you’re doing more than unit testing (e.g. you’re doing integration testing too), you’re not doing TDD. I would argue tosh, the whole of your code should benefit from a genuine test driven approach. At the highest level, the test should be derived from the user stories, and as subsequent requirements are introduced, they are manifested as tests that should be satisfied, which then may well depend on further capabilities which are described by more tests. At some point (depending on the size of the project), you reach the unit test stage. As already described, this hierarchy is then your specification.

So how does one fit BDD into all this? Well, it simply then becomes about the process by which tests are defined. A test is always the satisfying of a need, from the highest level which the end user probably cares about (which maps to the user stories), down to the lowest level which the developer cares about (e.g. there should be a frobinator method on this class which frobinates the two inputs). But here’s the point, all the language used to describe those tests are in terms of what behaviour is required, which is exactly what BDD is about!

In python, my implementation strategy for this is simply to use unittest and then describe every test set and individual test in terms of what the software should do. So something like “There should be a widget which calculates how many tins of custard I need to buy” or “This class should have a dot product method that returns the dot product of itself with an argument.”. This makes it crystal clear what the intended behaviour is and just thinking along these documentation lines clarify the purpose of the test in a way that I never had with simply writing tests to satisfy some hypothetical functionality.

In effect, the docstring becomes a first class component of the tests. Without the descriptive behaviour driven docstring, the test is invalid.

Now, I think this is great for various reasons, but two are worth point out: firstly it totally removes the disconnect between BDD and TDD and secondly, it acts as a really neat guide for architecting the software. I found myself neatly guided through the design process by describing my software using tests. I think it’s because the process of defining tests forces one to break the expected functionality into chunks. If you find later that your architecture can be improved, then great, isn’t refactoring one of the central tenets of TDD? (which neatly maps to rearchitecting in the general sense).

This all leads to an interesting result, which is particularly relevant in the context of medical software, which is in the world of IEC 62304, which fits neatly into the waterfall model, but not necessarily very well with agile techniques. When one has the testing system described as above, one already has a plain English representation of the specification – and better than that, all the tests that define it and by extension an implementation of the software that satisfies the spec perfectly! It’s a small step to write a sphinx plugin to auto-generates the spec on demand.

There’s an interesting series of blog posts (beginning here) I was reading in which agile methodologies for medical software is discussed. One solution (described as “Agile methods, solution 2”) is supported very nicely by what I describe above.

And so it has come to pass that Gnucash no longer suffices as my accounting software. It saddens me as I think the project is great and for many applications it would still be a great solution. There is not one overwhelming flaw with it, but a series of small deficiencies that occur simply because it’s a project run by volunteers with limited time.

The experience of using Gnucash has been great – I know far more about bookkeeping as a consequence of having to think hard about how to do it properly in Gnucash than I expect I would have if I’d jumped straight to a more modern solution. I’ve had a few problems that sunk time, including the handling of VAT just being more effort than it should be, but the final problem that brought me to a decision was the difficulty in exporting a suitable set of reports for the accountant at the end of the year. It eventually worked, but not without much effort and various frustrating bugs in layout. I dare say I could spend a while learning Scheme and writing my own reports, but the time has come to throw money at the problem.

For me, a cloud solution just seems to be the right way to do this. Web based solutions generally provide all the benefits of low up-front cost, device independent access and continuous upgrades along with perfect cross-platform support, which as a Linux user I truly value.

A cursory investigation flagged up three possible options for me as a fledgling micro business (albeit with a few complicated transactions to handle): Freeagent, Xero and Kashflow. Freeagent was the first I looked into – my (naive as it turned out) assumption was that being UK based, Freeagent would better handle the UK tax situation, though given the apparent pervasiveness of Xero, that was the one I tried first.

It was a little frustrating getting Xero set up – primarily because I had to fully grasp the notion of all the start up balances (still learning!), but I got there in a reasonable couple of hours. The online help was really excellent here. Another hour later and I had the historical data since the last year end (actually slightly earlier to begin on a new VAT quarter). What really impressed me was the single click to get the VAT return that I’d spent 3 hours fiddling with in Gnucash earlier (and, pleasingly, it was exactly what I’d written in my return to HMRC!). So much for a non-UK company having difficulty with UK tax.

I’m very impressed with Xero. I initially felt pretty constrained by all the transactions happening through actions (sales, purchases, etc) which was at odds with me being used to the general ledger being the top level concept in Gnucash, which invoices and bills influence. I had to think for a few mins how I wanted to do something in this new way, but my conclusion is that Xero actually presents it better. The only account that one can easily modify are the bank accounts (though of course, that carries an implicit change in some other account because it is a true double entry system behind the scenes), and I wasn’t convinced at first that was flexible enough. I now think it probably is for almost all the situations I’m ever going to need.

The invoices in Xero are (almost) delightful. Once the data is entered, a beautiful invoice is rendered (with almost no work on my part beyond adding a logo and payment details). One click sends it off the intended recipient (as well as me if I so desire). I played around a bit with modifying the contacts (the general term for people you might interact with, i.e. an invoice recipient) and invoice settings and they seem really quite flexible – multiple recipients, alternative email content etc. The one issue I had, which may well be down to me now knowing how to do it, was having to enter every line on the invoice afresh – there was no attempt to autofill based on what I’d written on previous invoices or the line above or anything, or the ability to copy a line. This is something Gnucash does better :).

Overall Xero will fit well with my needs and the interface is simple, clean and intuitive (to me with a reasonably understanding of double entry bookkeeping).

I thought I ought to compare with an alternative, so I signed up to the free trial of Freeagent. Kashflow I’m sure is great – I was put off by something, which I’ve forgotten. Also, Kashflow seeming to not do much in the way of bank feeds (Yodlee does sound like a dodgy solution, and that seriously puts me off Freeagent. Xero has a partner agreement with HSBC, so they seem to be able to do bank feeds “properly”, though Yodlee is an option). If I get a chance to play with Kashflow I’ll report back.

The Freeagent sign up process was very quick and simple, but left me wondering how I insert the opening balances or change the chart of accounts. After a bit of searching I found the relevant help page, but I do think that should have been a somewhat guided process at sign-up – the system wasn’t useful to me until that essential information had been entered. It is also possible to change at least some of the accounts – like income and expenditure accounts. I could not find out how to change the asset accounts though. This is substantially less flexible than Xero and seems oddly restrictive.

On the whole Freeagent seems slightly less slick that Xero. For example, it’s very prescriptive in how I need to enter a date. In Xero, I just type 3/4 and it gets interpreted on the fly as 3rd April 2014 (so I could edit it if necessary). In Freeagent, I get a complaint that the date was mis-entered. This is superficially a minor issue, but something that is just unnecessary and IMO shows a lack of attention to detail. It was a more general perception as well – the interface in Xero is a lovely balance of presenting and inferring the right information at the right time; once i’d worked out the general layout, everything seemed to be where I expected it to be. In Freeagent, things just don’t seem as intuitive. That’s probably not a wholly fair assessment, but it was certainly a feeling I had. Despite being very similar in the way they operate, I still feel pretty lost in Freeagent. I’ve no doubt that will fade with time, but the feeling went very fast with Xero.

On the whole, I actually think both Xero and Freeagent would be great for my purposes. The extra flexibility of Xero might help in the long run, but many people wouldn’t need that. Despite that, my opinion so far is I’ll be going with Xero. That’s not to say that Freeagent isn’t very good, I just think Xero is fantastic.

It has been sometime since I posted about pyFFTW, the last being well over a year ago when I introduced the wisdom functionality. Despite that, the development has been plodding along steadily (and quiet releases made!) and with the 0.9.1 release it has now reached the stage where it satisfies much of my wishlist.

Please go and check out the latest docs for all the features (the tutorial gives a pretty good overview). The main big improvements since the last post are the pretty neat interfaces module, which offers a drop in replacement for both numpy.fft and scipy.fftpack. Though not all of the functionality of those modules is provided by code within pyFFTW, the namespace is complete through importing parts of numpy.fft or scipy.fftpack as appropriate. There is some slowdown with using pyfftw.interfaces, but this can be largely alleviated with the cache functionality.

Both the above interfaces depend build on the simplified interface to construct FFT objects, pyfftw.builders, which is itself of potential interest to end users who want to squeeze out every drop of performance without getting bogged down in the details of creating pyfftw.FFTW objects directly.

In addition, some nice helper functionality has been added to pyfftw.FFTW objects. Mainly, the objects can now be called to yield the suitable output, updating the arrays as desired, with normalisation optionally applied to inverse FFTs. This makes creating and calling a pyfftw.FFTW object as simple as:

Though in the above example, a is set to be aligned to 32 bytes, the builder will by default handle that automatically and the FFTW object will consequently align all the array updates to make sure the alignment is maintained (the optimum alignment is inspected from the CPU). In the above example, the update to the array is not explicitly aligned and is of the incorrect dtype for the FFTW object. The correction is automatically handled by the call.

That is in a nutshell most of important improvements, though there are loads of little usability enhancements and bug fixes. Please, everyone, go and have a play with it and report back bugs to the github page.

Common wisdom says that the mex files that Matlab builds are good for Matlab, and Matlab only. Not having trivial access to an installation of Matlab and needing to access the very neat and useful Field II package, this was not wholly satisfactory for me.

Field II is a package for simulating an ultrasound field. It doesn’t matter what it does really, except that it only comes with precompiled mex files (no source :(). These mex files (in my case, the 64-bit version with extension .mexa64) are really just poorly disguised elf shared libraries so all the usual library inspection tools work – ldd, nm, objdump, as well as all manner of nefarious library hacking.

It turns out that it actually wasn’t that big a deal to get the library to work. Octave handily provides a matlab API, so implicitly has a chunk of code that does the right stuff, the only problem then is one of ABI compatibility. Now, a quick glance through the Mex documentation seems to suggest that only standard types, or pointers to exotic types are passed around on the stack, and also dictates that the Matlab internal objects are opaque, so the ABI for the Octave library is probably going to be compatible (another learning point for me, be nice to all potential users and keep function parameters clean of exotic types that aren’t references – thank you Mathworks!).

Before we even have a hope of linking the library though, we must first find out what’s missing. A peer into the distributed mex file tells us what we need to worry about:

$ ldd Mat_field.mexa64

gives us, among other things, the missing libraries (Mat_field.mexa64 is the mex file of interest):

libmx.so => not found
libmex.so => not found
libmat.so => not found

We can easily create a set of dummy libraries to keep the linker happy with these:

Now we only need to tie it all together with the octave libs whence the missing symbols will come. You can see what octave is doing when it builds an oct file by passing the -v (verbose) flag to mkoctfile. Since we don’t actually need to compile everything, we just want to relink, so we hope the following will work (fortunately, none of the distributed mex files had the .mex extension, so I could claim it myself, otherwise there might well be a name collision that needs working around)…

Unfortunately, the distributed mex file uses a couple of symbols to functions that are not documented by Mathworks. These functions seemed to be version specific implementations of documented functions, so I just reimplemented them as wrappers of the documented functions (and by extension, those that are implemented in Octave). In my case, the missing functions were mxCreateDoubleMatrix_700 and mxGetString_700 (octave tells you the missing symbols when you try to use the library). The following code did the trick:

I’ve done a bit of coding in the past with SSE instructions and achieved quite a significant speedup. I’ve also been playing recently with OpenCL as a means of implementing a fast, cross-device version of the Dual-Tree Complex Wavelet Transform (DT-CWT).

Since I lack a GPU capable of supporting OpenCL, I’m rather limited to targetting the CPU. This is a useful task in itself, as I would like a DT-CWT library that is fast on any platform, making use of the whatever hardware exists.

OpenCL implements a vector datatype, which the CPU compilers will try to map to SSE instructions. If no vector instructions are available, the compiler should just unroll the vector and perform scalar operations on each element (as would likely happen on most GPUs).

At the core of the DT-CWT is a convolution. The faster the convolution can be done, the faster the DT-CWT is done. This means a good starting point for a DT-CWT implementation is producing an efficient implementation of a convolution.

Given the difficulty of writing and debugging OpenCL, and the need to initially target a CPU, it made sense to me to create, in the first instance, a pure C version of an efficient convolution.

This is compiled with the following options:gcc -std=c99 -Wall -O3 -msse3 -mtune=core2

Running this with a 1024 sample input vector and a 16 sample kernel, it takes about 24.5μs per loop (on my rather old Core2 Duo).

The first modification is to reimplement this with SSE instructions. We can do this by computing 4 output values in parallel. If the kernel is of length N, then we create N 4-vectors. We then copy each element of kernel into each element of the corresponding 4-vector. We can then compute 4 output values in parallel. For this, we require that the input vector is a multiple of 4.

Note that we need to compute the last value in the output separately as the output is not a multiple of 4. We could also extend this principle to inputs that are not a multiple of 4 (but I don’t here!).

I use the SSE intrinsics (specifically, SSE3), included with#include <pmmintrin.h>
#include <xmmintrin.h>

The __m128 type is used to denote an SSE 4-vector that can be used with the intrinsics.

We’re still not done though! Each load from the dataset is an unaligned load (_mm_loadu_ps). Aligned loads using SSE are significantly faster than unaligned loads. To make the load aligned, the address of the data to be copied in must be a multiple of 16 (with AVX, Intel’s latest vector instruction set incarnation, I believe this has been increased to 32).

Unaligned loads are a necessity in the code so far as we step over every sample, meaning we never know what the byte-alignment of any given block of 4 floats will be. By somehow persuading every load to be aligned, we might be able to get more speed.

We can do this by making 4 copies (well, almost whole copies) of the input array, each with a differing alignment, such that every possible set of 4 floats can be found on the crucial 16-byte alignment.

If we start with the original data, with the numbers showing the sample numbers:
original: [0, 1, 2, 3, 4, 5, …]

A quick test demonstrates that the speed up from the aligned load overwhelms any increase in time due to extra copies being made.

The time with the same input and kernel is now about 5.4μs.

One final tweak is to fix the length of the kernel loop, replacing kernel_length with a constant. This allows the compiler to unroll the whole of the kernel loop and yields a time of about 4.0μs (code here).

Inspecting the yielded assembly of the inner loop suggests that we’ve pretty much reached the limit of what is possible, with most of the time being spent doing an aligned load, followed by a 4-vector multiplication, followed by a 4-vector add, the essence of a convolution.

Running on larger arrays suggests that speedup scales well across larger arrays (I tested up to 32,768). In the repository is a python script that you can play with to convince yourself.