BofAs Birnbaum: Memory Matters to Program in Parallel

If youre going to speed up the processing of bond, stock or other securities transactions by processing different parts of the overall task in parallel, memory matters. And so does understanding how memory works, said Jeffrey Birnbaum, global head of platform solutions at Bank of America Merrill Lynch. If you dont understand how memory works, he said at a workshop at the High Performance Computing on Wall Street conference at the Roosevelt Hotel Monday, you have no chance at creating good, fast code, using parallel programming techniques. If you dont understand how your systems caches of memory or other subsystem work, a lot of the high performance that parallelism promises  where chores are divided up and attacked at the same time  will be lost. Dont be afraid, he said, in fact, to duplicate memory. If need be, in effect, give the same information on bonds or stocks or other instruments to each thread of processing that is going on at once. Give each thread an exclusive copy of the data that is needed. To get started in a trading or operations shop with parallel programming, Birnbaum recommended taking a section of a large program and trying out chopping that up into component chores. Then, see if that section gets the boost in performance that is expected. But, he cautioned, dont expect the overall program to run 10 times faster, because of that try out. The section might run 10 times faster, but the overall program will not speed up overall by that amount. Just the amount, he said, that the section contributes to overall performance. Other recommendations from Birnbaum:

Use proven programming languages, such as C and C++ or Java and C-Sharp. There are other alternatives. Some incredible stuff has been one in Erlang, he said; and CUDA is hot. But moves to these languages can be costly  and new approaches may pass them by.

Check out the ecosystem of tools that surrounds the languages you choose. In C and C++ running on x86 processors, for instance, chipmaker Intel provides thread-building blocks of code that can make initial attempts at parallel programming easier. An extension to C++ called Ct is also coming, which will perform a lot of the parallelism under the cover for programmers.

Make sure you really have a problem that can be broken down into parallel pieces. Solving a Rubiks Cube, he noted, really can only be done serially. Of course, he said, if an entire program cant be parallelized, there may be parts within it that can be.

The results can be dramatic, he said. If, as his shop does, you spray thousands of bonds at a computer and can work them in parallel, the speed benefits are significant. He estimated processing times on each bond could be reduced from 30 seconds each to five seconds, with parallel programming.