Tag Archives: J

Post navigation

An Iverson ghost is an embedding of APL like array programming features in nonAPL languages and tools.

You would be surprised at how often Iverson ghosts appear. Whenever programmers are challenged with processing large numeric arrays they rediscover bits of APL. Often they’re unaware of the rich heritage of array processing languages but in NumPy's case, they indirectly acknowledged the debt. In Numerical Python the authors wrote:

“The languages which were used to guide the development of NumPy include the infamous APL family of languages, Basis, MATLAB, FORTRAN, S and S+, and others.”

Not only do developers frequently conjure up Iverson ghosts. They also invariably turn into little apostles of array programming that won’t shut up about how cutting down on all those goddamn loops clarifies and simplifies algorithms. How learning to think about operating on entire arrays, versus one dinky number at a time, frees the mind. Why it’s almost as if array programming is a tool of thought.

Where have I heard this before?

Ahh, I’ve got it, when I first encountered APL almost fifty years ago.

Yes, I am an old programmer, a fossil, a living relic. My brain is a putrid pool of punky programming languages. Python is just the latest in a longish line of languages. Some people collect stamps. I collect programming languages. And, just like stamp collectors have favorite stamps, I find some programming languages more attractive than others. For example, I recognize the undeniable utility of C/C++, for many tasks they are the only serious options, yet as useful and pervasive as C/C++ are they have never tickled my fancy. The notation is ugly! Yeah, I said it; suck on it C people. Similarly, the world’s most commonly used programming language JavaScript is equally ugly. Again, JavaScript is so damn useful that programmers put up with its many warts. Some have even made a few bucks writing books about its meager good parts.

I have similar inflammatory opinions about other widely used languages. The one that is making me miserable now is SQL, particularly Microsoft’s variant T-SQL. On purely aesthetic grounds I find well-formed SQL queries less appalling than your average C pointer fest. Core SQL is fairly elegant but the macro programming features that have grown up around it are depraved. I feel dirty when forced to use them which is just about every other day.

At the end of my programming day, I want to look on something that is beautiful. I don’t particularly care about how useful a chunk of code is or how much money it might make, or what silly little business problem it solves. If the damn code is ugly I don’t want to see it.

People keep rediscovering array programming, best described in Ken Iverson’s 1962 book A Programming Language, for two basic reasons:

It’s an efficient way to handle an important class of problems.

It’s a step away from the ugly and back towards the beautiful.

Both of these reasons manifest in NumPy‘s resounding success in the Python world.

As usual, efficiency led the way. The authors of Numerical Python note:

Why are these extensions needed? The core reason is a very prosaic one, and that is that manipulating a set of a million numbers in Python with the standard data structures such as lists, tuples or classes is much too slow and uses too much space.

Faced with a “does not compute” situation you can either try something else or fix what you have. The Python people fixed Python with NumPy. Pythonistas reluctantly embraced NumPy but quickly went apostolic! Now books like Elegant SciPy and the entire SciPy toolset that been built on NumPy take it for granted.

Is there anything in NumPy for programmers that have been drinking the array processing Kool-Aid for decades? The answer is yes! J programmers, in particular, are in for a treat with the new Python3 addon that’s been released with the latest J 8.07 beta. This addon directly supports NumPy arrays making it easy to swap data in and out of the J/Python environments. It’s one of those best of both worlds things.

The following NumPy examples are from the SciPy.orgNumPy quick start tutorial. For each NumPy statement, I have provided a J equivalent. J is a descendant of APL. It was largely designed by the same man: Ken Iverson. A scumbag lawyer or greedy patent troll might consider suing NumPy‘s creators after looking at these examples. APL’s influence is obvious. Fortunately, Ken Iverson was more interested in promoting good ideas that profiting from them. I suspect he would be flattered that APL has mutated and colonized strange new worlds and I think even zealous Pythonistas will agree that Python is a delightfully strange world.

While browsing in a favorite bookstore with my son, I spotted a display of horoscope themed Christmas tree ornaments. The ornaments were glass balls embossed with golden birth signs like Aquarius, Gemini, Cancer, et cetera, and a descriptive phrase that “summed up” the character of people born under a sign. Below my birth sign golden text declared, “Imaginative and Suspicious.”

I said to my son, “I hate it when astrological rubbish is right.”

I am imaginative and suspicious; it’s a curse. When it comes to money my “suspicious dial” is permanently set on eleven. I assume everyone is out to cheat and defraud me until there is overwhelming evidence to the contrary. Paranoia is generally crippling but when it comes to cold hard cash it’s a sound retention strategy.

Prompted by an eminent life move, I found myself in need of a cash flow forecasting tool. Normal people deal with forecasting problems by buying standard finance programs or cranking up spreadsheets; imaginative and suspicious people roll their own.

SWAG

Used highly portable, durable, and version control friendly inputs and outputs.

Reflected what ordinary people, not tax accountants, actually do with money.

Is open source and unencumbered by parasitic software patents.

Amazingly, my short list of no-brainer requirements eliminates most standard finance programs. Time to code!

SWAG Inputs

The bulk of SWAG is a JOD generated self-contained J script. You can peruse the script here. SWAG inputs and outputs are brain-dead simple TAB delimited text tables. Inputs consist of monthly, null-free, numeric time series tables, scenario tables, and name cross-reference tables. Outputs are simple, null-free, numeric time series tables. Input and output time series tables have identical formats.

A few examples will make this clear. The following is a typical SWAG input and output time series table.

The first header line is a simple list of names. The first name “Date” heads a column of first of month dates in YYYY-MM-DD format. The SWAG clock has month resolution and dates are the only nonnumeric items. Names beginning with “E” like E0, E1, …, are aggregated expenses. Names beginning with “I” like I0, I1, I2 … are income totals. “R” names are reserves: basically savings, investments, equity and so forth. “D” names are various debts. BB is basic period balance, NW is period net worth and “U” names are utility series. Utility series facilitate calculations. Remaining names are self-explanatory totals. Be aware that this table has been formatted for this blog. Examples of raw input and output tables can be found here.

The next ingredient in the SWAG stew is what many call a scenario. A scenario is a collection of prospective assumptions and actions. In one scenario you buy a Mercedes and assume interest rates remain low. In another, you take the bus and rates explode. When forecasting I evaluate five basic scenarios, grim, pessimistic, expected, optimistic, and exuberant. Being a negative Debbie Downer type I rarely invest time in exuberant scenarios. I concentrate on grim and pessimistic scenarios because once you are mentally prepared for the worst anything better feels like a lottery win.

The following is a typical SWAG scenario table. Scenario tables, like time series tables, are simple TAB delimited text files.

Again the first header row is a simple list of names. Most scenario names are self-explanatory but four OnDate, OffDate, Method, and MethodArguments merit some explanation. SWAG series methods assume, history, reserve, transfer, borrow, and spend are modeled on what people typically do with cash.

assume sets expected interest rates and other global assumptions for a given time period. SWAG series methods operate over a well-defined time period. The period is defined by OnDate and OffDate.

history looks at past periods and estimates a numeric value that is projected into the future. Currently, history computes simple means but the underlying code can use arbitrary time series verbs.

borrow borrows money and sets future loan payments. borrow supports simple amortization loans but is also capable of reading an arbitrary payment schedule that can be used for exotic2 loans.

transfer moves money between reserves, debts, expenses and income series.

spend does just what you expect.

SWAG series methods adjust all the series affected by the method. As you might expect SWAG arguments methods are detailed. MethodArguments uses a restricted J syntax to set SWAG arguments. Argument order does not matter but only supported names are allowed. Many examples of SWAG MethodArguments can be found in the EXCEL spreadsheet tp.xlsx. I use EXCEL as a scenario editor. By setting EXCEL filters, you can manage many scenarios.

The final SWAG input is a name cross-reference table. It is another TAB delimited text file that defines SWAG names. You can inspect a typical cross-reference table here.

Running SWAG

To run SWAG you:

Prepare input files.

Start J, any front-end jconsole, JQT or JHS will do, and load the Swag script.

Prepare Input Files

By far the most difficult step is the first. Here you review your financial status which means checking bank balances, stock values, loan balances and so on. Depending on your holdings this could take anywhere from minutes to hours. I call this updating actuals. Not only is updating actuals the most difficult and time-consuming step it is also the most valuable. Money that is not closely watched leaks away.

I store my actuals in a simple tabbed spreadsheet. Each tab maintains an image of a text file. I enter my data and then cut and paste the sheets into a text editor where I apply final tweaks and then save the sheets as TAB delimited text files.

Monthly income, expenses and debts are easy to update but some of my holdings do not offer monthly statements. The verb RawReservesFromLastin Swag.ijs fills in missing months with the last known values. When I’m finished preparing input files I’m left with four actual TAB delimited files, RawIncome.txt, RawExpenses.txt, RawReserves.txt, and RawDebts.txt. You can inspect example actual files here.

Start J and load the Swag script.

The SWAG script is relatively self-contained. It can be run in any J session that loads the standard J profile. Load Swag with the standard load utility.

load 'c:/pd/fd/swag/swag.ijs'

Here SWAG is loaded in JHS.

Execute RunTheNumbers

RunTheNumbers sets the SWAG configuration, loads scenarios, copies actuals to each scenario, and then evaluates each scenario. Scenarios are numbered. I use positive numbers for “production” scenarios and negative numbers for test scenarios. It sounds more complicated that it is. This is all you have to do to execute RunTheNumbers

RunTheNumbers writes a pair of TAB delimited forecast and statistics files for each scenario it evaluates.

Open swag.xlsx and press “Refresh All”

The spreadsheet swag.xlsx loads SWAG TAB delimited text files and plots results.3 I plot monthly cash flow, estimated net worth and debt/equity for each scenario. The following is a typical cash flow plot. It estimates mean monthly cash balance over the scenario time range.

The polynomial displayed on the graph is an estimated trend line. Things are looking bleak.

Here’s a typical net worth plot.

In this happy scenario, we die broke and leave a giant bill for the government.4

So far SWAG has met my basic needs and forced me to pay more attention to the proverbial bottom line. As I use the system I will fix bugs, refine rough spots, and add strictly necessary features. Feel free to use or modify SWAG for your own purposes. If you find SWAG useful please leave a note on this blog or follow the SWAG repository on GitHub.

What do you call dis-integrated collections of programs that you use to solve problems? Declaring such dog piles “systems” demeans the word “system” and gives the impression that everything has been planned. This is not how I roll. “Mob” is far more appropriate. It conveys a proper sense of disorder and danger.↩

When borrowing money you should always plan on paying it all back. Insist on a complete iron clad repayment schedule. If such a schedule cannot be provided run like hell or prepare for the thick end of a baseball bat to be rammed up your financial ass.↩

It may be necessary to adjust file paths on the EXCEL DATA ribbon to load SWAG TAB delimited text files.↩

jodhelp changes (#1, #3 and #5)

jodhelp has always been a kludge. In programmer speak a kludge is some half-baked facility added to a system after more essential features have stabilized. The original versions of jodhelp pointed at my rough notes. It was all the “documentation” I needed! Then others stared using JOD which resulted in an “evolved” online version of my notes. I originally thought that hosting my notes online would simultaneously serve user needs and cut the amount of time I spent maintaining documentation. In retrospect this wasn’t even wrong!

I used Google Documents to host my notes. If you’ve ever wondered why completely free Google Documents hasn’t obliterated expensive Microsoft Word or hoary old excellent I invite you to maintain a set of long-duration-documents with Google Documents. During jodhelp‘s online lifetime the basic internal format of Google Documents changed in a screw-your-old-documents upgrade which forced me to spend days repairing broken hyper-links and reformatting. I was not amused; you still get what you pay for!

I originally choose Google Documents because of its alleged global accessibility. Sadly, Google Documents is now often blocked by corporate and national firewalls. Even when it isn’t blocked it renders like a dog peeing on a fire hydrant. All these problems forced me to rewrite JOD documentation with a completely reliable tool: good old-fashioned . The result of my labors, jod.pdf, is now distributed by the joddocument addon and is easily browsed with jodhelp.

After jod.pdf‘s appearance another irritant surfaced: synchronizing jod.pdf and the online version. I tried using pandoc and markdown to generate both the online and PDF versions from the same source files but jod.pdf is too complex for not-to-fancy portable approaches. I was faced with a choice, lower my jod.pdf standards, or get rid of something I never really liked. I opted to drown a child and abandon online help. I don’t expect a lot of mourners at the funeral.

Using the new version of jodhelp requires installing the addon joddocument and configuring a J PDF reader. It’s also good idea to define a JQT PF key to pop up JOD help with a keystroke. To configure a J PDF reader edit the configuration file:

~config/base.cfg

this file is directly available from the JQT Edit\Configure menu. base.cfg defines a number of operating system dependent utilities. Make changes to the systems you use, save your changes, and restart J. The following example shows my Win64 system settings.

jodsource changes (#2)

The jodsource addon is a collection of JOD dump scripts. Dump scripts are serialized versions of binary JOD dictionaries. When executed they merge objects into the current JOD put dictionary. I use them primarily to move dictionaries around but they have other uses as well. Prior to this version I distributed the three main JOD development dump scripts, joddev, jod, and utils in one compressed zip file to reduce the size of JAL downloads.

The distributed script jodsourcesetup.ijs used the zfiles addon to extract these scripts and rebuild JOD development dictionaries. This worked on 32 bit Windows systems but failed elsewhere. J now runs on 32/64 bit Windows, Mac, Linux, IOS and Android systems. To better support all these variants I eliminated the zfiles dependency and pruned the JOD development dictionaries. The result is a more portable and smaller jodsource addon.

Bye bye volume sizing (#4)

Early versions of JOD ran in the now bygone era of floppy disks. It was possible to create many JOD dictionaries on a single standard 800 kilobyte 3.5 inch floppy. Compared to modern porcine-ware JOD, which many J’ers consider a huge system, is lithe and lean. In floppy days it was important to check if there was enough space on a floppy before creating another huge 48K empty JOD dictionary. This is a bit ridiculous today! If you don’t have 48K free on whatever device you are running you have far more serious problems than not being able to create JOD dictionaries.

Volume sizing code remained in JOD for years until it started giving me problems. Returning the size of very large network volumes can be time-consuming and there are serious portability issues. Every operating system calls different facilities to return volume sizes. Even worse, security settings on corporate networks and cloud architectures sometimes refuse to divulge national secrets like free byte counts.

To eliminate all these headaches this version of JOD no longer checks volume size when the FREESPACE noun is zero. To restore the previous behavior you have to edit the file

~addons/general/jod.ijs`

and change the line FREESPACE=:0 to whatever byte count you want. Alternatively, you could NGAF3 and just assume you have 48K free on your terabyte size volumes.

Still to come

You may have surmised from JOD’s version number that the system is still not feature complete. The JOD manual lists a few words that I am planning to implement. I only develop JOD when I need something or I am bored out of my mind at work and need a break. Such intermittent motivators seldom insure project completion but I have found a new reason to finish JOD. To list a book on Goodreads or Amazon you need an ISBN number. The hardcopy version of the JOD manual is a sort-of-published book. To complete the publishing process I need an ISBN. If I am going to bother with such formalities I might as well complete the system the manual describes. So there you have it a new software development motivator: vanity.

The version number is *‘ed because you are always a point release from done!↩

The genesis block is the first block on the Bitcoin blockchain. Satoshi Nakamoto, the mysterious entity that created Bitcoin, mined the genesis block on January 3, 2009. It’s been five years since the genesis block’s birth and Satoshi is still unknown, Bitcoin is bigger than ever, and the blockchain is longer than 300,000 blocks and growing.

One of the most important features of the blockchain is its immutability. After the Bitcoin network accepts a block and adds it to the blockchain it can never be altered. This makes Bitcoin blocks rare durable binary artifacts. The cryptographic hash algorithms that underpin the Bitcoin protocol enforce block immutability. If someone decides to tinker with a block, say maliciously flip a single bit, the block’s hash will change and the network will reject it. This is what makes it almost impossible to counterfeit Bitcoins. Bitcoins have been lost and stolen but they have never been successfully counterfeited. This sharply contrasts with funny money like the US dollar that is so routinely and brazenly counterfeited that many suspect the US government turns a blind eye.

The exceptional durability of Bitcoin blocks, coupled with the mysterious origins of Bitcoin, makes the genesis block one of the most intriguing and important byte runs in the world. This post was inspired by the now defunct post 285 bytes that changed the world. I would love to give you a link but this post has vanished. A secondary, but excellent reference is John Ratcliff’s How to Parse the Bitcoin BlockChain. I am adapting John’s nomenclature in what follows.

When programmers start exploring Bitcoin they often cut their teeth on parsing the genesis block. If you Google “blockchain parsing” you’ll find examples in dozens of programming languages. The most popular are C, C++, Java, PHP, C#, JavaScript, and the rest of the mainstream suspects. What you will not find, until now, are J examples.

So what does J bring to the table that makes yet another genesis block parser worth a look? Let’s take a look at Bitcoin addresses. The following is the Bitcoin address of this blog’s tip jar. Feel free to send as many Satoshis and full Bitcoins as you like to this address.

tip=. '17MfYvFqSyeZcy7nKMbFrStFmmvaJ143fA'

There is nothing deep or mysterious about this funny string of letters; it’s just a plain old number in Bitcoin base 58 clothing. So, what is this number in standard format? Here’s how it’s calculated with J.

The second line that defines dfb58, (decimal from base 58), is the complete J program! That’s it folks. You can troll the internet for days looking at base 58 to big integer converters and it’s unlikely you will find a shorter or more elegant conversion program. Not only is the J version short and sweet it’s also fast and versatile. Suppose you wanted to convert ten thousand Bitcoin addresses. The following converts ten thousand copies of tip.

At this point fanboys of mainstream programming languages typically pipe up with something like, “changing number encodings is inherently trivial; what about something more demanding like going the other way, say converting Bitcoin public keys to the base 58 address format?”

The public key in the genesis block is encoded in what many call the “challenge script.” Here is the genesis block’s challenge script in hex.

Public keys take a number of forms in the blockchain. John Ratcliff’s post summarizes the many forms you will run into. The genesis block uses the 65 byte ECDSA form. Converting this form to base 58 requires taking SHA-256 and RIPEMD-160 hashes. These hashes are available in OpenSSL which is conveniently distributed with J 8.02 JQT. Here’s how to convert the genesis block’s public key to base 58 with J.

The ChallengeScript noun holds the bytes given in hex above. The verbs sr150, s256 and Base58Check are available in the J scripts sslhash and ParseGenesisBlock that I have put in the jacks repository on GitHub.

The following J verb ParseGenesisBlock reads the first full node Bitcoin block file and then extracts and checks the genesis block. ParseGenesisBlock tests the various verbs, (functions), it employs. As a side effect it clearly describes the layout of the genesis block and provides test data for anyone that’s interested.

If this post peeks your curiosity about J a good place to start learning about the language is the recently released New Dictionary of J. You can download a version of J for Windows, Linux, OS/X, IOS, and Android at Jsoftware’s main site.

I have pushed out a JOD update that makes it possible to run the addon on J 8.02 systems. In the last eight months a QT based J IDE has been developed that runs on Linux, Windows and Mac platforms. To maintain JOD’s compatibility across all versions of J from 6.02 on I had to tweak a few verbs.

The only significant changes are how JOD interacts with various J system editors. I have tested the system on Windows J 6.02, J 7.01, J 8.02, and Mac J 7.01 systems. I expect it will behave on 32 and 64 bit Linux systems from J 7.01 on, but I have yet to test these setups. My hardware budget limits my ability to run common variants of Windows, Linux and Mac systems.

JOD is still not complete; that’s why the version number has not been bumped past 1.0.0. The missing features are noted in the table of contents of jod.pdf, (also available in the joddocument addon), with the suffix “NIMP,” which means “not implemented.” I will fill in these blanks as I need them. Most of the time JOD meets my needs so don’t hold your breath.

I joke that my job title should be software archaeologist because I often find myself resurrecting, not refactoring, code that dates to primitive and primeval eras. The language I’m typically hired to resurrect is APL. APL, the language with funny symbols, is a software vampire. People keep paying us to kill it, but no matter how many stakes we pound through its heart it keeps coming back.

There are good reasons for this. APL embodies many timeless ideas and I’m confident that programming in the future will look a lot more like APL than many expect. If you doubt me just press the Siri button on your iPhone and ask, “Integrate X squared times sine X from 0 to 2.” What comes back has more of an APL than QWERTYUIOP flavor. Strange Unicode characters are creeping into many mainstream languages. This is a good thing because restricting programming to the miserly key sets of ancient typewriters was, is, and always will be a spectacularly bad idea. Ken Iverson deserves rich accolades for pointing this out more than fifty years ago and beating this drum incessantly during his lifetime. Iverson taught that notation is a tool of thought and that if you care about ideas you must care about how they are expressed. Why is this even remotely controversial?

Siri’s results use appropriate mathematical notations. As we move away from keyboards programming languages and mathematical notation will merge. APL was way ahead of its time in this respect.

The genius of APL continues to exert influence on many programming languages, but APL’s rise had little to do with its abstract notation and a lot to do with how it was implemented. APL was one of the first programming environments that nonprogrammers could use. It was the spreadsheet of the late 1960’s and 1970’s and just like spreadsheets of today a lot of utterly horrid, poorly structured, lame amateur messes were created with it. If you’ve ever cracked open a gigantic Excel model that looks like it was developed by a roomful of quarreling ADHD afflicted unionized chimpanzees then you know what the standard APL mess feels like. Many programmers blamed APL for this just like gun control advocates blame firearms for shootings. They argued that it would have been impossible to concoct such monsters in clean compiled languages like Pascal. “It wouldn’t even compile.” This is not even wrong. I’ve dealt with plenty of dreadful messes that do compile! The tool is always neutral; don’t blame the paintbrush for the painting.

Allowing rubes to code yields mountains of rubbish and the occasional ruby. It will shock many programmers to learn they are not the only smart people in the world. It turns out that nonprogrammers occasionally have good ideas and, miraculously, some of them can ably express their ideas in code. Before spreadsheets such user rubies congealed in APL where some still run. Part of my day job is extracting these precious stones from layers and layers of kluges, hacks, patch jobs, retro-fits and workarounds and recoding them in modern programming languages like C# and JavaScript.

Recently I recovered1 an ancient inverted file system embedded in the APL systems of my employer and rendered it in C#. This system uses the extension .dbi. I don’t know who created this system; the code is old. The most recent code comments date from the year 2000, but I am pretty sure that .dbi files predate component files in APL+WIN, formerly STSC APL, which pushes the design back to the 1980’s or earlier. I know many APL’ers check this blog. If any of you know who created the original .dbi APL code please leave a note.

Somehow this .dbi system survived unsupported, with few user complaints, for decades of daily use. How is this possible? Astonishingly, good ideas age well and the core .dbi idea is inverted data. Modern high-performance databases make heavy use of this method. Inversion is so effective that hoary old interpreted APL code still beats compiled and optimized ADO.Net when fetching large numeric vectors and tables.

Restoring the .dbi system was a two-step process.2 I first converted the APL system to J. I used J because it is a close relative of APL but not so close that you can cut and paste. Translating nontrivial APL to J forces you to understand the APL at the nit-bitty level. The translation to J also allowed me to fix the APL interface. The original system used global variables, rampant branches and other lamentable coding practices that C# will not abide. After matching the APL and J systems I then translated the J to C# and then rematched all three systems.

Comparing multiple systems is a very effective testing technique. I found bugs in all three systems. I fixed the J and C# bugs but left the original APL code unchanged. Software archaeology is a delicate field. You don’t “fix” old code just like you don’t correct errors in cuneiform tablets. Original and important program code belongs in museums with other significant cultural artifacts.

Original inverted file code probably belongs in a museum. This .dbi APL code is old, but it certainly derives from earlier programs so it’s not museum worthy. Even if it was the APL and C# .dbi systems belong to my employer. However, I am placing the J scaffold version, which matches the performance of the other systems, into the public domain. The script is available on GitHub and here. The .dbi system gets right down to bits in some cases and illustrates some J techniques for dealing with indexed binary inverted file data. Enjoy!

.dbi files held many gigabytes of actuarially tuned data. Dumping them was not an option. We either had to convert to a new store or produce a component that could read old data in new systems.↩

Restoring old code is somewhat like restoring old pictures. When working on old pictures you’re always tempted to improve them. With pictures you usually have a choice. This may not hold for old code. Changes in software may force updates.↩