Introduction to J Programming Language [2004]

Take a look at the brief introduction to the J programming language. It took me two days to write the article. I got familiar with this language a little more than a month ago, when looking for a convenient way for programming on the Pocket PC. All this month, I read books and articles about J, K and APL. I also tried out programs of other developers, wrote some trivial ones of my own and recently wrote my first, non-trivial program, which I improved and described in this article. My overall impression of J programming language is admiration.

You’re welcome to read my article and share your thoughts about any mistakes or ambiguities found in it.

Introduction

Evolutionary Branches

In these days of universal computer literacy, it’s difficult to find an active person, who has no idea of programming languages that allow to control the computer. But even among the experts, you can hear the opinion that “all languages are similar”. Sure enough, popular languages that originated from Fortran, such as C, C ++, Java, Python, differ in the details of implementation and syntax. With all the differences between these details, as a rule, a program written in any of these languages can be easily read by a person familiar with any other of these programming languages.

The basis of this similarity is the historical formulation of programming, as the problem of moving (and processing) data from one memory cells of a computer to other ones. Plus, the movement is element-wise. Assembler (as opposed to programming in machine code) made it possible to name cells and their groups with symbolic names. In Fortran, platform- and architecture-independent operations on the named cells have been introduced. Then, the C language provided even more architecture-independent access to computer resources, allowing to operate directly on references and pointers to cells in the linear memory. Even in C++ and Java languages, a program is just a linear (with branching and loops) instruction for moving (and processing) the data located in the linear memory. These languages are just variants of the Turing linear machine in which operators are the combinations of Turing commands. By the way, Alan Turing is the author of the historical formulation mentioned above. The evolution of these languages moved towards hiding differences between various implementations of Turing machines. However, it provided as much as possible direct access to what is common between these machines (that is, to Turing’s one-dimensional memory tape and program tape with numbered commands to perform moves). The apotheosis of this evolution are probably the languages for virtual machines, Python and Java, in which cross-platform differences are smoothed over most strongly, but still, programs are read and look like Fortran.

Another historical peculiarity of the languages from this evolutionary branch is the scalar nature of their basic operations that affect the cells one by one. It’s connected with the fact that first multipurpose processors were scalar, as well as the Turing machine itself.

Summarizing, we could say that the mentioned above language branch developed by the principle of “being as close as possible to the “hardware” (i.e. to the Turing machine), abstracting the differences between its various implementations”. Of course, there’s nothing wrong with it, but the hardware in these languages has defined the course of development. Moreover, hardware also develops all the time. As for today, it’s far from being scalar, one-dimensional, one-processor and Turing[ous].

Another branch was developing in-parallel with the already mentioned one. I guess we should count its existence from the book by Kenneth E. Iverson “A Programming Language”, 1961, in which the author described the APL language. Iverson got the Turing prize for his work in this language. In his lecture (Kenneth E. Iverson: Notation as a Tool of Thought. Commun. ACM 23(8): 444-465(1980)) he described the basic principles underlying the APL. In my view, the main of them is the refusal to follow the details of Turing machines implementation. He left it to developers of compilers and interpreters. Instead of that, he wanted to make the programming language a reflection of mathematical ideas and objects by analogy with the mathematical notation. He wanted it to be compact enough, so that using it people could not only control the computing, but also think…

The APL language was implemented on big electronic computing machines (the ones that used to occupy floors and basements of some buildings). It was used in areas requiring the efficient processing of large volumes of data (mostly in banks and stock exchange), but was not popular in the mass. There are several reasons for that. First of all, mass computers, starting from those having Z80 processor, were absolutely scalar. Secondly, large volumes of data could not be stored in them. But APL also had non-standard symbols, that were absent on “normal” keypads. By the way, APL is sometimes called “a Chinese BASIC” due to its symbols. Though still living this language is doomed to exist within individual elite clubs.

But it’s not the end of the story. Both Moore’s law and the Internet interfered. Due to the first one, the power of “home” processors increased significantly. Even some vector constructions appeared in them. The second one provided those “giant” data volumes, for processing which APL is so useful. It may seem that APL should have experience a rebirth, but its symbols did not manage to enter regular coding. They complicated programs exchange in the Internet and, therefore, the language spreading. The entry barrier turned out to be too high.

Being aware of that, in 1980s Iverson and his colleagues (I’ll mention just Roger Hui and Arthur Whitney) began the project of developing a new version of APL, which was named J afterwards. This language dates its history from 1990, when it was mentioned by R. K. W. Hui, K. E. Iverson, E. E. McDonnell, and A. T. Whitney, in «APL/?,» (APL90 Conference Proceedings, APL Quote Quad 20, No. 4, ACM, New York) and described in the book by Kenneth E. Iverson: J Introduction and Dictionary (Iverson Software, Toronto, Canada, 1991). They took into account APL experience in many areas. In particular, they methodically introduced vector operations on multidimensional arrays (using rank of the operator and k-cells). The alphabet of this language became ASCII. We can compare the difference between APL and J to the difference between Fortran and C (or rather C++, considering the fact that J supports the object-oriented programming).

But history was not linear here either (refer to Kenneth E. Iverson: A Personal View of APL, IBM Systems Journal, Vol. 30 No. 4 1991). At the very beginning of J development, Arthur Whitney left the group and began developing his own K language. One of disagreements (not personal, of course) between Whitney and Iverson was the excessive complication of the J language (to Whitney’s mind) with the concepts of rank (in K, operators work elementwise), and also excessive capabilities (such as complex numbers, 3D graphics). But Whitney was the one to introduce ranks in 1982 at the APL conference in Heidelberg. He mentioned that the convolution of an array along one of the axes (+/[I] A) could be written with the help of the rank assignment operator (+/«I A). Nevertheless, there are no ranks in K. This language is more compact, simpler and turns out to be adjusted to the database field. On the basis of the K language, Whitney’s Company ( Kx Systems ) has developed a relational database named kdb. As for today, it’s a leading product in this field. It has also surpassed the widely advertised Oracle as for its rates in TPC tests. Rumour has it that the source codes of the kdb database (supporting SQL, ODBC, JDBC, remote HTTP access and plenty of other functions that are standard for the current field) is stored in the 26 text files named with single-letters. Each file is said to contain approximately one full standard screen of code that can be viewed without scrolling. They also say that there are no loops in kdb. All of that can be just rumors. But still, kdb distributive together with K interpreter and examples take (just!) 200 kB. This fact is available for the independent vefirication. Anyway, we’re going to talk about the J programming language that provides much more capabilities for a wide range of tasks. As for far-fetched analogies, I could also say that the difference between J and K is similar to C and Pascal differences (or rather C++ and object-based Pascal, as we can also program with objects in K).

Convex Hull

Perhaps, the best way to “feel” the language is to write some useful (and, as a rule, classical) program in it. Of course, if you want to make money, it’s better to write a non-classical program. But classical ones are more suitable for having fun with a new language, since they allow comparing results with other programs.

Say, we have a classical task to build a convex hull of a set of points. This procedure is really necessary in computational geometry. For instance, we can use it in various algorithms of collision detection. It comes down to choosing and ordering points that fit the traverse (say, counter-clockwise) of the convex hull of this set of points.

What is the convex hull? Let’s imagine that a plane is a piece of rubber. We’ve stuck pins randomly into it (pins will be our points). If we now take a thread and tighten it around all the pins, it will touch only some of them (the external ones). Its form will look just like a “convex hull”. The task of our algorithm is to find and enumerate one by one the pins we’ve touched with the thread.

Take a look at this picture:

The convex hull of this set will be (in the counter-clockwise order) points with the following coordinates: (-2,0), (1,-6), (5,-4), (10,-1), (8,5), (2,4). We should now write a program that will solve this task as quickly as, say, a task with a million of points.

What to Do?

This task also has its own history with algorithms, beginning with the simplest one. Its running time grows as N2 with an increase in the number of points. There’re also complex, randomized algorithms. Their running time depends on the result (the number of points in the final hull). There’s also a Graham scan. It is intermediate and scales reliably as N log N. We are going to implement it in this article.

This algorithm is to sort (time complexity is O(n log n) the points by polar angles, relative to the one at the leftmost of them (it belongs to the hull by definition). The next step is to pass sequentially the resulting starlike outline (in ~N time), starting with the left point, and remove all points, the angle of which (relative to the interior of the polygon) is more than 180˚.The remaining points will form concave angles. Thus, the hull will be convex.

Now let’s run the J interpreter and get down to solving our task. Installed? Launched?

When running the interpreter, you will see a blinking cursor that is exactly three spaces away from the left. It’s the input field. If we type something and press “Return”, the expression will be computed and the next string will show the result. Let’s type something! For example:

2+2
4

Wow! It works!

In J programming language, negative numbers are marked with the underscore (_) instead of the minus (-), like in other languages. This allows distinguishing the minus that is a part of constant definition from the minus that is an operation. For instance:

2-4
_2

It’s minus 2. Primitive operations in J (or verbs according to the language terminology, which emphasizes that they mean action) are coded by symbol combinations from ASCII, single and pair ones. At that, the operation denoted with the same symbol combination depends on the number of arguments that have been passed to the verb. There can be two arguments — on the left and on the right from the verb. That is the dyadic case of the verb. If there’s just one argument on the right – it’s the monadic case. For instance, we have just used „-“ verb (minus) in the dyadic case, when it means “subtraction”. Meanwhile, in the monadic case „-“ (minus) means “negation”.

-_2
2

It’s simple. Since J is a vector programming language, let’s complicate our example with vectors. Here’s a vector consisting of three numbers:

1 2 3
1 2 3

Separate vector elements by spaces. Monad operations affect one-dimensional vectors elementwise (multidimensional vectors are much more complex, but we don’t need them for now).

-_2 1 2 3
2 _1 _2 _3

Let’s look it up it at the J Vocabulary… Here’s the „%:“ verb. In the monadic case, it corresponds to the operation of finding the square root.

%: 4 2
2 1.41421

Now, like this:

%: _1
0j1

Oh, it turns out that J supports complex numbers!!! The square root of minus one (-1) is the imaginary unit. That must be the way these numbers are indicated here!

Let’s try to do it the following way:

0j1 * 0j_1
1

Thus, the imaginary unit multiplied by the conjugate (the imaginary part of the opposite sign) imaginary number, returns just a unit, which is the module of the imaginary unit. If we look it up at the vocabulary, we’ll see that „+“ operation in the monad variant corresponds to the complex conjugation. That is, it does not change real numbers

+1
1

Just like in other languages operating on real numbers only, in which „+“ is defined with „-“ for “symmetry”. As for complex arguments, „+“ changes the sign of the imaginary part.

+ 0j1
0j_1

But complex numbers are a very natural form to represent point coordinates on a plane. We know how to enter complex numbers, create lists. Now, let’s compile a list of our points and name it the following way:

points=: 1j_6 5j_4 7j_2 4j_2 10j_1 _2j0 9j0 5j1 7j2 2j4 8j5
You should note that J derives nothing here, but that’s not the point… Unlike C, we do not assign a value to a variable (the place in the memory) in J, but a name to a value. Therefore, a point is not a variable. It’s a name for the array of our points. We can call this array by name.

points
1j_6 5j_4 7j_2 4j_2 10j_1 _2 9 5j1 7j2 2j4 8j5

We can also assign this name to some other value. But till it’s not assigned, we can not change the value of the name. Thus, no one can change even one point in the point list, even its imaginary or real part. In the terminology of J, an operation is a verb, while an operand is a noun. Our “points” name is a pronoun that does not name, but points at it referring to the mentioned above noun.

J programming language is strong in its primitive operations. You can find the full list of them in the J dictionary and vocabulary. Plenty of operations can be explained by the fact that the language is vector. There are much more variants of actions with arrays, than with scalars. There are even two ways of multiplying a matrix by a vector in two ways – either by lines, or by columns. In order to write and understand programs without consulting J vocabulary, you should learn the latter by heart. If you want to write simple programs, it’s enough to run your eyes over the vocabulary several times and get a general idea of the available operations. At first you will have to read it using the dictionary, but you will soon learn all the necessary information.

Refer to the J vocabulary… It turns out that the array sorting operation is inbuilt and denoted as “/:” (ascending sort) and “\:” (descending sort). When sorting complex numbers, we should compare their real parts. Thus, we can sort our points:

/: points
5 0 9 3 1 7 2 8 10 6 4

The result of the monadic application of „/:“ verb (sort) is permutation of indices into the sorted array. We can get a sorted array by applying the dyadic version of the sorting verb.

points /: points
_2 1j_6 2j4 4j_2 5j_4 5j1 7j_2 7j2 8j5 9 10j_1

Now, the leftmost point is the first one in the array, while the rightmost is the last one. The string is a bit too long. We can avoid writing points two times with the help of adverbs. They change the verb action that is on the left from them. Thus, they form a new, derived verb. Particularly, there’s „~“ adverb, which converts a dyad verb to a monad one according to the following rule: „u~ x“ -> „x u x“, as if it “surrounds” the u verb with nouns. For example, using this adverb we can write 2*2 as *~2. We will make our sort a command:

/:~ points
_2 1j_6 2j4 4j_2 5j_4 5j1 7j_2 7j2 8j5 9 10j_1

Getting back to Graham’s scan, we will define the leftmost point, with regard to which we will sort all other points by angles, using of the following expression:

{./:~ points
_2

In which „{.“ in monadic application just takes the first element from the list on the right (the sorted list). But is it efficient to sort the entire array in order to choose only one element, the minimal one? Sorting is not efficient, but the interpreter can identify “{./:~” verb combination and replace it with a calculation of the minimum, without sorting. I am not sure that the current J version will identify this combination, but it will definitely identify plenty of others that are enumerated in the Supplement to the J Dictionary.

Thus, we have found the leftmost point. Now, we should sort the remaining points at the angles around this one. What should we do? Exactly! Place the origin of coordinates into the found point, i.e. subtract it from the rest

points - {./:~ points
3j_6 7j_4 9j_2 6j_2 12j_1 0 11 7j1 9j2 4j4 10j5

And again, we’ve mentioned points twice. It’s not bad, but it’s better to have just one monadic verb (our own), since the carried out calculation has just one argument of the point. Therefore, the verb should also be monadic. In order to write such verb, we should use a hook. It appears when there are two successive verbs in an expression: “(u v) y”. In which u and v are verbs, and y is a noun. You should note that there is no hook without parentheses. Instead, there’s just a simple consistent application of u and v verbs to y verb. Thus, “u v y” -> “u(v y)”, or, using @ conjunction: „u v y“ -> „(u @ v) y“ -> „u(v y)“, like we have done before. But we can write the expression in a totally different way. In the monadic “(u v) y” and the dyadic “x (u v) y”, it’s easier to represent the hook in the form of the below charts:

monad dyad
u u
/ \ / \
y v x v
| |
y y

We can see that if „u“ -> „-“, „v“->{.@/:~ (@unites two verbs into one that corresponds to the consistent application of the original ones). „y“->»points” is a monadic hook that is equivalent to our last expression.

(- {.@/:~) points
3j_6 7j_4 9j_2 6j_2 12j_1 0 11 7j1 9j2 4j4 10j5

Thus, the result is the same. The one in brackets is a proverb of our own. Getting an array of points in the form of an argument, it relocates the centre of the coordinate system to the leftmost of them. We can name this verb by creating a compound verb (a proverb):

centerleft =: -{.@/:~
and use it

centerleft points
3j_6 7j_4 9j_2 6j_2 12j_1 0 11 7j1 9j2 4j4 10j5

In the same form, we will have to arrange the final verb that will compute the convex hull. It may seem that it’s taking too long. Exactly, but it’s only because the convex hull building is not our main goal. What we want is to focus our attention on J language peculiarities. And we have covered quite a lot of stuff. We can already enter and name new data, we can sort it. We got familiar with the hooks and wrote a program centering the points on a plane around the leftmost one. And this program consists of 7 (!) symbols.

But let’s move on. Now, we should sort these points by angles. To begin with, we should calculate the angles. In order to calculate the argument of a complex number and other CIRCULAR functions in J we should use a dyadic operation that is defined as «o.» The left argument of o. dyad is an integer defining the type of the function being calculated. «12 o. y» dyad corresponds to calculating the argument of «y» complex number.

But before we get down to sorting, let’s think a bit. The problem is that we should exclude from sorting the left point we’ve placed at the origin of coordinates. Then we should place it at the beginning of the sorted list, since it belongs to the convex hull.

In order to make it shorter, let’s use a fork. A fork works almost like a hook. But it’s a bit more complex, since it contains not two, but three consecutive verbs. The dyadic «x (u w v) y» and monadic “(u w v) y” cases of fork application can be illustrated with the following chart:

We used a verb train and the constant function (12”_). How does this train work? No, there is nothing more complex than a fork in J! If a verb train is longer than three verbs, we divide it into threes (with the last pair in case there are not enough of verbs). Each group is calculated according to the rules of the fork or the hook. Thus, we can use parentheses in the last expression ‘(12”_ o. (}. — {.)) spoints’ and it will be equal to two forks ‘((12”_ spoints) o. ((}. — {.) spoints))‘. In which ‘12”_’ is not a noun, but a proverb. It’s a function returning «12» constant, regardless of its argument. By the way, we can create a constant function from any constant by adding ‘”_’. Do you understand now, how verb trains work?

The dyadic sorting organizes the left argument (obtained here by means of a fork, discarding the fiest “}. spoints”) point) in the order assigned by the right argument (by the already calculated angles). As a result, we’ll get points that are sorted by angles. We’ll get all of the points, except for the one we’ve discarded. Now, we should add this first one to the beginning with the help of another fork. We will indicate the binding of two arrays using the dyadic “,” operation. Therefore,

will bind the first of points to the beginning of the sorted array. This point did not take part in the sort. Thus, when uniting it with the sort that is temporarily located in spoints, we will write our verb as follows:

s =: ({. , }. /: 12"_ o. }. - {.) @ /:~
When applying it to the unsorted array of points, we will get the same points. But they will be sorted in the form of a “star”, in regard to the leftmost one, which will be the first in the list.

s points
_2 1j_6 5j_4 4j_2 7j_2 10j_1 9 5j1 7j2 8j5 2j4

It’s time to look at the first picture with points, so that we could estimate how close we are to building a convex hull. We just have to remove the convex angles of the star. Then, there will remain concave angles. Thus, the resulting polygon will be convex by definition. That’s exactly what we need. We are going to solve this task iteratively by throwing away at each iteration all angles that we can reach and define as concave.

Let’s take a look at one more angle. It consists of three points. The traverse goes from the first to the third one. We will consider the angle at the second point. The picture:

There are two versions of the third point location. Variant 3, when angle 123 is convex (that’s what we need) and variant3’, when angle 123’ is concave (we do not need that as it says that 2 point does NOT belong to the convex hull and we should throw it away). But how do we distinguish 3 and 3’ situations? It will help if we know that the cross product of two vectors in the plane directed completely perpendicular to this plane and proportional to the SINE of the angle between the vectors. The angle between 12 and 13 vectors is positive, while the angle between 12 and 13’ vectors is negative. Thus, the sine has the same sign. Therefore, we should calculate the cross product [12 x 13]. How can we do that? There’s a famous formula for the cross product of vectors A=(Ax,Ay,Az) and B=(Bx,By,Bz) in the form of the determinant:

In which i, j, k are unit vectors in the direction of x,y,z coordinate axes. Thus, xi and yi are coordinates of the corresponding points (i=1,2,3). Therefore, we need to calculate the value (x2-x1)(y3-y1)-(x3-x1)(y2-y1), the sign of which, if positive, me will tell us that we have to deal with the situation 123, while the negative sign will mean that the situation is similar to 123’. This expression is NULL when 1, 2 and 3 points are all located at one straight line. In this case, we will consider that 2 point belongs to the convex hull. How do we calculate it?

Suppose, there are three points defined in the following form: p1 = x1 + I y1, p2 = x2 + I y2, p3 = x3+ I y3. Let’s take a look at the product:

(p2-p1)*Conjugate[(p3-p1)],
its imaginary part which is equal to “-((x2-x1)(y3-y1)-(x3-x1)(y2-y1))”.

When we’ve got three points as an argument, we should write the verb that will calculate this test, the following way:

l =: 11"_ o. [: (* +)/ }. - {.
Got that? At first we subtract the first point from the other ones, and then discard it. Then, using “/” adverb, we insert the “*+” operation (multiply “*” by the complex conjugated “+”) between the two remaining array elements. “[:” verb is called a “header”. It discards the appropriate branch of the fork, which makes an operation at its top monadic. Finally, extract the imaginary part from the result with the help of the circular o. function, with the left 11 argument.

Now we should apply this test to sequential threes of the sorted points. In order to build such moving “window”, there’s a ready “\” adverb in J language. It derives a dyadic verb, the left argument of which is the number of threes that we place into the moving window of the operator. Its right argument is the array, which we apply to the current operator:

3 l\ (s points)
_30 _10 6 _3 _4 _3 6 _5 _17

You should note that there are two points less, than in the initial array. It happened due to the fact that we cannot build threes around the first and the last points. A test to check whether the appropriate angle of the polygon is convex can be written as follows:

0 > 3 l\ s points
1 1 0 1 1 1 0 1 1

The result is a binary array, with ones across the vertices with convex angles. But let’s remember that we have sorted everything around the leftmost point. Therefore, the first and the last vertices definitely belong to the hull. That’s why we can just add two ones to the acquired Boolean list. We’ll place one of them at the beginning, and the other one at the end.

1, 0 > 3 l\ s points , 1
1 1 1 0 1 1 0 1 0 1 1

Now we should find “#” dyad in J Dictionary. It will remove from the right list the elements that have 0 opposite them in the left list, the Boolean one.

But! There can be remaining some obtuse angles. The reiteration will remove them:

rr rr s points
_2 1j_6 5j_4 10j_1 8j5 2j4

When there are no obtuse angles, the result is the convex hull we’ve been looking for. In our case it happens during the third iteration.

rr rr rr s points
_2 1j_6 5j_4 10j_1 8j5 2j4

To reuse the verb several times more, we should use the “^:n” adverb, in which n is the number of times.

rr^:3 s points
_2 1j_6 5j_4 10j_1 8j5 2j4

In fact, this corresponds to raising a verb to the power. Raising a verb to an infinite power means applying it till the result stops changing. In our case, till there are no concave angles remained. That’s exactly what we need. Thus, we’re close to defining the final hull verb that will solve the task of building the convex hull:

hull =: [: rr^:_ s
This definition means the following: “apply the iteration for removing convex angles to the starlike sorted list of points” When applying it to our list, we will get the following:

hull points
_2 1j_6 5j_4 10j_1 8j5 2j4

Now, let’s take a look at the initial picture! Is that it?

Thus, the full text of our program for finding the convex hull can be written as:

Since it uses a bubble sort, its operation time is scaled as N2. Changing the sort to, say, quicksort would make this program significantly longer.

Draw your own conclusions… I will just say that, to my mind, writing a program in J is more like deriving a formula, than rubbing every smallest detail of calculations in computer’s face.

NOTE: this review is illustrative and you can optimize this program. Its operation time has the N log N asymptotics, as befits the Graham’s algorithm. But you can improve the constant multiplier of this asymptotics. For instance, you can avoid the rightmost sort in «s» replacing it with a permutation of the leftmost point (with the minimal real part) to the first position in the array. Such move can be done in ~N time (in contrast to the sort requiring N log N). You can do all of that as an exercise.

[*] if we take this point not belonging to the hull, it may lead to some saving of resources in the future, as such point at the outline is clearly excessive. Those wanting to use Graham’s scan for further calculations, can like the idea of discarding such points. For educational purposes, it’s better to leave the points located at one straight line, within the outline. But they belong to the hull even instinctively (according to the method of drawing a thread).

Comments

Most of us work with strings one way or another. There’s no way to avoid them — when writing code, you’re doomed to concatinate strings every day, split them into parts and access certain characters by index. We are used to the fact that strings are fixed-length arrays of characters, which leads to certain limitations when working with them. For instance, we cannot quickly concatenate two strings. To do this, we will at first need to allocate the required amount of memory, and then copy there the data from the concatenated strings.