An Introduction to Object Oriented Programming For APL Programmers

by Morten Kromberg, Dyalog Ltd, December 2005

Introduction

This introduction has been written as a
companion to version 11.0 of Dyalog APL. Our goal is not only to explain the
details of new functionality in version 11.0 of Dyalog APL, but also be mildly
provocative and entertaining, to convey a flavour of the thinking which is
behind the object oriented extensions to APL, and something about how the
development team imagines that you might make use of the new features  in the
belief that this will make them easier to understand and use.

It is recommended that you have version 11.0
of Dyalog APL available for experiments as you work through the guide, and use
it to verify your understanding of the new features as they are introduced.

If you have an electronic copy of this
guide, and the Paste Text as Unicode option enabled (see
Options|Configure|Trace/Edit), you should be able to copy and paste code from
the guide into the Dyalog APL editor and session. Alternatively, the folder OO4APL,
included with the version 11.0 installation, contains a workspace with the same
name as each of the classes and namespaces used in the guide.

Version 11.0 of Dyalog APL introduces Classes to the APL language. A class is
a blueprint from which one or more Instances
of the class can be created (instances are sometimes also referred to as Objects). For example, the following
defines a class which implements a simple timeseries analysis module:

The above description declares that instances of TimeSeries will have four Members: two Public Fields called Obs and Degree (the latter having a default value of 3), a Public Method called PolyFit (header plus 4 lines of code) and a Constructor, which is implemented by the function make (header plus 3 lines of code). Note that methods (or functions) begin and end with a ∇. The term Public means that the methods are for public consumption by all users of the class.

The system function ⎕NEW is
used to create new instances using the class definition. The first element of
the right argument to ⎕NEW must be a
class, the second element contains instance
parameters, which are passed to the constructor:

ts1←⎕NEW TimeSeries (1 2 2.5 3 6)

During instantiation,
the constructor function make is called, and it initialises the instance by storing its argument
in the public field Obs. We can now use the instance ts1 in much the
same way as if it were a namespace:

For negative arguments, ⎕NL returns
a vector of names rather than a matrix. The Degree field allows us to decide the
degree of the polynomial function used when fitting the curve. The following
example uses a straight line:

ts1.Degree←1
ts1.PolyFit ⍳5
0.7 1.8 2.9 4 5.1

Arrays of
instances are handled in much the same way as arrays of namespaces:

APL developers have often used naming
standards, and in recent versions of Dyalog APL namespaces, to collect related
functionality into modules. Users of Dyalog APL will recognise that an instance
is very similar to a namespace. One
of the big advantages of classes is that they make it possible to clone a
namespace and create multiple data contexts, without copying the code. This
saves space, but more importantly it means that you are less likely to lose
track of where the source code is.

Imagine that we have prototyped our way
through to a classical solution to fitting multiple polynomials. Looking
back, we are able to copy the following expressions from our session:

In a traditional APL system, we could now
create a workspace called POLY with these functions inside, write some documentation explaining
how to call them, and then store that documentation in a variable in the
workspace, or in a separate document. Anyone wanting to use the functions would
have to find the relevant documentation and make sure that he did not already
have any functions with these names in his application (or variables with the
same name as the documentation). A namespace could be used to isolate the names
from our application code.

Classes make it possible for the developer
to encapsulate functionality in a way which keeps related code and data
together, avoids name conflicts and
provides some degree of documentation which suggests and can limit how the
solution is used. This makes the module easier to learn to use, while the
control over how the module can be used makes it easier to maintain. At the
same time, splitting an application into objects with well defined behaviour
and interfaces is a valuable tool of thought when dealing with complex design
issues.

On the other hand, it is also clear that the
simple functions polyfit and polycalc are more generally useful
than the PolyFit method of the class TimeSeries, which exposes a specific
use of polynomial fitting. The encapsulation of data within instances can make
it harder, slower, and sometimes virtually impossible to go across the grain
and use the properties and methods in a way which is different from that which was
intended by the class designer. OO fans may argue that object orientation will
help you think more carefully about how things will be used and this is to your
advantage. However, APL is often used in problem areas where requirements
change very unexpectedly. Providing a
flexible solution with OO design is as much of an art, and requires the same
insight into where the solution might be heading, as any other technique.

A key design goal for version 11.0 has been
to make it as easy as possible to blend the array and functional paradigms
which already make APL so productive, with the object oriented view of data, a
Tool of Thought in its own right.

This is one of the reasons why, if you have
a namespace POLY which contains the two dynamic fns we developed above, you can add
a line which says:

:Include #.POLY...

at the beginning of :Class TimeSeries, and subsequently write PolyFit as:

This Introduction
to Object Oriented Programming for APL Programmers will attempt to
illuminate the issues and put the reader in a better position to decide when
and how to combine array, functional and object thinking. In order to achieve
this, we will:

First, briefly explore the thinking which
lead to the emergence of OO, to get an idea about the type of problems which OO
is likely to help us solve.

Introduce the fundamentals of OO programming
using a number of examples written in Dyalog APL version 11.0.

Illustrate how the new OO functionality in
Dyalog APL makes it easier than ever before to implement components which can
be consumed by other development tools.

Where possible, try to remember to discuss
alternative solutions, and present some guidelines on how to choose between the
various techniques which are available. Given that the temperament and
environment of the developer, the department and the company will weigh heavily
on any choice of technique, it is clear from the outset that there will be no
universal answers.

Although OO feels like a recent invention to
many of us, the first OO language saw the light of day around the same time as
the first APL interpreter. SIMULA (SIMUlation LAnguage) was designed and
implemented by Ole-Johan Dahl and Kristen Nygaard at the Norwegian Computing
Centre between 1962 and 1967, based on ideas which Nygaard had developed during
the 1950s[1]. At the same time that Ken Iverson was working on new ways to
conceptualise algebra and computation involving large groups of numbers,
Nygaard was searching for ways to think about a different type of systems using
symbolic notation[2].

The focus of the NCC work was on the
simulation of complex systems. Nygaard explained the rationale behind SIMULA as
follows:

SIMULA
represents an effort to meet this need with regard to discrete-event networks,
that is, where the flow may be thought of as being composed of discrete units
demanding service at discrete service elements, and entering and leaving the
elements at definite moments [sic] of time. Examples of such systems are ticket
counter systems, production lines, production in development programs, neuron
systems, and concurrent processing of programs on computers.

The desire to describe and model so-called
discrete-event networks led to object-oriented notation, in which the
description of the ways in which the service elements interacted with each
other is separated from the details of how each service element (or object) manages its internal state. As
with APL, the language subsequently evolved into a notation which could be
executed by computers.

SIMULA turned out to be a powerful notation
for simulating complex systems, and other OO languages followed. Initially, OO
languages were used for planning and simulation applications (much the same
areas as APL has been most successfully applied to), but with the arrival of
graphical user interfaces, which are a form of discrete-event network, and
concurrent or networked computing systems, OO languages proved that they had
more to offer. As systems and teams used to implement them have grown in size
and complexity, OO has grown from a humble start as a specialist modelling
technique to become the most popular paradigm for describing computer systems.

With Dyalog APL version 11.0, Arrays,
Functions and Objects are now happily married. It is possible to have arrays of
instances, and instances can contain arrays (of more instances, if necessary).
The challenge is to pick the best architecture for a given problem!

As an illustration, let us take a look at
one of the classical examples which the inventors of SIMULA used the new
language to model: The queue.
Customers arrive at random intervals and enter the queue. An algorithm
simulates the time required to process each customer. The goal is to run a
number of simulations with different parameters and see how long the queue gets
and how long customers have to wait. With luck, we will discover how the queue
or system of queues can be optimized, and measure the effect of improving the
system without having to perform expensive experiments. In the post-modern age,
these systems are probably being used to see how much longer the queues will get if the Post Office spends less money. Assuming that the planning
department has not already been made redundant.

The following
is a simple Queue class, written in Dyalog APL version 11.0[3].

The class has two public methods called Join and Length, which are
used to add customers to the end of the queue and report the current length of
the queue, respectively. There is a public
field History which contains a record of customers who passed through the system.
Note that members of a class are private unless they are declared to be
public. Private members cannot be referenced from outside the class.

While the TimeSeries class
in the previous chapter only had public members, Queue has a private field called Customers, and a private method
called Serve, which is launched in a background thread when there are customers
in the queue (thread-savvy readers are requested to ignore the potential race
condition if Server drops the last element of Customers at the same time as Join
adds one).

If
we have a workspace containing the above class, we can experiment with it from
the session:

The variable History contains a
log of the customers who passed through the queue, the time they spent in line
and the length of the queue when they entered it. Since History
was declared as a public field, we can refer to it from outside:

In our Queue class, we
have decided that the Customer field, which contains the list of customers in the queue, is private. We do not want users of our
class to reference it. We have provided a public method Length
which makes it possible to determine how long it is.

Why have we not simply made the variable
public and allowed the user to inspect aQueue.Customers
using ⍴ or other primitive functions,
rather than doing extra work to implement a method ourselves? We would
typically do this if we want to reserve the right to change this part of the
implementation in the future, or if we do not wish to take responsibility for
the potential bugs resulting from the use of these members (Warranty void if
seal broken).

If we had exposed Customers
directly, users would have the right to expect that we would continue to have a
one-dimensional vector called Customers with similar characteristics in the future. It would be virtually
impossible for us (as class designers) to estimate the impact of a change to
this variable, without reading all the application code to see exactly how it was being used. And even
then, we would probably get it wrong. The user might also feel tempted to
modify the variable, which might cause bugs to be reported to us, even though
there were no errors in our code.

If our class evolved into a more general
tool where the contents of the queue were not necessarily customers, we could
not rename it  so we would end up with a system where variable names were
misleading[4]. We cannot turn it into a 2-dimensional matrix, store it on file,
or make any other architectural changes which might be convenient as requirements
evolve and we need to store more or different types of information about the
queue.

We have thought ahead a little bit and
decided that it will always be reasonable for users of the Queue
class to ask what the current length of the queue is, and therefore we have
exposed a public method for this purpose.

Information
hiding is
one of the cornerstones of OO. The ability to decide which members of a class
are visible to the users on the one hand and the developer of the class on the
other is seen as the key to reduced workload, improved reliability and
maintainability of the code on both
sides of the divide. Dyalog APL version 11.0 takes a strict view of
encapsulation. It is not possible to
reference private members (a VALUE
ERROR will be signalled if you try).

aQueue.Customers
VALUE ERROR
aQueue.Customers
^

Clearly, if the designer of a class you are
using has decided to hide information which you really need before 9am
tomorrow, this can be frustrating. Before you get too concerned, the good news
is that there are a variety of techniques for getting past the gatekeepers in
an emergency, or in a prototyping session. We will discuss a couple of these
in the following chapters. However, to enjoy the full benefits of OO, it is
important to have the discipline to use such tricks only when required, and
re-factor the class in question at the first available opportunity.

The variable History is
obviously susceptible to the same problems as Customers, and
seasoned OO designers would probably consider it to be bad form to expose it.
This is perfectly true, exposing all the result data in this form in order to
make it easy to analyze is a result of traditional APL thinking. We will
investigate a number of alternatives which we could have used to expose data in
subsequent chapters.

3. Working with Classes

In order to help you successfully experiment
with APL as we explore more OO functionality, let us take a closer look at the
practical details of functionality which has been introduced in the first two
chapters, and get comfortable with actually working with classes and instances.

The easiest way to create a class is
probably to use the editor. Start an )ED session, prefixing the name of
your new class by a circle (ctrl+O on most keyboards). Were going to use a
class for generating random numbers to illustrate some important issues:

)ed ○Random

This will present you with an empty class,
which only contains the :Class and :EndClass statements. Insert a few lines to create the following simple
class:

This class can be used to generate sequences
of random numbers with a flat distribution between 0 and 1 (with only 2
digits, to allow us to easily recognize them in the examples). One advantage of
encapsulating it in a class is that it can manage its own seed (⎕RL)
completely separately from the rest of the system. We can generate repeatable
or suitably randomized sequences according to the requirements of our
application.

As you exit from the editor, APL evaluates
the class definition from top to bottom. Most of the script consists of
function definitions, but in our class there are two APL expressions, which are
executed as the script is fixed in the workspace:

InitialSeed←42
⎕←'Default Initial Seed is: ',⍕InitialSeed

As a result, you should see one line written
to the session as you leave the editor. If there are errors in any of the
executable lines in your script, you will see one or more error messages in the
status window, and the class will not be fixed in the workspace. If you are
unable to correct (or comment out) all the errors, you can save your work and
return to it later by changing the type of the name to a character vector.

Lets perform some experiments with our new
class:

rand1←⎕NEW Random 0
Random Generator Initialised, Seed is 42

The constructor also has an output
statement, which shows us which initial seed was selected for the instance.

As can be seen above, each instance produces
the same sequence of numbers, if the same initial seed is used. The sequence is
unaffected by the use of ? in the root, or indeed
anywhere else in the application.

We can create a generator with a non-default
initial seed, and we can reset the sequence. If you are accustomed to using
namespaces, the above behaviour will not come as a surprise, as you will be
accustomed to each namespace having a separate set of system variables.
However, the encapsulation provided by an instance is even stronger, as
illustrated by the following example:

If rand4 had been a namespace rather
than an instance, the call to ? inside rand4 would have modified ⎕RL in the namespace, and the subsequent call to Flat
would have continued from a different point in the sequence. However, APL
expressions executed on an instance from outside are executed in an APL
execution space, which is separate
from the space in which the class members run.

In effect, when an instance of a class is
created, APL encapsulates it within a namespace. This has always been the case
for instances of COM or DotNet classes, and as a result, Dyalog APL also allows
the use of APL expressions in parenthesis following the dot after the name of
one of these instances. Such APL expressions have access to all the public
members of the instance, but are (obviously, since Excel cannot run APL
expressions) executed outside the instance itself, as in the following example:

In this example (which would work the same
way in Dyalog APL versions 9 or 10), there is an APL expression which
references the public properties Version and OperatingSystem
and catenates them together. For consistency, the same approach is used for
instances of APL-based classes, rather than simply running the expressions as
if the instance was a namespace. Thus, the behaviour of an APL class will not change
if it is exported to a DotNet Assembly or a COM DLL and subsequently used from
APL.

When an instance method such as Flat is
referenced in one of these expressions, it runs in the instance environment.
For example, the example on the previous page could have been written:

The reverse is required because the
rightmost call to Flat happens first J. The
important point is that the call to (?6) in the middle of the expression
executes in and uses the ⎕RL in the
APL space and does not modify the
value of ⎕RL in
the instance space.

Note: There
are a couple of small potential surprises which are worth mentioning:

First, the APL space inherits the values of ⎕IO, ⎕ML and
other system variables when the object is instantiated not where it is being used.
Secondly, if you mistype the name of a property or field in an
assignment, this will create a variable in the APL space. For example:

This is the same behaviour as you would get
if you made a spelling error in APL, but might come as a bit of a surprise in
an OO setting. However, we believe that it is desirable to allow a user to
introduce own names for analytical purposes. For example, if iPlan
is an instance of some object which exposes properties named Actual
and Budget, it may be very useful to introduce a new property:

iPlan.(Variance←Actual-Budget)

It is possible that a future version of
Dyalog APL will allow the class designer or the user to place restrictions on
the introduction of new names into the APL space.

The strict
encapsulation described in the previous chapter may be a bit disconcerting
to APL developers, who are accustomed to having access to data on a want to
know rather than a need to know basis J. What if we want to know what value ⎕RL or InitialSeed currently have in the instance, because the instance seems to be
misbehaving?

The first thing which is important to
realize is that if you set a stop in an method, or trace into a call, the
internal environment where the method is running is available to you while the
method is on the stack. To experience this first hand, create a new instance of
Random, trace into a call to Reset or Flat and examine
the value of ⎕RL while one of
these functions is suspended.

It is also important to realize that classes
and instances are dynamic in APL (as you would expect)! If you edit a class and
fix it, all existing instances will be updated to include the new definition.
You can inject temporary methods into a class for debugging purposes. Type )ED Random and add a public method to the class:

∇ r←RL x
:Access Public
r←⎕RL
:If x≠0 ⋄ ⎕RL←x ⋄ :EndIf
∇

Using our new method RL, we
can now query and set ⎕RL in the instance
as follows:

rand1.RL 77
42
rand1.RL 0
77

The system function ⎕CLASS
returns the class of an instance. This can be useful in a debugging situation
where you are faced with a misbehaving instance of unknown pedigree and need to
know which class to edit. You could just display the instance, the default
display will often tell you the class name, but as we will learn a bit later,
it is possible to change this  so it is not a reliable way to determine the
class.

⎕←rand1
#.[Instance of Random]
⎕CLASS rand4
#.Random

In one of the following chapters, we will
show how it is possible to define a derived
class. A derived class extends an
existing class by inheriting its
definition and adding to it. For an instance of a derived class, the result of ⎕CLASS
will have more than one element, and document the entire class hierarchy. The first element always
starts with a reference to the class which was used to create the instance.

The ultimate workaround or back door to
break encapsulation is of course the introduction of a public method with a
name like Execute, which allows you to execute any APL expression you like in the
instance space. We can use the :Include keyword to embed a namespace containing suitable development tools
in your classes. An example namespace called OOTools can be
found in the workspace of the same name in the OO4APL folder. It
includes a number of functions which may be useful during development. The
functions with names beginning with i
will execute in the instance space, those beginning with s will run in the shared space (more about the shared space later):