Daily Challenge Day 14

Friday is admin day, and after completing admin decided to take a look at how far I had progressed with converting my Technical Library to LibreOffice Basic (LoB), since it has been a few months since last worked on it. Whilst doing so I noticed that one of the new exported module files (.bas) was over 64kB. To convert to LoB I have combined vba modules together, due to to limitations of variable and type scope in LoB. So the process, comprises of modifications in LoB, FreeBasic and vba. The larger module doesn’t appear to be a problem: confirming my suspicions that the 64kb limit was not entirely valid. So instead of writing a post for my daily challenge I went googling information on the issue.

Here is one discussion group on the issue: Does VBA have a 64K limit on Modules? Part of the answer is given by the quote attributed to Chip Pearson. I say part because in effect it is still talking about file size, and I suggest the issue is about memory management.

I’m guessing the issue has something to do with the 8080 chip, and CPM/80 and MS DOS, where a program was limited to run in 64kB of memory. The following articles discuss the allocation of memory in MS DOS, and the Excel/vba limits:

Whilst the discussions are interesting they seem to get locked into the file size, and lose sight of allocation of memory: that is file size does not directly relate to the amount of memory required by the program.

I remember when programming in Turbo Pascal, that there was an array size and data structure limit of 64kB: it doesn’t leave much for the program if the array is stored in the .COM file. As far as I know it wasn’t stored in the file, the .COM contains instructions to allocate RAM memory for the array, not the array itself. The point being is that the .COM file itself could be less than 64kB and the program can still exceed the 64kB limit. On the other hand Turbo Pascal supported overlay files (.OVR), which permitted programs larger than 64kB. If I remember correctly the overlay files, could replace each other in memory, so they were only useful if instructions could be broken into independent chunks (eg. swap editing functions for printing functions). Apparently the next step beyond the limits of the .COM file was the .EXE file which, worked with multiple 64kB segments.

The limits on the size of arrays and data structures impacts the practicality of computers for solving engineering problems. As systems of equations are solved using matrix techniques, and therefore the more variables and equations to solve, the larger the matrices involved, and therefore more memory required. Since the matrices are largely filled with zeros, some efficiency can be achieved by using banded matrices: this is the approach typically taken for matrix structural analysis (MSA), for example our plane frame program. But if the problem gets larger still, then techniques have to be adopted which allow the use of hard-disk space, and solve the system of equations incrementally, passing through the matrix like a wave. That is to say cannot work with the whole matrix at once, only part of it, and therefore the partial results from files have to be combined together.

Anycase with the arrival of MS Windows the need for these hard-disk techniques diminished, as large arrays can now be defined without getting a compiler error. Similarly the major difference between spreadsheets and databases, for example between Lotus 123 and dBase was that spreadsheets were limited by RAM, whilst databases worked incrementally on files. I remember for one project, I had to take memory from two other computers to get enough RAM to process railway curve data for a long stretch of railway. I’d suggested that it would be easier to process the data in FoxBase, which the company had licenses for, but I was required to use Lotus because that’s all the engineer on the project could use. I was there because I could do stuff with computers. Actually Lotus was the easy way to process the data given that needed different formula for different records, and for the most part that was apparent as manually went through each row. I did create multiple flags to check various conditions so that could select the appropriate formula: the nested ‘if’ statements were too long for an individual cell, and also cumbersome to read where they did fit. Still I reckon if I could have created the appropriate test conditions, then FoxBase would have been the more robust solution and suitable for future projects. The Lotus 123 solution, would need modifying on each and every future project.

Now whilst Windows Page files and Swap files, and generally virtual memory, remove the need for programmers to manually manage the allocation of RAM, and the need to write their own page files: there is still the possibility of hitting memory limits.

VBA has to run in the memory allocated by Excel or other application, chances are that the VBA compiler is based on the old compilers with the 64kB segment limits. Therefore the vba module exported to a .bas file less than 64kB, may still have problems if the subroutines in that module generate code and data segments which exceed the 64kB limits. On the other hand a .bas file can be larger than 64kB on condition it doesn’t generate compiled code and data segments larger than 64kB. Though the data objects may be held outside the 64kB code segments, and managed as virtual memory, as relatively large arrays can be defined/declared in vba.

Anycase my point is, the issue is with the management and use of the RAM, not size of plain text files, and not the size of compiled files. My understanding is that compiled files contain instructions on how to allocate and use memory, therefore it is possible for a few short instructions to demand the use of a large amount of memory.

So before start hacking a vba program up into using smaller modules, first look at the data structures being used, for example try reducing the size of dimensioned arrays, and experiment with local and public variables (they may be using different memory segments).

Additionally, rewrite the program making use of functions and subroutines. That is, it is not necessarily the module that is too large but one or more of the procedures in the module, which is demanding a compiled segment space greater than 64kB. Therefore try to keep procedures small, sticking to a single task. For example if print procedures out onto A4 paper, and the ends of loops and block statements are stretched onto a second sheet, then the procedure is too long and potentially unreadable. Another more restrictive guide to length of a procedure is the limits set by the forth programming language, the dialect I was introduced to, was limited to 16 lines (it may have been less 4 or 8). Another limit is the old text screen limit of 25 lines: if the procedure doesn’t fit in 25 lines then it doesn’t fit on the screen and therefore have to scroll the screen to view the remainder: but can never view all in one hit.

To get a bird’s-eye view consider a product structure tree in the form of an indented list. A single level list is easier to read than a fully exploded assembly list. If a procedure does little more than call a list of procedures, then it is relatively easy to read that procedure and also relatively easy to convert from one programming language to another. So aim to have a programming style which is compatible with multiple programming languages and makes translation from one to the other relatively easy: that is your program looks the same no matter what programming language it is written in.