Make BASIC Fast Again, part 3

by Allen Huffman ·
Published January 31, 2018
· Updated February 2, 2018

Welcome to part 3 of a multi-part series on simple things you can do to speed up Microsoft BASIC on the Radio Shack Color Computer. These tips may be applicable to other systems with Microsoft (or other) BASICs.

Part 3 – Variables

Previously, we took a look at FOR/NEXT and numbers, and showed some simple things that will speed them up in Color BASIC.

Today, we will discuss variables.

You know my name, look up my value

Color BASIC has two types of variables: numeric and string. These variables may be single, such as “A” or “A$”, or an array of 1 or more of the same variable, such as “A(3)’ or “A$(3)”.

Unlike languages such as C, our BASIC doesn’t require you to declare variables before you use them. You just set and go:

10 A=1
20 A$="HELLO WORLD"

When an unkown variable is encountered, BASIC adds it to a list of all known variables. The end result is a lookup table of all in-use variables in the program. Here’s a poorly-written BASIC program to demonstrate this:

This program has a subroutine at line 100 that will print whatever string is in the A$ variable centered on the screen, based on the screen width of SW. (This allows the code to be used on the 40 or 80 column screen of the Color Computer 3.)

When this runs, BASIC begins scanning through the program and starts initializing variables as it finds them. It looks something like this:

In line 10, NM$ is seen, so BASIC tries to find it in the variable table. It is not there, so NM$ is added to the variable table.

In line 20, CP$ is seen, so BASIC tries to find it in the variable table. It is not there, so CP$ is added to the variable table.

In line 30, SW is seen, so BASIC tries to find it in the variable table. It is not there, so SW is added to the variable table.

In line 50, A$ is seen, so BASIC tries to find it in the variable table. It is not there, A$ is added to the variable table. We GOSUB to line 100.

In line 110, LN is seen, so BASIC tries to find it in the variable table. It is not there, so LN is added to the variable table. LEN wants to get the length of A$, and A$ is found in the variable table so it uses that string.

In line 120, SW is found in the variable table, so we use its value and divide it by 2. LN is found in the variable table, so we use its value and divide it by 2. A$ is found in the variable table, so we use its value to print to the screen.

In line 130, we return which takes us to the end of the GOSUB in line 50, and since nothing is there, we go to line 60.

In line 60, A$ is found in the variable table, and we also find CP$ in the variable table, so we update the existing A$ with whatever CP$ is set to.

…and so on…

Based on this walk-through, the variable table should look like:

NM$

CP$

SW

A$

LN

As the variable table grows, it will take longer to access variables at the end of the table since BASIC has to scan through each variable until it finds a match. If no match is found, a new variable is added. Thus, even declaring a new variable will take longer the more existing variables are already defined.

Do you have anything to (pre) declare?

Instead of letting variables get declared in the order they are encountered in the program, you can declare them at the top of your program and control the order. For example

5: A$="":SW=0:LN=0:NM$="":CP$=""

If you had that line at the top of the poorly-written example, it would change the order of the variable table to:

A$

SW

LN

NM$

CP$

But, if you aren’t needing the initial values (i.e., SW=0 is pointless since it is set manually later, and the strings get set as they are used), there is a better way: DIM

DIM is used to dimension an array of variables, such as

DIM A$(10)

That lets you have 11 different A variables (it’s base-0, sorry), such as:

A$(0)="BACON"
A$(5)="EGGS"
A$(10)="SPATULA"

But, you can also use DIM just to declare an entry in the variable table, without any initial value set:

DIM A$
DIM SW
DIM LN
DIM NM$
DIM CP$

But that looks silly. It’s less silly to just put them all on one line like this:

5 DIM A$,SW,LN,NM$,CP$

This lets you control the order of the variables in the table. This lets you put variables that need to be fast near the start of the table, and lesser used variables where speed isn’t as important near the end of the table.

Order (of declaration) is important

With this knowledge, you should now be able to pass this quiz.

VARIABLE ORDER: Which is faster?

Version 1

10 LO=1:HI=1000:NM=42
20 FOR I=LO TO HI
30 IF I=NM THEN PRINT "DON'T PANIC!"
40 NEXT

Version 2

10 NM=42:LO=1:HI=1000
20 FOR I=LO TO HI
30 IF I=NM THEN PRINT "DON'T PANIC!"
40 NEXT

To test, we will use our benchmark program. You will notice the benchmark already does a DIM on line 5, so we’ll add our new variables to it in the order that we use them in the program:

Running this (poorly written) example shows 321. Every time line 50 has to compare I to NM, it has to scan through the entire variable table of TE, TM, B, A, TT, I, LO and HI before finding NM. If we want it faster, we could put the two variables we check inside the loop (I and NM) at the start:

5 DIM TE,TM,B,A,TT,I,NM,LO,HI

…and that shows 314 – barely a bit faster. I could have put them at the very start, which would be even faster, but I wanted the benchmark variables to be there so the benchmark is consistent.

The more variables in use, the longer it takes to find ones at the end. Here is an extreme standalone example:

This prints around 387. That is testing A, the first variable in the list. But if we were testing Z at the end:

40 IF Z=42 THEN PRINT "DON'T PANIC!"

…it prints around 436. Checking Z requires scanning through 25 variable entries before it finds Z. This is slower.

I guess my point here is, for places where timing is most important, use variables that are declared at the start. If you are just using a FOR/NEXT loop to repeat something, keep in mind it only has to look up that variable once when processing the FOR:

That prints 65. Since FOR/NEXT only has to look up the variable one time, not every time, it’s not much of a savings using a variable at the start of the list. Because of this, you can place FOR/NEXT variables that you aren’t checking (just things used for repeat functions or time loops, etc.) at the end of your DIM, and save the earlier variable slots for things that need to be faster.

And now maybe something we discussed in part 2 makes a bit more sense. When using NEXT with a variable (“NEXT I”), it is slower. BASIC has to look up that variable each time. With this in mind, lets do one more test:

…while the FOR is about as fast to start, the NEXT has to look up Z each time. This reports around 131. FOR/NEXT loops are fairly predictable for timing when you just use “NEXT”, but if you use “NEXT x” with the variable, they are slower depending on how far down in the variable list the variable is.

Bonus: Faster than constants!

Before leaving the subject of variables, let’s see how using a variable compares to using a constant. For example…

This gives us 216. It seems scanning through all those variables to find Z is still faster. But, as you can see, at some point, finding Z will take longer than parsing the constant. If your program uses a huge amount of variables, this is something to consider.

For this reason, programs that do things like PRINT to screen positions will run a tad faster by using variables instead of constants. Instead of doing calculations in the PRINT, like…

PRINT@32+X,"*";

…it might make more sense to work out a way to have that 32 added to X to begin with, or have it in a variable. For example, if I wanted to do a program that makes an asterisk bounce back and forth on the screen, I could calculate the middle of the screen like this:

PRINT@32*8+X,"*";

The original Color Computer has 32 characters per line, and 16 lines, so multiplying 32 by 8 gets us to one of the middle lines on the screen. Let’s test:

…we get 98! Wow, that’s a huge improvement! (And yeah, I did all the math in the FOR loops but that math is only done when the FOR is created, so if I do this six times, it only has to parse that match six times for each FOR loop.)

I like variables.

There is much more we could discuss on the subject of variables, so maybe we’ll revisit them later.

In future installments, we still need to look at GOTO and GOSUB, and various forms of INPUT. We should also revisit variables and test the speed of arrays.

We should also look at another type of optimizing: optimizing for space, rather than speed.

If you have other topics you’d like discussed, please leave a comment.

In 1982, I received my first computer: a $299.99 Commodore VIC-20. A year later, I moved on to a 64K Radio Shack Color Computer ("CoCo"). In 1990, I co-founded Sub-Etha Software "in Support of the CoCo and OS-9".
This later led me to a job at Microware, creator of OS-9. I am author of the CoCoFest Chronicles, a compilation of my fest reports covering the 1990s era. I also host the CoCopedia.com wiki. These days, I am enjoying excavating my original VIC-20 tapes and thousands of CoCo floppy disks...

2 Responses

Hi, here again.
In my idea of the new basic, the variables besides being able to be integer that is already a very good improvement, we will not have to look for them, when rolling or precompiling the program, simply after each variable in the source, 2 bytes will be hidden, indicating simply the position of the variable in the table, there will be one to integrate and other tables for the rest, then based on that index you will instantly locate the variable no matter where it is, or the large number of letters in its identifier, since you will directly access an index of a 2-byte array (if it is an integer variable), in another table the names of the variables will be saved