Is it possible to have programs longer than 32K with the Hack instruction set?

Is it possible to have programs longer than 32K with the Hack instruction set?

Is it possible to have programs longer than 32K with the Hack instruction set, or is a new instruction set needed?

At present, the highest bit is reserved for use as a differentiator between A and C instructions. This limits the largest addressable location using the '@' keyword to 32,768 instead of the 65,536 that would otherwise be available if all 16 bits were used.

The problem comes when you have labels in your program. Since labels are addressed using the '@' keyword (@lineNumber), this effectively sets the limit on program size to 32K.

I've thought about using a 'bank' register. Its value would be used to determine which chunk of memory is accessed when the '@' command is used. For instance a value of '00' to access the first chunk of 32K, '01' to access the second chunk of 32K etc. But this won't work because of instructions like '@SP' and '@ARG' where the bank is always '00'. That is, even if you bank switch to the appropriate section of program code (>32K) the commands are oblivious to this and operate as if they are in bank '00'. Further, even if you manage to write some clever code to handle this, there is another problem with nested statements where calls can be made to any location but are then expected to return to the previous location. This requires tracking the bank of the caller, which implies the need of a stack, which then implies a maximum call depth... So at the end of the day, banks don't seem to work.

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

Administrator

It would be possible to have 64K programs, but the assembler needs to be modified a bit.

You can load an address greater than 32767 into A using two instructions:
@ not(address)
A=!A

(Extra credit: why do we need to use '!' for this instead of '-' ?)

The tricky part for the assembler is that it no longers knows the exact address for code labels during pass 1 since the amount of generated code depends on the value of labels that are defined later in the code.

The assembler will need to generate a table of references and labels during pass 1 and between pass 1 and pass 2 it will need to determine which @ commands will need two instructions and adjust the symbol table appropriately. (It may be easier to assume all @ commands require 2 instructions and then figure out which ones only need 1.)

Also note that the assembler could also generate 2 instructions sequences to handle @ commands with negative numbers. @-42 could be coded as:
@41
A=!A

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

So tweaking the assembler worked fabulously for getting 65K programs to run!

Any thoughts on how to go about programs longer than 65K? I'm writing custom libraries to create a more fleshed out OS. These libraries alone (not even including any application code) are more than 65K long.

The program counter can count only to a maximum of 65K (2^16 - 1). What tricks can be used to overcome this limitation?

I suspect one solution might be bank switching. I'm thinking of having an additional register called 'BKSEL'. Each time a label is referenced, in addition to setting the A register with the label's address, the BKSEL register will be set with the appropriate bank number... One problem I can already see with this is ensuring that all the code belonging to a label is located in the same bank (i.e. it is not clipped).

In this video, it is mentioned that bank switching is not used in computers. And instead, data is stored in larger (but slower) storage media such as hard/flash/floppy/cd/dvd disks. What is the technique used to get data from these media? For example, how can a 32bit computer access data from a 128GB hard drive?

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

I think bank switching was used in early computers. The Apple ][ had the so-called language card, which you could plug-in and have additional memory. I think it used bank-switching.

Also on the IBM PC you could (still can?) bank-switch the ROM, so you can have access to more RAM if you don't need the BIOS.

I think the technique they refer to in the video is Virtual RAM. But it requires a larger address space, so it's not directly applicable to the HACK platform.

Somewhat similar to the bank-switching is the idea to use two registers to form a larger address space. This was used in the 8-bit processors like 6502 to have a 16-bit address. In the 8086 two 16-bit registers were used to form a 20-bit address space capable of addressing 1 MiB of memory. This meant that you had multiple addresses for most of the memory locations.

I think the bank-switching technique is used when you need to go beyond what the processor is capable of. For example, the 6502 can only address 64 KiB of memory. If you want to go beyond that, you need to have an external to the CPU way to switch which part of RAM or ROM it sees at a given address. If you can augment the CPU, like adding additional registers in the CPU, then it's not really bank-switching. You can add more address pins as well, to widen the address to 32 bits.

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

I'll confirm Ivan's comments on back switching and virtual memory. I programmed IBM PC/XTs that used bank switching.

With virtual memory, the application program's view of memory is one continuous address space and the program need not concern itself with loading or enabling various parts of itself. All modern general-purpose computers use virtual memory.

With bank switching, the program needs to manage the bank switching hardware when it needs to call different parts of itself.

[This description assumes an architecture where the programs run from RAM.]

The normal bank switching scheme only affects a portion of RAM so the low RAM always contains the same program code. The parts of the program that occupy the switchable banks are called overlays, and the portion of the program that does the required bank switching is called the overlay manager. The overlay manager must reside in the low RAM.

When a routine in an overlay needs to be called, the overlay manager must handle the call.

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

Any thoughts on how to go about programs longer than 65K? I'm writing custom libraries to create a more fleshed out OS. These libraries alone (not even including any application code) are more than 65K long.

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

Administrator

ivant wrote

And enable 32-bit jumps as well, right?

Well, that's the big architectural question for a Hack computer with large, flat, address space. You need another register to hold the target address high word. Jump to 0x1234ABCD would need to be something like

@0x1234
H=A
@0x5432
A=~A
0;LJMP

LJMP would need one of the spare instructions bits set, or maybe just retask the 'a' bit to mean </l>long when jump bits are set in the instruction since M;JMP is illegal, or at least extremely useless 8^).

Hmm. If there's a new 'L' bit in the instruction then you could use it to gate the H register to the high RAM address and get access to more than 64K RAM address space.

Re: Is it possible to have programs longer than 32K with the Hack instruction set?

Thank you everyone for all the info, I appreciate it!

I read up a bit on virtual memory and paging and I think that's what I'm looking for. This playlist in particular has been quite helpful.

My plan is to treat the current 64k program ROM as program RAM (i.e. make it writable). And then to have a bigger memory that emulates "disk" storage. Thus chunks of program can be loaded from the large memory into program memory as needed. I should mention that I am not using the TECS emulator so I am able to adapt my emulator accordingly.

With that said, I haven't taken any formal CS classes so it might be a while before I figure out how to implement the virtual memory code/hardware. (And the above plan will likely change as I learn more).