I prefer the old model, with hubexec only being $1000 and above. I think it's totally fine that the first chunk of hub ram can't be hubexec'd.

Also, Seairth, the cog memory needs to be in the first block so that immediate mode numbers (9 bits) can address cog memory space without some funky remapping.

Exactly my thought.

Hubexec will anyways need longer addresses as fit into 9 bits. And we may still be able to uses addresses above the 512k without restrictions in further versions of FPGAs. Any ROM could be placed at the end of the address space?

Loosing 4K of 512K for Hubexec is not so bad, compared to a complicated memory system.

Another thought would be that there is no hub-memory below $1000. So HUB starts at $1000 and ends at $1000+512K. Then there would be space for the ROM at $0

But having the same address space shared if aligned or not to 'save' 4 more K for HubExec is - hmm - needs - a lot of explanation, documentation and is not worth it. To say HubExec can not start below $1000 is one sentence.

And I am quite sure that any compiler writer will be more happy with a simpler memory model.

Enjoy!

Mike

I am just another Code Monkey.
A determined coder can write COBOL programs in any language. -- Author unknown.
Press any key to continue, any other key to quit

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.

This came up because the ROM gets loaded into $00000..$03FFF on startup and I needed to be able to execute it in place.

I think having these new address rules is not bad, at all. It does not impinge upon anything that existed before, but allows you to execute from the hub in what was cog/LUT-only area, but at offsets that you never would have used in cog/LUT code. In fact, the assembler errors out if you are in cog/LUT mode (ORG, as opposed to ORGH) and try to assemble anything at non-long-aligned addresses. That is within the context of a cog's memory, though, and not the hub's memory.

You can think of it like this: everything is hub-executable, except addresses: %0000000000xxxxxxxxxx00. Is that really so bad? It doesn't affect anything you might do with hub memory. It just means that if the cog's program counter is in the address range %0000000000xxxxxxxxxx00, it get its instruction from its own memory, as oppose to the hub's.

I was thinking about how addresses above $1000 are hub-exec, while addresses $0..$7FF are cog-exec and $800..$FFF are LUT-exec, and what a pain it is that you can't have hub-executable code below $1000. Then, it dawned on me that cog/LUT-exec could be restricted to long-aligned addresses, only, allowing hub exec to occur on non-long-aligned addresses below $1000. Here's the new way:

My initial reaction was just like the rest of you... No way!
In fact I wrote a post outlining some other suggestions but before I posted it I had second thoughts...

I love it !!!

The writeup just says that hub-exec is limited to hub memory above 4KB (this may go up in later revisions).
Then describe a quirk that can be used to get hub-exec to work in the lower 4KB.
This makes it easy to understand.

Besides, most code that will likely end up in the lower 4KB Hub space will be some form of hub-exec bootup & monitor code that will likely be provided in an object (or copied from ROM). So the normal user can forget about this.

As a secondary benefit, a number of us wanted a space for mailboxes to pass info between cogs. This could now be forced upon us

(4.) The hub ROM is read via COGID WC, sequentially. This happens at boot-up and the contents are loaded into the first 16KB of RAM and executed by cog0. 16KB is complete overkill, for now, but it is sufficient room for a complete on-chip development system in the future.

I was thinking about how addresses above $1000 are hub-exec, while addresses $0..$7FF are cog-exec and $800..$FFF are LUT-exec, and what a pain it is that you can't have hub-executable code below $1000. Then, it dawned on me that cog/LUT-exec could be restricted to long-aligned addresses, only, allowing hub exec to occur on non-long-aligned addresses below $1000. Here's the new way:

Alternatively, you could get rid of DAT and add HUB, COG, and LUT. Then ORG is dependent on which section you are in. This might provide a bit better documentation than trying to spot ORGx directives to differentiate between code/data locations.

Alternatively, you could get rid of DAT and add HUB, COG, and LUT. Then ORG is dependent on which section you are in. This might provide a bit better documentation than trying to spot ORGx directives to differentiate between code/data locations.

Actually, In the cases of COG and LUT, it would just be implied that they start with the appropriate ORG ($0000 for COG, $0800 for LUT). For HUB, it starts at whatever the current hub address is (basically the same as what DAT is now). ORG would only be used to override the defaults.

Why can't the ROM entry point just be at $01000? So when the first cog starts up it starts running the ROM code at $01000 instead of $00000 (or now $00001). Then the cog/lut image can be in the first 4k if you want, and the code at $01000 can load that into the cog and jump to $00000.

This seems a lot cleaner and simpler to me that concocting this oddball thing where jumping to odd addresses below $01000 = hub exec, but jumping to long aligned addresses goes to cog.

Code generation for hub exec code in the bottom 4k now has to make sure everything stays not long aligned for destinations for branches. Which will be the case if there is no embedded data, I guess, but still it's now a concern where it wasn't before. At least in the hand coding space you have to make sure you specify (or calculate) non-long aligned addresses for branches when in the first 4k so as not to accidentally branch into cog or lut code (which is perfectly reasonable to do in hub exec code).

To me it makes things more complex than they need to be, and that's not fun.

This should make the verilog very simple. In all cases, the 2 LSBs are set to zero. This means that hub instructions will still be long aligned, and the addressing is contiguous. This shouldn't be an issue for the cog or lut for the most part. Your code above would instead look like:

Code generation for hub exec code in the bottom 4k now has to make sure everything stays not long aligned for destinations for branches. Which will be the case if there is no embedded data, I guess, but still it's now a concern where it wasn't before. At least in the hand coding space you have to make sure you specify (or calculate) non-long aligned addresses for branches when in the first 4k so as not to accidentally branch into cog or lut code (which is perfectly reasonable to do in hub exec code).

To me it makes things more complex than they need to be, and that's not fun.

Surely an assembler (or compiler) manages all that housekeeping, once you specify the segment.
It only appears complex, if you try to go deeper & code in hex, but PCs are great at this sort of low level house keeping.

Why can't the ROM entry point just be at $01000? So when the first cog starts up it starts running the ROM code at $01000 instead of $00000 (or now $00001). Then the cog/lut image can be in the first 4k if you want, and the code at $01000 can load that into the cog and jump to $00000.

This seems a lot cleaner and simpler to me that concocting this oddball thing where jumping to odd addresses below $01000 = hub exec, but jumping to long aligned addresses goes to cog.

Code generation for hub exec code in the bottom 4k now has to make sure everything stays not long aligned for destinations for branches. Which will be the case if there is no embedded data, I guess, but still it's now a concern where it wasn't before. At least in the hand coding space you have to make sure you specify (or calculate) non-long aligned addresses for branches when in the first 4k so as not to accidentally branch into cog or lut code (which is perfectly reasonable to do in hub exec code).

To me it makes things more complex than they need to be, and that's not fun.

I agree that complex is not good. Simple linear address mapping is easiest to understand and doesn't tax one's mind unnecessarily.

The ROM could be made to load starting at $1000.

I'm also thinking that we don't need a special ORGL instruction, but just the ORG we use for the cog, allowing values (long-index) $000..$1F7, then $200..$3FF.

Man, I wish we could use simple long-index addresses within cog space to get around this everything-times-4 issue. It's a brain-bender.

Chip,
The ROM could still load into RAM starting at $00000. Just make the cog start hub exec at $01000. Then the image to be loaded into cog/lut could be in the first 4k, and the code at $01000 in hub could load that first 4k into the cog/lut and jump to it.

This of course is just one example of a simple startup, but in any case, I think keeping stuff simpler here is the wise choice.

This just means that when the lower 2KB HUB is used in hub-exec mode, instructions MUST be long aligned.

I expect mostly we will desire long aligned hub-exec to minimise clock stalls.

Is there any reason that the ROM could not just be loaded into HUB 2KB (%000000001_000000000_00) and above, and execution start at this address ???
Maybe the FUSES could be read into HUB $0 upwards, and the secure section cleared if security is enabled.

As for ORGC/ORGL/ORGH...
I would prefer to have use ORGx rather than use ORG for one of them (ie always specifically declare)
Actually ORGCOG/ORGLUT/ORGHUB is more intuitive.

Why do we need an ORGL (or ORGLUT) ???
Couldn't ORGC (ORGCOG) just be expanded to 4KB where the lower 2KB is COG/Registers and the upper 2KB is LUT ?

Also, won't LUT be used more for cog-exec or extended cog memory, rather than LUT usage?
If so, then maybe we need to think of a better name than LUT. Extended COG Memory is a bit of a mouthful though.