OK Guys after many hours of pondering this I have decided to release the source code, but not all of it! Just the main game itself. I'll still be incharge of overview and putting the game together that way I can still be involved in not just handing the project over. You'll need a level file so download the game from STNICCC and extract and select a level file, unpack it (It's ICE packed) and save it to disk. In the source find the label 'level_data' and incbin it there. Set the label 'stage' to whatever stage you're playing. At the top of the source find the line 'move.w #-1,cheat_flag' uncomment this to turn the cheat off. Have fun!

You do not have the required permissions to view the files attached to this post.

So let it be written, So let it be done. I'm sent here by the chosen one.

yeah this is kinda spooky and sad (i mean that respectfully ofc)But, dear Stephen... it's one last "thank you" from me and I truly hope somebody with equal talent picks this up and continues the hard work already done.Steve

It doesn't assemble directly in Devpac, there's some typos in the source code. But after fixing two "(" missing and some REPT located on the same lines than labels (which devpac seems not to like), it works. Don't forget to ICE-unpack the levels with a tool like Multi depacker.

I wanted to check if a single level could work on a 16Mhz CPU (the complete game doesn't), but it doesn't. Even switching to 16Mhz when playing the level hangs the game. Still don't know why.

(and there's graphic and gameplay bugs on level 1, don't know if it's the source code which may be not final or the level data that are corrupted).

I need to check Devpac 3 manual. Never seen label x - and here is plenty, probably just mean current one, but why is not local then ? , set 0 ????In any case, preview release works well at 16, 32 MHz in Steem Debugger. And as I remember it worked fine on Mega STE at 16 MHz.

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

x is not a label, it's a macro variable defined to 0, then the block between REPT and ENDR is repeated 10 times. x is incremented by 4*13 between each REPT loop.

That's because on ST it's more efficient to unroll loops in memory instead of making actual loops - I guess it's different on Mega STE because of the associative cache, but it's difficult to check on emulators, I think none of them emulate this cache.

I was talking about some occurences of a label being declared on the same line of a a REPT :

Above is just a part - there is total 273 line on screen. I guess that I did not use macro for looong time.Btw. Unroll is more efficient even with cache. Especially if we loop short code - because no dbf or other jumps to loop begin, which just eat CPU time, while doing nothing "useful" .Anyway, more interesting is why you had problems at 16MHz, on Mega STE ? May be that it is cache sensitive. I need to check it again on MSTE.In any case, it works well and smooth at 8 MHz, so no need to push over.I already did some mod of this game - replaced ICE packed files with better packing, what depacks much faster. Really don't get why ICE is so much used (seen even in some commercial releases). viewtopic.php?f=28&t=29047&hilit=rtype+deluxe&start=50Maybe we could do new release for floppy users - I will provide you repacked files and source of depacker, so you can add it in src. file and assemble.I really have too much things to work on currently, and you already fixed src.

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

AtariZoll wrote:Btw. Unroll is more efficient even with cache. Especially if we loop short code - because no dbf or other jumps to loop begin, which just eat CPU time, while doing nothing "useful" .

Yes, there's the jump test time, but it all depends on the size of the loop itself, given that the cache avoids memory access - provided the loop code doesn't write or read on memory locations that collide with the ones of the code, which should be cached after the first iteration.

AtariZoll wrote:Anyway, more interesting is why you had problems at 16MHz, on Mega STE ? May be that it is cache sensitive. I need to check it again on MSTE.

It shouldn't be, at least if I understand correctly how the cache works (unless the blitter is used to generate code or some other nasty trick). But it doesn't work on Hatari when set at frequency > 8Mhz either, which is very strange because there doesn't seem to be time-critical code nor generated/self-modifying code. I'm going to try to disable some subrouts to see if I find a culprit.

I already did some mod of this game - replaced ICE packed files with better packing, what depacks much faster. Really don't get why ICE is so much used (seen even in some commercial releases). http://atari-forum.com/viewtopic.php?f= ... e&start=50Maybe we could do new release for floppy users - I will provide you repacked files and source of depacker, so you can add it in src. file and assemble.I really have too much things to work on currently, and you already fixed src.

Sorry, I forgotten about your thread and version. It was nearly a year ago, and I didn't have time to try to compile the source before now. Thanks for the work

You do not have the required permissions to view the files attached to this post.

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

I think with respect we should establish and document the correct configuration to build the project.

The source code is quite big at 400k so the text buffer has to be allocated at least 500k. This is set in Editor\Preferences.The compiled code is about 523k so we need to allocate a maximum of 600k. This is set in Program\Assemble. Max: 600kSo you need an ST or emulator configuration with at least 2Mb for this project.

I think it is important to mention why it doesn't assemble using Devpac v2.09In many places in the code it uses the conditional assembly directive IFEQ\ELSE.This is not available in Devpac 2.09. Changing the code to use ELSEIF gets it further and shows the same errors outlined above.

Assembling on a hard disk in a subdirectory caused errors for me. Specifying the full path name fixed this.Adding my path to the INCDIR didn't solve the problem but should be explained for people trying to assemble the code in different locations.I only have Devpac 2.09 and 3.10 because that's what I used back in the day.

So exactly which version of Devpac did Bod use or did he use a different assembler or toolchain?

I assembled some longer sources than this with Devpac 3, without problems. Should look what is latest version. I have 3.1 as highest v.I think that it must be Devpac, at least it looks pretty much as source for it. I see 3.5 for Amiga mentioned, but for ST 3.1 seems as last one.

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

OK, I managed to assemble it after only removing 2 brackets in problematic lines. Did not correct lines where label and rept was together.Works fine - level 4 in Steem Debugger.But, this is not same as SITNICC preview. As is stated by bod/STAX self. We need RTYPE.TOS (2.2 KB depacked) - what controls intro, level loads. Sadly he is not among us anymore, so we need to disasm. it . I will probably do it, and changing depacking for begin. In any case, it will not fit on single DD floppy.Not clear is some bug fixing needed, code optimizing, making slow parts faster - if possible at all without serious rewrites ?

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

Yes there some errors are misleading in devpac 3 when it comes to macros.

I also found that there is bugs, at least in stage 1 the boss has graphic glitches.

However I recommend to build this using vasm on another system, it takes about 2 seconds. Devpac takes a lot more time and when I switch Hatari to >8Mhz to build faster, for an unknown reason the generated code crashes (?).

Anyway, I'm here to say that I was looking for a way to make this game to work at higher CPU frequencies (mostly because I have a Mega STE and I wanted to see if the 16Mhz mode would improve the game experience). In all released version the game works only at 8Mhz.

After some testing I found that the game hangs when running at >8Mhz because it uses a peculiar way of doing vsync:- first timer B is used to change the hardscroll parameters in the lower part of the screen to display the status bar (status_tb)- then a new timer B to open the lower border (lower_tb1)- then around 28 lines later, a new timer B in event mode (.ltb2) decrements vbl_count, used by a function called "sync" which is used into the game to sync to begin a new screen buffer.(there is a "vsync" function but it's not used in-game, only when system vbl functions are active, for stage loader I guess)

Of course at 16Mhz+ the lower border is not opened, then vbl_count is not decremented, and the game hangs in the sync function.

I tried to change the timer B from event mode to delay mode to get the same sync without having to open the lower border, it works in game but for some reason during the game init it hangs in the sync function as well. It may be related to something I didn't see in the timer B handling or maybe the blitter (the blitter is called just before setting up the lower border opening timer).

In the end I just ditched all timer B events except the one displaying the status bar and decremented vbl_count at the end of this function, and it works perfectly at 8Mhz and 16Mhz+cache. I still don't know why the timer B was used this way to make a software vsync at a precise line in the lower border.

I configured the game to run at 25fps and there IS a lot of slowdowns at 16Mhz, but a lot less than when it runs at 8Mhz, even without doing any optimization specific to the Mega STE associative cache.(note than on Hatari the game still hangs during the init, but it works on my Mega STE@16Mhz)

Now if I find some more time I will get in touch with a R-Type superplayer I know to check if the collision detection in this game must be pixel perfect (it's often not the case in shoot'em ups, smaller collision masks are used). I searched on the interwebs and couldn't find the answer for R-Type.

I'm going to cleanup the source code and post my modified version here

fenarinarsa wrote:After some testing I found that the game hangs when running at >8Mhz because it uses a peculiar way of doing vsync:- first timer B is used to change the hardscroll parameters in the lower part of the screen to display the status bar (status_tb)- then a new timer B to open the lower border (lower_tb1)- then around 28 lines later, a new timer B in event mode (.ltb2) decrements vbl_count, used by a function called "sync" which is used into the game to sync to begin a new screen buffer.(there is a "vsync" function but it's not used in-game, only when system vbl functions are active, for stage loader I guess)

Of course at 16Mhz+ the lower border is not opened, then vbl_count is not decremented, and the game hangs in the sync function.

Fenarinarsa I guess you're testing that 16MHz under emulator. That mode works differently than on the real Atari.

What does it mean in case of open border? Under emulator NOP takes 4 low-res pixels for 8MHz CPU and 2 low-res pixels for 16MHz CPU.In case of Mega STE, bottom border should be open properly due to the same memory bus speed. NOP instruction takes 4 low-res pixels either on 16MHz CPU or 8MHz CPU .

Cyprian wrote:...In case of Hatari/Steem 16MHz means: 16MHz CPU and 4MHz memory bus. In case of Mega STE 16Mhz means: 16MHz CPU and 2MHz memory bus (exactly the same as standard ST).What does it mean in case of open border? Under emulator NOP takes 4 low-res pixels for 8MHz CPU and 2 low-res pixels for 16MHz CPU.In case of Mega STE, bottom border should be open properly due to the same memory bus speed. NOP instruction takes 4 low-res pixels either on 16MHz CPU or 8MHz CPU .

I would not agree with that. There is 2 kind of memory in Mega STE : regular ST RAM and cache RAM. If CPU addressing some address what is already in cache it will be accessed 2x faster than when is not in cache.So, if there is some loop where we have NOP, first access of it will be as on 8MHz clock, but then it is cached, and following ones will be at higher speed - 250nS instead 500nS. Additionally, CPU can perform internal operations always at 16MHz, and will wait (virtually switch to 8MHz) only when ST RAM is accessed, + ROM I guess (but honestly I never looked is it faster in MSTE than some ST). Concrete, NOP will be executed when is fetched from cache 2x faster. When from ST RAM then it looks like: 500nS for fetch, then 250nS internal. What comes after that depends from exact cache logic and is next instruction in cache: it may fetch following instruction from cache immediately, or wait 250nS and read it from ST RAM (it must be in 500nS frame). And as we talk about loop, NOP will be executed exactly 2x faster at 16MHz. And other instructions too.There is plenty of titles what work incorrectly at 16MHz MSTE..

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

AtariZoll wrote:I would not agree with that. There is 2 kind of memory in Mega STE : regular ST RAM and cache RAM. If CPU addressing some address what is already in cache it will be accessed 2x faster than when is not in cache.

You're right, 16Mhz on MegaSTE actually makes no sense when you don't enable the 16k associative cache (you get only ~3% performance boost I think), because the CPU has to wait for the bus. When enabling the cache it's a different story. It does not give a 100% boost unless you optimize your code to fit in the 16k cache - and it's quite hard since it caches everything, instructions + data - but you get at least a good performance boost. My tests showed that it can be as good as +80%.

And of course, this performance boost comes with a different execution time for instructions that are already in cache, the CPU does not have to wait for the bus. So you actually cannot do cycle-exact sync effets in this mode (like fullscreen) because you don't know if your sync loop is in already in the cache or not.

It's also not relevant to disable the cache for a few cycles because when disabled, all data in the cache is invalidated.

Unfortunately no emulator implements the MegaSTE cache behavior, so I configure them to 16Mhz, which is the closest I can get. In the case of R-Type DX of course I tested it under the real hardware @16Mhz first, and it hanged the same way it does under Steem or Hatari.

fenarinarsa wrote:You're right, 16Mhz on MegaSTE actually makes no sense when you don't enable the 16k associative cache (you get only ~3% performance boost I think), because the CPU has to wait for the bus. When enabling the cache it's a different story. It does not give a 100% boost unless you optimize your code to fit in the 16k cache - and it's quite hard since it caches everything, instructions + data - but you get at least a good performance hit. My tests showed that it can be as good as +80%.

It all depends on circumstances, typically it is the ST-ram access that holds the CPU back, ROM access and many hardware registers are faster. Most instructions on the 68k are also typically not "slow enough" to really benefit from a faster clock, since they access the ST-ram frequently enough to have that hold them back. (ie. they don't spend too much time inside the CPU "contemplating" over the data. With mul/div instructions the great exception) So yes, 3-5% speed boost on a game/demo is probably about as good as it gets.

But you get a much better speedboost in TOS than in games/demos running 16MHz without cache, since much is running in ROM, and it is noticeable. Perhaps like ~30% on average in desktop applications.

I do agree though, 16MHz without cache is not worth the effort imho, too much work for very little gain. TT/Fastram change the equation though, but only works with software that is fastram aware, which basically no game is.

Greenious wrote:It all depends on circumstances, typically it is the ST-ram access that holds the CPU back, ROM access and many hardware registers are faster. Most instructions on the 68k are also typically not "slow enough" to really benefit from a faster clock, since they access the ST-ram frequently enough to have that hold them back. (ie. they don't spend too much time inside the CPU "contemplating" over the data. With mul/div instructions the great exception) So yes, 3-5% speed boost on a game/demo is probably about as good as it gets.But you get a much better speedboost in TOS than in games/demos running 16MHz without cache, since much is running in ROM, and it is noticeable. Perhaps like ~30% on average in desktop applications.I do agree though, 16MHz without cache is not worth the effort imho, too much work for very little gain. TT/Fastram change the equation though, but only works with software that is fastram aware, which basically no game is.

ROM access on ST is not faster, and HW registers are not too. Actually some are slower - extra wait states.And at 8MHz it can't be faster since RAM access is at full speed. Did you perform actual ROM speed test on Mega STE ? I think that even if ROM works at same speed as ST RAM, with cache it will be faster up to some 80% - depending of code executed. And really no point to talk about using 16MHz on MSTE without cache - nobody using it. Maybe some really rare SW can benefit little - what crashes with cache on.

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

AtariZoll wrote:ROM access on ST is not faster, and HW registers are not too. Actually some are slower - extra wait states.And at 8MHz it can't be faster since RAM access is at full speed. Did you perform actual ROM speed test on Mega STE ? I think that even if ROM works at same speed as ST RAM, with cache it will be faster up to some 80% - depending of code executed. And really no point to talk about using 16MHz on MSTE without cache - nobody using it. Maybe some really rare SW can benefit little - what crashes with cache on.

Well, obviously you'll need to generate faster dtack and use faster eproms than supplied with Atari originally, but no, the ST-ram and the rest of the bus works somewhat independently, so you can indeed access ROM and things like TT/Fast/alt-ram independently and magnitudes faster than 8 MHz. And I did not claim all hardware registers was faster, indeed some, as you point out, are slower. (Even ICD Adspeed from back in the days included an option to use "fast rom access", dependent on you fitting fast enough eproms ofcourse)

But swing by the hardware forum and read up on Exxos 16Mhz booster, I'm sure you'll learn a thing or two.

Greenious wrote:Well, obviously you'll need to generate faster dtack and use faster eproms than supplied with Atari originally, but no, the ST-ram and the rest of the bus works somewhat independently, so you can indeed access ROM and things like TT/Fast/alt-ram independently and magnitudes faster than 8 MHz. And I did not claim all hardware registers was faster, indeed some, as you point out, are slower. (Even ICD Adspeed from back in the days included an option to use "fast rom access", dependent on you fitting fast enough eproms ofcourse)

But swing by the hardware forum and read up on Exxos 16Mhz booster, I'm sure you'll learn a thing or two.

I think that we talked here about regular Atari models. Going into what is possible with diverse accelerators, and furthermore, what would be possible ... Actually, we had (and as I remember Exxos was involved too) discussion here about possible custom MMU, what can make possible much faster ST RAM. Everything is possible with diverse mods, upgrades. Anyway, I will do soon simple TOS ROM speed test on Mega STE - using 16MHz, cache off mode. If it will be faster only some 10-20% then ROM access is not faster.

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

Thanx for this. I think that I need to do new release of it. Any tips for some further changes (nothing big) ?

Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.