Thank you - this is the result of one of me wanting to find out some of Helgefs remarks.
I could always help you splitting the topics if you need that. Same goes for editing the first post if you want to add new stuff.

- The most practical solution, I think, everything considered, is this:
- We move this thread to a different forum, Scripts and Functions perhaps.
- It's too early to try to make this a 'tidy' thread, the posts will be being changed and added to for some time to come.
- At some future date, I could think about consolidating them in one place, in a perfected form.
- I will start a new tutorial, soon, explicitly regarding how to speed up scripts. It would be a one-post wonder, that I would continue updating from time to time. It would summarise everything that I'd discovered from doing benchmark tests.
- To maintain some order in this thread, the OP will list/link to each benchmark test.

- I remembered seeing a nice QPC function when reading through threads, turns out it wasn't this thread though. A function by Helgef based on a function by wolf_II.
[QPC function, returns milliseconds]
Anagrams - AutoHotkey Communityhttps://autohotkey.com/boards/viewtopic ... 64#p158464
[QPC function, returns seconds]
Code Puzzle Thread - Page 3 - AutoHotkey Communityhttps://autohotkey.com/boards/viewtopic ... 46#p186646
- I have added my version of the function to 'ASSUME ALWAYS' in the OP. It returns milliseconds. It's basically the same as Helgef's.

[classic AHK knowledge]
benchmark tests - append text (with/without prior VarSetCapacity)
if don't prepare capacity, around 8 times slower (this is very much dependent on how many times expansion/copying is necessary)

- I had thought that the for loop might be faster. Why did we get different results (did you perform the tests twice, swapping the order, did you perform the operation more times)? Why are earlier tests faster within the same script run, should we have a delay after a hotkey is executed or a dummy test?
- I've tried to incorporate every piece of advice I've been given. So getting incorrect results at this stage is really facepalm.
- Btw how do x64/x32 compare generally?
- Also I've been wondering about 'CPU' benchmark tests, or other measures, i.e. the method that takes less of a toll on the system.
- Yes, Helgef, it's milliseconds. I've now corrected it everywhere in this thread. [@nnnik: could you edit your post to make it say 'benchmark tests (ms)', thanks.]
QueryPerformanceCounter MSDN page:
Retrieves the current value of the performance counter, which is a high resolution (<1us) time stamp that can be used for time-interval measurements.
...
A pointer to a variable that receives the current performance-counter value, in counts.
QueryPerformanceFrequency MSDN page:
A pointer to a variable that receives the current performance-counter frequency, in counts per second.

- I'd be interested in any benchmark tests re. optimising DllCall results, e.g. specifying dll name or not, specifying .dll or not, specifying W/A or not, using Int or "Int" and anything else. I haven't been able to get any clear-cut results so far.
- The documentation does mention about using LoadLibrary and using a function address (for dlls that aren't pre-loaded).
DllCallhttps://autohotkey.com/docs/commands/DllCall.htm#load
- Also, any good example dlls/functions for testing would be helpful, it's hard to find the best ones for testing.

- Also, doing a benchmark test, with the exact same code being tested twice in a row, with one being faster than the other, is concerning.

Use, quotes for the types to avoid creating unnecessary variables. #NoEnv is important when omitting quotes for types, as documented. Also, if dllcall performance is an issue, AHK_H has dynacall, which is generally faster if I am not mistaken.

@nnnik, I think script code is a little too high level for branch prediction have this effect.

@Helgef yes it is. However AutoHotkey itself will speed up after the first test. ( The low level code that handles all sort of stuff )
Thats consistent with our findings that the 1st test is the slower than the second test of the same type no matter what happens.
Depending on which things we do our branch prediction might already be trained for specific actions in autohotkey while it isn't trained for others.

- Note: All of the variables mentioned in WINDOWS FOLDER LOCATIONS, here:
jeeswg's Explorer tutorial - AutoHotkey Communityhttps://autohotkey.com/boards/viewtopic.php?f=7&t=31755
appeared in the list of slow variables, apart from A_ComSpec and A_WinDir.
- Note: Clearly tests of file/registry loop variables should be done within loops for more useful results.

- I came across 5 A_LoopFileXXX variables that took more than 1 second for 1000 reads in the file loop.
- I found no A_LoopRegXXX variables that took more than 1 second for 1000 reads in the registry loop.
- Slower variables in the file loop:

- A_LoopFileLongPath and A_LoopFileShortPath are addressed in this AHK test build:
Test build - Obj.Count(), OnError(), long paths, experimental switch-case - AutoHotkey Communityhttps://autohotkey.com/boards/viewtopic ... 24&t=47682
- My code here essentially recreates the logic used in an AHK file loop, most of the key file data is placed into a WIN32_FIND_DATA struct, which the A_LoopFileXXX variables no doubt refer to. AHK has to convert those UTC dates to local dates, which probably explains the relative slowness.
259-char path limit workarounds - AutoHotkey Communityhttps://autohotkey.com/boards/viewtopic.php?f=5&t=26170
- (I had proposed A_LoopFileTimeModifiedUTC and A_LoopFileTimeCreatedUTC variables (and A_LoopFileTimeAccessedUTC for completeness), as they would be faster, but also more useful, when comparing files across DST and time zone differences.)
- (Similarly, an A_LoopFileAttribValue variable would be faster than A_LoopFileAttrib, as you'd get the raw number from the struct instead of converting the number to letters (from the list 'RASHNDOCT'), and more useful, as any unusual properties would also be listed.)