In order to create higher frequencies, the algorithm must be as time efficient as possible. And the guys at scienceprog managed to make it 10 cpu cycles long. I can't see how they made it so short, so i would really like to understand that one..

I have made some progress in understanding it. The %0, %1 etc are variables that were passed in from the C-code to the assembly.

The line "add r18, %0" is the same as "add r18, ad0". This is because the first variable in the list at the bottom of the assembly is ad0.