I also took a look at the assembly file and it looks fine to me for all my test programs.I see that he loads 1 in the dptr, probably the 1 byte i want to allocate with malloc, then he lcall's the subroutine _malloc, then he moves the 3 byte address in r2, r3, r4. then the NULL check is nothing more than 3 times a compare and jump not equal on 0 value followed by placing 0xf0 on P2 like it should be.
In the other case i see that the pointer address is loaded, followed by putting 0xaa in the accumulator followed by an lcall to __gptrput like it should be.Then the pointer address is moved again in the registers followed by a lcall to __gptrget and a move of the accumulator to P2 since __gptrget stocks his result in the accumulator.