Does the processor have special behaviour for multiple prefetch on same cache line ?

Does the processor have special behaviour for multiple prefetch on same cache line ? for example: mov eax,[ptr] prefetchnta [eax] prefetchnta [eax+4] prefetchnta [eax+8] prefetchnta [eax+12] prefetchnta [eax+16] prefetchnta [eax+20] Does the Load buffer allocate entries of these prefetches?

The redundant prefetches will be canceled just as a redundant load would be. If the code is doing a large number of loads and stores these could still negatively effect performance. They also of course consume fetch/decode resources. If your code is memory latency limited, these extra prefetches might not be a problem.