To make up for my earlier response, I have coded a small timer routine
that generates a 1000x1000 uniformly random matrix with values between 0
and 1 (using cvxopt), then tests the code suggested 10 000 times each
with randomly generated diagonal vectors.
I have a slow system, but the averaged results are:
0.8 seconds to run David's striding code (in place modification)
6 seconds to run Anne's code
18 seconds to run a for-loop
This is an impressive speed up.
For smaller matrices (100x100) there is still one order of magnitude
between for-next loops and the suggested setdiag.
Code available upon request (requires cvxopt).
Cheers,
Mclean