I'd like to sample the elements of a symmetric square matrix uniformly. For example, for a $N\times N$ matrix, I'd like to only keep $\alpha$% of the matrix elements to build a sparse matrix, while keeping the symmetry property.

A simple method would be to scan the upper right part of the matrix, generating a random number uniformly between 0 and 1 and compare to $\alpha/2$ to accept or reject the element, and symmetrize the matrix at the end.

However, in my case, sampling a row is costly while sampling a column is not : I'd like to minimize the number of rows that I will visit. For example, in the strategy above, every row is visited while the last row will only have a single element checked: this is thus not optimal since I'll need to access the last row (which is very costly) to decide for a single element. I'd rather discard as many rows as possible if the symmetry property ensures that they will be covered anyway.

Hence, a solution could be to randomly choose $\sqrt{\alpha} N$ rows and sample each element of this row with a probability $\sqrt\alpha$, and symmetrize afterwards. However, I am not sure this would produce a uniform random sampling, similar to the one I would obtain with the first strategy.

What would be the best strategy while maintaining the uniform sampling property ?

Please, don't hesitate to tell me whether the question is not clear enough, or should be rather asked on StackOverflow. Thanks!

1 Answer
1

write down the $N^2$ pairs of integers $(n,m)$ with both $n$ and $m$ ranging from $1$ to $N$ ; randomly select a% of them; these are the nonzero elements $M_{nm}$ of your sparse matrix; then evaluate the row with the largest number of nonzero elements, reflect in the diagonal to get the corresponding column, continue with the row that now has the largest number of nonzero elements, reflect in the diagonal, and so on until you have evaluated all nonzero elements.

note that diagonal elements appear with one half the probability of the offdiagonal elements, which may or may not be what you want; if it is not, just give the pairs $n=m$ double the weight when selecting the nonzero elements.