In Piotr Indyk's presentation on how the sFFT work (I very much liked the Bins and Balls explanation) , this is the first time I heard about an Hadamard based algorithm with good running time. I asked Piotr about it and here are the references for that work:

Alas, it is deeply buried in there. A good explanation is given in: O. Goldreich. Modern cryptography, probabilistic proofs and pseudorandomness. Algorithms and Combinatorics, 17, 1999. Appendix C.2.3 Improved Implementation of Algorithm. Alas, there is no actual implementation (i.e., code) available, to the best of my knowledge.

Given an $n$-length input signal $\mbf{x}$, it is well known that its Discrete Fourier Transform (DFT), $\mbf{X}$, can be computed in $O(n \log n)$ complexity using a Fast Fourier Transform (FFT). If the spectrum $\mbf{X}$ is exactly $k$-sparse (where $k<0:99); and (ii) computational complexity: we can reliably compute the DFT X using O(k logk) operations, where the constants in the big Oh are small and are related to the constants involved in computing a small number of DFTs of length approximately equal to the sparsity parameter k. Our algorithm succeeds with high probability, with the probability of failure vanishing to zero asymptotically in the number of samples acquired, M. Our approach is based on ﬁlterless subsampling of the input signal x using a small set of carefully chosen uniform subsampling patterns guided by the Chinese Remainder Theorem (CRT). The idea is to cleverly exploit, rather than avoid, the resulting aliasing artifacts induced by this subsampling. Speciﬁcally, our subsampling operation on x is designed to create aliasing patterns on the spectrum X that “look like” parity-check constraints of good erasure-correcting sparse-graph codes. We show how computing the sparse DFT X is equivalent to decoding of these sparse-graph codes. These codes further allow for fast peeling-style decoding, similar to that of fountain codes. The resulting DFT computation is low in both sample complexity and decoding complexity. We accordingly dub our algorithm the FFAST (Fast Fourier Aliasing-based Sparse Transform) algorithm. We analytically connect our proposed CRT-based aliasing framework to random sparse-graph codes, and analyze the performance of our algorithm using density evolution techniques from coding theory. Additionally our approach uncovers a new deterministic and efﬁcient class of compressive sensing measurement matrices as an alternative to the popular choice of random matrices. Based on the qualitative nature of the subsampling patterns needed, our analysis is decomposed into two regimes: i) “very-sparse” (0 lt 1=3), where M < 2:5k, and ii) “less-sparse” (1=3 lt 1), where M lt 4k for lt 0:99. While our theory is asymptotic, our simulation results reveal that in practice our algorithm works even for moderate values of k and n, e.g. k = 200 and n = 4080. Further, when k = 300, and n is approximately 3:8 million, our algorithm achieves computational savings by a factor of more than 6000, and savings in the number of input samples by a factor of more than 3900 over the standard FFT. We analyze the 1-D DFT here, but our approach can be extended in a straightforward manner to multi-dimensional DFTs. Further, while our analysis is for the exactly sparse case, our approach can be extended to be robust to the more general approximately sparse and noisy measurement cases as well, albeit at the cost of increased sample and computational complexity. We provide extensive simulation results in Section VIII that corroborate our theoretical ﬁndings, and validate the empirical performance of the FFAST algorithm for a wide variety of 1D and 2D DFT settings for signals having an exactly or approximately sparse Fourier spectrum.

Kannan tells me that in the next version of their papers sFFT will be included in the comparison and that an implementation should be out by the end of the summer. woohoo!

Finally, I note that the talk by Mike Mahoney on Revisiting the Nystrom Method for Improved Large-Scale Machine Learning is the rebirth of randomization thanks to the use of theoretically stronger mathematical results (Johnson-Lindenstrauss) and is behind the Randomized Numerical Linear Algebra movement (RandNLA tag on Nuit Blanche). during the presentation I think I heard Mike saying because of the rounding errors, the randomized approach had a better accuracy than full LAPACK which reminded me this entry: Slowly but surely they'll join our side of the Force... Anyway, I have seen most of the figures of Mike's presentation in the following two papers:

We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods based on leverage scores. We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projections methods. These bounds are qualitatively superior to existing bounds---e.g., improved additive-error bounds for the spectral and Frobenius norm errors and relative-error bounds for the trace norm error.