Optimizing steps in Gradient Descent. How is it done?

I found this method by a Netflix Prize participant on using "Aggressive Parametrization". It seems like he is using optimization technique for the learning rates between iterations of a gradient descent algorithm. Any ideas on methods of doing the step optimizations.