It is used within a Manipulate[] so I'm trying to speed it up a bit. It returns a list of points, one from each step the gradient takes (i.e. {{1,2}, {3,2}, ...}). I understand using For[] is not optimal, but I'm not sure how to replace it.

Question: Is there a better, cleaner, and/or faster way to do this in Mathematica?

Also the function seems to work for most functions like $x^2+y^2$ and $\sin x\cos y$, but fails for a linear error function like below. The points grow rapidly up to the order of $10^{80}$ in only 10 steps. Why does it not work for this error function?

$\begingroup$The code behind that link uses numerical differentiation. Apart from that, seemingly, no difference. But that does not make it working unless using tiny(!) step sizes (aka learning rates).$\endgroup$
– Henrik SchumacherNov 7 '17 at 7:03

1 Answer
1

For neural networks, one often prescribes a "learning rate", i.e. a constant step size. In is quite well known in optimization circles that this is a very, very bad idea as the gradient alone does not tell you how far you should travel without ascending the objective function (we want to descent!).

In the following, I show you an implementation of gradient descent with "Armijo step size rule with quadratic interpolation", applied to a linear regression problem. (Actually, with regression problems, it is often better to use the Gauss-Newton method.)

This is the code for the steepest descent. One has to supply a objective function f and a function generating its differential:

Setting up the fidelity function and its derivative. It is a good idea to do that outside of any loop since this involves symbolic computations which tend to be rather slow. So we perform them only once. (With big data, symbolic computations should be completely avoided. In this case, one should implement the derivative in a cleverer way.)

Mathematica is a registered trademark of Wolfram Research, Inc. While the mark is used herein with the limited permission of Wolfram Research, Stack Exchange and this site disclaim all affiliation therewith.