In Newton's method, one computes the gradient of a cost function, (the 'slope') as well as its hessian matrix, (ie, second derivative of the cost function, or 'curvature'). I understand the intuition, that the less 'curved' the cost landscape is at some specific weight, the bigger the step we want to take on the landscape terrain. (This is why it is somewhat superior to simply gradient ascent/descent).

What I am having a hard time visualizing, is what is the intuition behind 'curvature'? I get that is the 'curviness of a function', but isnt that the slope? Obviously it is not, that is what the gradient measures, so what is the intuition behind curvature?

Thanks Henry. I get that - I dont know why I am having a hard time visualizing it. Maybe I need more coffee. When I visually inspect a curve, I am seeing its - gradient, no? Perhaps there is a mis-match between the colloquial usage of curvature in English and curvature in the math? Curvature_english = gradient_math?
–
MohammadSep 1 '12 at 14:15

@Mohammad: A function with a fixed slope (constant first derivative, zero second derivative) would be a straight line with no curvature, even though its values change because of the slope.
–
HenrySep 1 '12 at 14:43

The gradient and the radius of curvature at a point x are both normal to the equipotential surface through x.

The radius of curvature is the radius of the largest sphere that is tangent to the surface at that point. It has nothing to do with the slope, but only with the equipotential surface. Curvature is the reciprocal of the radius of curvature. It is used rather than the radius of curvature because the radius of curvature is infinite when curvature is 0. 0 is more convenient to use in computations than infinity.

The gradient is a vector that points "up hill" and has magnitude equal to the slope.

Image you have a straight line, that has a slope, say 45 degrees, so as an equation y=x
Differentiate it once gives the gradient,a constant dy/dx = 1 differentiate it twice give the curvature, d2y/dx2 =0 so in this case y=x has no curvature, but it does have a gradient. By the way, if you have a straight line that is horizontal y= 4 say that would have no gradient or curvature.

Intuitively, I would image holding a straight plastic stick in your hands. When the stick is parallel to the ground, it has no gradient or curvature. If you tilt one end, 45 degrees say, then it has a gradient but still no curvature. If you bend the plastic a bit, it has a gradient and curvature (in this case it has a range of gradients, a different gradient at each point on the stick)

btw - the stick could have a single curvature across the stick whilst having multiple gradients. If you bend the the stick with one bend, then that shape would be a parabola, represented as a quadratic equation ax^2+bx+c=0 . The gradient would be dy/dy = 2ax+b (different at each x) and the curvature would be d2y/dx2 = 2a (a constant).