In the gradient descent algorithm say $f(x)$ (quadratic function) is the objective function. SO the algorithm is defined as

$$x_i = x_i - a\frac{\partial f(x)}{\partial x_i}$$

I Just dont quite understand the meaning of doing a subtraction. I'm intuitively able to follow that we are going in the direction of steepest descent but have some questions. The derivative of $f(x)$ is going to give us the equation of a line. So when we substitute the value of $x_i$ in $f'(x)$ , what we get is a $y$ coordinate: $y_i$. So I dont understand how we subtract a $y$ coordinate from an $x$ coordinate ?

1 Answer
1

The direction of $\nabla f$ is the direction of greatest increase of $f$. (This can be shown by writing out the directional derivative of $f$ using the chain rule, and comparing the result with a dot product of the direction vector with the gradient vector.) You want to go toward the direction of greatest decrease, so move along $-\nabla f$.

Hi yes , I was able to get the general Idea. So $\nabla f$ gives us the equation of a straight line. And when we substitute the value of $X_i$ in that straight line equation we get a yi coordinate . So are we subtracting this yi coordinate from $X_i$ which is an X coordinate?
–
karthik ASep 7 '12 at 5:34