Please enable JavaScript.

2. Basics of Neural Network Programming (2.15 Broadcasting in Python…

NN stacking up the train examples in column makes the implementation much easier.

Dimension

X dim: n*m (size of x; size of training dataset)

Y dim: 1*m

Forward/Backward pass = Forward/Backward Propagation Step

2.2 Logistic Regression

Sigmoid function goes smoothly from 0 up to 1 and it crosses the vertical axia as 0.5.

2.3 Logistic Regression cost function

loss (error) function

Square error makes gradient descent not work well.

To measure how good the output y_hat is when the true label is y.

It is applied to a single training example.

cost function

It is the average with 1/m of the sum of the loss function applied to each of the training examples.

It is the cost of your parameters.

2.4 Gradient Descent

It starts at the initial point and then take a step in the steepest downhill direction.

Alpha learning rate: Control how big a step we take on each iteration or gradient descnt.

2.5 Derivatives

intuitive understanding: slope of the function

slope: height/width

2.7 Computation Graph

The computations of a NN are organized in terms of a forward propagation step (forward path) in which we compute the output of the NN followed by a back complication step (backward pass) which we use to compute gradients or derivatives.

2.8 Derivatives with a Computation Graph

Chain rule

2.9 Logistic Regression Gradient descent

One training example

w:=w-alpha*dw

b:=b-alpha*db

2.11 Vectorization

It is the art of getting rid of explicit for loops.

GPU: Graphics Processing Unit

SIMD (Single Instruction Multiple Data): Using built-in fucntions that don't require explicitly implementing a for loop enables you to take much better advantage of parallelism to do the computations much faster.