Gradient Descent:
In out last topic, we learned that through the logic of back propagation weights in ANN are adjusted to minimize the cost function. In this section we would learn how exactly this minimization of the cost function happens.
The above figure is the very basic structure of a perceptron, where information is back propagated based on the value of cost function. The weights in input layer would be adjusted based on information being propagated back to neuron.
How do we minimize Cost Function? There are several ways to achieve this task.
Brute Force: One way is to use a Brute Force approach, where ANN takes all possible combination of weights and choose combination which suits the best after trying out thousands possible combination. But problem with this method is, as number of input variable increases the effect of curse of dimensionality increases which might result into a slight decrease in efficiency.
Gradient Descent: Another way to achieve the minimum value of Cost Function is Gradient Descent. Gradient descent is a first-order iterative optimization algorithm for finding the minimum of the cost function. Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm. In Gradient descent we try to estimate gradient at every point of the curve. IF our slope is negative that means we are on the left of minimum and if our slope is positive that means we are on the right of minimum. Our goal is to reach at the point where slope is zero and that point is known as minima.
Next Section: Artificial Neural Network Part 7
[…] Next Section: Artificial Neural Network Part 6 […]