Gradient Descent

What is Gradient Descent?

Gradient Descent is an optimization algorithm used for finding the minimum of a function, commonly used in machine learning and deep learning to optimize the parameters of models. Gradient Descent iteratively updates the parameters by moving in the direction of the negative gradient of the function, eventually converging to the minimum.

How does Gradient Descent work?

Gradient Descent works by initializing the parameters of the model with random values and then iteratively updating the parameters using the following rule:

theta = theta - learning_rate * gradient_of_function(theta)

The learning rate is a hyperparameter that determines the step size of each update. There are several variants of Gradient Descent, such as Stochastic Gradient Descent (SGD), which updates the parameters using a single data point at each step, and Mini-Batch Gradient Descent, which uses a small subset of data points for each update.

Resources for learning Gradient Descent: