Singular Value Decomposition



The Singular value decomposition (SVD) is a factorization of a real or complex matrix. It is the generalization of the eigendecomposition of a positive semidefinite normal matrix (for example, a symmetric matrix with positive eigenvalues) to any m x n via an extension of polar decomposition. It has many useful applications in signal processing and statistics.
Formally, the singular value decomposition of an m x n real or complex matrix M is a factorization of the form U∑V*, where U is an m x m real or complex unitary matrix, is a m x n rectangular diagonal matrix with non-negative real numbers on the diagonal, and V is an n x n real or complex unitary matrix. The diagonal entries σi of are known as the singular values of M. The columns of U and the columns of V are called the left-singular vectors and right-singular vectors of M, respectively.
Visualization of the SVD of a two-dimensional, real shearing matrix M. Given a unit disc together with the two canonical unit vectors. We then see the action of M, which distorts the disk to an ellipse. The SVD decomposes M into three simple transformations: an initial rotation V*, a scaling σ along the coordinate axes, and a final rotation U. The lengths σ1 and σ2 of the semi-axes of the ellipse are the singular values of M.

Gradient Descent

"Do not worry about your problems with mathematics, I assure you mine are far greater."
-Albert Einstein

Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.

Lagrange Multipliers

The method of Lagrange Multiplier is user to determint the maximum or minimum of a function given a constraint that the points lie on a given curve or surface. The gradient of the function and constraint are parallel at the extremum points and hence we introduce a scalar which is known as the Lagrange Multiplier to find out the extremum values by equating both the gradients.