Linear Regression with multiple variables

Multivariate Linear Regression

Linear regression with multiple variables is also known as “multivariate linear regression”.

Multiple Features

Hypothesis:

Concepts:


Gradient descent for multiple variables

The gradient descent equation itself is generally the same form; we just have to repeat it for our ‘n’ features:


Concepts:


Feature Scaling:

We can speed up gradient descent by having each of our input values in roughly the same range.

This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.

The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same.

Ideally:   −1 ≤ x(i) ≤ 1  or  0.5 ≤ x(i) ≤ 0.5

Two techniques to help with this are:

Learning Rate:

Job of gradient descent is to find the value of θ that hopefully minimizes the cost function J(θ).

Debugging gradient descent:
Automatic convergence test:

Summary:

  • If α is too small: slow convergence.
  • If α is too large: may not decrease on every iteration and thus may not converge.

To choose α try: . . . . . . , 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10, 30, 100, . . . . . . . . .


Features and Polynomial Regression

Note:

If you choose our features in the way like squares and cubes then feature scaling becomes very important.


Computing Parameters Analytically

Gradient descent gives one way of minimizing cost function J.

Let’s discuss a second way of doing so, this time performing the minimization explicitly and without resorting to an iterative algorithm.

Normal Equation Method

Normal Equation Formula:   θ = (XTX)-1XTy

Gradient Descent Vs. Normal Equation:

Notes:


Normal Equation: Non-invertibility

If XTX* is noninvertible, the common causes might be having :



← Previous: Linear Algebra

Next: Octave/Matlab Tutorial →