Linear Regression with One Variable

Linear regression is the first learning algorithm, we need to see how the model looks like and more importatnly what is the overall process of supervised learning looks like.

Model Representation


Cost Function

Cost Function Intuition-1:

Analyzing Cost Funciton by Simplified Hypotheiss Function by making θ0=0


Cost Function Intuition-2:

Analyzing Cost Funciton by Actual Hypotheiss Function

Note:

Contour Plots

Notes:


Gradient Descent Algorithm


Minimizing Gradient Descent

Example:

  • The distance between each ‘star’ in the graph above represents a step determined by our parameter α, a smaller α would result in a smaller step and a larger α results in a larger step.
  • The direction in which the step is taken is determined by the partial derivative of J(θ0, θ1).
  • Depending on where one starts on the graph, one could end up at different points, the image above shows us two different starting points that end up in two different places.


Gradient Descent Intuition

Understanding the derivative term:

Understanding the learning rate α:

Notes:


Gradient Descent for Linear Regression


Problem with Gradient Descent Algorithm:  Susceptible to local optimmum

Solution: Th Cost function for linear regression is always going to be a bowl-shaped function or convex function for which no other local optimum, only global optimum.


Algorithm in Action:


Finally: The closest fit line to training data set


Batch Gradient Descent Algorithm



← Previous: Introduction to Machine Learning

Next: Linear Algebra →