I’m currently enrolled in Andrew Ng’s Machine Learning class on coursera.org (highly recommended!) and thought I’d write a post about one of the first topics we’re covering, linear regression.  Though Dr. Ng explains the mechanics of this technique very clearly, there’s not many examples of situations in which the technique is commonly used, or examples of the types of problems that are solved by the techniques.  I’ll do my best to fill in this gap!

What is linear regression?

Linear regression is the technique of approximating output values for a given input set using a polynomial function.  Given an input matrix X & an output vector Y, find coefficients A & B such that XA + B produces a vector that is as close to Y as possible.  Finding the coefficients A & B allows a person to then predict output values given similar input values.

When to use it?

Linear regression assumes that the relationship between the input values in X and the dependent values in Y have a linear relationship.  Also, linear regression produces values that can be used as coefficients in a continuous function.  So, if you suspect that your inputs and outputs have a linear relationship, and the output is effectively continuous (rather than discrete), try using linear regression to approximate the relationship.

Continue reading