I’m currently enrolled in Andrew Ng’s Machine Learning class on coursera.org (highly recommended!) and thought I’d write a post about one of the first topics we’re covering, linear regression.  Though Dr. Ng explains the mechanics of this technique very clearly, there’s not many examples of situations in which the technique is commonly used, or examples of the types of problems that are solved by the techniques.  I’ll do my best to fill in this gap!

What is linear regression?

Linear regression is the technique of approximating output values for a given input set using a polynomial function.  Given an input matrix X & an output vector Y, find coefficients A & B such that XA + B produces a vector that is as close to Y as possible.  Finding the coefficients A & B allows a person to then predict output values given similar input values.

When to use it?

Linear regression assumes that the relationship between the input values in X and the dependent values in Y have a linear relationship.  Also, linear regression produces values that can be used as coefficients in a continuous function.  So, if you suspect that your inputs and outputs have a linear relationship, and the output is effectively continuous (rather than discrete), try using linear regression to approximate the relationship.

Continue reading

Let’s say you’re developing a simple alternative to the TriMet  website, where riders can (among other things) check real-time location of buses and trains.  This means that your website will need to query the TriMet TransitTracker service when one of your site’s users performs a search.

So now your life has many more problems than just a moment ago.  How does your code get called when the TransitTracker service:

  • returns the real-time location information?
  • is unavailable?

Further, how do you implement your site so that it shows on-time arrivals in green, and late arrivals in red?

Continue reading