This is the first in a series of posts I will write about design patterns.

Design patterns in software development have been heavily influenced by the work of Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, known as the Gang of Four (GoF).  They literally wrote the book on patterns, Design Patterns: Elements of Reusable Object-Oriented Software.  In this book, the authors describe patterns for managing object creation, composing objects into larger structures, and coordinating control flow between objects.  Since publication, other developers have identified and described more patterns in practically every area of software design. Try googling your favorite software topic + “design patterns” to see what kind of patterns other developers have identified and described: android design patterns, embedded design patterns, machine learning design patterns, etc.

It’s very useful to know about the patterns in abstract, even if you don’t know the details of a particular pattern.  As the authors state, knowing the patterns helps developers identify and use the “right” design faster.  Knowing the patterns provides these benefits to the developer:

  1. They describe common problems that occur in software development.  As a developer, especially a new developer, it’s easy to think of every problem as completely unique to the program that you’re writing.  But very often, they are unique only because of poor modeling or simply lack of experience.  Simply knowing common problems can help a developer model a system in terms of those problems, which often reduces the size, number, and complexity of the problems that remain to be solved.
  2. They are considered “best known methods” for solving typical/frequent problems that arise in programming & architecture.  Knowing the “best known method” for solving a problem eliminates a lot of thought, effort, and time devoted to solving it, which reduces the time that a developer must spend on a particular problem.
  3. Knowledge of the patterns simplifies & clarifies communication between developers when talking about a particular problem or solution.  When a developer who is familiar with design patterns hears “I used the singleton pattern on the LogFile class,” the developer immediately knows that (if implemented correctly) there will only be one or zero instances of the LogFile class living in the program at one time.

When to use it

It’s pretty easy to describe when to use a pattern – whenever your program contains the exact problem that is solved by one of the patterns.  They can even be used if your program contains a similar problem to that solved by one of the patterns, but in this case, the implementation of the pattern may need to be modified to fit the particulars of your program.

However, it’s not always obvious your software’s problem(s) can be solved by a GoF pattern.  In other words, the program may be such a mess that it needs to be refactored simply to transform a problem into one that can be solved with a GoF pattern.  Hopefully by learning about the patterns, you’ll be able to recognize non-obvious applications in your own software.

 

I’ll cover the patterns by subject, and within a subject I’ll try to cover what I feel are the most broadly applicable patterns first.  Stay updated by following me on RSS, linkedin, or twitter (@avitevet)!

Many problems in real life can be represented as optimization problems that are subject to various constraints.  How far can I go without stopping at the gas station if I expect to drive 60% on the highway and 40% in the city?   What’s the most enjoyment I can get with $10 of chocolate bars, given that I want at least one Butterfinger bar but like Snickers twice as much?  How can I achieve the best GPA given my current grades in the classes, each class’s grading system, and that I only have 2 more days to study for finals?

The simplex method is an algorithm for finding a maximal function value given a set of constraints.  We’ll start with a non-trivial example that shows why we need a rigorous method to solve this problem, then move on to a simple example that illustrates most of the main parts of the simplex method.  You’ll learn when to use it, you can check out my cheatsheet, and you can review my code on github!

Continue reading

I recently interviewed with a large company where I was asked how I would check that a given word was a valid English word.  Knowing that there were on the order of a few tens of thousands of stems, and that each stem may have many variations, I figured that there were maybe a few hundred thousand possible words in the English language.  With each word having average length probably in the 7 range, there might be a million or so characters in this set.  I concluded that the check could be performed by searching an in-memory dictionary, and proposed either a std::unordered_set, or a custom trie, with a performance test necessary to see which would be faster.

I decided to perform this test.

Continue reading

I recently was asked essentially this question in an interview: given a string S and a set of strings P, can the strings from P be concatenated with repetitions to form S?  Or put another way, can strings from P with repetitions cover S without overlaps?

My first instinct was to use suffix tries and an overlap graph, but the solution that we eventually reached used no sophisticated data structures, just the array of strings and some recursion in a divide and conquer approach.  Here’s my extended thoughts on the problem.

Continue reading

Suffix trie for string "panamabananas"

I’ll admit it up front, this is not as much of a “when to use it” post as it is a “what is it” post.  Never fear, I’ll still fill you all in on when to use it.  Also, I’m talking about the data structure for storing strings, not the synonym for attempts.

What is it?

A trie is a data structure that is used for fast string searching.  A trie has a root node, and a node for each character in the string.  The nodes are constructed such that traversing the tree from the root to a leaf reconstructs one of the strings that is stored in it.

Continue reading

I’m currently enrolled in Andrew Ng’s Machine Learning class on coursera.org (highly recommended!) and thought I’d write a post about one of the first topics we’re covering, linear regression.  Though Dr. Ng explains the mechanics of this technique very clearly, there’s not many examples of situations in which the technique is commonly used, or examples of the types of problems that are solved by the techniques.  I’ll do my best to fill in this gap!

What is linear regression?

Linear regression is the technique of approximating output values for a given input set using a polynomial function.  Given an input matrix X & an output vector Y, find coefficients A & B such that XA + B produces a vector that is as close to Y as possible.  Finding the coefficients A & B allows a person to then predict output values given similar input values.

When to use it?

Linear regression assumes that the relationship between the input values in X and the dependent values in Y have a linear relationship.  Also, linear regression produces values that can be used as coefficients in a continuous function.  So, if you suspect that your inputs and outputs have a linear relationship, and the output is effectively continuous (rather than discrete), try using linear regression to approximate the relationship.

Continue reading