So with my new job at Digimarc, you have surely noticed that I haven’t been writing new posts!  However, I’m learning a lot and decided that it’s time to give something back.  I’ll start with the image processing basics that I’ve acquired.  This post is about a common way that  still images are represented by computers: rasterized.

Vector vs. Rasterized images

There are two basic types of still images commonly used today: vector and rasterized images.  This post is about raster or rasterized images.

Raster images are basically made of contiguous blocks of colors. These blocks are called picture elements, or “pixels” for short. Pixels in an image are usually considered to be square-shaped. They are usually arranged in uniform columns and rows. Thus, a picture is a bit like a wall of square legos.

Is that an eye?

When the pixels are sufficiently small, or when they are viewed from sufficiently far away, our human visual system cannot distinguish individual pixels and they blend together into what appears to be a smooth, continuous image.

It’s Lena’s left eye! Lena is a commonly-used example of a grayscale raster image

By using colored blocks and a sufficient number of pixels, raster images can represent complex imagery such as people, nature, drawn art, and more. In fact, you are probably already familiar with the versatility of raster images – all modern digital photography produces raster images!

We can better understand the details of raster graphics by ignoring color for the moment and considering black & white, or grayscale, images.

Grayscale

In a grayscale image, each pixel is composed of a single number that represents the amount of black (or white) that appears in that pixel.  The number is sometimes referred to as the intensity, and it can be helpful to think of the intensity as a percentage between 0% and 100%.

Zoomed in view of 3×3 image

The image above is a zoomed-in view of a image that’s 3 pixels high and 3 pixels wide. This is typically referred to as a 3×3 image (width x height). In this image, all the pixels are either black or white – none are gray.

One way to think of this image is white paper in a dark room. We can illuminate each pixel with white light. Dim white illumination on a pixel would turn it gray. Bright white illumination on the pixel would turn it white. We can quantify the brightness of the illumination as the intensity, where 0% = no light and 100% = bright white light.

This scheme, where increasing intensity produces lighter colors, is known as additive. The image below has been annotated with the additive intensities for each pixel.

3×3 image, with borders between pixels, and annotated with additive intensities for each pixel

Though many interesting images can be created using only black and white pixels that are uniformly square and located on a uniform grid, more realistic detail can be represented when shades of gray are allowed as well.

In the image of Lena’s eye above, 82% points to a near-white pixel, 14% points to a near-black pixel, and 40% points to a mid-tone pixel. So, using varying intensities of black and white between 0% and 100% allows us to clearly represent this eye, which can be discerned even when highly zoomed in.

Color

In a grayscale image, the image is made of pixels that vary in lightness between white and black. Each pixel has a single intensity: the amount of white illumination on a black pixel.

Rather than varying intensity between white and black, which produces shades of gray, we could instead vary between white and a different color. For example, varying intensity between white and red produces shades of pink.

Red-Green-Blue (RGB) images

Full color images are created by combining multiple single color images. When multiple single color images are combined into a full color image, each color is called a separation or channel. In a full color image, each pixel has multiple intensities – one for each channel. The pixel’s color is the combination of each channel’s color at that pixel.

Full color Lena eye

The full color eye above is composed of a red channel, green channel, and blue channel, which are shown below.

Red channel from the full color Lena eye. Note that this is different from a red colorized version of the grayscale image.
Green channel from the full color Lena eye
Blue channel from the full color Lena eye

One aspect of this image that’s immediately noticeable is that each channel is relatively dark on average, yet the full color image is relatively light. That occurs because in the Red-Green-Blue (RGB) scheme, colors are additive – as intensity increases, the amount of light increases, making the image brighter.

The RGB scheme can be interpreted as the red, green, and blue components of a white light that illuminate a white paper in a dark room. For example, if the blue channel’s intensity is 0% for a pixel, it means none of the blue component of white light is illuminating that pixel. If the blue channel’s intensity is 10% for a pixel, it means a little bit of the blue component of white light is illuminating that pixel. If the blue channel’s intensity is 100% for a pixel, it means all of the blue component of white light is illuminating that pixel.

A pixel with intensities 50% red, 50% blue, and 0% green would have a purplish color. Can you think of how to create a white, gray, or yellow pixel?

This website shows how different RGB intensity percentages can create different colors. The table also includes another component, alpha, which is the transparency of the color. 0% is completely opaque, while 100% is completely transparent. Transparency allows background color to mix into the foreground color.

RGB is a common scheme to use because computer screens very commonly are manufactured to emit red, green, and blue light, which combine to create millions of colors. By using RGB data to store image data, the image data translates directly to the display: each pixel on the display just emits red, green, and blue light at the intensity stored in the image’s pixel.

Conclusion

That’s the absolute basics on raster still images! Raster images represent images by describing a uniform grid made of tiny squares called pixels. Each pixel has a color, which is described by one or more intensities.

Raster images are optimal for complex static imagery, such as photography. The pixels allow for arbitrary variation and irregular shapes, which appear throughout our natural world.

Questions? Leave a comment in the comment section below.

The GoF book defines the Factory Method pattern in terms of an ICreator interface that defines a virtual function, CreateIProduct().  The virtual function CreateIProduct() returns a base class, IProduct.  So, classes that derive from ICreator implement CreateIProduct() to return a subclass of IProduct.  In other words, a SubclassOfCreator would create a SubclassOfProduct.

Like the Abstract Factory pattern, this enforces object cohesion – the SubclassOfCreator creates only the types that it can work with.  It also allows for extensibility, because the framework is only providing the ICreator and IProduct interfaces.  The client code will derive from ICreator and IProduct, which eliminates the need for client-specific behaviors in the framework’s code.

Continue reading

Let’s discuss the Abstract Factory pattern to continue our tour of design patterns in C++ .  This is an interesting technique for maintaining coherence between objects, while allowing easy switching or swapping of object sets.  This can be useful for allowing an application to use multiple UI toolkits (Apple vs. Windows), allowing different enemies with different capabilities in a game, using different submodels in a complicated simulation, and much more.

Continue reading

I’m researching image processing techniques for my new job.  I’m finding lots of things that I never took the time to understand, even if I had encountered them.  One of them is color maps.  Color maps are ways to convert a set of scalar values into colors.  They can be used to visualize non-visual data, or enhance visual data.

Continue reading

We’ll start the discussion of design patterns with the object creation patterns.  First up is the Singleton pattern.  Conceptually, this is used when you want exactly one instance of an object.  A common example is a logger.  Sometimes an application wants all its components to log data to the same destination.  So, developers might create a Singleton logger, then all the components can easily get a handle to its instance and use its API.  But the Singleton pattern has significant drawbacks, and there’s usually better methods for handling situations where you want a Singleton.

Continue reading

This is the first in a series of posts I will write about design patterns.

Design patterns in software development have been heavily influenced by the work of Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, known as the Gang of Four (GoF).  They literally wrote the book on patterns, Design Patterns: Elements of Reusable Object-Oriented Software.  In this book, the authors describe patterns for managing object creation, composing objects into larger structures, and coordinating control flow between objects.  Since publication, other developers have identified and described more patterns in practically every area of software design. Try googling your favorite software topic + “design patterns” to see what kind of patterns other developers have identified and described: android design patterns, embedded design patterns, machine learning design patterns, etc.

It’s very useful to know about the patterns in abstract, even if you don’t know the details of a particular pattern.  As the authors state, knowing the patterns helps developers identify and use the “right” design faster.  Knowing the patterns provides these benefits to the developer:

  1. They describe common problems that occur in software development.  As a developer, especially a new developer, it’s easy to think of every problem as completely unique to the program that you’re writing.  But very often, they are unique only because of poor modeling or simply lack of experience.  Simply knowing common problems can help a developer model a system in terms of those problems, which often reduces the size, number, and complexity of the problems that remain to be solved.
  2. They are considered “best known methods” for solving typical/frequent problems that arise in programming & architecture.  Knowing the “best known method” for solving a problem eliminates a lot of thought, effort, and time devoted to solving it, which reduces the time that a developer must spend on a particular problem.
  3. Knowledge of the patterns simplifies & clarifies communication between developers when talking about a particular problem or solution.  When a developer who is familiar with design patterns hears “I used the singleton pattern on the LogFile class,” the developer immediately knows that (if implemented correctly) there will only be one or zero instances of the LogFile class living in the program at one time.

When to use it

It’s pretty easy to describe when to use a pattern – whenever your program contains the exact problem that is solved by one of the patterns.  They can even be used if your program contains a similar problem to that solved by one of the patterns, but in this case, the implementation of the pattern may need to be modified to fit the particulars of your program.

However, it’s not always obvious your software’s problem(s) can be solved by a GoF pattern.  In other words, the program may be such a mess that it needs to be refactored simply to transform a problem into one that can be solved with a GoF pattern.  Hopefully by learning about the patterns, you’ll be able to recognize non-obvious applications in your own software.

 

I’ll cover the patterns by subject, and within a subject I’ll try to cover what I feel are the most broadly applicable patterns first.  Stay updated by following me on RSS, linkedin, or twitter (@avitevet)!

Many problems in real life can be represented as optimization problems that are subject to various constraints.  How far can I go without stopping at the gas station if I expect to drive 60% on the highway and 40% in the city?   What’s the most enjoyment I can get with $10 of chocolate bars, given that I want at least one Butterfinger bar but like Snickers twice as much?  How can I achieve the best GPA given my current grades in the classes, each class’s grading system, and that I only have 2 more days to study for finals?

The simplex method is an algorithm for finding a maximal function value given a set of constraints.  We’ll start with a non-trivial example that shows why we need a rigorous method to solve this problem, then move on to a simple example that illustrates most of the main parts of the simplex method.  You’ll learn when to use it, you can check out my cheatsheet, and you can review my code on github!

Continue reading

I recently interviewed with a large company where I was asked how I would check that a given word was a valid English word.  Knowing that there were on the order of a few tens of thousands of stems, and that each stem may have many variations, I figured that there were maybe a few hundred thousand possible words in the English language.  With each word having average length probably in the 7 range, there might be a million or so characters in this set.  I concluded that the check could be performed by searching an in-memory dictionary, and proposed either a std::unordered_set, or a custom trie, with a performance test necessary to see which would be faster.

I decided to perform this test.

Continue reading

I recently was asked essentially this question in an interview: given a string S and a set of strings P, can the strings from P be concatenated with repetitions to form S?  Or put another way, can strings from P with repetitions cover S without overlaps?

My first instinct was to use suffix tries and an overlap graph, but the solution that we eventually reached used no sophisticated data structures, just the array of strings and some recursion in a divide and conquer approach.  Here’s my extended thoughts on the problem.

Continue reading