Tuesday, March 29th, 2005
Paul Graham’s “Writing, Briefly”
I liked Paul Graham’s tips on writing. It reads like a bullet
list, which makes it a little choppy, but it succeeds in offering many
useful ideas in a form that is compact enough to understand all-at-once.
Here are my favorite points:
- write version 1 fast
- rewrite many times
- if you can’t get started, verbally explain your point to a friend
- don’t try to sound impressive
- work in fairly long chunks of time
- use simple words
/web 〆
permalink
Tuesday, March 29th, 2005
Image Processing Test Image: The Burger Girl
Here’s a test image I enjoy. Click the image to see an ucompressed 512x512 version (640KB PNG).

/image processing 〆
permalink
Sunday, March 27th, 2005
Example MATLAB Code Testing SSIM and CW-SSIM
While learning about structural image quality techniques, I implemented some test code to experiment a bit. Since I didn’t have the MATLAB image processing toolbox conveniently available, I shell out (call external command line programs) to some Imagemagick functions a bit, so watch out for that.
See my test code for the complex wavelet domain structural similarity metric (CW-SSIM). Note that I’m also using Eero Simoncelli’s steerable pyramid tools.
/image processing 〆
permalink
Wednesday, March 23rd, 2005
MarsEdit … Tried the Demo Once and I’m Already Loving It
I decided to give MarsEdit from ranchero software a try and it was very enjoyable. It took all of 5 minutes to get it working great with my Blosxom weblog software. You simply point it towards the folder your weblog files are stored in and it configures the rest very nicely. I like how it lets you compose drafts before publishing them to your blog … just like composing an email before sending it.
The only reason it took 5 minutes instead of 2 is that I wanted the preview mode to use Markdown. As suggested here, the MarsEdit developers already made this very easy to change the default mode to Markdown. Simply setup a default by typing the following into a Terminal window (it should all go on one line).
defaults write com.ranchero.MarsEdit
previewWithMarkdownAlways YES
/web 〆
permalink
Sunday, March 20th, 2005
Looking for new design ideas: my blog is ugly …
I’m on a search for ideas on how to give my website and blog a nice
makeover. Open Source Web Design might be nice — there seems
to be no fee associated with thier designs.
Update: here’s a nice implementation of rounded corners without using
images called Nifty Corners. I like it.
/web 〆
permalink
Sunday, March 20th, 2005
What makes an image look good?
I gave a presentation on image quality and some related topics
(global and local image phase, steerable pyramid wavelet transforms,
statistical modeling of natural images, and structural image quality).
Some of the most interesting questions resulting from the talk were:
How should one interpret the diagram from the Phase & Perception of
Blur paper — specifically, what do the converging lines represent? My
current interpretation is that they are equal-phase contours
corresponding to a well-localized feature point at any scale.
What is the gaussian scale mixture (GSM) model? I hope to better
explain and interpret this in an upcoming blog entry.
How do SSIM and CWSSIM compare to the latest perceptual error-based
models of image quality (such as ones derived from the Watson paper)? A
specific test could evaluate structural methods with images that are
only degraded with a just-noticable difference (JND). In other words,
look at errors that are just visible at the threshold of human
perception instead of the gross “suprathreshold” errors that we looked
at before.
/image processing 〆
permalink
Wednesday, March 16th, 2005
Papers on Perceptual Image Quality Metrics, Image Phase, Subband Transforms, and Image Statistics
This entry documents the most interesting papers I’ve been reading and studying this quarter. I have sorted them into categories and then sorted chronologically to show the influence that early papers has on the newer ones.
Image Phase
1975 Kuglin and Hines, “The phase correlation image alignment method”
1979 Oppenheim, Lim, Kopec, and Pohlig, “Phase in speech and pictures”
1980 Hayes, Lim, and Oppenheim, “Signal reconstruction from phase or magnitude”
1999 Thomson, “Visual coding and the phase structure of natural scenes”
2000 Kovesi, “Phase congruency: A low-level image invariant”
2003 Wang and Simoncelli, “Local Phase Coherence and the Perception of Blur”
Subband Transforms: Steerable Pyramids
1991 Freeman and Adelson, “The design and use of steerable filters”
1991 Simoncelli, “Shiftable Multi-scale Transforms”
1995 Simoncelli, “The steerable pyramid: A flexible architecture for multi-scale derivative computation”
2000 Portilla, “A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients”
Statistical Image Modeling
2002 Srivastava, “On advances in statistical modeling of natural images”
2005 Simoncelli, “Statistical Modeling of Photographic Images”
2005 Wang, “Reduced-Reference Image Quality Assessment Using a Wavelet-Domain Natural Image Statistic Model”
Perceptual Image Quality
1998 Watson, “Toward a perceptual video-quality metric”
1998 Eckert, “Perceptual quality metrics applied to still image compression”
2001 Chen and Pappas, “Perceptual Coders and Perceptual Metrics”
2002 Wang, “Why is Image Quality Assessment So Difficult?”
2004 Pappas, “Perceptual Criteria for Image Quality Evaluation”
2004 Wang, “Image Quality Assessment- From Error Visibility to Structural Similarity”
2005 Wang, “Translation Insensitive Image Similarity in Complex Wavelet Domain”
/image processing 〆
permalink
Tuesday, March 15th, 2005
Eero Simoncelli’s “Statistical Modeling of Photographic Images”
Main idea:
Out of the huge set of possible images, a particular subset of likely images exist, and these images can be described using a probability model.
Three probability models are discussed:
- The Gaussian Model
- pros
- easy computations
- single parameter
- direct application to compression and noise removal
- cons
- unconstrained phase (can destroy image content)
- doesn’t capture structure in most real images
- The Wavelet Marginal Model
- pros
- captures non-gaussian histogram characteristics (with peaks at zero and long tails)
- better fit (reduced entropy) leads to improved compression and noise removal
- cons
- important image information is still not captured
- wavelet coefficients are not independent — their high-order statistics are correlated
- Wavelet Joint Models
- pros
- adapts to local variance
- gaussian scale mixture (GSM) model is useful
- gives much improved noise removal results
- cons
- still can’t capture all image structure
/image processing 〆
permalink
Tuesday, March 15th, 2005
Feedback and Answers on SSIM
Thanks to my dedicated reader, Steve, for providing feedback to my recent entries on image quality using structural similarity. He had these ideas:
Start with a low quality image (such as one that is already blurry) and degrade it more. See if results still are good — does SSIM measure this further degradation in a reasonable way?
What happens with an image that is all noise and then gets distorted? There is no structure to start with.
I ran a quick test to check out the first idea. The results follow. Click the thumbnails to view full-sized images. The image on the left is the image that has been blurred once, while the one on the right has been blurred twice.

The additional blurring operation gave a MSE = 9.9 and a MSSIM = 0.975. Qualitatively, this result makes sense — I think we lost much more visual information with the original blur than this one.
In response to the second question (what if the original image is noise only), I found that the results depend on the type of distortion. Distortion by shifting the mean or stretching the contrast gave results similar to those obtained when using natural images (MSSIM = 0.998 or so).
However, it was interesting look at the distortion caused by compressing the noise image using jpeg to achieve a MSE = 60. To achieve a MSE of 60, the jpeg algorithm couldn’t compress the noise image (shown below) very much. I can’t distinguish between the “original” and “degraded” images, therefore, my intuitive understanding is that the compressed noise-only image has a high image quality. The high MSSIM result of 0.952 coincided well with my intuition.

/image processing 〆
permalink
Monday, March 14th, 2005
The Importance of Phase in Images
Many papers have suggested that phase information in an image is very important. A report from Alan Oppenheim in 1979 entitled Phase in Speech and Pictures demonstrated that much of the structural information in an image is preserved even when it is represented by phase alone.
He describes an experiment in which an image is decomposed into phase and magnitude parts using a Fourier transform, then the magnitude is set to unity, and an image is reconstructed from the remaining phase information.
The idea is that Fourier phase includes important information about the features and details in an image. The following figures show an original and the phase-only reconstruction of an example image. These were produced by the following MATLAB commands:
% start with an image stored in variable "im"
im_fourier = fft2(im);
im_phase = angle(im_fourier);
im_reconstruct_from_phase = abs(ifft2(exp(i*im_phase)));
im_reconstruct_from_phase
% display original & reconstructed image
% (scaled for visibility)
imshow(im,[])
imshow(im_reconstruct_from_phase.^.4),[])

Many of the high-frequency structures have been preserved in the phase-only image. Indeed, the transformation into a phase-only image can be approximately interpreted as a high pass filtering operation.
It turns out that the intelligibility of the phase-only representation depends on the magnitude “smoothness” of the signal being looked at. Since most natural images contain mostly low frequency content, their magnitude rolls off quickly at high frequency and this leads to the situation where the “high pass” interpretation of the phase-only transform holds.
/image processing 〆
permalink
Thursday, March 10th, 2005
Overview of Zhou Wang’s “Image Quality Assessment: From Error Visibility to Structural Similarity”
The main idea in this paper (available here) is that human visual perception is built to understand a scene based on its structure suggesting that this structural information is the key component of visual quality. A good way to measure image quality, then, is to quantify the degradation in the structure within a distorted image versus an original.
This is a change in the fundamental assumption from past image quality work. Previous approaches measure perceptual image quality assuming that image intensity is the key component of visual quality. These methods often measure intensity error and then penalize these errors according to visibility.
To get started, let’s go over some definitions of commonly used “image quality” terms and abbreviations.
- image quality: a field of study with goals of quantifying subjective human-perceived visual quality and developing objective measures that accurately predict subjective quality
- subjective image quality: human-perceived visual quality, often measured for a group of test subjects and reported as a mean opinion score (MOS)
- objective image quality: quantitative measures that can accurately predict subjective image quality
- full-reference: the complete undistorted original image is available
- no-reference or blind: only the distorted image is available
- reduced-reference: partial information (extracted features) about the original image is available
- MSE: mean squared error, the average of squared pixel intensity differences
- PSNR: peak signal-to-noise ratio
Error-Sensitivity Approach
The assumption here is that the perceived distortion is directly related
to the error signal. These approaches apply a sequence of steps consisting of: preprocessing to scale/align and account for human color perception, CSF (contract sensitivity function) filtering to account for human spacial and temporal frequency response, channel decomposition into temporal and spacial subbands, error normalization according to a perceptual masking model, and error pooling to weight errors and come up with a single quality number.
Some common problems with these approaches have been emphasized in this paper, including:
- the quality definition problem: it’s not clear that error visibility corresponds well with image quality
- the supra-threshold problem: most perceptual studies have been
evaluated with small errors, where the error is producing a JND (just
noticeable difference) and therefore, the studies don’t account for large
errors very well
- the natural image complexity problem: the images used to develop
perceptual threshold are very simple compared to natural images
- the cognitive interaction problem: foveation (where a person is likely
to look in an image) and cognation of the image also leads to variable
image quality perception
Structural Similarity Approach
The goal of the new approach is to “find a more direct way to compare the structures of the reference and the distorted signals.” The assumption is humans extract structural information from images — not pixel intensities.
An image quality metric based on structural similarity can overcome many of the problems associated with the error-sensitivity method. The SSIM index is one specific implementation of a structural similarity approach — it is not the only possible architecture that uses the structural similarity paradigm, but it is interesting as a first example of structural similarity’s utility.
SSIM: An Example Structural Approach

Algorithm Description
The figure above shows a proposed image quality measurement system
that compares registered images x and y. The similarity
measure SSIM(x,y) is a function of luminance l(x,y),
contrast c(x,y), and structure s(x,y). Also, it is
necessary to include three constants (C1, C2, and C3) to prevent
unstable results when the denominators approach zero.
The average intensity (ux and uy) is used to define the luminance function
l(x,y) = (2*ux*uy + C1) / (ux^2 + uy^2 + C1).
The standard deviation (sx and sy) is used to define the contrast function
c(x,y) = (2*sx*sy + C2) / (sx^2 + sy^2 + C2).
The correlation (sxy) after removing the mean and normalizing by the standard deviation is used to represent structural similarity:
s(x,y) = (sxy + C3) / (sx*xy + C3).
Finally, the similarity is computed as a combination of the luminance, chrominance, and correlation in a general form
SSIM(x,y) = l(x,y)^a * c(x,y)^b * s(x,y)^g
where a > 0, b > 0, and g > 0 are parameters that determine the relative weighting of each term.
For the specific implementation in this paper, SSIM is simplified by choosing a = b = g = 1 and C3 = C2/2, giving
(2*ux*uy+C1)*(2*sxy+C2)
SSIM(x,y) = -----------------------------
(ux^2+uy^2+C1)*(sx^2+sy^2+C2)
Local image statistics are measured in a weighted 11x11 circular window around each pixel to generate SSIM for each pixel. A few other numbers are needed to fully define the parameters C1 and C2. The dynamic range of the pixels is defined as L (255 for 8-bit grayscale). Then, C1 and C2 are given as functions of L and some small constants K1 << 1 and K2 << 1.
C1 = (K1*L)^2
C2 = (K2*L)^2
In the paper, the author uses these settings: K1 = 0.01; K2 = 0.03. A single number representing overall image quality is computed by averaging the SSIM values to give a mean:
MSSIM(X,Y) = 1/M * sum( SSIM(:) ).
Test Results
Using the example MATLAB implementation referenced in the paper, I
compared MSSIM with mean-squared error (MSE) for a few images. The
following figure shows the test images I used. Also, there is a
high-resolution version (540kB).

From left-to-right starting across the top row, these images are
1. the original version
2. jpeg-compressed
3. blurred
4. added gaussian white noise
5. mean-shifted
6. contrast-stretched
All of these versions were created to give an equal mean-squared error (MSE) of 60 — this clearly demonstrates that MSE does not correlate with perceived quality. It is clear that the image quality of 2 and 3 is much worse that the others. Let’s see if MSSIM works better.
Table 1: Comparing Image Quality Measures
Image # MSE MSSIM
1 0 1.000
2 60 0.817
3 60 0.881
4 60 0.638
5 60 0.998
6 60 0.998
Structural similarity accurately predicts the high quality of images 5 and 6, the mean and contrast-shifted images.
It is interesting to discuss the results from image 4, the one with gaussian white noise added. MSSIM is the lowest for this image, contradicting my expectation that image 4 has a perceptual image quality somewhere between the worst images (2 and 3) and the best images (5 and 6). I wonder why this result didn’t match my expectations …
Anyway, I hope you enjoyed this summary. Please send me suggestions and/or comments.
/image processing 〆
permalink
Wednesday, March 2nd, 2005
Where am I: My Latitude and Longitude in GeoURL
GeoURL is an interesting website that implements a location-to-URL
reverse directory that can be used to find URLs by proximity to a given
location.
I used terraserver to find my location — it turns out I’m at
42.05062 degrees latitude and -87.68261 degrees longitude (western
hemisphere longitudes are negative). Interestingly, my building didn’t
exist in 2002 when this sattelite picture was taken — it was just a
parking lot then.
To become a part of the GeoURL database, I added the following <meta>
tags to my website’s <head> section:
<meta name="ICBM" content="42.05062, -87.68261" />
<meta name="DC.title" content="Alan The Dork" />
Then, I told the GeoURL server that my page needs to be indexed by using
the ping form mentioned in step 4 of these instructions.
Now, you can look at the sites near me.
/web 〆
permalink