Interquartile mean
The
interquartile mean (IQM) is a
statistical
measure of central tendency, much like
the
mean (in more popular terms called the
average), the
median,
and the
mode.
The IQM is a truncated mean and so is very similar to the scoring method used in sports that are evaluated by a panel of judges: discard the
lowest and the highest scores; calculate the mean value of the remaining
scores.
In calculation of the IQM, the lowest 25% and the highest 25% of the
scores are discarded. These points are called the first and third
quartiles, hence the name of the IQM. (Note that the
second quartile is also called the median). The method is best
explained with an example:
Consider the following dataset:
- 5, 8, 4, 38, 8, 6, 9, 7, 7, 3, 1, 6
First sort the list from lowest-to-highest:
- 1, 3, 4, 5, 6, 6, 7, 7, 8, 8, 9, 38
There are 12 observations (datapoints) in the dataset, thus we have 4 quartiles of 3 numbers. Discard the lowest and the highest 3 values:
1, 3, 4, 5, 6, 6, 7, 7, 8, 8, 9, 38
We now have 6 of the 12 observations remaining; next, we calculate the average of the
mean of these numbers:
- xIQM = (5 + 6 + 6 + 7 + 7 + 8) / 6 = 6.5
The Interquartile Mean shares some properties from both the
mean as well as
the
median:
- Like the median, the IQM insensitive to outliers; in the example given, the highest value (38) was an obvious outlier of the dataset, but its value is not used in the calculation of the IQM. On the other hand, the common average (the arithmetic mean) is sensitive to these outliers: xmean = 8.5.
- Like the mean, the IQM is a discrete parameter, based on a large number of observations from the dataset. The median is always equal to one of the observations in the dataset (assuming an odd number of observations). The mean can be equal to any value between the lowest and highest observation, depending on the value of all the other observations. The IQM can be equal to any value between the first and third quartiles, depending on all the observations in the interquartile range.
The above example consisted of 12 observations in the dataset, which
made the determination of the quartiles very easy. Of course, not all
datasets have a number of observations that is divisible by 4. We can
adjust the method of calculating the IQM to accommodate this. Ideally we
want to have the IQM equal to the
mean for symmetric distributions,
e.g.:
- 1, 2, 3, 4, 5
has a mean value x
mean = 3, and since it is a symmetric
distribution, x
IQM = 3 would be desired.
We can solve this by using a weighted average of the quartiles and
the interquartile dataset:
Consider the following dataset of 9 observations:
- 1, 3, 5, 7, 9, 11, 13, 15, 17
There are 9/4 = 2.25 observations in each quartile, and 4.5 observations
in the interquartile range. Truncate the fractional quartile size, and
remove this number from the 1st and 3rd quartiles (2.25 observations in
each quartile, thus the lowest 2 and the highest 2 are removed).
1, 3, (5), 7, 9, 11, (13), 15, 17
Thus, there are 3
full observations in the interquartile range,
and 2 fractional observations. Since we have a total of 4.5 observations
in the interquartile range, the two fractional observations each count
for 0.75 (and thus 3x1 + 2x0.75 = 4.5 observations).
The IQM is now calculated as follows:
- xIQM = {(7 + 9 + 11) + 0.75 x (5 + 13)} / 4.5 = 9
In the above example, the mean has a value x
mean = 9. The
same as the IQM, as was expected. The method of calculating the IQM for
any number of observations is analogous; the fractional
contributions to the IQM can be either 0, 0.25, 0.50, or 0.75.