Main Page | See live article | Alphabetical index

Standard deviation

In probability and statistics, the standard deviation, generally denoted σ ('sigma'), is the most commonly used measure of statistical dispersion which is measured with the same units as the data. It is calculated as the positive square root of the variance and is therefore always a non-negative number.

The standard deviation of a sample, as opposed to a population, is denoted s.

See also: mean, skewness, kurtosis, raw score, standard score.

Geometric Interpretation of Mean and Standard Deviation

Given a set of numbers , it is desired to define the mean and standard deviation of these numbers. We will imagine an n-dimensional hypercube in Rn. Let the hypercube be large enough to contain all the numbers, so let it have sides of length . Let the point be a point inside this hypercube. For convenience, we will visualize this by means of a three dimensional diagram, in which point A is inside a cube.

Now draw the main diagonal of the cube, which goes through points O=(0,0,0) and point M=(L,L,L) and call it OM.

Now find a point on line OM such that line OB and line BA are perpendicular:

.

Divide both sides by B0,
therefore
Thus the length of OB is
Then the mean of the numbers is

This can be easily generalized to a higher dimension n. For any set of numbers , their mean is
where B is a point on line OM such that (recapitulating):

where
is the unit vector in the direction of the main diagonal OM, and

This requirement that the
dot product of OB and BA be equal to 0 means that lines OB and BA are perpendicular.

The standard deviation can then be defined as

In other words, the standard deviation is the (hyperdimensional) distance between the event (point A) and the vector mean of the event. The mean is the distance from the origin to the projection of the event onto the main diagonal. The standard deviation is the distance between the event and the main diagonal. The mean is the projected distance away from the origin, the standard deviation is the distance away from the mean.