Completeness (statistics)
Suppose a
random variable X (which may be a sequence
(
X1, ...,
Xn) of
scalar-valued
random variables), has a probability distribution belonging to a known
family of probability distributions, parametrized by θ, which
may be either
vector- or scalar-valued. A function
g(
X)
is an
unbiased estimator of zero if the expectation
E(
g(
X)) remains zero regardless of the value of the
parameter θ. Then
X is a
complete statistic
precisely if it admits no such unbiased
estimator of
zero.
For example, suppose X1, X2
are independent, identically
distributed random variables,
normally distributed with
expecation θ and variance 1. Then
X1 — X2 is an unbiased
estimator of zero. Therefore the pair
(X1, X2) is not a complete
statistic. On the other hand, the sum
X1 + X2
can be shown to be a complete statistic. That means that there
is no non-zero function g such that
-
remains zero regardless of changes in the value of θ.
That fact may be seen as follows.
The probability distribution of
X1 +
X2
is normal with expectation 2θ and variance 2.
Its probability density function is therefore
-
The expectation above would therefore be a constant times
-
A bit of algebra reduces this to
-
As a function of θ this is a two-sided
Laplace transform
of
h(
x), and cannot be identically zero unless
h(
x)
zero almost everywhere.
One reason for the importance of the concept is the Lehmann-Scheffé theorem,
which states that a statistic that is complete, sufficient, and unbiased is the best unbiased estimator, i.e., the one that has a smaller mean squared error than any other unbiased estimator, or, more generally, a smaller expected loss, for any convex loss function.