Recall that the DFT is defined by the formula
The final summation, above, is precisely a cyclic convolution of the two sequences aq and bq of length n-1 (q = 0,...,n-2) defined by:
This algorithm, then, requires O(n) additions plus O(n log n) time for the convolution. In practice, the O(n) additions can often be performed by absorbing the additions into the convolution: if the convolution is performed by a pair of FFTs, then the sum of xk is given by the DC (0th) output of the FFT of aq, and x0 can be added to all the outputs by adding it to the DC term of the convolution prior to the inverse FFT. Still, this algorithm requires intrinsically more operations than FFTs of nearby composite sizes, and typically takes 3-10 times as long in practice.
If Rader's algorithm is performed by using FFTs of size n-1 to compute the convolution, rather than by zero padding as mentioned above, the efficiency depends strongly upon n and the number of times that Rader's algorithm must be applied recursively. The worst case would be if n-1 were 2n2 where n2 is prime, with n2-1 = 2n3 where n3 is prime, and so on. In such cases, supposing that the chain of primes extended all the way down to some bounded value, the recursive application of Rader's algorithm would actually require O(n2) time. Such nj are called Sophie Germain primes, and such a sequence of them is called a Cunningham chain. The lengths of Cunningham chains, however, are observed to grow more slowly than log2(n), so Rader's algorithm applied in this way is probably not &Omega(n2), though it is possibly worse than O(n log n) for the worst cases. Fortunately, a guarantee of O(n log n) complexity can be achieved by zero padding.
Since Rader's algorithm only depends upon the periodicity of the DFT kernel, it is directly applicable to any other transform (of prime order) with a similar property, such as a number-theoretic transform or the discrete Hartley transform.