![]() | |
![]() |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() | |
![]() | |
![]() |
It is well known that the error estimate from any MC calculations (or in
general from a simulation) converges as where N is the
number of random samples or equivalently the computer time expended.
Recently there has been much research into whether non-random sequences could
result in faster convergence, but still have the advantages of MC in being
applicable to high-dimensional problems.
Consider the integral:
The Monte Carlo estimate for this integral is:
where the points are to be uniformly distributed in the
s-dimensional unit cube. An important, and very general, result
that bounds the absolute error of the Monte Carlo estimate is
the Koksma-Hlwaka inequality [25]:
where is the discrepancy of the points
as
defined by Eq. (
), and
V(f) is the total variation of f on
in the sense of
Hardy and Krause [38].
The total variation is roughly the average absolute value of the
derivative of f(x).
Note that Eq. () gives a deterministic
error bound for integration because V(f)
depends only on the nature of the function. Similarly,
the discrepancy of a point set is a purely geometric property of that point
set. When given a numerical quadrature problem, we must cope with whatever
function we are given, it is really only the points,
, that we control.
Thus one approach to efficient integration is to seek
point sets with small discrepancies.
Such sets are necessarily not random
but are instead referred to as quasi-random numbers (QRNs).
The s dimensional discrepancy of the N points,
is defined as:
Here B is any rectangular subvolume inside the s-dimensional unit cube
with sides parallel to the axes and volume .
The function
counts the number of
points of
in B.
Hence a set of points with a low discrepancy covers the unit cube more uniformly
than one with a large discrepancy.
A lattice has a very low discrepancy, however adding additional points to the
lattice is slow in high dimensions.
The discrepancy of a lattice does not decrease smoothly with increasing N.
A remarkable fact is that there is a lower (Roth) bound to the
its discrepancy [39]:
This gives us a target to aim at for the construction of low-discrepancy point
sets (quasi-random numbers). For comparison, the estimated error with
random sequences is:
The quantity in brackets, the variance, only depends on f.
Hence the standard deviation of the Monte Carlo estimate is
. This is much worse than the bounds of Eq. (
) as a function of the
number of points.
It is this fact that has
motivated the search for quasi-random points.
What goes wrong with this argument as far as most high-dimensional
problems in physical science is that the Koksma-Hlwaka bound is
extremely poor for large s. Typical integrands become more and more
singular the more one differentiates. The worse case is the
Metropolis rejection step: the acceptance is itself
discontinuous. Even assuming V(f) were finite, it is likely to be so
large (for large s) that truly astronomical values of N would be
required for the bound in Eq. () to be less than the MC
convergence rate given in Eq. (
).
We will now present a very brief description of quasi-random
sequences. Those interested in a much more detailed review of the
subject are encouraged to consult the recent work of Niederreiter [25]. An example of a one-dimensional set of quasi-random
numbers is the van der Corput sequence. First we choose a base, b,
and write an integer n in base b as . Then we define the van der Corput sequence as
. For base b=3, the first
12 terms of the van der Corput sequence are:
One sees intuitively how this
sequence, while not behaving in a random fashion, fills in all the
holes in a regular and low-discrepancy way.
There are many other low-discrepancy point sets and sequences. Some, like the Halton sequence [40] use the van der Corput sequence. There have been many others which are thought to be more general and have provably smaller discrepancy. Of particular note are the explicit constructions of Faure [41] and Niederreiter [42].
The use of quasi-random numbers in quadrature has not been widespread
because the claims of superiority of quasi-random over pseudo-random
for quadrature have not been shown empirically especially for high dimensional
integrals and never in MCMC simulations [43].
QRNs work best in low dimensional spaces
where the spacing between the points can be made small with respect to
the curvature of the integrand. QRNs work well for very smooth integrands.
The Boltzmann distribution is highly peaked for R a
many dimensional point. Empirical studies have shown that
the number of integration points needed for QRN error to be less than the MC
error increases very rapidly with the dimensionality.
Until the asymptotic region is reached, the error of QRN is not very different from
simple MC. Also there are many MC tricks which can be used to reduce the
variance such as importance sampling, antithetic sampling and so forth.
Finally MC has a very robust way of estimating errors; that is one of its
main advantages as a method. Integrations with QRNs have to rely on
empirical error estimates since the rigorous upper bound is exceedingly large.
It may happen that the integrand in some way correlates with the QRN sequence
so that these error estimates are completely wrong, just as sometimes happens
with other deterministic integration methods.
There seem to be some advantages of quasi-random over pseudo-random for simple smooth integrands that are effectively low dimensional (say less than 15 dimensions), but they are much smaller than one is lead to expect from the mathematical results.
The National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
ashoks@ncsa.uiuc.edu
Last modified: September 16, 1997