Class Session 04: Errors and fluctuations

Errors and Central limit theorem


Consider a direct sampling simulation, the outcome is usually a sequence of values

which are different realizations of a random variable ξ , of distribution π(ξ). We assume that these values are independent and identically distributed (i.i.d.).

Using these N realisations, the best estimation of the mean

of the distribution π(ξ) of ξ is

Can we determine the statistical error associated with this estimation?

The mean XN being the average of N i.i.d. random variables, the central limit theorem should apply and the result writes:


Test this prediction with the following example: Sample N random numbers

taken from a uniform distribution on [0,1] and give an estimation of the average with its associated error.

Quality Control

We consider the direct sampling of the distribution

  • How can you sample this distribution?
  • Compute the average of ξ analytically.
  • Estimate the error using the previous method. Compare your results for different values of α. (First take α=3 and then try α=1.2).
import math, numpy, random
Ntime = 500000
random.seed(2)
alpha = 3.
xi = []
for itime in range(Ntime):
    xi.append('think a little bit')
xerror = numpy.std(numpy.array(xi))/math.sqrt(Ntime)
xm = numpy.mean(numpy.array(xi))
print 'alpha = ',alpha
print 'exact mean', alpha / (alpha - 1.)
print 'simulation', xm,'+/-', xerror
To understand the difference between the different values of α let us introduce the "Quality control" method, which consists in plotting the running average defined as

import pylab, math, numpy, random
Ntime = 10000
random.seed(2)
alpha = 3
xmeanexact = alpha / (alpha - 1.)
xicontrol = []
xtot = 0.
for itime in range(Ntime):
    xi = 'think a little bit'
    xtot += xi
    xicontrol.append( xtot / float(itime + 1))
pylab.title('Quality Control, alpha = 3')
pylab.xlabel('time')
pylab.ylabel('running average of xi')
time = [t for t in range(Ntime)]
meanexact = [ xmeanexact for t in range(Ntime)]
pylab.plot(xicontrol, 'r-')
pylab.plot(time, meanexact, 'k-')
pylab.axis([0, Ntime, xmeanexact * 0.9, xmeanexact * 1.1])
pylab.show()
  • Run this program for Ntime=10000, 50000, 250000 and compare your results for different values of α.
  • Compute analytically the variance of ξ as a function of α. Comment.

Stable distribution


Take Ntime=10,100,1000... Study the distribution of the variable

as N becomes large. Show that after a proper rescaling of this variable, the distribution converge (for large N) to a unique curve. What is this limit distribution?
 import pylab, math, numpy, random
 Nsample = 20000
 Ntime = 100
 random.seed(2)
 alpha = 3
 xmean = alpha/(alpha-1.)
 Nfactor = float(Ntime)**0.5   #(1. - 1./alpha)
 xsum = [0. for n in range(Nsample)]
 for isample in range(Nsample):
     xi = [0. for n in range(Ntime)]
     for itime in range(Ntime):
         xi[itime] = random.random()**(-1./alpha)
     xsum[isample]=(sum(xi)/float(Ntime) - xmean) * Nfactor
 pylab.hist(xsum, bins=30,range=(-10,10),normed=True)
 pylab.title('Example 3, Rescaled Distributions, alpha = 3')
 pylab.xlabel('distribution')
 pylab.ylabel('x')
 pylab.show()
[Print this page]