The Monte Carlo method [3,4] is used to generate samples
of correlated random variables for the three
classes. The marginal distributions for the
random variables
and
for
and
are shown in Figure 1. While the correlation coefficients
and
, the correlation
coefficient
is varied between 0.0 and 0.9 in step of 0.1.
The 2-D projections for
versus
for
are depicted in Figure 2 for
0.0, 0.3, 0.6
and 0.9.
Each PDF template uses
1 million events. A large number of events are needed for the computation
of the multi-dimensional PDF.
The ensemble test generates 10,000 experiments (i.e. samples). The size of each
sample is approximately 900 events in which the fraction of each classes is about 1/3. The total
number of events per experiment is Poisson distributed and the fraction of
events
for class
is randomly
assigned so that
. The
simulated fluctuation on
reflects the statistical uncertainty on
the total number of events for each class.
The power of an ensemble test is that the law of large numbers and the central limit
theorem ensures that the distribution of the fitted fraction
and the statistical error
on the fitted fraction for the 10,000 experiments
are Gaussian distributed. The sample mean
should be 1/3 within
the statistical uncertainties and the mean
should be the
expected statistical error on the measurement of the unknown fractions.
The residual is defined as the difference between the fitted fraction and
the true fraction:
.
Here it is very important to use the mean of the Poisson distributed
for the definition of the residual (using the generated value will lead to a narrow pull distribution).
The pull is
. According to
the central limit theorem the random variables
must be normally
distributed
. Hence,
a pull experiment based on a large statistics ensemble test allows for a detailed
study of the fitting
procedure for the unknown parameters and their statistical uncertainties.
As an example, the results of the pull experiment are depicted in
Figures 3 for the multi-D method with
.
As noted before, the distribution of the pull
should be normally distributed since
and
. Deviation from the
behavior
is an indication of a true bias of the fitting method since the
ensemble test relies on a very large number of experiments.