The results of the pull experiment are summarized in Tables 1,
2, and 3. It clearly shows that the standard
1D method breaks down between
and
(c.f. Table 1).
It confirms our suspicion that the use of 1D PDFs is adequate only
for very small correlations. Using the 1D method for correlated
variables can lead to huge inconsistencies.
The PCA method is clearly biased for all values
of
(c.f. Table 2).
This was to be expected since the PCA method was designed for
the calculation of likelihood ratios for signal selection over a small
background. Even if the PCA method is not adequate here, it does
better than the 1D method for large correlations
.
Overall, the Multi-D method (c.f. Table 3) is the
approach with the smallest biases
-0.022,
-0.055, -0.012 for
,
, and
, respectively.
Even if the biases are small
for the multi-D approach, they are REAL since the error on the
mean is expected to be
for
a sample size of
. In fact, it shows the biases inherent to the use
of binned data. The binning effects of the PDFs on the extraction
of the estimators
is a know limitation of the binned maximum
likelihood method and the ensemble tests are a robust way to quantify
the loss of information due to binned data.
The binning effect were studied by changing the number of bins used in the
maximum likelihood fits. As the number of bins decreases (increases), the bias
on the pull increases (decreases). When the number of bins
,
the fit is unstable because there is a small number of events per bin for the
calculation of the template joint PDFs
.
Another feature which can be studied with the ensemble test is the possible
bias of the statistical error returned by the maximum likelihood fitter. In
the study presented here the sample standard deviation is
, but
in some cases
is NOT consistent with unity since the error on the sample
standard deviation is given by
for
. From Tables 1,
2, and 3,
for
and
are
consistent with unity; while
for
is not.
The ensemble test allow to quantify the inherent binning effects. In complicated statistical analysis, biases due to binned data must be evaluated and quoted as systematic errors if they are not negligible compare to other uncertainties.