Imagine you have a process that follows a binomial distribution: for
each trial conducted, an event either occurs or does it does not, referred
to as "successes" and "failures". If, by experiment,
you want to measure the frequency with which successes occur, the best
estimate is given simply by k / N,
for k successes out of N trials.
However our confidence in that estimate will be shaped by how many trials
were conducted, and how many successes were observed. The static member
functions binomial_distribution<>::find_lower_bound_on_p
and binomial_distribution<>::find_upper_bound_on_p
allow you to calculate the confidence intervals for your estimate of
the occurrence frequency.

The sample program binomial_confidence_limits.cpp
illustrates their use. It begins by defining a procedure that will print
a table of confidence limits for various degrees of certainty:

#include<iostream>#include<iomanip>#include<boost/math/distributions/binomial.hpp>voidconfidence_limits_on_frequency(unsignedtrials,unsignedsuccesses){//// trials = Total number of trials.// successes = Total number of observed successes.//// Calculate confidence limits for an observed// frequency of occurrence that follows a binomial// distribution.//usingnamespacestd;usingnamespaceboost::math;// Print out general info:cout<<"___________________________________________\n""2-Sided Confidence Limits For Success Ratio\n""___________________________________________\n\n";cout<<setprecision(7);cout<<setw(40)<<left<<"Number of Observations"<<"= "<<trials<<"\n";cout<<setw(40)<<left<<"Number of successes"<<"= "<<successes<<"\n";cout<<setw(40)<<left<<"Sample frequency of occurrence"<<"= "<<double(successes)/trials<<"\n";

The procedure now defines a table of significance levels: these are the
probabilities that the true occurrence frequency lies outside the calculated
interval:

And now for the important part - the intervals themselves - for each
value of alpha, we call find_lower_bound_on_p
and find_lower_upper_on_p
to obtain lower and upper bounds respectively. Note that since we are
calculating a two-sided interval, we must divide the value of alpha in
two.

Please note that calculating two separate single sided bounds,
each with risk level α is not the same thing as calculating a two sided
interval. Had we calculate two single-sided intervals each with a risk
that the true value is outside the interval of α, then:

The risk that it is less than the lower bound is α.

and

The risk that it is greater than the upper bound is also α.

So the risk it is outside upper or lower bound,
is twice alpha, and the probability
that it is inside the bounds is therefore not nearly as high as one might
have thought. This is why α/2 must be used in the calculations below.

In contrast, had we been calculating a single-sided interval, for example:
"Calculate a lower bound so that we are P% sure that the
true occurrence frequency is greater than some value"
then we would not have divided by two.

Finally note that binomial_distribution
provides a choice of two methods for the calculation, we print out the
results from both methods in this example:

As you can see, even at the 95% confidence level the bounds are really
quite wide (this example is chosen to be easily compared to the one in
the NIST/SEMATECH
e-Handbook of Statistical Methods.here).
Note also that the Clopper-Pearson calculation method (CP above) produces
quite noticeably more pessimistic estimates than the Jeffreys Prior method
(JP above).

Now even when the confidence level is very high, the limits are really
quite close to the experimentally calculated value of 0.2. Furthermore
the difference between the two calculation methods is now really quite
small.