"A machine is set to pack 3 kg of ground beef per pack. Over a
long period of time it is found that the average packed was 3 kg with
a standard deviation of 0.1 kg. Assume the packing is normally distributed."

We start by constructing a normal distribution with the given parameters:

We might want to ensure that 95% of packs are over a minimum weight
specification, then we want the value of the mean such that P(X <
2.9) = 0.05.

Using the mean of 3 kg, we can estimate the fraction of packs that
fail to meet the specification of 2.9 kg.

doubleminimum_weight=2.9;cout<<"Fraction of packs <= "<<minimum_weight<<" with a mean of "<<mean<<" is "<<cdf(complement(packs,minimum_weight))<<endl;// fraction of packs <= 2.9 with a mean of 3 is 0.841345

This is 0.84 - more than the target fraction of 0.95. If we want 95%
to be over the minimum weight, what should we set the mean weight to
be?

Using the KK StatCalc program supplied with the book and the method
given on page 126 gives 3.06449.

We can confirm this by constructing a new distribution which we call
'xpacks' with a safety margin mean of 3.06449 thus:

doubleover_mean=3.06449;normalxpacks(over_mean,standard_deviation);cout<<"Fraction of packs >= "<<minimum_weight<<" with a mean of "<<xpacks.mean()<<" is "<<cdf(complement(xpacks,minimum_weight))<<endl;// fraction of packs >= 2.9 with a mean of 3.06449 is 0.950005

Using this Math Toolkit, we can calculate the required mean directly
thus:

doubleunder_fraction=0.05;// so 95% are above the minimum weight mean - sd = 2.9doublelow_limit=standard_deviation;doubleoffset=mean-low_limit-quantile(packs,under_fraction);doublenominal_mean=mean+offset;// mean + (mean - low_limit - quantile(packs, under_fraction));normalnominal_packs(nominal_mean,standard_deviation);cout<<"Setting the packer to "<<nominal_mean<<" will mean that "<<"fraction of packs >= "<<minimum_weight<<" is "<<cdf(complement(nominal_packs,minimum_weight))<<endl;// Setting the packer to 3.06449 will mean that fraction of packs >= 2.9 is 0.95

This calculation is generalized as the free function called find_location.

cout<<"Setting the packer to "<<nominal_mean<<" will mean that "<<"fraction of packs >= "<<minimum_weight<<" is "<<cdf(complement(good_packs,minimum_weight))<<endl;// Setting the packer to 3.06449 will mean that fraction of packs >= 2.9 is 0.95

After examining the weight distribution of a large number of packs,
we might decide that, after all, the assumption of a normal distribution
is not really justified. We might find that the fit is better to a
Cauchy
Distribution. This distribution has wider 'wings', so that whereas
most of the values are closer to the mean than the normal, there are
also more values than 'normal' that lie further from the mean than
the normal.

This might happen because a larger than normal lump of meat is either
included or excluded.

We first create a Cauchy
Distribution with the original mean and standard deviation,
and estimate the fraction that lie below our minimum weight specification.

cauchycpacks(mean,standard_deviation);cout<<"Cauchy Setting the packer to "<<mean<<" will mean that "<<"fraction of packs >= "<<minimum_weight<<" is "<<cdf(complement(cpacks,minimum_weight))<<endl;// Cauchy Setting the packer to 3 will mean that fraction of packs >= 2.9 is 0.75

Note that far fewer of the packs meet the specification, only 75% instead
of 95%. Now we can repeat the find_location, using the cauchy distribution
as template parameter, in place of the normal used above.

Note that the safe_mean setting needs to be much higher, 3.53138 instead
of 3.06449, so we will make rather less profit.

And again confirm that the fraction meeting specification is as expected.

cauchygoodcpacks(lc,standard_deviation);cout<<"Cauchy Setting the packer to "<<lc<<" will mean that "<<"fraction of packs >= "<<minimum_weight<<" is "<<cdf(complement(goodcpacks,minimum_weight))<<endl;// Cauchy Setting the packer to 3.53138 will mean that fraction of packs >= 2.9 is 0.95

Finally we could estimate the effect of a much tighter specification,
that 99% of packs met the specification.

cout<<"Cauchy Setting the packer to "<<find_location<cauchy>(minimum_weight,0.99,standard_deviation)<<" will mean that "<<"fraction of packs >= "<<minimum_weight<<" is "<<cdf(complement(goodcpacks,minimum_weight))<<endl;

Setting the packer to 3.13263 will mean that fraction of packs >=
2.9 is 0.99, but will more than double the mean loss from 0.0644 to
0.133 kg per pack.

Of course, this calculation is not limited to packs of meat, it applies
to dispensing anything, and it also applies to a 'virtual' material
like any measurement.

The only caveat is that the calculation assumes that the standard deviation
(scale) is known with a reasonably low uncertainty, something that
is not so easy to ensure in practice. And that the distribution is
well defined, Normal
Distribution or Cauchy
Distribution, or some other.

If one is simply dispensing a very large number of packs, then it may
be feasible to measure the weight of hundreds or thousands of packs.
With a healthy 'degrees of freedom', the confidence intervals for the
standard deviation are not too wide, typically about + and - 10% for
hundreds of observations.

For other applications, where it is more difficult or expensive to
make many observations, the confidence intervals are depressingly wide.

So 0.05 was quite a good guess, but we are a little over the 2.9 target,
so the standard deviation could be a tiny bit more. So we could do
some more guessing to get closer, say by increasing standard deviation
to 0.06 kg, constructing another new distribution called pack06.

Now we are getting really close, but to do the job properly, we might
need to use root finding method, for example the tools provided, and
used elsewhere, in the Math Toolkit, see Root
Finding Without Derivatives.

But in this (normal) distribution case, we can and should be even smarter
and make a direct calculation.

We want to find the standard deviation that would be required to meet
this limit, so that the p th quantile is located at z (minimum_weight).
In this case, the 0.05 (5%) quantile is at 2.9 kg pack weight, when
the mean is 3 kg, ensuring that 0.95 (95%) of packs are above the minimum
weight.

Rearranging, we can directly calculate the required standard deviation:

normalN01;// standard normal distribution with meamn zero and unit standard deviation.p=0.05;doubleqp=quantile(N01,p);doublesd95=(minimum_weight-mean)/qp;cout<<"For the "<<p<<"th quantile to be located at "<<minimum_weight<<", would need a standard deviation of "<<sd95<<endl;// For the 0.05th quantile to be located at 2.9, would need a standard deviation of 0.0607957

We can now construct a new (normal) distribution pack95 for the 'better'
packer, and check that our distribution will meet the specification.

normalpack95(mean,sd95);cout<<"Fraction of packs >= "<<minimum_weight<<" with a mean of "<<mean<<" and standard deviation of "<<pack95.standard_deviation()<<" is "<<cdf(complement(pack95,minimum_weight))<<endl;// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.0607957 is 0.95

This calculation is generalized in the free function find_scale, as
shown below, giving the same standard deviation.

Note that our guess of 0.06 was close to the accurate value of 0.060795683191176959.

We can again confirm our prediction thus:

normalpack95c(mean,ssc);cout<<"Fraction of packs >= "<<minimum_weight<<" with a mean of "<<mean<<" and standard deviation of "<<pack95c.standard_deviation()<<" is "<<cdf(complement(pack95c,minimum_weight))<<endl;// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.0607957 is 0.95

Notice that these two deceptively simple questions:

Do we over-fill to make sure we meet a minimum specification (or
under-fill to avoid an overdose)?

and/or

Do we measure better?

are actually extremely common.

The weight of beef might be replaced by a measurement of more or less
anything, from drug tablet content, Apollo landing rocket firing, X-ray
treatment doses...

The scale can be variation in dispensing or uncertainty in measurement.