Solution for Programming Exercise 5.2

Exercise 5.2:

A common programming task
is computing statistics of a set of numbers. (A statistic is a number that
summarizes some property of a set of data.) Common statistics include the mean
(also known as the average) and the standard deviation (which tells how spread
out the data are from the mean). I have written a little class called
StatCalc that can be used to compute these statistics, as well as the
sum of the items in the dataset and the number of items in the dataset. You can
read the source code for this class in the file StatCalc.java.
If calc is a variable of
type StatCalc, then the following methods are defined:

calc.enter(item) where
item is a number, adds the item to the dataset.

calc.getCount() is a function
that returns the number of items that have been added to the dataset.

calc.getSum() is a function
that returns the sum of all the items that have been added to the dataset.

calc.getMean() is a function
that returns the average of all the items.

calc.getStandardDeviation() is
a function that returns the standard deviation of the items.

Typically, all the data are added one after the other by calling the
enter() method over and over, as the data become available. After all
the data have been entered, any of the other methods can be called to get
statistical information about the data. The methods getMean() and
getStandardDeviation() should only be called if the number of items is
greater than zero.

Modify the current source code, StatCalc.java, to add instance
methods getMax() and getMin(). The getMax() method
should return the largest of all the items that have been added to the dataset,
and getMin() should return the smallest. You will need to add two new
instance variables to keep track of the largest and smallest items that have
been seen so far.

Test your new class by using it in a program to compute statistics for a set
of non-zero numbers entered by the user. Start by creating an object of type
StatCalc:

StatCalc calc; // Object to be used to process the data.
calc = new StatCalc();

Read numbers from the user and add them to the dataset. Use 0 as a sentinel
value (that is, stop reading numbers when the user enters 0). After all the
user's non-zero numbers have been entered, print out each of the six statistics
that are available from calc.

Discussion

For the StatCalc class to handle minimums and maximums, some of
what must be added to the class is obvious. We needs two new instance
variables, min and max, and two getter methods to return the values
of those instance variables. So, we can add these lines to the class
definition:

But then there is the problem of making sure that min and
max have the right values. min records the smallest number
seen so far. Every time we have a new number to add to
the dataset there is a possibility that min will change, so
we have to compare min with the newly added number. If the new number is
smaller than the current min, then the number becomes the new value of
min (since the new number is now the smallest number we have seen so
far). We do something similar for max. This has to be done whenever a
number is entered into the dataset, so it has to be added to the
enter() method, giving:

Unfortunately, if this is all we do, there is a bug in our
program! For example, if the dataset consists of the numbers 21, 17,
and 4, the computer will insist that the minimum is 0, rather than 4. The
problem is that the variables min and max are initialized to
zero. (If no initial value is provided for a numerical instance variable, it
gets the default initial value, zero.) Since min is 0, none of the
numbers in the dataset pass the test "if (num < min)", so the value
of min never changes. A similar problem holds for max, but it
will only show up if all the numbers in the dataset are less than zero. For the
other instance variables, count, sum, and squareSum,
the default initial value of zero is correct. For min and
max, we have to do something different.

One possible way to fix the problem is to treat the first number entered as
a special case. When only one number has been entered, it's certainly the
largest number so far and also the smallest number so far, so it should be
assigned to both min and max. This can be handled in the
enter() method:

public void enter(double num) {
// (This is NOT the version I used in my final answer.)
count++;
sum += num;
squareSum += num*num;
if (count == 1) { // This is the first number.
max = num;
min = num;
}
else {
if (num > max) // We have a new maximum.
max = num;
if (num < min) // We have a new minimum.
min = num;
}
}

This works fine. However, I decided to use an alternative approach. We would
be OK if we could initialize min to have a value that is bigger than
any possible number. Then, when the first number is entered, it will have to
satisfy the test "if (num < min)", and it will become the value of
min. But to be "bigger than any possible number," min would
have to be infinity. The initial value for max has to be smaller than
any possible number, so max has to be initialized to negative
infinity. And that's what we'll do!

Recall that the standard class Double contains constants
Double.POSITIVE_INFINITY and Double.NEGATIVE_INFINITY that
represent positive and negative infinity. We can
use these named constants to provide initial values for the instance variables
min and max. So, the declarations become:

With this change, the StatCalc class works correctly. The complete
class is shown below. (By the way, you might think about what happens if getMin()
or getMax() is called before any data has been entered. What actually happens?
What should happen? What is the minimum or maximum of a set of numbers that contains no
numbers at all?)

The main program is fairly straightforward, so just for fun I decided to
use a Scanner instead of TextIO
to read the user's input (see Subsection 2.4.6.
The user's data are read and
entered into the StatCalc object in a loop:

The subroutine call "calc.enter(item);" enters the user's item.
That is, it does all the processing necessary to include this data item in the
statistics it is computing. After all the data have been entered, the
statistics can be obtained by using function calls such as
"calc.getMean()". The statistics are output in statements such as:

System.out.println(" Average: " + calc.getMean());

Note that a function call represents a value, and so can be used anyplace
where a variable or literal value could be used. I don't have to assign the
value of the function to a variable. I can use the function call directly in
the output statement. Another note: In this program, I decided not to use
formatted output, since it seems appropriate to print the answers with
as much accuracy as possible. For formatted output, the statement
used to print the mean could be something like:

System.out.printf(" Average: %10.3f\n", calc.getMean());

The complete main program is shown below.

Although that completes the exercise, one might wonder: Instead of modifying
the source code of StatCalc, could we make a subclass of
StatCalc and put the modifications in that? The answer is yes, but we
need to use the slightly obscure special variable super that was
discussed in Subsection 5.6.2.

The new instance variables and instance methods can simply be put into the
subclass. The problem arises with the enter() method. We have to
redefine this method so that it will update the values of min and
max. But it also has to do all the processing that is done by the
original enter() method in the StatCalc class. This is what
super is for. It lets us call a method from the superclass of the
class we are writing. So, the subclass can be written:

Revised StatCalc Class
/*
* An object of class StatCalc can be used to compute several simple statistics
* for a set of numbers. Numbers are entered into the dataset using
* the enter(double) method. Methods are provided to return the following
* statistics for the set of numbers that have been entered: The number
* of items, the sum of the items, the average, the standard deviation,
* the maximum, and the minimum.
*/
public class StatCalc {
private int count; // Number of numbers that have been entered.
private double sum; // The sum of all the items that have been entered.
private double squareSum; // The sum of the squares of all the items.
private double max = Double.NEGATIVE_INFINITY; // Largest item seen.
private double min = Double.POSITIVE_INFINITY; // Smallest item seen.
/**
* Add a number to the dataset. The statistics will be computed for all
* the numbers that have been added to the dataset using this method.
*/
public void enter(double num) {
count++;
sum += num;
squareSum += num*num;
if (num > max)
max = num;
if (num < min)
min = num;
}
/**
* Return the number of items that have been entered into the dataset.
*/
public int getCount() {
return count;
}
/**
* Return the sum of all the numbers that have been entered.
*/
public double getSum() {
return sum;
}
/**
* Return the average of all the items that have been entered.
* The return value is Double.NaN if no numbers have been entered.
*/
public double getMean() {
return sum / count;
}
/**
* Return the standard deviation of all the items that have been entered.
* The return value is Double.NaN if no numbers have been entered.
*/
public double getStandardDeviation() {
double mean = getMean();
return Math.sqrt( squareSum/count - mean*mean );
}
/**
* Return the smallest item that has been entered.
* The return value will be infinity if no numbers have been entered.
*/
public double getMin() {
return min;
}
/**
* Return the largest item that has been entered.
* The return value will be -infinity if no numbers have been entered.
*/
public double getMax() {
return max;
}
} // end class StatCalc
Main Program
/**
* Computes and display several statistics for a set of non-zero
* numbers entered by the user. (Input ends when user enters 0.)
* This program uses StatCalc.java.
*/
import java.util.Scanner;
public class SimpleStats {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
StatCalc calc; // Computes stats for numbers entered by user.
calc = new StatCalc();
double item; // One number entered by the user.
System.out.println("Enter your numbers. Enter 0 to end.");
System.out.println();
do {
System.out.print("? ");
item = in.nextDouble();
if (item != 0)
calc.enter(item);
} while (item != 0);
System.out.println("\nStatistics about your calc:\n");
System.out.println(" Count: " + calc.getCount());
System.out.println(" Sum: " + calc.getSum());
System.out.println(" Minimum: " + calc.getMin());
System.out.println(" Maximum: " + calc.getMax());
System.out.println(" Average: " + calc.getMean());
System.out.println(" Standard Deviation: "
+ calc.getStandardDeviation());
} // end main()
} // end SimpleStats