Convinced that a ML technique could provide a significantly better
shooting policy, we decided to try using a neural network as our initial attempt.
We plan to experiment with other ML techniques on the same task in the
future. We first considered how we should structure the neural network in order
to learn a function from the current state of the world to an
indication of whether the shooter should start accelerating or remain
still and wait. The output of this function was fairly
straightforward. It would indicate whether starting to accelerate in
a world state described by the input values was likely to lead to a
goal (outputs close to 1) or a miss (outputs close to 0). However,
deciding how to represent the world state, i.e. the inputs to the neural network,
represented a core part of our research.

One option was to use coordinates for both the shooter
and the ball. However, such inputs would not have generalized beyond
the very limited training situation. Furthermore, they would have led
to a higher dimensional function (6) than turned out to be necessary.
Instead we chose to use just 3 easily-computable
coordinate-independent predicates.

Since the line along which the agent steered was computed before it
started moving (the line connecting the agent's initial position and
the point 170 units wide of the goal), and since the ball's trajectory
could be estimated (with some error due to noise) after getting two
distinct position readings, the shooter was able to determine the
point at which it hoped to strike the ball, or the Contact
Point. It could then cheaply compute certain useful predicates:

The physical meaning of these inputs is illustrated in
Figure 5(a). These inputs proved to be sufficient for learning
the task at hand. Furthermore, since they contained no
coordinate-specific information, they enabled training in a narrow
setting to apply much more widely as shown at the end of this section.