Summary:
Do systems need a fully automated test harness to pass a test?
I was chatting with Dave Reynolds about what is expected to pass an
entailement test.
The tests are expressed as
Graph1 entails Graph2
In practice many APIs (including ours) do not directly support such an
operation.
Hence Dave automatically transforms Graph2 into a query which he can then
execute againsts Graph1, and pass the test.
That looks fine to me.
For some of the tests, he has a more complex query rewrite that he does
manually, and then passes the test. I am discouraging him from reporting such
tests as passed. (These reflect the lack of support for the comprehension
axioms - the query rewrite essentially compensates for this).
===
What are other people doing? How much manual and/or automatic rewrite do
people do?
Jeremy