I have been trying to educate myself on writing rules and doing inferencing with these rules. I created a very simple rule to use for my testing. I added triples to my knowledge store, ran the sem_apis.create_entailment command using my new rule, and queried for the inferred triples to verify that all was working. I then added a new triple that was relevant to the rule to see what Oracle would do. I figured that either (a) my query would return the same as the previous query (implying that no inferencing was done with the new triple) or (b) the query would show new inferences based on the new triple. To my surprise, Oracle gave me option (c), which was that my rules_index was no longer valid! I have to force the entailment to be rerun in order to get the new inferred triples. I am very, very new to this RDF/Semantic game, and this unexpected behavior brings up a few questions:

1. Is there a way to configure Oracle 11g so that inferencing is performed automatically on newly added triples?

2. If I have data that is continuously being added to my knowledge store (say, just for fun, greater than1 triple every second) and if my rules_index becomes invalid with every newly added triple, it seems to me that my rules_index will practically stay non-existent. Is there some standard or alternate way of handling this?

3. Where are the inferred triples actually stored? If they are not really stored in any table, is there a way for me to do so? In other words, if I can infer a relationship between two entities, I would like to store it somewhere (assert it?) so that I do not have to keep inferring this same relationship every time I run create_entailment. As my triples reach into the millions, I foresee a performance hit.

As you observed, the rules index is not updated every time a triple is added. The create_entailment procedure has to be explicitly called to update the rules index. This is for performance reasons, and our recommendation is to call the create_entailment procedure after some updates have been made, for example, on a nightly basis.

But this does not meant that semantic data cannot be queried before the rules index is updated. When new triples are added, the rules index goes into an 'INCOMPLETE' state (to indicate that there could be new triples that can be inferred, but that that has not happened yet), and SEM_MATCH can be used for query, provided 'INCOMPLETE' is passed in as an argument (see section 1.6 in the documentation for details). Similarly, when triples are deleted, the rules index goes into an 'INVALID' state (to indicate that some of the inferred triples might not be valid any longer), but SEM_MATCH can still be used, by passing in 'INVALID' as an argument. This functionality enables a user to continue querying semantic data but ensures that the user is cognizant of the fact that the triples have been added or deleted and hence the rules index status has changed. In other words, the user has to explicitly pass in 'INCOMPLETE' or 'INVALID' to indicate that he or she is OK with the results being so.

The inferred triples are stored in internal tables, and can be viewed using the view SEMI_<rules_index_name>. You do not have to worry about the same relationship being inferred every time, create_entailment is optimized to handle all that. If you want to store triples inferred through other means, then we would recommend using a model (like with the original set of triples; the same model could also be used).