Details

Description

UF had trouble doing a complete reindex with VIVO 1.5.1 and eventually tracked down a formfeed character that had been pasted into VIVO in 2011 from a PDF. The same error can't be triggered again by pasting into an editing form, but the bad data can be re-introduced by uploading an n-triples file, and still causes the indexing error.

This error may well predate 1.5 since they had been noticing that merged organizations had not been removed from the search index for some time while still running 1.4.1, so the index may not have been updating successfully.

The bad data has been removed, but to avoid this problem again it would be helpful to trap for characters that break the indexing, whether a bug in VIVO, in Jena, or in Solr. If VIVO could at least catch the exception and ignore the record rather than abort the indexing process that would be a big improvement.