How to compile the examples:

How to run the examples:

To run the examples, you need to give Java a special system-property named
"java.library.path". The value of this property is the path of the directory
where the libxapian_jni.so (or whatever extension is used on your platform)
JNI library is located.

Alternatively, you can avoid needing the -Djava.library.path setting by
setting the LD_LIBRARY_PATH environment variable, or by installing the JNI
library in the appropriate directory so your JVM finds it automatically
(for example, on Mac OS X you can copy it into /Library/Java/Extensions/).

The java bindings have been tested recently with OpenJDK versions 1.8.0_77,
1.7.0_03, and 1.6.0_38, but they should work with any java toolchain with
suitable JNI support - please report success stories or any problems to the
development mailing list: xapian-devel@lists.xapian.org

Strings and binary data

The Xapian C++ API is largely agnostic about character encoding, and uses
the std::string type as an opaque container for a sequence of bytes.
In places where the bytes represent text (for example, in the
Stem, QueryParser and TermGenerator classes), UTF-8 encoding is used.
In Java, the String class uses UTF-16 encoding, and can't hold arbitrary
binary data.

The approach taken to this problem by these bindings (in Xapian 1.4.4 and
later) is to map C++ std::string to/from Java byte arrays (byte[]) in
places where the data is inherently binary (serialisation functions) or likely
to be binary (document values).

This loses a bit of generality compared to the C++ API - for example, in C++
you can add a term with a binary data value but in Java it has to be a
Unicode string. But users rarely actually need or want that generality,
and losing it means that you can just work with Java String.

Document values work best when the values are compactly encoded, so a binary
encoding is usually appropriate. However, if you really want to put a text
value in a document value slot you can explicitly convert String to/from
a byte array of UTF-8 data like so: