setIndexDeletionPolicy

Expert: allows an optional IndexDeletionPolicy implementation to be
specified. You can use this to control when prior commits are deleted from
the index. The default policy is KeepOnlyLastCommitDeletionPolicy
which removes all prior commits as soon as a new commit is done (this
matches behavior before 2.2). Creating your own policy can allow you to
explicitly keep previous "point in time" commits alive in the index for
some time, to allow readers to refresh to the new commit without having the
old commit deleted out from under them. This is necessary on filesystems
like NFS that do not support "delete on last close" semantics, which
Lucene's "point in time" search normally relies on.

setMergePolicy

Expert: MergePolicy is invoked whenever there are changes to the
segments in the index. Its role is to select which merges to do, if any,
and return a MergePolicy.MergeSpecification describing the merges.
It also selects merges to do for forceMerge.

setRAMPerThreadHardLimitMB

Expert: Sets the maximum memory consumption per thread triggering a forced
flush if exceeded. A DocumentsWriterPerThread is forcefully flushed
once it exceeds this limit even if the getRAMBufferSizeMB() has
not been exceeded. This is a safety limit to prevent a
DocumentsWriterPerThread from address space exhaustion due to its
internal 32 bit signed integer based memory addressing.
The given value must be less that 2GB (2048MB)

setMaxBufferedDeleteTerms

Determines the maximum number of delete-by-term operations that will be
buffered before both the buffered in-memory delete terms and queries are
applied and flushed.

Disabled by default (writer flushes by RAM usage).

NOTE: This setting won't trigger a segment flush.

Takes effect immediately, but only the next time a document is added,
updated or deleted. Also, if you only delete-by-query, this setting has no
effect, i.e. delete queries are buffered until the next segment is flushed.

setMaxBufferedDocs

Determines the minimal number of documents required before the buffered
in-memory documents are flushed as a new Segment. Large values generally
give faster indexing.

When this is set, the writer will flush every maxBufferedDocs added
documents. Pass in DISABLE_AUTO_FLUSH to prevent
triggering a flush due to number of buffered documents. Note that if
flushing by RAM usage is also enabled, then the flush will be triggered by
whichever comes first.

Disabled by default (writer flushes by RAM usage).

Takes effect immediately, but only the next time a document is added,
updated or deleted.

setRAMBufferSizeMB

Determines the amount of RAM that may be used for buffering added documents
and deletions before they are flushed to the Directory. Generally for
faster indexing performance it's best to flush by RAM usage instead of
document count and use as large a RAM buffer as you can.

When this is set, the writer will flush whenever buffered documents and
deletions use this much RAM. Pass in
DISABLE_AUTO_FLUSH to prevent triggering a flush
due to RAM usage. Note that if flushing by document count is also enabled,
then the flush will be triggered by whichever comes first.

The maximum RAM limit is inherently determined by the JVMs available
memory. Yet, an IndexWriter session can consume a significantly
larger amount of memory than the given RAM limit since this limit is just
an indicator when to flush memory resident documents to the Directory.
Flushes are likely happen concurrently while other threads adding documents
to the writer. For application stability the available memory in the JVM
should be significantly larger than the RAM buffer used for indexing.

NOTE: the account of RAM usage for pending deletions is only
approximate. Specifically, if you delete by Query, Lucene currently has no
way to measure the RAM usage of individual Queries so the accounting will
under-estimate and you should compensate by either calling commit()
periodically yourself, or by using LiveIndexWriterConfig.setMaxBufferedDeleteTerms(int)
to flush and apply buffered deletes by count instead of RAM usage (for each
buffered delete Query a constant number of bytes is used to estimate RAM
usage). Note that enabling LiveIndexWriterConfig.setMaxBufferedDeleteTerms(int) will not
trigger any segment flushes.

NOTE: It's not guaranteed that all memory resident documents are
flushed once this limit is exceeded. Depending on the configured
FlushPolicy only a subset of the buffered documents are flushed and
therefore only parts of the RAM buffer is released.

setReaderTermsIndexDivisor

Sets the termsIndexDivisor passed to any readers that IndexWriter opens,
for example when applying deletes or creating a near-real-time reader in
DirectoryReader.open(IndexWriter, boolean). If you pass -1, the
terms index won't be loaded by the readers. This is only useful in advanced
situations when you will only .next() through all terms; attempts to seek
will hit an exception.

Takes effect immediately, but only applies to readers opened after this
call

NOTE: divisor settings > 1 do not apply to all PostingsFormat
implementations, including the default one in this release. It only makes
sense for terms indexes that can efficiently re-sample terms at load time.

setTermIndexInterval

Expert: set the interval between indexed terms. Large values cause less
memory to be used by IndexReader, but slow random-access to terms. Small
values cause more memory to be used by an IndexReader, and speed
random-access to terms.

This parameter determines the amount of computation required per query
term, regardless of the number of documents that contain that term. In
particular, it is the maximum number of other terms that must be scanned
before a term is located and its frequency and position information may be
processed. In a large index with user-entered query terms, query processing
time is likely to be dominated not by term lookup but rather by the
processing of frequency and positional data. In a small index or when many
uncommon query terms are generated (e.g., by wildcard queries) term lookup
may become a dominant cost.

In particular, numUniqueTerms/interval terms are read into
memory by an IndexReader, and, on average, interval/2 terms
must be scanned for each random term access.

Takes effect immediately, but only applies to newly flushed/merged
segments.

NOTE: This parameter does not apply to all PostingsFormat implementations,
including the default one in this release. It only makes sense for term indexes
that are implemented as a fixed gap between terms. For example,
Lucene41PostingsFormat implements the term index instead based upon how
terms share prefixes. To configure its parameters (the minimum and maximum size
for a block), you would instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int, int).
which can also be configured on a per-field basis: