Apache Solr Release Notes

Introduction

Apache Solr is an open source enterprise search server based on the Apache Lucene Java
search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search,
caching, replication, and a web administration interface. It runs in a Java
servlet container such as Jetty.

Getting Started

You need a Java 1.7 VM or later installed.
In this release, there is an example Solr server including a bundled
servlet container in the directory named "example".
See the tutorial at http://lucene.apache.org/solr/tutorial.html

In Solr 3.6, all primitive field types were changed to omit norms by default when the
schema version is 1.5 or greater (SOLR-3140), but TrieDateField's default was mistakenly
not changed. As of Solr 4.10, TrieDateField omits norms by default (see SOLR-6211).

Creating a SolrCore via CoreContainer.create() no longer requires an
additional call to CoreContainer.register() to make it available to clients
(see SOLR-6170).

CoreContainer.remove() has been removed. You should now use CoreContainer.unload() to
delete a SolrCore (see SOLR-6232).

solr.xml parsing has been improved to better account for the expected data types of
various options. As part of this fix, additional error checking has also been added to
provide errors in the event of duplicated options, or unknown option names that may
indicate a typo. Users who have modified their solr.xml in the past and now upgrade may
get errors on startup if they have typos or unexpected options specified in their solr.xml
file. (See SOLR-5746 for more information.)

SOLR-6183: New spatial BBoxField for indexing rectangles with search support for most predicates.
It includes extra score relevancy modes in addition to distance: score=overlapRatio|area|area2D.
(David Smiley, Ryan McKinley)

SOLR-6232: You can now unload/delete cores that have failed to initialize
(Alan Woodward)

SOLR-6020: Auto-generate a unique key in schema-less example if data does not have an id field.
The UUIDUpdateProcessor was improved to not require a field name in configuration and generate
a UUID into the unique Key field.
(Vitaliy Zhovtyuk, hossman, Steve Rowe, Erik Hatcher, shalin)

SOLR-6294: SOLR-6437: Remove the restriction of adding json by only wrapping it in an array in a
new path /update/json/docs
(Noble Paul , hossman, Yonik Seeley, Steve Rowe)

SOLR-6257: More than two "!"-s in a doc ID throws an
ArrayIndexOutOfBoundsException when using the composite id router.
(Steve Rowe)

SOLR-5746: Bugs in solr.xml parsing have been fixed to more correctly deal with the various
datatypes of options people can specify, additional error handling of duplicated/unidentified
options has also been added.
(Maciej Zasada, hossman)

SOLR-6383: RegexTransformer returns no results after replaceAll if regex does not match a value.
(Alexander Kingson, shalin)

SOLR-6387: Add better error messages throughout Solr and supply a work around for
Java bug #8047340 to SystemInfoHandler: On Turkish default locale, some JVMs fail
to fork on MacOSX, BSD, AIX, and Solaris platforms.
(hossman, Uwe Schindler)

SOLR-6268: HdfsUpdateLog has a race condition that can expose a closed HDFS FileSystem instance and should
close it's FileSystem instance if either inherited close method is called.
(Mark Miller)

SOLR-6089: When using the HDFS block cache, when a file is deleted, it's underlying data entries in the
block cache are not removed, which is a problem with the global block cache option.
(Mark Miller, Patrick Hunt)

LUCENE-5803: Solr's schema now uses DelegatingAnalyzerWrapper. This uses less heap
for cached TokenStreamComponents because it caches per FieldType not per Field, so
indexes with many fields of same type just use one TokenStream per thread.
(Shay Banon, Uwe Schindler, Robert Muir)

SOLR-6259: Reduce CPU usage by avoiding repeated costly calls to Document.getField inside
DocumentBuilder.toDocument for use-cases with large number of fields and copyFields.
(Steven Bower via shalin)

SOLR-6120: On Windows, when the war is not extracted, the zkcli.bat script
will print a helpful message indicating that the war must be unzipped instead
of a java error about a missing class.
(shalin, Shawn Heisey)

Support for DiskDocValuesFormat (ie: fieldTypes configured with docValuesFormat="Disk")
has been removed due to poor performance. If you have an existing fieldTypes using
DiskDocValuesFormat please modify your schema.xml to remove the 'docValuesFormat'
attribute, and optimize your index to rewrite it into the default codec, prior to
upgrading to 4.9. See LUCENE-5761 for more details.

SOLR-5285: Added a new [child ...] DocTransformer for optionally including
Block-Join decendent documents inline in the results of a search. This works
independent of whether the search itself is a block-join related query and is
supported by he xml, json, and javabin response formats.
(Varun Thacker via hossman)

SOLR-6150: Add new AnalyticsQuery to support pluggable analytics
(Joel Bernstein)

SOLR-6006: Separate test and compile scope dependencies in the Solrj and
Solr contrib ivy.xml files, so that the derived Maven dependencies get
filled out properly in the corresponding POMs.
(Steven Scott, Steve Rowe)

In previous versions of Solr, Terms that exceeded Lucene's MAX_TERM_LENGTH were
silently ignored when indexing documents. Begining with Solr 4.8, a document
an error will be generated when attempting to index a document with a term
that is too large. If you wish to continue to have large terms ignored,
use "solr.LengthFilterFactory" in all of your Analyzers. See LUCENE-5472 for
more details.

Solr 4.8 requires Java 7 or greater, Java 8 is verified to be
compatible and may bring some performance improvements. When using
Oracle Java 7 or OpenJDK 7, be sure to not use the GA build 147 or
update versions u40, u45 and u51! We recommend using u55 or later.
An overview of known JVM bugs can be found on
http://wiki.apache.org/lucene-java/JavaBugs

ZooKeeper is upgraded from 3.4.5 to 3.4.6.

<fields> and <types> tags have been deprecated. There is no longer any reason to
keep them in the schema file, they may be safely removed. This allows intermixing of
<fieldType>, <field> and <copyField> definitions if desired. Currently, these tags
are supported so either style may be implemented. TBD is whether they'll be
deprecated formally for 5.0

SOLR-5795: New DocExpirationUpdateProcessorFactory supports computing an expiration
date for documents from the "TTL" expression, as well as automatically deleting expired
documents on a periodic basis.
(hossman)

SOLR-5783: Requests to open a new searcher will now reuse the current registered
searcher (w/o additional warming) if possible in situations where the underlying
index has not changed. This reduces overhead in situations such as deletes that
do not modify the index, and/or redundant commits.
(hossman)

SOLR-5884: When recovery is cancelled, any call to the leader to wait to see
the replica in the right state for recovery should be aborted.
(Mark Miller)

SOLR-5799: When registering as the leader, if an existing ephemeral
registration exists, wait a short time to see if it goes away.
(Mark Miller)

LUCENE-5472: IndexWriter.addDocument will now throw an IllegalArgumentException
if a Term to be indexed exceeds IndexWriter.MAX_TERM_LENGTH. To recreate previous
behavior of silently ignoring these terms, use LengthFilter in your Analyzer.
(hossman, Mike McCandless, Varun Thacker)

Detailed Change List

SOLR-5950: Maven config: make the org.slf4j:slf4j-api dependency transitive
(i.e., not optional) in all modules in which it's a dependency, including
solrj, except for the WAR, where it will remain optional.
(Uwe Schindler, Steve Rowe)

SOLR-5550: shards.info is not returned by a short circuited distributed query.
(Timothy Potter, shalin)

SOLR-5777: Fix ordering of field values in JSON updates where
field name key is repeated
(hossman)

SOLR-5734: We should use System.nanoTime rather than System.currentTimeMillis
when calculating elapsed time.
(Mark Miller, Ramkumar Aiyengar)

SOLR-5760: ConcurrentUpdateSolrServer has a blockUntilFinished call when
streamDeletes is true that should be tucked into the if statement below it.
(Mark Miller, Gregory Chanan)

SOLR-5761: HttpSolrServer has a few fields that can be set via setters but
are not volatile.
(Mark Miller, Gregory Chanan)

SOLR-5907: The hdfs write cache can cause a reader to see a corrupted state.
It now defaults to off, and if you were using solr.hdfs.blockcache.write.enabled
explicitly, you should set it to false.
(Mark Miller)

SOLR-5811: The Overseer will retry work items until success, which is a serious
problem if you hit a bad work item.
(Mark Miller)

SOLR-5796: Increase how long we are willing to wait for a core to see the ZK
advertised leader in it's local state.
(Timothy Potter, Mark Miller)

SOLR-5834: Overseer threads are only being interrupted and not closed.
(hossman, Mark Miller)

CloudSolrServer and LBHttpSolrServer no longer declare MalformedURLException
as thrown from their constructors.

Due to a bug in previous versions the default value of the 'discountOverlap' property
of DefaultSimilarity was not being set appropriately if you were using the implicit
DefaultSimilarityFactory instead of explicitly configuring it. To preserve
consistent behavior for people who upgrade, the implicit behavior is now contingent
on the <luceneMatchVersion/> -- discountOverlap=false for 4.6 and below,
discountOverlap=true for 4.7 and above. See SOLR-5561 for more information.

SOLR-5208: Support for the setting of core.properties key/values at create-time on
Collections API
(Erick Erickson)

SOLR-5428: SOLR-5690: New 'stats.calcdistinct' parameter in StatsComponent returns
set of distinct values and their count. This can also be specified per field
e.g. 'f.field.stats.calcdistinct'.
(Elran Dvir via shalin)

SOLR-5378, SOLR-5528: A new SuggestComponent that fully utilizes the Lucene suggester
module and adds pluggable dictionaries, payloads and better distributed support.
This is intended to eventually replace the Suggester support through the
SpellCheckComponent.
(Areek Zillur, Varun Thacker via shalin)

SOLR-5492: Return the replica that actually served the query in shards.info
response.
(shalin)

SOLR-5506: Support docValues in CollationField and ICUCollationField.
(Robert Muir)

SOLR-5023: Add support for deleteInstanceDir to be passed from SolrJ for Core
Unload action.
(Lyubov Romanchuk, shalin)

SOLR-5451: SyncStrategy closes it's http connection manager before the
executor that uses it in it's close method.
(Mark Miller)

SOLR-5460: SolrDispatchFilter#sendError can get a SolrCore that it does not
close.
(Mark Miller)

SOLR-5461: Request proxying should only set con.setDoOutput(true) if the
request is a post.
(Mark Miller)

SOLR-5481: SolrCmdDistributor should not let the http client do it's own
retries.
(Mark Miller)

LUCENE-5347: Fixed Solr's Zookeeper Client to copy files to Zookeeper using
binary transfer. Previously data was read with default encoding and stored
in zookeeper as UTF-8. This bug was found after upgrading to forbidden-apis
1.4.
(Uwe Schindler)

SOLR-4376: DataImportHandler uses wrong date format for last_index_time if
a delta-import is run first before any full-imports.
(Sebastien Lorber, Arcadius Ahouansou via shalin)

SOLR-5494: CoreContainer#remove throws NPE rather than returning null when
a SolrCore does not exist in core discovery mode.
(Mark Miller)

SOLR-5547: Creating a collection alias using SolrJ's CollectionAdminRequest
sets the alias name and the collections to alias to the same value.
(Aaron Schram, Mark Miller)

SOLR-5577: Likely ZooKeeper expiration should not slow down updates a given
amount, but instead cut off updates after a given time.
(Mark Miller, Christine Poerschke, Ramkumar Aiyengar)

SOLR-5580: NPE when creating a core with both explicit shard and coreNodeName.
(YouPeng Yang, Mark Miller)

SOLR-5552: Leader recovery process can select the wrong leader if all replicas
for a shard are down and trying to recover as well as lose updates that should
have been recovered.
(Timothy Potter, Mark Miller)

SOLR-5569 A replica should not try and recover from a leader until it has
published that it is ACTIVE.
(Mark Miller)

SOLR-5568 A SolrCore cannot decide to be the leader just because the cluster
state says no other SolrCore's are active.
(Mark Miller)

SOLR-5496: We should share an http connection manager across non search
HttpClients and ensure all http connection managers get shutdown.
(Mark Miller)

SOLR-5583: ConcurrentUpdateSolrServer#blockUntilFinished may wait forever if
the executor service is shutdown.
(Mark Miller)

SOLR-5586: All ZkCmdExecutor's should be initialized with the zk client
timeout.
(Mark Miller)

SOLR-5587: ElectionContext implementations should use
ZkCmdExecutor#ensureExists to ensure their election paths are properly
created.
(Mark Miller)

If you are using methods from FieldMutatingUpdateProcessorFactory for getting
configuration information (oneOrMany or getBooleanArg), those methods have
been moved to NamedList and renamed to removeConfigArgs and removeBooleanArg,
respectively. The original methods are deprecated, to be removed in 5.0.
See SOLR-5264.

SOLR-5353: Enhance CoreAdmin api to split a route key's documents from an index
and leave behind all other documents.
(shalin)

SOLR-5027: CollapsingQParserPlugin for high performance field collapsing on high cardinality fields.
(Joel Bernstein)

SOLR-5395: Added a RunAlways marker interface for UpdateRequestProcessorFactory
implementations indicating that they should not be removed in later stages
of distributed updates (usually signalled by the update.distrib parameter)
(yonik)

SOLR-5367: Unmarshalling delete by id commands with JavaBin can lead to class cast
exception.
(Mark Miller)

SOLR-5359: ZooKeeper client is not closed when it fails to connect to an ensemble.
(Mark Miller, Klaus Herrmann)

SOLR-5042: MoreLikeThisComponent was using the rows/count value in place of
flags, which caused a number of very strange issues, including NPEs and
ignoring requests for the results to include the score.
(Anshum Gupta, Mark Miller, Shawn Heisey)

SOLR-4882: SolrResourceLoader was restricted to only allow access to resource
files below the instance dir. The reason for this is security related: Some
Solr components allow to pass in resource paths via REST parameters
(e.g. XSL stylesheets, velocity templates,...) and load them via resource
loader. For backwards compatibility, this security feature can be disabled
by a new system property: solr.allow.unsafe.resourceloading=true
(Uwe Schindler)

SOLR-5323: Disable ClusteringComponent by default in collection1 example.
The solr.clustering.enabled system property needs to be set to 'true'
to enable the clustering contrib (reverts SOLR-4708).
(Dawid Weiss)

XML configuration parsing is now more strict about situations where a single
setting is allowed but multiple values are found. In the past, one value
would be chosen arbitrarily and silently. Starting with 4.5, configuration
parsing will fail with an error in situations like this. If you see error
messages such as "solrconfig.xml contains more than one value for config path:
XXXXX" or "Found Z configuration sections when at most 1 is allowed matching
expression: XXXXX" check your solrconfig.xml file for multiple occurrences of
XXXXX and delete the ones that you do not wish to use. See SOLR-4953 &
SOLR-5108 for more details.

In the past, schema.xml parsing would silently ignore "default" or "required"
options specified on <dynamicField/> declarations. Begining with 4.5, attempting
to do configured these on a dynamic field will cause an init error. If you
encounter one of these errors when upgrading an existing schema.xml, you can
safely remove these attributes, regardless of their value, from your config and
Solr will continue to bahave exactly as it did in previous versions. See
SOLR-5227 for more details.

The UniqFieldsUpdateProcessorFactory has been improved to support all of the
FieldMutatingUpdateProcessorFactory selector options. The <lst named="fields">
init param option is now deprecated and should be replaced with the more standard
<arr name="fieldName">. See SOLR-4249 for more details.

UpdateRequestExt has been removed as part of SOLR-4816. You should use UpdateRequest
instead.

CloudSolrServer can now use multiple threads to add documents by default. This is a
small change in runtime semantics when using the bulk add method - you will still
end up with the same exception on a failure, but some documents beyond the one that
failed may have made it in. To get the old, single threaded behavior, set parallel updates
to false on the CloudSolrServer instance.

SOLR-2345: Enhanced geodist() to work with an RPT field, provided that the
field is referenced via 'sfield' and the query point is constant.
(David Smiley)

SOLR-5082: The encoding of URL-encoded query parameters can be changed with
the "ie" (input encoding) parameter, e.g. "select?q=m%FCller&ie=ISO-8859-1".
The default is UTF-8. To change the encoding of POSTed content, use the
"Content-Type" HTTP header.
(Uwe Schindler, Shawn Heisey)

SOLR-4249: UniqFieldsUpdateProcessorFactory now extends
FieldMutatingUpdateProcessorFactory and supports all of it's selector options. Use
of the "fields" init param is now deprecated in favor of "fieldName"
(hossman)

SOLR-2548: Allow multiple threads to be specified for faceting. When threading, one
can specify facet.threads to parallelize loading the uninverted fields. In at least
one extreme case this reduced warmup time from 20 seconds to 3 seconds.
(Janne Majaranta,
Gun Akkor via Erick Erickson, David Smiley)

SOLR-4679, SOLR-4908, SOLR-5124: Text extracted from HTML or PDF files
using Solr Cell was missing ignorable whitespace, which is inserted by
TIKA for convenience to support plain text extraction without using the
HTML elements. This bug resulted in glued words.
(hossman, Uwe Schindler)

SOLR-5227: Correctly fail schema initalization if a dynamicField is configured to
be required, or have a default value.
(hossman)

SOLR-5231: Fixed a bug with the behavior of BoolField that caused documents w/o
a value for the field to act as if the value were true in functions if no other
documents in the same index segment had a value of true.
(Robert Muir, hossman, yonik)

SOLR-5233: The "deleteshard" collections API doesn't wait for cluster state to update,
can fail if some nodes of the deleted shard were down and had incorrect logging.
(Christine Poerschke, shalin)

TieredMergePolicy and the various subtypes of LogMergePolicy no longer have
an explicit "setUseCompoundFile" method. Instead the behavior of new
segments is determined by the IndexWriter configuration, and the MergePolicy
is only consulted to determine if merge segements should use the compound
file format (based on the value of "setNoCFSRatio"). If you have explicitly
configured one of these classes using <mergePolicy> and include an init arg
like this...
<bool name="useCompoundFile">true</bool>
...this will now be treated as if you specified...
<useCompoundFile>true</useCompoundFile>
...directly on the <indexConfig> (overriding any value already set using that
syntax) and a warning will be logged to updated your configuration. Users
with an explicitly declared <mergePolicy> are encouraged to review the
current javadocs for their MergePolicy subclass and review their configured
options carefully. See SOLR-4941, SOLR-4934 and LUCENE-5038 for more
information.

SOLR-4778: The signature of LogWatcher.registerListener has changed, from
(ListenerConfig, CoreContainer) to (ListenerConfig). Users implementing their
own LogWatcher classes will need to change their code accordingly.

LUCENE-5063: ByteField and ShortField have been deprecated and will be removed
in 5.0. If you are still using these field types, you should migrate your
fields to TrieIntField.

SOLR-3240: Add "spellcheck.collateMaxCollectDocs" option so that when testing
potential Collations against the index, SpellCheckComponent will only collect
n documents, thereby estimating the hit-count. This is a performance optimization
in cases where exact hit-counts are unnecessary. Also, when "collateExtendedResults"
is false, this optimization is always made
(James Dyer).

SOLR-4893: Extend FieldMutatingUpdateProcessor.ConfigurableFieldNameSelector
to enable checking whether a field matches any schema field. To select field
names that don't match any fields or dynamic fields in the schema, add
<bool name="fieldNameMatchesSchemaField">false</bool> to an update
processor's configuration in solrconfig.xml.
(Steve Rowe, hossman)

SOLR-4916: Add support to write and read Solr index files and transaction log
files to and from HDFS.
(phunt, Mark Miller, Gregory Chanan)

SOLR-4892: Add FieldMutatingUpdateProcessorFactory subclasses
Parse{Date,Integer,Long,Float,Double,Boolean}UpdateProcessorFactory. These
factories have a default selector that matches all fields that either don’t
match any schema field, or are in the schema with the corresponding
typeClass. If they see a value that is not a CharSequence, or can't parse
the value, they leave it as is. For multi-valued fields, these processors
will not convert any values unless all are first successfully parsed, or
already are instances of the target class. Ordering the processors, e.g.
[Boolean, Long, Double, Date] will allow e.g. values ["2", "5", "8.6"] to
be left alone by the Boolean and Long processors, but then converted by the
Double processor.
(Steve Rowe, hossman)

SOLR-4693: A "deleteshard" collections API that unloads all replicas of a given
shard and then removes it from the cluster state. It will remove only those shards
which are INACTIVE or have no range (created for custom sharding).
(Anshum Gupta, shalin)

SOLR-4943: Add a new system wide info admin handler that exposes the system info
that could previously only be retrieved using a SolrCore.
(Mark Miller)

SOLR-3076: Block joins. Documents and their sub-documents must be indexed
as a block.
{!parent which=<allParents>}<someChildren> takes in a query that matches child
documents and results in matches on their parents.
{!child of=<allParents>}<someParents> takes in a query that matches some parent
documents and results in matches on their children.
(Mikhail Khludnev, Vadim Kirilchuk, Alan Woodward, Tom Burton-West, Mike McCandless,
hossman, yonik)

SOLR-4805: SolreCore#reload should not call preRegister and publish a DOWN state to
ZooKeeper.
(Mark Miller, Jared Rodriguez)

SOLR-4899: When reconnecting after ZooKeeper expiration, we need to be willing to wait
forever, not just for 30 seconds.
(Mark Miller)

SOLR-4920: JdbcDataSource incorrectly suppresses exceptions when retrieving a connection from
a JNDI context and falls back to trying to use DriverManager to obtain a connection. Additionally,
if a SQLException is thrown while initializing a connection, such as in setAutoCommit(), the
connection will not be closed.
(Chris Eldredge via shalin)

SOLR-4915: The root cause should be returned to the user when a SolrCore create call fails.
(Mark Miller)

SOLR-4910: persisting solr.xml is broken. More stringent testing of persistence fixed
up a number of issues and several bugs with persistence. Among them are
> don't persisting implicit properties
> should persist zkHost in the <solr> tag (user's list)
> reloading a core that has transient="true" returned an error. reload should load
a transient core if it's not yet loaded.
> No longer persisting loadOnStartup or transient core properties if they were not
specified in the original solr.xml
> Testing flushed out the fact that you couldn't swap a core marked transient=true
loadOnStartup=false because it hadn't been loaded yet.
> SOLR-4862, CREATE fails to persist schema, config, and dataDir
> SOLR-4363, not persisting coreLoadThreads in <solr> tag
> SOLR-3900, logWatcher properties not persisted
> SOLR-4850, cores defined as loadOnStartup=true, transient=false can't be searched
(Erick Erickson)

SOLR-4923: Commits to non leaders as part of a request that also contain updates
can execute out of order.
(hossman, Ricardo Merizalde, Mark Miller)

SOLR-4932: persisting solr.xml saves some parameters it shouldn't when they weren't
defined in the original. Benign since the default values are saved, but still incorrect.
(Erick Erickson, thanks Shawn Heisey for helping test!)

SOLR-4982: Creating a core while referencing system properties looks like it loses files
Actually, instanceDir, config, dataDir and schema are not dereferenced properly
when creating cores that reference sys vars (e.g. &dataDir=${dir}). In the dataDir
case in particular this leads to the index being put in a directory literally named
${dir} but on restart the sysvar will be properly dereferenced.

SOLR-4978: Time is stripped from datetime column when imported into Solr date field
if convertType=true.
(Bill Au, shalin)

SOLR-5019: spurious ConcurrentModificationException when spell check component
was in use with filters.
(yonik)

SOLR-5018: The Overseer should avoid publishing the state for collections that do not
exist under the /collections zk node.
(Mark Miller)

SOLR-5028,SOLR-5029: ShardHandlerFactory was not being created properly when
using new-style solr.xml, and was not being persisted properly when using
old-style.
(Tomás Fernández Löbbe, Ryan Ernst, Alan Woodward)

LUCENE-5107: Properties files by Solr are now written in UTF-8 encoding,
Unicode is no longer escaped. Reading of legacy properties files with
\u escapes is still possible.
(Uwe Schindler, Robert Muir)

Detailed Change List

SOLR-4795: Sub shard leader should not accept any updates from parent after
it goes active
(shalin)

SOLR-4798: shard splitting does not respect the router for the collection
when executing the index split. One effect of this is that documents
may be placed in the wrong shard when the default compositeId router
is used in conjunction with IDs containing "!".
(yonik)

SOLR-4797: Shard splitting creates sub shards which have the wrong hash
range in cluster state. This happens when numShards is not a power of two
and router is compositeId.
(shalin)

SOLR-4806: Shard splitting does not abort if WaitForState times out
(shalin)

SOLR-4807: The zkcli script now works with log4j. The zkcli.bat script
was broken on Windows in 4.3.0, now it works.
(Shawn Heisey)

In the schema REST API, the output path for copyFields and dynamicFields
has been changed from all lowercase "copyfields" and "dynamicfields" to
camelCase "copyFields" and "dynamicFields", respectively, to align with all
other schema REST API outputs, which use camelCase. The URL format remains
the same: all resource names are lowercase. See SOLR-4623 for details.

Slf4j/logging jars are no longer included in the Solr webapp. All logging
jars are now in example/lib/ext. Changing logging impls is now as easy as
updating the jars in this folder with those necessary for the logging impl
you would like. If you are using another webapp container, these jars will
need to go in the corresponding location for that container.
In conjunction, the dist-excl-slf4j and dist-war-excl-slf4 build targets
have been removed since they are redundent. See the Slf4j documentation,
SOLR-3706, and SOLR-4651 for more details.

The hardcoded SolrCloud defaults for 'hostContext="solr"' and
'hostPort="8983"' have been deprecated and will be removed in Solr 5.0.
Existing solr.xml files that do not have these options explicitly specified
should be updated accordingly. See SOLR-4622 for more details.

Detailed Change List

SOLR-4648 PreAnalyzedUpdateProcessorFactory allows using the functionality
of PreAnalyzedField with other field types. See javadoc for details and
examples.
(Andrzej Bialecki)

SOLR-4623: Provide REST API read access to all elements of the live schema.
Add a REST API request to return the entire live schema, in JSON, XML, and
schema.xml formats. Move REST API methods from package org.apache.solr.rest
to org.apache.solr.rest.schema, and rename base functionality REST API
classes to remove the current schema focus, to prepare for other non-schema
REST APIs. Change output path for copyFields and dynamicFields from
"copyfields" and "dynamicfields" (all lowercase) to "copyFields" and
"dynamicFields", respectively, to align with all other REST API outputs, which
use camelCase.
(Steve Rowe)

SOLR-4658: In preparation for REST API requests that can modify the schema,
a "managed schema" is introduced.
Add '<schemaFactory class="ManagedSchemaFactory" mutable="true"/>' to solrconfig.xml
in order to use it, and to enable schema modifications via REST API requests.
(Steve Rowe, Robert Muir)

SOLR-4656: Added two new highlight parameters, hl.maxMultiValuedToMatch and
hl.maxMultiValuedToExamine. maxMultiValuedToMatch stops looking for snippets after
finding the specified number of matches, no matter how far into the multivalued field
you've gone. maxMultiValuedToExamine stops looking for matches after the specified
number of multiValued entries have been examined. If both are specified, the limit
hit first stops the loop. Also this patch cuts down on the copying of the document
entries during highlighting. These optimizations are probably unnoticeable unless
there are a large number of entries in the multiValued field. Conspicuously, this will
prevent the "best" match from being found if it appears later in the MV list than the
cutoff specified by either of these params.
(Erick Erickson)

SOLR-4675: Improve PostingsSolrHighlighter to support per-field/query-time overrides
and add additional configuration parameters. See the javadocs for more details and
examples.
(Robert Muir)

SOLR-4662: Discover SolrCores by directory structure rather than defining them
in solr.xml. Also, change the format of solr.xml to be closer to that of solrconfig.xml.
This version of Solr will ship the example in the old style, but you can manually
try the new style. Solr 4.4 will ship with the new style, and Solr 5.0 will remove
support for the old style.
(Erick Erickson, Mark Miller)

SOLR-4710: You cannot delete a collection fully from ZooKeeper unless all nodes are up and
functioning correctly.
(Mark Miller)

SOLR-4487: SolrExceptions thrown by HttpSolrServer will now contain the
proper HTTP status code returned by the remote server, even if that status
code is not something Solr itself returned -- eg: from the Servlet Container,
or an intermediate HTTP Proxy
(hossman)

SOLR-4661: Admin UI Replication details now correctly displays the current
replicable generation/version of the master.
(hossman)

SOLR-4622: The hardcoded SolrCloud defaults for 'hostContext="solr"' and
'hostPort="8983"' have been deprecated and will be removed in Solr 5.0.
Existing solr.xml files that do not have these options explicitly specified
should be updated accordingly.
(hossman)

SOLR-4672: Requests attempting to use SolrCores which had init failures
(that would be reported by CoreAdmin STATUS requests) now result in 500
error responses with the details about the init failure, instead of 404
error responses.
(hossman)

SOLR-4730: Make the wiki link more prominent in the release documentation.
(Uri Laserson via Robert Muir)

SOLR-4624: CachingDirectoryFactory does not need to support forceNew any
longer and it appears to be causing a missing close directory bug. forceNew
is no longer respected and will be removed in 4.3.
(Mark Miller)

SOLR-4451: SolrJ, and SolrCloud internals, now use SystemDefaultHttpClient
under the covers -- allowing many HTTP connection related properties to be
controlled via 'standard' java system properties.
(hossman)

SOLR-4078: Allow custom naming of SolrCloud nodes so that a new host:port
combination can take over for a previous shard.
(Mark Miller)

SOLR-4210: Requests to a Collection that does not exist on the receiving node
should be proxied to a suitable node.
(Mark Miller, Po Rui, yonik)

SOLR-1365: New SweetSpotSimilarityFactory allows customizable TF/IDF based
Similarity when you know the optimal "Sweet Spot" of values for the field
length and TF scoring factors.
(hossman)

SOLR-4138: CurrencyField fields can now be used in a ValueSources to
get the "raw" value (using the default number of fractional digits) in
the default currency of the field type. There is also a new
currency(field,[CODE]) function for generating a ValueSource of the
"natural" value, converted to an optionally specified currency to
override the default for the field type.
(hossman)

SOLR-4511: When a new index is replicated into place, we need
to update the most recent replicatable index point without
doing a commit. This is important for repeater use cases, as
well as when nodes may switch master/slave roles.
(Mark Miller, Raúl Grande)

SOLR-4515: CurrencyField's OpenExchangeRatesOrgProvider now requires
a ratesFileLocation init param, since the previous global default
no longer works
(hossman)

SOLR-4518: Improved CurrencyField error messages when attempting to
use a Currency that is not supported by the current JVM.
(hossman)

SOLR-3798: Fix copyField implementation in IndexSchema to handle
dynamic field references that aren't string-equal to the name of
the referenced dynamic field.
(Steve Rowe)

BaseDistributedSearchTestCase now randomizes the servlet context it uses when
creating Jetty instances. Subclasses that assume a hard coded context of
"/solr" should either be fixed to use the "String context" variable, or should
take advantage of the new BaseDistributedSearchTestCase(String) constructor
to explicitly specify a fixed servlet context path. See SOLR-4136 for details.

Detailed Change List

SOLR-2255: Enhanced pivot faceting to use local-params in the same way that
regular field value faceting can. This means support for excluding a filter
query, using a different output key, and specifying 'threads' to do
facet.method=fcs concurrently. PivotFacetHelper now extends SimpleFacet and
the getFacetImplementation() extension hook was removed.
(dsmiley)

SOLR-3897: A highlighter parameter "hl.preserveMulti" to return all of the
values of a multiValued field in their original order when highlighting.
(Joel Bernstein via yonik)

SOLR-4051: Add <propertyWriter /> element to DIH's data-config.xml file,
allowing the user to specify the location, filename and Locale for
the "data-config.properties" file. Alternatively, users can specify their
own property writer implementation for greater control. This new configuration
element is optional, and defaults mimic prior behavior. The one exception is
that the "root" locale is default. Previously it was the machine's default locale.
(James Dyer)

SOLR-4084: Add FuzzyLookupFactory, which is like AnalyzingSuggester except that
it can tolerate typos in the input.
(Areek Zillur via Robert Muir)

SOLR-4118: Fix replicationFactor to align with industry usage.
replicationFactor now means the total number of copies
of a document stored in the collection (or the total number of
physical indexes for a single logical slice of the collection).
For example if replicationFactor=3 then for a given shard there
will be a total of 3 replicas (one of which will normally be
designated as the leader.)
(yonik)

SOLR-4124: You should be able to set the update log directory with the
CoreAdmin API the same way as the data directory.
(Mark Miller)

SOLR-4028: When using ZK chroot, it would be nice if Solr would create the
initial path when it doesn't exist.
(Tomás Fernández Löbbe via Mark Miller)

SOLR-1028: The ability to specify "transient" and "loadOnStartup" as a new properties of
<core> tags in solr.xml. Can specify "transientCacheSize" in the <cores> tag. Together
these allow cores to be loaded only when needed and only transientCacheSize transient
cores will be loaded at a time, the rest aged out on an LRU basis.

SOLR-4246: When update.distrib is set to skip update processors before
the distributed update processor, always include the log update processor
so forwarded updates will still be logged.
(yonik)

SOLR-4230: The new Solr 4 spatial fields now work with the {!geofilt} and
{!bbox} query parsers. The score local-param works too.
(David Smiley)

SOLR-4255: The new Solr 4 spatial fields now have a 'filter' boolean local-param
that can be set to false to not filter. Its useful when there is already a spatial
filter query but you also need to sort or boost by distance.
(David Smiley)

SOLR-4265, SOLR-4283: Solr now parses request parameters (in URL or sent with POST
using content-type application/x-www-form-urlencoded) in its dispatcher code. It no
longer relies on special configuration settings in Tomcat or other web containers
to enable UTF-8 encoding, which is mandatory for correct Solr behaviour. Query
strings passed in via the URL need to be properly-%-escaped, UTF-8 encoded
bytes, otherwise Solr refuses to handle the request. The maximum length of
x-www-form-urlencoded POST parameters can now be configured through the
requestDispatcher/requestParsers/@formdataUploadLimitInKB setting in
solrconfig.xml (defaults to 2 MiB). Solr now works out of the box with
e.g. Tomcat, JBoss,...
(Uwe Schindler, Dawid Weiss, Alex Rocher)

SOLR-4302: New parameter 'indexInfo' (defaults to true) in CoreAdmin STATUS
command can be used to omit index specific information
(Shahar Davidson via shalin)

SOLR-2592: Collection specific document routing. The "compositeId"
router is the default for collections with hash based routing (i.e. when
numShards=N is specified on collection creation). Documents with ids sharing
the same domain (prefix) will be routed to the same shard, allowing for
efficient querying.

Example:
The following two documents will be indexed to the same shard
since they share the same domain "customerB!".

Collections that do not specify numShards at collection creation time
use custom sharding and default to the "implicit" router. Document updates
received by a shard will be indexed to that shard, unless a "_shard_" parameter
or document field names a different shard.
(Michael Garski, Dan Rosher, yonik)

SOLR-4001: In CachingDirectoryFactory#close, if there are still refs for a
Directory outstanding, we need to wait for them to be released before closing.
(Mark Miller)

SOLR-4005: If CoreContainer fails to register a created core, it should close it.
(Mark Miller)

SOLR-4009: OverseerCollectionProcessor is not resilient to many error conditions
and can stop running on errors.
(Raintung Li, milesli, Mark Miller)

SOLR-4019: Log stack traces for 503/Service Unavailable SolrException if not
thrown by PingRequestHandler. Do not log exceptions if a user tries to view a
hidden file using ShowFileRequestHandler.
(Tomás Fernández Löbbe via James Dyer)

SOLR-4031: Upgrade to Jetty 8.1.7 to fix a bug where in very rare occasions
the content of two concurrent requests get mixed up.
(Per Steffensen, yonik)

SOLR-4060: ReplicationHandler can try and do a snappull and open a new IndexWriter
after shutdown has already occurred, leaving an IndexWriter that is not closed.
(Mark Miller)

SOLR-4055: Fix a thread safety issue with the Collections API that could
cause actions to be targeted at the wrong SolrCores.
(Raintung Li, Per Steffensen via Mark Miller)

SOLR-3993: If multiple SolrCore's for a shard coexist on a node, on cluster
restart, leader election would stall until timeout, waiting to see all of
the replicas come up.
(Mark Miller, Alexey Kudinov)

SOLR-2045: Databases that require a commit to be issued before closing the
connection on a non-read-only database leak connections. Also expanded the
SqlEntityProcessor test to sometimes use Derby as well as HSQLDB (Derby is
one db affected by this bug).
(Fenlor Sebastia, James Dyer)

SOLR-4064: When there is an unexpected exception while trying to run the new
leader process, the SolrCore will not correctly rejoin the election.
(Po Rui via Mark Miller)

SOLR-4036: field aliases in fl should not cause properties of target field
to be used.
(Martin Koch, yonik)

SOLR-4003: The SolrZKClient clean method should not try and clear zk paths
that start with /zookeeper, as this can fail and stop the removal of
further nodes.
(Mark Miller)

SOLR-4076: SolrQueryParser should run fuzzy terms through
MultiTermAwareComponents to ensure that (for example) a fuzzy query of
foobar~2 is equivalent to FooBar~2 on a field that includes lowercasing.
(yonik)

SOLR-4081: QueryParsing.toString, used during debugQuery=true, did not
correctly handle ExtendedQueries such as WrappedQuery
(used when cache=false), spatial queries, and frange queries.
(Eirik Lygre, yonik)

SOLR-4178: ReplicationHandler should abort any current pulls and wait for
it's executor to stop during core close.
(Mark Miller)

SOLR-3918: Fixed the 'dist-war-excl-slf4j' ant target to exclude all
slf4j jars, so that the resulting war is usable as is provided the servlet
container includes the correct slf4j api and impl jars.
(Shawn Heisey, hossman)

SOLR-4067: ZkStateReader#getLeaderProps should not return props for a leader
that it does not think is live.
(Mark Miller)

SOLR-4086: DIH refactor of VariableResolver and Evaluator. VariableResolver
and each built-in Evaluator are separate concrete classes. DateFormatEvaluator
now defaults with the ROOT Locale. However, users may specify a different
Locale using an optional new third parameter.
(James Dyer)

In order to better support distributed search mode, the TermVectorComponent's
response format has been changed so that if the schema defines a
uniqueKeyField, then that field value is used as the "key" for each document in
it's response section, instead of the internal lucene doc id. Users w/o a
uniqueKeyField will continue to see the same response format. See SOLR-3229
for more details.

If you are using SolrCloud's distributed update request capabilities and a non
string type id field, you must re-index.

Solr is now much more strict about requiring that the uniqueKeyField feature
(if used) must refer to a field which is not multiValued. If you upgrade from
an earlier version of Solr and see an error that your uniqueKeyField "can not
be configured to be multivalued" please add 'multiValued="false"' to the
<field /> declaration for your uniqueKeyField. See SOLR-3682 for more details.

In addition, please review the notes above about upgrading from 4.0.0-BETA

The Lucene index format has changed and as a result, once you upgrade,
previous versions of Solr will no longer be able to read your indices.
In a master/slave configuration, all searchers/slaves should be upgraded
before the master. If the master were to be updated first, the older
searchers would not be able to read the new index format.

Setting abortOnConfigurationError=false is no longer supported
(since it has never worked properly). Solr will now warn you if
you attempt to set this configuration option at all. (see SOLR-1846)

The default logic for the 'mm' param of the 'dismax' QParser has
been changed. If no 'mm' param is specified (either in the query,
or as a default in solrconfig.xml) then the effective value of the
'q.op' param (either in the query or as a default in solrconfig.xml
or from the 'defaultOperator' option in schema.xml) is used to
influence the behavior. If q.op is effectively "AND" then mm=100%.
If q.op is effectively "OR" then mm=0%. Users who wish to force the
legacy behavior should set a default value for the 'mm' param in
their solrconfig.xml file.

The VelocityResponseWriter is no longer built into the core. Its JAR and
dependencies now need to be added (via <lib> or solr/home lib inclusion),
and it needs to be registered in solrconfig.xml like this:
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>

The update request parameter to choose Update Request Processor Chain is
renamed from "update.processor" to "update.chain". The old parameter was
deprecated but still working since Solr3.2, but is now removed
entirely.

The <indexDefaults> and <mainIndex> sections of solrconfig.xml are discontinued
and replaced with the <indexConfig> section. There are also better defaults.
When migrating, if you don't know what your old settings mean, simply delete
both <indexDefaults> and <mainIndex> sections. If you have customizations,
put them in <indexConfig> section - with same syntax as before.

Two of the SolrServer subclasses in SolrJ were renamed/replaced.
CommonsHttpSolrServer is now HttpSolrServer, and
StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer.

The PingRequestHandler no longer looks for a <healthcheck/> option in the
(legacy) <admin> section of solrconfig.xml. Users who wish to take
advantage of this feature should configure a "healthcheckFile" init param
directly on the PingRequestHandler. As part of this change, relative file
paths have been fixed to be resolved against the data dir. See the example
solrconfig.xml and SOLR-1258 for more details.

Due to low level changes to support SolrCloud, the uniqueKey field can no
longer be populated via <copyField/> or <field default=...> in the
schema.xml. Users wishing to have Solr automatically generate a uniqueKey
value when adding documents should instead use an instance of
solr.UUIDUpdateProcessorFactory in their update processor chain. See
SOLR-2796 for more details.

In addition, please review the notes above about upgrading from 4.0.0-BETA, and 4.0.0-ALPHA

SOLR-3597: seems like a lot of wasted whitespace at the top of the admin screens
(steffkes)

SOLR-3304: Added Solr adapters for Lucene 4's new spatial module. With
SpatialRecursivePrefixTreeFieldType ("location_rpt" in example schema), it is
possible to index a variable number of points per document (and sort on them),
index not just points but any Spatial4j supported shape such as polygons, and
to query on these shapes too. Polygons requires adding JTS to the classpath.
(David Smiley)

SOLR-3825: Added optional capability to log what ids are in a response
(Scott Stults via gsingers)

SOLR-3715: improve concurrency of the transaction log by removing
synchronization around log record serialization.
(yonik)

SOLR-3807: Currently during recovery we pause for a number of seconds after
waiting for the leader to see a recovering state so that any previous updates
will have finished before our commit on the leader - we don't need this wait
for peersync.
(Mark Miller)

SOLR-3837: When a leader is elected and asks replicas to sync back to him and
that fails, we should ask those nodes to recovery asynchronously rather than
synchronously.
(Mark Miller)

SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer
on each request.
(Mark Miller)

SOLR-3685: Solr Cloud sometimes skipped peersync attempt and replicated instead due
to tlog flags not being cleared when no updates were buffered during a previous
replication.
(Markus Jelsma, Mark Miller, yonik)

SOLR-3793: UnInvertedField faceting cached big terms in the filter
cache that ignored deletions, leading to duplicate documents in search
later when a filter of the same term was specified.
(Günter Hipler, hossman, yonik)

SOLR-3795: Fixed LukeRequestHandler response to correctly return field name
strings in copyDests and copySources arrays
(hossman)

SOLR-3699: Fixed some Directory leaks when there were errors during SolrCore
or SolrIndexWriter initialization.
(hossman)

SOLR-3518: Include final 'hits' in log information when aggregating a
distributed request
(Markus Jelsma via hossman)

SOLR-3628: SolrInputField and SolrInputDocument are now consistently backed
by Collections passed in to setValue/setField, and defensively copy values
from Collections passed to addValue/addField
(Tom Switzer via hossman)

SOLR-3595: CurrencyField now generates an appropriate error on schema init
if it is configured as multiValued - this has never been properly supported,
but previously failed silently in odd ways.
(hossman)

SOLR-3823: Fix 'bq' parsing in edismax. Please note that this required
reverting the negative boost support added by SOLR-3278(hossman)

SOLR-3828: Fixed QueryElevationComponent so that using 'markExcludes' does
not modify the result set or ranking of 'excluded' documents relative to
not using elevation at all.
(Alexey Serba via hossman)

SOLR-3811: Query Form using wrong values for dismax, edismax
(steffkes)

SOLR-3779: DataImportHandler's LineEntityProcessor when used in conjunction
with FileListEntityProcessor would only process the first file.
(Ahmet Arslan via James Dyer)

SOLR-3791: CachedSqlEntityProcessor would throw a NullPointerException when
a query returns a row with a NULL key.
(Steffen Moelter via James Dyer)

SOLR-3833: When a election is started because a leader went down, the new
leader candidate should decline if the last state they published was not
active.
(yonik, Mark Miller)

SOLR-3836: When doing peer sync, we should only count sync attempts that
cannot reach the given host as success when the candidate leader is
syncing with the replicas - not when replicas are syncing to the leader.
(Mark Miller)

SOLR-3835: In our leader election algorithm, if on connection loss we found
we did not create our election node, we should retry, not throw an exception.
(Mark Miller)

SOLR-3834: A new leader on cluster startup should also run the leader sync
process in case there was a bad cluster shutdown.
(Mark Miller)

SOLR-3772: On cluster startup, we should wait until we see all registered
replicas before running the leader process - or if they all do not come up,
N amount of time.
(Mark Miller)

SOLR-3756: If we are elected the leader of a shard, but we fail to publish
this for any reason, we should clean up and re trigger a leader election.
(Mark Miller)

SOLR-3812: ConnectionLoss during recovery can cause lost updates, leading to
shard inconsistency.
(Mark Miller)

SOLR-3813: When a new leader syncs, we need to ask all shards to sync back,
not just those that are active.
(Mark Miller)

SOLR-3772: Optionally, on cluster startup, we can wait until we see all registered
replicas before running the leader process - or if they all do not come up,
N amount of time.
(Jan Høydahl, Per Steffensen, Mark Miller)

SOLR-3750: Optionally, on session expiration, we can explicitly wait some time before
running the leader sync process so that we are sure every node participates.
(Per Steffensen, Mark Miller)

Solr is now much more strict about requiring that the uniqueKeyField feature
(if used) must refer to a field which is not multiValued. If you upgrade from
an earlier version of Solr and see an error that your uniqueKeyField "can not
be configured to be multivalued" please add 'multiValued="false"' to the
<field /> declaration for your uniqueKeyField. See SOLR-3682 for more details.

Detailed Change List

SOLR-1856: In Solr Cell, literals should override Tika-parsed values.
Patch adds a param "literalsOverride" which defaults to true, but can be set
to "false" to let Tika-parsed values be appended to literal values
(Chris Harris, janhoy)

SOLR-2702: The default directory factory was changed to NRTCachingDirectoryFactory
which wraps the StandardDirectoryFactory and caches small files for improved
Near Real-time (NRT) performance.
(Mark Miller, yonik)

SOLR-1725: StatelessScriptUpdateProcessorFactory allows users to implement
the full ScriptUpdateProcessor API using any scripting language with a
javax.script.ScriptEngineFactory
(Uri Boness, ehatcher, Simon Rosenthal, hossman)

SOLR-139: Change to updateable documents to create the document if it doesn't
already exist. To assert that the document must exist, use the optimistic
concurrency feature by specifying a _version_ of 1.
(yonik)

LUCENE-2510, LUCENE-4044: Migrated Solr's Tokenizer-, TokenFilter-, and
CharFilterFactories to the lucene-analysis module. To add new analysis
modules to Solr (like ICU, SmartChinese, Morfologik,...), just drop in
the JAR files from Lucene's binary distribution into your Solr instance's
lib folder. The factories are automatically made available with SPI.
(Chris Male, Robert Muir, Uwe Schindler)

SOLR-3634, SOLR-3635: CoreContainer and CoreAdminHandler will now remember
and report back information about failures to initialize SolrCores. These
failures will be accessible from the web UI and CoreAdminHandler STATUS
command until they are "reset" by creating/renaming a SolrCore with the
same name.
(hossman, steffkes)

SOLR-3658: Adding thousands of docs with one UpdateProcessorChain instance can briefly create
spikes of threads in the thousands.
(yonik, Mark Miller)

SOLR-3656: A core reload now always uses the same dataDir.
(Mark Miller, yonik)

SOLR-3662: Core reload bugs: a reload always obtained a non-NRT searcher, which
could go back in time with respect to the previous core's NRT searcher. Versioning
did not work correctly across a core reload, and update handler synchronization
was changed to synchronize on core state since more than on update handler
can coexist for a single index during a reload.
(yonik)

SOLR-3663: There are a couple of bugs in the sync process when a leader goes down and a
new leader is elected.
(Mark Miller)

SOLR-3524: Make discarding punctuation configurable in JapaneseTokenizerFactory.
The default is to discard punctuation, but this is overridable as an expert option.
(Kazuaki Hiraga, Jun Ohtani via Christian Moen)

SOLR-3215: Clone SolrInputDocument when distrib indexing so that update processors after
the distrib update process do not process the document twice.
(Mark Miller)

SOLR-3683: Improved error handling if an <analyzer> contains both an
explicit class attribute, as well as nested factories.
(hossman)

SOLR-3682: Fail to parse schema.xml if uniqueKeyField is multivalued
(hossman)

SOLR-2115: DIH no longer requires the "config" parameter to be specified in solrconfig.xml.
Instead, the configuration is loaded and parsed with every import. This allows the use of
a different configuration with each import, and makes correcting configuration errors simpler.
Also, the configuration itself can be passed using the "dataConfig" parameter rather than
using a file (this previously worked in debug mode only). When configuration errors are
encountered, the error message is returned in XML format.
(James Dyer)

SOLR-3439: Make SolrCell easier to use out of the box. Also improves "/browse" to display
rich-text documents correctly, along with facets for author and content_type.
With the new "content" field, highlighting of body is supported. See also SOLR-3672 for
easier posting of a whole directory structure.
(Jack Krupansky, janhoy)

SOLR-3579: SolrCloud view should default to the graph view rather than tree view.
(steffkes, Mark Miller)

The Lucene index format has changed and as a result, once you upgrade,
previous versions of Solr will no longer be able to read your indices.
In a master/slave configuration, all searchers/slaves should be upgraded
before the master. If the master were to be updated first, the older
searchers would not be able to read the new index format.

Setting abortOnConfigurationError=false is no longer supported
(since it has never worked properly). Solr will now warn you if
you attempt to set this configuration option at all. (see SOLR-1846)

The default logic for the 'mm' param of the 'dismax' QParser has
been changed. If no 'mm' param is specified (either in the query,
or as a default in solrconfig.xml) then the effective value of the
'q.op' param (either in the query or as a default in solrconfig.xml
or from the 'defaultOperator' option in schema.xml) is used to
influence the behavior. If q.op is effectively "AND" then mm=100%.
If q.op is effectively "OR" then mm=0%. Users who wish to force the
legacy behavior should set a default value for the 'mm' param in
their solrconfig.xml file.

The VelocityResponseWriter is no longer built into the core. Its JAR and
dependencies now need to be added (via <lib> or solr/home lib inclusion),
and it needs to be registered in solrconfig.xml like this:
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>

The update request parameter to choose Update Request Processor Chain is
renamed from "update.processor" to "update.chain". The old parameter was
deprecated but still working since Solr3.2, but is now removed
entirely.

The <indexDefaults> and <mainIndex> sections of solrconfig.xml are discontinued
and replaced with the <indexConfig> section. There are also better defaults.
When migrating, if you don't know what your old settings mean, simply delete
both <indexDefaults> and <mainIndex> sections. If you have customizations,
put them in <indexConfig> section - with same syntax as before.

Two of the SolrServer subclasses in SolrJ were renamed/replaced.
CommonsHttpSolrServer is now HttpSolrServer, and
StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer.

The PingRequestHandler no longer looks for a <healthcheck/> option in the
(legacy) <admin> section of solrconfig.xml. Users who wish to take
advantage of this feature should configure a "healthcheckFile" init param
directly on the PingRequestHandler. As part of this change, relative file
paths have been fixed to be resolved against the data dir. See the example
solrconfig.xml and SOLR-1258 for more details.

Due to low level changes to support SolrCloud, the uniqueKey field can no
longer be populated via <copyField/> or <field default=...> in the
schema.xml. Users wishing to have Solr automatically generate a uniqueKey
value when adding documents should instead use an instance of
solr.UUIDUpdateProcessorFactory in their update processor chain. See
SOLR-2796 for more details.

Detailed Change List

SOLR-571: The autowarmCount for LRUCaches (LRUCache and FastLRUCache) now
supports "percentages" which get evaluated relative the current size of
the cache when warming happens.
(Tomás Fernández Löbbe and hossman)

SOLR-2822: Skip update processors already run on other nodes
(hossman)

SOLR-1566: Transforming documents in the ResponseWriters. This will allow
for more complex results in responses and open the door for function queries
as results.
(ryan with patches from grant, noble, cmale, yonik, Jan Høydahl,
Arul Kalaipandian, Luca Cavanna, hossman)

SOLR-2037: Thanks to SOLR-1566, documents boosted by the QueryElevationComponent
can be marked as boosted.
(gsingers, ryan, yonik)

SOLR-2396: Add CollationField, which is much more efficient than
the Solr 3.x CollationKeyFilterFactory, and also supports
Locale-sensitive range queries.
(rmuir)

SOLR-2338: Add support for using <similarity/> in a schema's fieldType,
for customizing scoring on a per-field basis.
(hossman, yonik, rmuir)

SOLR-2335: New 'field("...")' function syntax for referring to complex
field names (containing whitespace or special characters) in functions.

SOLR-2533: Converted ValueSource.ValueSourceSortField over to new rewriteable Lucene
SortFields. ValueSourceSortField instances must be rewritten before they can be used.
This is done by SolrIndexSearcher when necessary.
(Chris Male).

SOLR-2193, SOLR-2565: You may now specify a 'soft' commit when committing. This will
use Lucene's NRT feature to avoid guaranteeing documents are on stable storage in exchange
for faster reopen times. There is also a new 'soft' autocommit tracker that can be
configured.
(Mark Miller, Robert Muir)

SOLR-2399: Updated Solr Admin interface. New look and feel with per core administration
and many new options.
(Stefan Matheis via ryan)

SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing
for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a
"multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't
specify <analyzer type="multiterm">
(Pete Sturge Erick Erickson, Mentoring from Seeley and Muir)

SOLR-3069: Ability to add openSearcher=false to not open a searcher when doing
a hard commit. commitWithin now only invokes a softCommit.
(yonik)

SOLR-2802: New FieldMutatingUpdateProcessor and Factory to simplify the
development of UpdateProcessors that modify field values of documents as
they are indexed. Also includes several useful new implementations:

SOLR-3221: Added the ability to directly configure aspects of the concurrency
and thread-pooling used within distributed search in solr. This allows for finer
grained controlled and can be tuned by end users to target their own specific
requirements. This builds on the work of the HttpCommComponent and uses the same configuration
block to configure the thread pool. The default configuration has
the same behaviour as solr 3.5, favouring throughput over latency. More
information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml)
(Greg Bowyer)

SOLR-3358: Logging events are captured and available from the /admin/logging
request handler.
(ryan)

SOLR-1535: PreAnalyzedField type provides a functionality to index (and optionally store)
field content that was already processed and split into tokens using some external processing
chain. Serialization format is pluggable, and defaults to JSON.
(ab)

SOLR-3363: Consolidated Exceptions in Analysis Factories so they only throw
InitializationExceptions
(Chris Male)

SOLR-2690: New support for a "TZ" request param which overrides the TimeZone
used when rounding Dates in DateMath expressions for the entire request
(all date range queries and date faceting is affected). The default TZ
is still UTC.
(David Schlotfeldt, hossman)

SOLR-3402: Analysis Factories are now configured with their Lucene Version
throw setLuceneMatchVersion, rather than through the Map passed to init.
Parsing and simple error checking for the Version is now done inside
the code that creates the Analysis Factories.
(Chris Male)

SOLR-3178: Optimistic locking. If a _version_ is provided with an update
that does not match the version in the index, an HTTP 409 error (Conflict)
will result.
(Per Steffensen, yonik)

SOLR-139: Updateable documents. JSON Example:
{"id":"mydoc", "f1":{"set":10}, "f2":{"add":20}} will result in field "f1"
being set to 10, "f2" having an additional value of 20 added, and all
other existing fields unchanged. All source fields must be stored for
this feature to work correctly.
(Ryan McKinley, Erik Hatcher, yonik)

SOLR-2857: Support XML,CSV,JSON, and javabin in a single RequestHandler and
choose the correct ContentStreamLoader based on Content-Type header. This
also deprecates the existing [Xml,JSON,CSV,Binary,Xslt]UpdateRequestHandler.
(ryan)

SOLR-2585: Context-Sensitive Spelling Suggestions & Collations. This adds support
for the "spellcheck.alternativeTermCount" & "spellcheck.maxResultsForSuggest"
parameters, letting users receive suggestions even when all the queried terms
exist in the dictionary. This differs from "spellcheck.onlyMorePopular" in
that the suggestions need not consist entirely of terms with a greater document
frequency than the queried terms.
(James Dyer)

SOLR-2058: Edismax query parser to allow "phrase slop" to be specified per-field
on the pf/pf2/pf3 parameters using optional "FieldName~slop^boost" syntax. The
prior "FieldName^boost" syntax is still accepted. In such cases the value on the
"ps" parameter serves as the default slop.
(Ron Mayer via James Dyer)

SOLR-3495: New UpdateProcessors have been added to create default values for
configured fields. These works similarly to the <field default="..."/>
option in schema.xml, but are applied in the UpdateProcessorChain, so they
may be used prior to other UpdateProcessors, or to generate a uniqueKey field
value when using the DistributedUpdateProcessor (ie: SolrCloud)
TimestampUpdateProcessorFactory
UUIDUpdateProcessorFactory
DefaultValueUpdateProcessorFactory
(hossman)

SOLR-2993: Add WordBreakSolrSpellChecker to offer suggestions by combining adjacent
query terms and/or breaking terms into multiple words. This spellchecker can be
configured with a traditional checker (ie: DirectSolrSpellChecker). The results
are combined and collations can contain a mix of corrections from both spellcheckers.
(James Dyer)

SOLR-3508: Simplify JSON update format for deletes as well as allow
version specification for optimistic locking. Examples:

{"delete":"myid"}

{"delete":["id1","id2","id3"]}

{"delete":{"id":"myid", "_version_":123456789}}

(yonik)

SOLR-3211: Allow parameter overrides in conjunction with "spellcheck.maxCollationTries".
To do so, use parameters starting with "spellcheck.collateParam." For instance, to
override the "mm" parameter, specify "spellcheck.collateParam.mm". This is helpful
in cases where testing spellcheck collations for result counts should use different
parameters from the main query
(James Dyer)

SOLR-2599: CloneFieldUpdateProcessorFactory provides similar functionality
to schema.xml's <copyField/> declaration but as an update processor that can
be combined with other processors in any order.
(Jan Høydahl & hossman)

SOLR-1875: Per-segment field faceting for single valued string fields.
Enable with facet.method=fcs, control the number of threads used with
the "threads" local param on the facet.field param. This algorithm will
only be faster in the presence of rapid index changes.
(yonik)

SOLR-1904: When facet.enum.cache.minDf > 0 and the base doc set is a
SortedIntSet, convert to HashDocSet for better performance.
(yonik)

SOLR-2092: Speed up single-valued and multi-valued "fc" faceting. Typical
improvement is 5%, but can be much greater (up to 10x faster) when facet.offset
is very large (deep paging).
(yonik)

SOLR-2193, SOLR-2565: The default Solr update handler has been improved so
that it uses fewer locks, keeps the IndexWriter open rather than closing it
on each commit (ie commits no longer wait for background merges to complete),
works with SolrCore to provide faster 'soft' commits, and has an improved API
that requires less instanceof special casing.
(Mark Miller, Robert Muir)

SOLR-1908: Fixed SignatureUpdateProcessor to fail to initialize on
invalid config. Specifically: a signatureField that does not exist,
or overwriteDupes=true with a signatureField that is not indexed.
(hossman)

SOLR-1824: IndexSchema will now fail to initialize if there is a
problem initializing one of the fields or field types.
(hossman)

SOLR-2108: Fixed false positives when using wildcard queries on fields with reversed
wildcard support. For example, a query of *zemog* would match documents that contain
'gomez'.
(Landon Kuhn via Robert Muir)

SOLR-1962: SolrCore#initIndex should not use a mix of indexPath and newIndexPath
(Mark Miller)

SOLR-2605: fixed tracking of the 'defaultCoreName' in CoreContainer so that
CoreAdminHandler could return consistent information regardless of whether
there is a a default core name or not.
(steffkes, hossman)

SOLR-3548: Fixed a bug in the cachability of queries using the {!join}
parser or the strdist() function, as well as some minor improvements to
the hashCode implementation of {!bbox} and {!geofilt} queries.
(hossman)

SOLR-1123: Changed JSONResponseWriter to now use application/json as its Content-Type
by default. However the Content-Type can be overwritten and is set to text/plain in
the example configuration.
(Uri Boness, Chris Male)

SOLR-3032: logOnce from SolrException logOnce and all the supporting
structure is gone. abortOnConfigurationError is also gone as it is no longer referenced.
Errors should be caught and logged at the top-most level or logged and NOT propagated up the
chain.
(Erick Erickson)

SOLR-2105: Remove support for deprecated "update.processor" (since 3.2), in favor of
"update.chain"
(janhoy)

SOLR-1893: Refactored some common code from LRUCache and FastLRUCache into
SolrCacheBase
(Tomás Fernández Löbbe via hossman)

SOLR-3403: Deprecated Analysis Factories now log their own deprecation messages.
No logging support is provided by Factory parent classes.
(Chris Male)

SOLR-1258: PingRequestHandler is now directly configured with a
"healthcheckFile" instead of looking for the legacy
<admin><healthcheck/></admin> syntax. Filenames specified as relative
paths have been fixed so that they are resolved against the data dir
instead of the CWD of the java process.
(hossman)

SOLR-2796: Due to low level changes to support SolrCloud, the uniqueKey
field can no longer be populated via <copyField/> or <field default=...>
in the schema.xml.

SOLR-3534: The Dismax and eDismax query parsers will fall back on the 'df' parameter
when 'qf' is absent. And if neither is present nor the schema default search field
then an exception will be thrown now.
(dsmiley)

SOLR-2983: As a consequence of moving the code which sets a MergePolicy from SolrIndexWriter to SolrIndexConfig,
(custom) MergePolicies should now have an empty constructor; thus an IndexWriter should not be passed as constructor
parameter but instead set using the setIndexWriter() method.

As doGet() methods in SimplePostTool was changed to static, the client applications of this
class need to be recompiled.

In Solr version 3.5 and earlier, HTMLStripCharFilter had known bugs in the
character offsets it provided, triggering e.g. exceptions in highlighting.
HTMLStripCharFilter has been re-implemented, addressing this and other
issues. See the entry for LUCENE-3690 in the Bug Fixes section below for a
detailed list of changes. For people who depend on the behavior of
HTMLStripCharFilter in Solr version 3.5 and earlier: the old implementation
(bugs and all) is preserved as LegacyHTMLStripCharFilter.

As of Solr 3.6, the <indexDefaults> and <mainIndex> sections of solrconfig.xml are deprecated
and replaced with a new <indexConfig> section. Read more in SOLR-1052 below.

SOLR-3161: <requestDispatcher handleSelect="false"> is now the default. An existing config will
probably work as-is because handleSelect was explicitly enabled in default configs. HandleSelect
makes /select work as well as enables the 'qt' parameter. Instead, consider explicitly
configuring /select as is done in the example solrconfig.xml, and register your other search
handlers with a leading '/' which is a recommended practice.
(David Smiley, Erik Hatcher)

SOLR-3161: Don't use the 'qt' parameter with a leading '/'. It probably won't work in 4.0
and it's now limited in 3.6 to SearchHandler subclasses that aren't lazy-loaded.

SOLR-2724: Specifying <defaultSearchField> and <solrQueryParser defaultOperator="..."/> in
schema.xml is now considered deprecated. Instead you are encouraged to specify these via the "df"
and "q.op" parameters in your request handler definition.
(David Smiley)

Bugs found and fixed in the SignatureUpdateProcessor that previously caused
some documents to produce the same signature even when the configured fields
contained distinct (non-String) values. Users of SignatureUpdateProcessor
are strongly advised that they should re-index as document signatures may
have now changed. (see SOLR-3200 & SOLR-3226 for details)

SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing
for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a
"multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't
specify <fieldType="multiterm">
(Pete Sturge Erick Erickson, Mentoring from Seeley and Muir)

SOLR-1843: A new "rootName" attribute is now available when
configuring <jmx/> in solrconfig.xml. If this attribute is set,
Solr will use it as the root name for all MBeans Solr exposes via
JMX. The default root name is "solr" followed by the core name.
(Constantijn Visinescu, hossman)

SOLR-1729: Evaluation of NOW for date math is done only once per request for
consistency, and is also propagated to shards in distributed search.
Adding a parameter NOW=<time_in_ms> to the request will override the
current time.
(Peter Sturge, yonik, Simon Willnauer)

SOLR-3221: Added the ability to directly configure aspects of the concurrency
and thread-pooling used within distributed search in solr. This allows for finer
grained controlled and can be tuned by end users to target their own specific
requirements. This builds on the work of the HttpCommComponent and uses the same configuration
block to configure the thread pool. The default configuration has
the same behaviour as solr 3.5, favouring throughput over latency. More
information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml)
(Greg Bowyer)

SOLR-2001: The query component will substitute an empty query that matches
no documents if the query parser returns null. This also prevents an
exception from being thrown by the default parser if "q" is missing.
(yonik)

SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
These can be used to customize range query/sort behavior, for example to
support numeric collation, ignore punctuation/whitespace, ignore accents but
not case, control whether upper/lowercase values are sorted first, etc.
(rmuir)

SOLR-2346: Add a chance to set content encoding explicitly via content type
of stream for extracting request handler. This is convenient when Tika's
auto detector cannot detect encoding, especially the text file is too short
to detect encoding.
(koji)

SOLR-3190: Minor improvements to SolrEntityProcessor. Add more consistency
between solr parameters and parameters used in SolrEntityProcessor and
ability to specify a custom HttpClient instance.
(Luca Cavanna via Martijn van Groningen)

SOLR-2382: Added pluggable cache support to DIH so that any Entity can be
made cache-able by adding the "cacheImpl" parameter. Include
"SortedMapBackedCache" to provide in-memory caching (as previously this was
the only option when using CachedSqlEntityProcessor). Users can provide
their own implementations of DIHCache for other caching strategies.
Deprecate CachedSqlEntityProcessor in favor of specifing "cacheImpl" with
SqlEntityProcessor. Make SolrWriter implement DIHWriter and allow the
possibility of pluggable Writers (DIH writing to something other than Solr).
(James Dyer, Noble Paul)

LUCENE-3820: Fixed invalid position indexes by reimplementing PatternReplaceCharFilter.
This change also drops real support for boundary characters -- all input is prebuffered
for pattern matching.
(Dawid Weiss)

LUCENE-3690, LUCENE-2208, SOLR-882, SOLR-42: Re-implemented
HTMLStripCharFilter as a JFlex-generated scanner and moved it to
lucene/contrib/analyzers/common/. See below for a list of bug fixes and
other changes. To get the same behavior as HTMLStripCharFilter in Solr
version 3.5 and earlier (including the bugs), use LegacyHTMLStripCharFilter,
which is the previous implementation.

Behavior changes from the previous version:

Known offset bugs are fixed.

The "Mark invalid" exceptions reported in SOLR-1283 are no longer
triggered (the bug is still present in LegacyHTMLStripCharFilter).

The character entity "&apos;" is now always properly decoded.

More cases of <script> tags are now properly stripped.

CDATA sections are now handled properly.

Valid tag name characters now include the supplementary Unicode characters
from Unicode character classes [:ID_Start:] and [:ID_Continue:].

Uppercase character entities "&QUOT;", "&COPY;", "&GT;", "&LT;", "&REG;",
and "&AMP;" are now recognized and handled as if they were in lowercase.

The REPLACEMENT CHARACTER U+FFFD is now used to replace numeric character
entities for unpaired UTF-16 low and high surrogates (in the range
[U+D800-U+DFFF]).

Properly paired numeric character entities for UTF-16 surrogates are now
converted to the corresponding code units.

Opening tags with unbalanced quotation marks are now properly stripped.

Literal "<" and ">" characters in opening tags, regardless of whether they
appear inside quotation marks, now inhibit recognition (and stripping) of
the tags. The only exception to this is for values of event-handler
attributes, e.g. "onClick", "onLoad", "onSelect".

A newline '\n' is substituted instead of a space for stripped HTML markup.

HTMLStripCharFilterFactory now handles HTMLStripCharFilter's "escapedTags"
feature: opening and closing tags with the given names, including any
attributes and their values, are left intact in the output.

(Steve Rowe)

LUCENE-3717: Fixed offset bugs in TrimFilter, WordDelimiterFilter, and
HyphenatedWordsFilter where they would create invalid offsets in
some situations, leading to problems in highlighting.
(Robert Muir)

SOLR-3074: fix SolrPluginUtils.docListToSolrDocumentList to respect the
list of fields specified. This fix also deprecates
DocumentBuilder.loadStoredFields which is not used anywhere in Solr,
and was fundamentally broken/bizarre.
(hossman, Ahmet Arslan)

SOLR-3264: Fix CoreContainer and SolrResourceLoader logging to be more
clear about when SolrCores are being created, and stop misleading people
about SolrCore instanceDir's being the "Solr Home Dir"
(hossman)

SOLR-3200: Fix SignatureUpdateProcessor "all fields" mode to use all
fields of each document instead of the fields specified by the first
document indexed
(Spyros Kapnissis via hossman)

SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and
sometimes returned a wrong hit count as matches.
(Cody Young, Martijn van Groningen)

SOLR-3107: contrib/langid: When using the LangDetect implementation of
langid, set the random seed to 0, so that the same document is detected as
the same language with the same probability every time.
(Christian Moen via rmuir)

SOLR-2937: Configuring the number of contextual snippets used for
search results clustering. The hl.snippets parameter is now respected
by the clustering plugin, can be overridden by carrot.summarySnippets
if needed
(Stanislaw Osinski).

SOLR-2938: Clustering on multiple fields. The carrot.title and
carrot.snippet can now take comma- or space-separated lists of
field names to cluster
(Stanislaw Osinski).

SOLR-2939: Clustering of multilingual search results. The document's
language field be passed in the carrot.lang parameter, the carrot.lcmap
parameter enables mapping of language codes to ISO 639
(Stanislaw Osinski).

SOLR-2940: Passing values for custom Carrot2 fields to Clustering component.
The custom field mapping are defined using the carrot.custom parameter
(Stanislaw Osinski).

SOLR-2941: NullPointerException on clustering component initialization
when schema does not have a unique key field
(Stanislaw Osinski).

SOLR-3032: Deprecate logOnce from SolrException logOnce and all the supporting
structure will disappear in 4.0. Errors should be caught and logged at the
top-most level or logged and NOT propagated up the chain.
(Erick Erickson)

SOLR-2712: expecting fl=score to return all fields is now deprecated.
In solr 4.0, this will only return the score.
(ryan)

SOLR-3156: Check for Lucene directory locks at startup. In previous versions
this check was only performed during modifying (e.g. adding and deleting
documents) the index.
(Luca Cavanna via Martijn van Groningen)

SOLR-1052: Deprecated <indexDefaults> and <mainIndex> in solrconfig.xml
From now, all settings go in the new <indexConfig> tag, and some defaults are
changed: useCompoundFile=false, ramBufferSizeMB=32, lockType=native, so that
the effect of NOT specifying <indexConfig> at all gives same result as the
example config used to give in 3.5
(janhoy, gsingers)

SOLR-3295: netcdf jar is excluded from the binary release (and disabled in
ivy.xml) because it requires java 6. If you want to parse this content with
extracting request handler and are willing to use java 6, just add the jar.
(rmuir)

SOLR-3142: DIH Imports no longer default optimize to true, instead false.
If you want to force all segments to be merged into one, you can specify
this parameter yourself. NOTE: this can be very expensive operation and
usually does not make sense for delta-imports.
(Robert Muir)

SOLR-3204: The packaged pre-release artifact of Commons CSV used the original
package name (org.apache.commons.csv). This created a compatibility issue as
the Apache Commons team works toward an official release of Commons CSV.
The source of Commons CSV was added under a separate package name to the
Solr source code.
(Uwe Schindler, Chris Male, Emmanuel Bourg)

SOLR-2749: Add boundary scanners for FastVectorHighlighter. <boundaryScanner/>
can be specified with a name in solrconfig.xml, and use hl.boundaryScanner=name
parameter to specify the named <boundaryScanner/>.
(koji)

SOLR-2771: Solr modules' tests should not depend on solr-core test classes;
move BufferingRequestProcessor from solr-core tests to test-framework so that
the Solr Cell module can use it.
(janhoy, Steve Rowe)

The Lucene index format has changed and as a result, once you upgrade,
previous versions of Solr will no longer be able to read your indices.
In a master/slave configuration, all searchers/slaves should be upgraded
before the master. If the master were to be updated first, the older
searchers would not be able to read the new index format.

Previous versions of Solr silently allow and ignore some contradictory
properties specified in schema.xml. For example:

indexed="false" omitNorms="false"

indexed="false" omitTermFreqAndPositions="false"

Field property validation has now been fixed, to ensure that
contradictions like these now generate error messages. If users
have existing schemas that generate one of these new "conflicting
'false' field options for non-indexed field" error messages the
conflicting "omit*" properties can safely be removed, or changed to
"true" for consistent behavior with previous Solr versions. This
situation has now been fixed to cause an error on startup when these
contradictory options. See SOLR-2669.

SOLR-2429: Ability to add cache=false to queries and query filters to avoid
using the filterCache or queryCache. A cost may also be specified and is used
to order the evaluation of non-cached filters from least to greatest cost .
For very expensive query filters (cost >= 100) if the query implements
the PostFilter interface, it will be used to obtain a Collector that is
checked only for documents that match the main query and all other filters.
The "frange" query now implements the PostFilter interface.
(yonik)

SOLR-2630: Added new XsltUpdateRequestHandler that works like
XmlUpdateRequestHandler but allows to transform the POSTed XML document
using XSLT. This allows to POST arbitrary XML documents to the update
handler, as long as you also provide a XSL to transform them to a valid
Solr input document.
(Upayavira, Uwe Schindler)

SOLR-2615: Log individual updates (adds and deletes) at the FINE level
before adding to the index. Fix a null pointer exception in logging
when there was no unique key.
(David Smiley via yonik)

LUCENE-2048: Added omitPositions to the schema, so you can omit position
information while still indexing term frequencies.
(rmuir)

SOLR-2523: Added support in SolrJ to easily interact with range facets.
The range facet response can be parsed and is retrievable from the
QueryResponse class. The SolrQuery class has convenient methods for using
range facets.
(Martijn van Groningen)

SOLR-2637: Added support for group result parsing in SolrJ.
(Tao Cheng, Martijn van Groningen)

SOLR-2665: Added post group faceting. Facet counts are based on the most
relevant document of each group matching the query. This feature has the
same impact on the StatsComponent.
(Martijn van Groningen)

SOLR-2675: CoreAdminHandler now allows arbitrary properties to be
specified when CREATEing a new SolrCore using property.* request
params.
(Yury Kats, hossman)

Created a new Solr-internal module named "core" by moving the java/,
test/, and test-files/ directories from solr/src/ to solr/core/src/.

Merged solr/src/webapp/src/ into solr/core/src/java/.

Eliminated solr/src/ by moving all its directories up one level;
renamed solr/src/site/ to solr/site-src/ because solr/site/ already
exists.

Merged solr/src/common/ into solr/solrj/src/java/.

Moved o.a.s.client.solrj.* and o.a.s.common.* tests from
solr/src/test/ to solr/solrj/src/test/.

Made the solrj tests not depend on the solr core tests by moving
some classes from solr/src/test/ to solr/test-framework/src/java/.

Each internal module (core/, solrj/, test-framework/, and webapp/)
now has its own build.xml, from which it is possible to run
module-specific targets. solr/build.xml delegates all build
tasks (via <ant dir="internal-module-dir"> calls) to these
modules' build.xml files.

(Steve Rowe, Robert Muir)

LUCENE-3406: Add ant target 'package-local-src-tgz' to Lucene and Solr
to package sources from the local working copy.
(Seung-Yeoul Yang via Steve Rowe)

SolrCore's CloseHook API has been changed in a backward-incompatible way. It
has been changed from an interface to an abstract class. Any custom
components which use the SolrCore.addCloseHook method will need to
be modified accordingly. To migrate, put your old CloseHook#close impl into
CloseHook#preClose.

SOLR-2378: A new, automaton-based, implementation of suggest (autocomplete)
component, offering an order of magnitude smaller memory consumption
compared to ternary trees and jaspell and very fast lookups at runtime.
(Dawid Weiss)

SOLR-2400: Field- and DocumentAnalysisRequestHandler now provide a position
history for each token, so you can follow the token through all analysis stages.
The output contains a separate int[] attribute containing all positions from
previous Tokenizers/TokenFilters (called "positionHistory").
(Uwe Schindler)

LUCENE-3204: The maven-ant-tasks jar is now included in the source tree;
users of the generate-maven-artifacts target no longer have to manually
place this jar in the Ant classpath. NOTE: when Ant looks for the
maven-ant-tasks jar, it looks first in its pre-existing classpath, so
any copies it finds will be used instead of the copy included in the
Lucene/Solr source tree. For this reason, it is recommeded to remove
any copies of the maven-ant-tasks jar in the Ant classpath, e.g. under
~/.ant/lib/ or under the Ant installation's lib/ directory.
(Steve Rowe)

Detailed Change List

SOLR-2496: Add ability to specify overwrite and commitWithin as request
parameters (e.g. specified in the URL) when using the JSON update format,
and added a simplified format for specifying multiple documents.
Example: [{"id":"doc1"},{"id":"doc2"}]
(yonik)

SOLR-2113: Add TermQParserPlugin, registered as "term". This is useful
when generating filter queries from terms returned from field faceting or
the terms component. Example: fq={!term f=weight}1.5
(hossman, yonik)

SOLR-1915: DebugComponent now supports using a NamedList to model
Explanation objects in it's responses instead of
Explanation.toString
(hossman)

SOLR-2464: Fix potential slowness in QueryValueSource (the query() function) when
the query is very sparse and may not match any documents in a segment.
(yonik)

SOLR-2469: When using java replication with replicateAfter=startup, the first
commit point on server startup is never removed.
(yonik)

SOLR-2466: SolrJ's CommonsHttpSolrServer would retry requests on failure, regardless
of the configured maxRetries, due to HttpClient having it's own retry mechanism
by default. The retryCount of HttpClient is now set to 0, and SolrJ does
the retry.
(yonik)

SOLR-2409: edismax parser - treat the text of a fielded query as a literal if the
fieldname does not exist. For example Mission: Impossible should not search on
the "Mission" field unless it's a valid field in the schema.
(Ryan McKinley, yonik)

SOLR-2403: facet.sort=index reported incorrect results for distributed search
in a number of scenarios when facet.mincount>0. This patch also adds some
performance/algorithmic improvements when (facet.sort=count && facet.mincount=1
&& facet.limit=-1) and when (facet.sort=index && facet.mincount>0)
(yonik)

SOLR-2333: The "rename" core admin action does not persist the new name to solr.xml
(Rasmus Hahn, Paul R. Brown via Mark Miller)

SOLR-2390: Performance of usePhraseHighlighter is terrible on very large Documents,
regardless of hl.maxDocCharsToAnalyze.
(Mark Miller)

SOLR-2474: The helper TokenStreams in analysis.jsp and AnalysisRequestHandlerBase
did not clear all attributes so they displayed incorrect attribute values for tokens
in later filter stages.
(uschindler, rmuir, yonik)

The Lucene index format has changed and as a result, once you upgrade,
previous versions of Solr will no longer be able to read your indices.
In a master/slave configuration, all searchers/slaves should be upgraded
before the master. If the master were to be updated first, the older
searchers would not be able to read the new index format.

The Solr JavaBin format has changed as of Solr 3.1. If you are using the
JavaBin format, you will need to upgrade your SolrJ client.
(SOLR-2034)

Old syntax of <highlighting> configuration in solrconfig.xml
is deprecated
(SOLR-1696)

The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
HTMLStripCharFilter should be used instead, and it works with any
Tokenizer of your choice.
(SOLR-1657)

Field compression is no longer supported. Fields that were formerly
compressed will be uncompressed as index segments are merged. For
shorter fields, this may actually be an improvement, as the compression
used was not very good for short text. Some indexes may get larger though.

SOLR-1845: The TermsComponent response format was changed so that the
"terms" container is a map instead of a named list. This affects
response formats like JSON, but not XML.
(yonik)

SOLR-1876: All Analyzers and TokenStreams are now final to enforce
the decorator pattern.
(rmuir, uschindler)

LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker
methods using the new SpellingOptions class, but are not required to. While this change is
backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method.
(gsingers)

In previous releases, sorting or evaluating function queries on
fields that were "multiValued" (either by explicit declaration in
schema.xml or by implict behavior because the "version" attribute on
the schema was less then 1.2) did not generally work, but it would
sometimes silently act as if it succeeded and order the docs
arbitrarily. Solr will now fail on any attempt to sort, or apply a
function to, multi-valued fields

The DataImportHandler jars are no longer included in the solr
WAR and should be added in Solr's lib directory, or referenced
via the <lib> directive in solrconfig.xml.

Detailed Change List

SOLR-1302: Added several new distance based functions, including
Great Circle (haversine), Manhattan, Euclidean and String (using the
StringDistance methods in the Lucene spellchecker).
Also added geohash(), deg() and rad() convenience functions.
See http://wiki.apache.org/solr/FunctionQuery.(gsingers)

SOLR-1131: FieldTypes can now output multiple Fields per Type and still be searched. This can be handy for hiding the details of a particular
implementation such as in the spatial case.
(Chris Mattmann, shalin, noble, gsingers, yonik)

SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the
performance of SnowballPorterFilterFactory.
(rmuir)

SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
TokenFilters now support custom Attributes, and some have improved performance:
especially WordDelimiterFilter and CommonGramsFilter.
(rmuir, cmale, uschindler)

SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator"
parameters for controlling the minimum shingle size produced by the filter, and
the separator string that it uses, respectively.
(Steven Rowe via rmuir)

SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles"
parameter, to output unigrams if the number of input tokens is fewer than
minShingleSize, and no shingles can be generated.
(Chris Harris via Steven Rowe)

SOLR-1923: PhoneticFilterFactory now has support for the
Caverphone algorithm.
(rmuir)

SOLR-1966: QueryElevationComponent can now return just the included results in the elevation file
(gsingers, yonik)

SOLR-1556: TermVectorComponent now supports per field overrides. Also, it now throws an error
if passed in fields do not exist and warnings
if fields that do not have term vector options (termVectors, offsets, positions)
that align with the schema declaration. It also
will now return warnings about
(gsingers)

SOLR-397: Date Faceting now supports a "facet.date.include" param
for specifying when the upper & lower end points of computed date
ranges should be included in the range. Legal values are: "all",
"lower", "upper", "edge", and "outer". For backwards compatibility
the default value is the set: [lower,upper,edge], so that all ranges
between start and end are inclusive of their endpoints, but the
"before" and "after" ranges are not.

SOLR-2015: Add a boolean attribute autoGeneratePhraseQueries to TextField.
autoGeneratePhraseQueries="true" (the default) causes the query parser to
generate phrase queries if multiple tokens are generated from a single
non-quoted analysis string. For example WordDelimiterFilter splitting text:pdp-11
will cause the parser to generate text:"pdp 11" rather than (text:PDP OR text:11).
Note that autoGeneratePhraseQueries="true" tends to not work well for non whitespace
delimited languages.
(yonik)

SOLR-1240: "Range Faceting" has been added. This is a generalization
of the existing "Date Faceting" logic so that it now supports any
all stock numeric field types that support range queries in addition
to dates. facet.date is now deprecated in favor of this generalized mechanism.
(Gijs Kunze, hossman)

SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved.
(gsingers)

SOLR-2133: Function query parser can now parse multiple comma separated
value sources. It also now fails if there is extra unexpected text
after parsing the functions, instead of silently ignoring it.
This allows expressions like q=dist(2,vector(1,2),$pt)&pt=3,4
(yonik)

SOLR-1804: Re-enabled clustering component on trunk, updated to latest
version of Carrot2. No more LGPL run-time dependencies. This release of
C2 also does not have a specific Lucene dependency.
(Stanislaw Osinski, gsingers)

SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
UAX#29, a unicode algorithm with good results for most languages, as well as
URL and E-mail tokenization according to the relevant RFCs.
(Tom Burton-West via rmuir)

SOLR-1968: speed up initial filter cache population for facet.method=enum and
also big terms for multi-valued facet.method=fc. The resulting speedup
for the first facet request is anywhere from 30% to 32x, depending on how many
terms are in the field and how many documents match per term.
(yonik)

SOLR-2089: Speed up UnInvertedField faceting (facet.method=fc for
multi-valued fields) when facet.limit is both high, and a high enough
percentage of the number of unique terms in the field. Extreme cases
yield speedups over 3x.
(yonik)

SOLR-1577: The example solrconfig.xml defaulted to a solr data dir
relative to the current working directory, even if a different solr home
was being used. The new behavior changes the default to a zero length
string, which is treated the same as if no dataDir had been specified,
hence the "data" directory under the solr home will be used.
(yonik)

SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs
(i.e. code points outside of the BMP), resulting in incorrect
matching. This change requires reindexing for any content with
such characters.
(Robert Muir, yonik)

SOLR-1596: A rollback operation followed by the shutdown of Solr
or the close of a core resulted in a warning:
"SEVERE: SolrIndexWriter was not closed prior to finalize()" although
there were no other consequences.
(yonik)

SOLR-1595: StreamingUpdateSolrServer used the platform default character
set when streaming updates, rather than using UTF-8 as the HTTP headers
indicated, leading to an encoding mismatch.
(hossman, yonik)

SOLR-1587: A distributed search request with fl=score, didn't match
the behavior of a non-distributed request since it only returned
the id,score fields instead of all fields in addition to score.
(yonik)

SOLR-1601: Schema browser does not indicate presence of charFilter.
(koji)

SOLR-1615: Backslash escaping did not work in quoted strings
for local param arguments.
(Wojtek Piaseczny, yonik)

SOLR-1660: CapitalizationFilter crashes if you use the maxWordCountOption
(Robert Muir via shalin)

SOLR-1667: PatternTokenizer does not reset attributes such as positionIncrementGap
(Robert Muir via shalin)

SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that
could halt the streaming of documents. The original patch to fix this
(never officially released) introduced another hanging bug due to
connections not being released.
(Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik)

SOLR-1870: Indexing documents using the 'javabin' format no longer
fails with a ClassCastException whenSolrInputDocuments contain field
values which are Collections or other classes that implement
Iterable.
(noble, hossman)

SOLR-1981: Solr will now fail correctly if solr.xml attempts to
specify multiple cores that have the same name
(hossman)

SOLR-2100: The replication handler backup command didn't save the commit
point and hence could fail when a newer commit caused the older commit point
to be removed before it was finished being copied. This did not affect
normal master/slave replication.
(Peter Sturge via yonik)

SOLR-2111: Change exception handling in distributed faceting to work more
like non-distributed faceting, change facet_counts/exception from a String
to a List<String> to enable listing all exceptions that happened, and
prevent an exception in one facet command from affecting another
facet command.
(yonik)

SOLR-2110: Remove the restriction on names for local params
substitution/dereferencing. Properly encode local params in
distributed faceting.
(yonik)

SOLR-96: Fix XML parsing in XMLUpdateRequestHandler and
DocumentAnalysisRequestHandler to respect charset from XML file and only
use HTTP header's "Content-Type" as a "hint".
(uschindler)

SOLR-2339: Fix sorting to explicitly generate an error if you
attempt to sort on a multiValued field.
(hossman)

SOLR-2348: Fix field types to explicitly generate an error if you
attempt to get a ValueSource for a multiValued field.
(hossman)

SOLR-2380: Distributed faceting could miss values when facet.sort=index
and when facet.offset was greater than 0.
(yonik)

SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader
are fixed to be resolved using the URI standard (RFC 2396). The system
identifier is no longer a plain filename with path, it gets initialized
using a custom URI scheme "solrres:". This scheme is resolved using a
EntityResolver that utilizes ResourceLoader
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
pathes in Solr's config files behave like expected. This change
introduces some backwards breaks in the API: Some config classes
(Config, SolrConfig, IndexSchema) were changed to take
org.xml.sax.InputSource instead of InputStream. There may also be some
backwards breaks in existing config files, it is recommended to check
your config files / XSLTs and replace all XIncludes/HREFs that were
hacked to use absolute paths to use relative ones.
(uschindler)

SOLR-309: Fix FieldType so setting an analyzer on a FieldType that
doesn't expect it will generate an error. Practically speaking this
means that Solr will now correctly generate an error on
initialization if the schema.xml contains an analyzer configuration
for a fieldType that does not use TextField.
(hossman)

SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
thread safe and could throw an exception.
(yonik)

SOLR-1756: The date.format setting for extraction request handler causes
ClassCastException when enabled and the config code that parses this setting
does not properly use the same iterator instance.
(Christoph Brill, Mark Miller)

SOLR-2221: Use StrUtils.parseBool() to get values of boolean options in DIH.
true/on/yes (for TRUE) and false/off/no (for FALSE) can be used for
sub-options (debug, verbose, synchronous, commit, clean, optimize) for
full/delta-import commands.
(koji)

SOLR-1191: resolve DataImportHandler deltaQuery column against pk when pk
has a prefix (e.g. pk="book.id" deltaQuery="select id from ..."). More
useful error reporting when no match found (previously failed with a
NullPointerException in log and no clear user feedback).
(gthb via yonik)

SOLR-1570: Log warnings if uniqueKey is multi-valued or not stored
(hossman, shalin)

SOLR-1558: QueryElevationComponent only works if the uniqueKey field is
implemented using StrField. In previous versions of Solr no warning or
error would be generated if you attempted to use QueryElevationComponent,
it would just fail in unexpected ways. This has been changed so that it
will fail with a clear error message on initialization.
(hossman)

SOLR-1695: Improved error messages when adding a document that does not
contain exactly one value for the uniqueKey field
(hossman)

SOLR-1776: DismaxQParser and ExtendedDismaxQParser now use the schema.xml
"defaultSearchField" as the default value for the "qf" param instead of failing
with an error when "qf" is not specified.
(hossman)

SOLR-1851: luceneAutoCommit no longer has any effect - it has been remove
(Mark Miller)

SOLR-1865: SolrResourceLoader.getLines ignores Byte Order Markers (BOMs) at the
beginning of input files, these are often created by editors such as Windows
Notepad.
(rmuir, hossman)

SOLR-1938: ElisionFilterFactory will use a default set of French contractions
if you do not supply a custom articles file.
(rmuir)

SOLR-2003: SolrResourceLoader will report any encoding errors, rather than
silently using replacement characters for invalid inputs
(blargy via rmuir)

SOLR-2034: Switch to JavaBin codec version 2. Strings are now serialized
as the number of UTF-8 bytes, followed by the bytes in UTF-8. Previously
Strings were serialized as the number of UTF-16 chars, followed by the
bytes in Modified UTF-8.
(hossman, yonik, rmuir)

SOLR-2391: The preferred Content-Type for XML was changed to
application/xml. XMLResponseWriter now only delivers using this
type; updating documents and analyzing documents is still supported
using text/xml as Content-Type, too. If you have clients that are
hardcoded on text/xml as Content-Type, you have to change them.
(uschindler, rmuir)

SOLR-2414: All ResponseWriters now use only ServletOutputStreams
and wrap their own Writer around it when serializing. This fixes
the bug in PHPSerializedResponseWriter that produced wrong string
length if the servlet container had a broken UTF-8 encoding that was
in fact CESU-8 (see SOLR-1091). The system property to enable the
CESU-8 byte counting in PHPSerializesResponseWriters for broken
servlet containers was therefore removed and is now ignored if set.
Output is always UTF-8.
(uschindler, yonik, rmuir)

There is a new default faceting algorithm for multiVaued fields that should be
faster for most cases. One can revert to the previous algorithm (which has
also been improved somewhat) by adding facet.method=enum to the request.

Searching and sorting is now done on a per-segment basis, meaning that
the FieldCache entries used for sorting and for function queries are
created and used per-segment and can be reused for segments that don't
change between index updates. While generally beneficial, this can lead
to increased memory usage over 1.3 in certain scenarios:
1) A single valued field that was used for both sorting and faceting
in 1.3 would have used the same top level FieldCache entry. In 1.4,
sorting will use entries at the segment level while faceting will still
use entries at the top reader level, leading to increased memory usage.
2) Certain function queries such as ord() and rord() require a top level
FieldCache instance and can thus lead to increased memory usage. Consider
replacing ord() and rord() with alternatives, such as function queries
based on ms() for date boosting.

If you use custom Tokenizer or TokenFilter components in a chain specified in
schema.xml, they must support reusability. If your Tokenizer or TokenFilter
maintains state, it should implement reset(). If your TokenFilteFactory does
not return a subclass of TokenFilter, then it should implement reset() and call
reset() on it's input TokenStream. TokenizerFactory implementations must
now return a Tokenizer rather than a TokenStream.

New users of Solr 1.4 will have omitTermFreqAndPositions enabled for non-text
indexed fields by default, which avoids indexing term frequency, positions, and
payloads, making the index smaller and faster. If you are upgrading from an
earlier Solr release and want to enable omitTermFreqAndPositions by default,
change the schema version from 1.1 to 1.2 in schema.xml. Remove any existing
index and restart Solr to ensure that omitTermFreqAndPositions completely takes
affect.

The default QParserPlugin used by the QueryComponent for parsing the "q" param
has been changed, to remove support for the deprecated use of ";" as a separator
between the query string and the sort options when no "sort" param was used.
Users who wish to continue using the semi-colon based method of specifying the
sort options should explicitly set the defType param to "lucenePlusSort" on all
requests. (The simplest way to do this is by specifying it as a default param
for your request handlers in solrconfig.xml, see the example solrconfig.xml for
sample syntax.)

If spellcheck.extendedResults=true, the response format for suggestions
has changed, see SOLR-1071.

Use of the "charset" option when configuring the following Analysis
Factories has been deprecated and will cause a warning to be logged.
In future versions of Solr attempting to use this option will cause an
error. See SOLR-1410 for more information.

GreekLowerCaseFilterFactory

RussianStemFilterFactory

RussianLowerCaseFilterFactory

RussianLetterTokenizerFactory

DIH: Evaluator API has been changed in a non back-compatible way. Users who
have developed custom Evaluators will need to change their code according to
the new API for it to work. See SOLR-996 for details.

DIH: The formatDate evaluator's syntax has been changed. The new syntax is
formatDate(<variable>, '<format_string>'). For example,
formatDate(x.date, 'yyyy-MM-dd'). In the old syntax, the date string was
written without a single-quotes. The old syntax has been deprecated and will
be removed in 1.5, until then, using the old syntax will log a warning.

DIH: The Context API has been changed in a non back-compatible way. In
particular, the Context.currentProcess() method now returns a String
describing the type of the current import process instead of an int.
Similarily, the public constants in Context viz. FULL_DUMP, DELTA_DUMP and
FIND_DELTA are changed to a String type. See SOLR-969 for details.

DIH: The EntityProcessor API has been simplified by moving logic for applying
transformers and handling multi-row outputs from Transformers into an
EntityProcessorWrapper class. The EntityProcessor#destroy is now called once
per parent-row at the end of row (end of data). A new method
EntityProcessor#close is added which is called at the end of import.

DIH: In Solr 1.3, if the last_index_time was not available (first import) and
a delta-import was requested, a full-import was run instead. This is no longer
the case. In Solr 1.4 delta import is run with last_index_time as the epoch
date (January 1, 1970, 00:00:00 GMT) if last_index_time is not available.

Detailed Change List

SOLR-560: Use SLF4J logging API rather then JDK logging. The packaged .war file is
shipped with a JDK logging implementation, so logging configuration for the .war should
be identical to solr 1.3. However, if you are using the .jar file, you can select
which logging implementation to use by dropping a different binding.
See: http://www.slf4j.org/(ryan)

SOLR-561: Added Replication implemented in Java as a request handler. Supports index replication
as well as configuration replication and exposes detailed statistics and progress information
on the Admin page. Works on all platforms.
(Noble Paul, yonik, Akshay Ukey, shalin)

SOLR-821: Add support for replication to copy conf file to slave with a different name. This allows replication
of solrconfig.xml
(Noble Paul, Akshay Ukey via shalin)

SOLR-911: Add support for multi-select faceting by allowing filters to be
tagged and facet commands to exclude certain filters. This patch also
added the ability to change the output key for facets in the response, and
optimized distributed faceting refinement by lowering parsing overhead and
by making requests and responses smaller.

SOLR-876: WordDelimiterFilter now supports a splitOnNumerics
option, as well as a list of protected terms.
(Dan Rosher via hossman)

SOLR-928: SolrDocument and SolrInputDocument now implement the Map<String,?>
interface. This should make plugging into other standard tools easier.
(ryan)

SOLR-540: Add support for globbing in field names to highlight.
For example, hl.fl=*_text will highlight all fieldnames ending with
_text.
(Lars Kotthoff via yonik)

SOLR-906: Adding a StreamingUpdateSolrServer that writes update commands to
an open HTTP connection. If you are using solrj for bulk update requests
you should consider switching to this implementaion. However, note that
the error handling is not immediate as it is with the standard SolrServer.
(ryan)

SOLR-865: Adding support for document updates in binary format and corresponding support in Solrj client.
(Noble Paul via shalin)

SOLR-1115: <bool>on</bool> and <bool>yes</bool> work as expected in solrconfig.xml.
(koji)

SOLR-1099: A FieldAnalysisRequestHandler which provides the analysis functionality of the web admin page as
a service. The AnalysisRequestHandler is renamed to DocumentAnalysisRequestHandler which is enhanced with
query analysis and showMatch support. AnalysisRequestHandler is now deprecated. Support for both
FieldAnalysisRequestHandler and DocumentAnalysisRequestHandler is also provided in the Solrj client.
(Uri Boness, shalin)

SOLR-1106: Made CoreAdminHandler Actions pluggable so that additional actions may be plugged in or the existing
ones can be overridden if needed.
(Kay Kay, Noble Paul, shalin)

SOLR-1124: Add a top() function query that causes it's argument to
have it's values derived from the top level IndexReader, even when
invoked from a sub-reader. top() is implicitly used for the
ord() and rord() functions.
(yonik)

SOLR-243: Add configurable IndexReaderFactory so that alternate IndexReader implementations
can be specified via solrconfig.xml. Note that using a custom IndexReader may be incompatible
with ReplicationHandler (see comments in SOLR-1366). This should be treated as an experimental feature.
(Andrzej Bialecki, hossman, Mark Miller, John Wang)

SOLR-1214: differentiate between solr home and instanceDir .deprecates the method SolrResourceLoader#locateInstanceDir()
and it is renamed to locateSolrHome
(noble)

SOLR-1145: Add capability to specify an infoStream log file for the underlying Lucene IndexWriter in solrconfig.xml.
This is an advanced debug log file that can be used to aid developers in fixing IndexWriter bugs. See the commented
out example in the example solrconfig.xml under the indexDefaults section.
(Chris Harris, Mark Miller)

SOLR-1237: firstSearcher and newSearcher can now be identified via the CommonParams.EVENT (evt) parameter
in a request. This allows a RequestHandler or SearchComponent to know when a newSearcher or firstSearcher
event happened. QuerySenderListender is the only implementation in Solr that implements this, but outside
implementations may wish to. See the AbstractSolrEventListener for a helper method.
(gsingers)

SOLR-1343: Added HTMLStripCharFilter and marked HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
HTMLStripStandardTokenizerFactory deprecated. To strip HTML tags, HTMLStripCharFilter can be used
with an arbitrary Tokenizer.
(koji)

SOLR-1033: Current entity's namespace is made available to all DIH
Transformers. This allows one to use an output field of TemplateTransformer
in other transformers, among other things.
(Fergus McMenemie, Noble Paul via shalin)

SOLR-1062: A DIH LogTransformer which can log data in a given template
format.
(Jon Baer, Noble Paul via shalin)

SOLR-1065: A DIH ContentStreamDataSource which can accept HTTP POST data
in a content stream. This can be used to push data to Solr instead of
just pulling it from DB/Files/URLs.
(Noble Paul via shalin)

SOLR-1059: Special DIH flags introduced for deleting documents by query or
id, skipping rows and stopping further transforms. Use $deleteDocById,
$deleteDocByQuery for deleting by id and query respectively. Use $skipRow
to skip the current row but continue with the document. Use $stopTransform
to stop further transformers. New methods are introduced in Context for
deleting by id and query.
(Noble Paul, Fergus McMenemie, shalin)

SOLR-1076: JdbcDataSource should resolve DIH variables in all its
configuration parameters.
(shalin)

SOLR-1055: Make DIH JdbcDataSource easily extensible by making the
createConnectionFactory method protected and return a
Callable<Connection> object.
(Noble Paul, shalin)

SOLR-1058: DIH: JdbcDataSource can lookup javax.sql.DataSource using JNDI.
Use a jndiName attribute to specify the location of the data source.
(Jason Shepherd, Noble Paul via shalin)

SOLR-1060: A DIH LineEntityProcessor which can stream lines of text from a
given file to be indexed directly or for processing with transformers and
child entities.
(Fergus McMenemie, Noble Paul, shalin)

SOLR-1127: Add support for DIH field name to be templatized.
(Noble Paul, shalin)

SOLR-1092: Added a new DIH command named 'import' which does not
automatically clean the index. This is useful and more appropriate when one
needs to import only some of the entities.
(Noble Paul via shalin)

SOLR-1153: DIH 'deltaImportQuery' is honored on child entities as well
(noble)

SOLR-1230: Enhanced dataimport.jsp to work with all DataImportHandler
request handler configurations, rather than just a hardcoded /dataimport
handler.
(ehatcher)

SOLR-475: New faceting method with better performance and smaller memory usage for
multi-valued fields with many unique values but relatively few values per document.
Controllable via the facet.method parameter - "fc" is the new default method and "enum"
is the original method.
(yonik)

SOLR-970: Use an ArrayList in SolrPluginUtils.parseQueryStrings
since we know exactly how long the List will be in advance.
(Kay Kay via hossman)

SOLR-1169: SortedIntDocSet - a new small set implementation
that saves memory over HashDocSet, is faster to construct,
is ordered for easier implementation of skipTo, and is faster
in the general case.
(yonik)

SOLR-1165: Use Lucene Filters and pass them down to the Lucene
search methods to filter earlier and improve performance.
(yonik)

SOLR-1111: Use per-segment sorting to share fieldcache elements
across unchanged segments. This saves memory and reduces
commit times for incremental updates to the index.
(yonik)

SOLR-879: Enable position increments in the query parser and fix the
example schema to enable position increments for the stop filter in
both the index and query analyzers to fix the bug with phrase queries
with stopwords.
(yonik)

SOLR-807: BinaryResponseWriter writes fieldType.toExternal if it is not a supported type,
otherwise it writes fieldType.toObject. This fixes the bug with encoding/decoding UUIDField.
(koji, Noble Paul, shalin)

SOLR-863: SolrCore.initIndex should close the directory it gets for clearing the lock and
use the DirectoryFactory.
(Mark Miller via shalin)

SOLR-976: deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a <delete>.
Now both delete by id and delete by query can be specified at the same time as follows.
<delete>
<id>05991</id><id>06000</id>
<query>office:Bridgewater</query><query>office:Osaka</query>
</delete>
(koji)

SOLR-1078: Fixes to WordDelimiterFilter to avoid splitting or dropping
international non-letter characters such as non spacing marks.
(yonik)

SOLR-825, SOLR-1221: Enables highlighting for range/wildcard/fuzzy/prefix queries if using hl.usePhraseHighlighter=true
and hl.highlightMultiTerm=true. Also make both options default to true.
(Mark Miller, yonik)

SOLR-1182: Fix bug in OrdFieldSource#equals which could cause a bug with OrdFieldSource caching
on OrdFieldSource#hashcode collisions.
(Mark Miller)

SOLR-1207: equals method should compare this and other of DocList in DocSetBase
(koji)

SOLR-1242: Human readable JVM info from system handler does integer cutoff rounding, even when dealing
with GB. Fixed to round to one decimal place.
(Jay Hill, Mark Miller)

SOLR-1243: Admin RequestHandlers should not be cached over HTTP.
(Mark Miller)

SOLR-1260: Fix implementations of set operations for DocList subclasses
and fix a bug in HashDocSet construction when offset != 0. These bugs
never manifested in normal Solr use and only potentially affect
custom code.
(yonik)

SOLR-1371: LukeRequestHandler/schema.jsp errored if schema had no
uniqueKey field. The new test for this also (hopefully) adds some
future proofing against similar bugs in the future. As a side
effect QueryElevationComponentTest was refactored, and a bug in
that test was found.
(hossman)

SOLR-914: General finalize() improvements. No finalizer delegates
to the respective close/destroy method w/o first checking if it's
already been closed/destroyed; if it hasn't a, SEVERE error is
logged first.
(noble, hossman)

SOLR-1362: WordDelimiterFilter had inconsistent behavior when setting
the position increment of tokens following a token consisting of all
delimiters, and could additionally lose big position increments.
(Robert Muir, yonik)

SOLR-1091: Jetty's use of CESU-8 for code points outside the BMP
resulted in invalid output from the serialized PHP writer.
(yonik)

SOLR-1103: LukeRequestHandler (and schema.jsp) have been fixed to
include the "1" (ie: 2**0) bucket in the term histogram data.
(hossman)

SOLR-1517: Admin pages could stall waiting for localhost name resolution
if reverse DNS wasn't configured; this was changed so the DNS resolution
is attempted only once the first time an admin page is loaded.
(hossman)

SOLR-1529: More than 8 deleteByQuery commands in a single request
caused an error to be returned, although the deletes were
still executed.
(asmodean via yonik)

SOLR-1257: logging.jsp has been removed and now passes through to the
hierarchical log level tool added in Solr 1.3. Users still
hitting "/admin/logging.jsp" should switch to "/admin/logging".
(hossman)

Upgraded to Lucene 2.9-dev r794238. Other changes include:

LUCENE-1614 - Use Lucene's DocIdSetIterator.NO_MORE_DOCS as the sentinel value.

SOLR-1377: The TokenizerFactory API has changed to explicitly return a Tokenizer
rather then a TokenStream (that may be or may not be a Tokenizer). This change
is required to take advantage of the Token reuse improvements in lucene 2.9.
(ryan)

SOLR-1410: Log a warning if the deprecated charset option is used
on GreekLowerCaseFilterFactory, RussianStemFilterFactory,
RussianLowerCaseFilterFactory or RussianLetterTokenizerFactory.
(Robert Muir via hossman)

SOLR-1423: Due to LUCENE-1906, Solr's tokenizer should use Tokenizer.correctOffset() instead of CharStream.correctOffset().
(Uwe Schindler via koji)

SOLR-1319, SOLR-1345: Upgrade Solr Highlighter classes to new Lucene Highlighter API. This upgrade has
resulted in a back compat break in the DefaultSolrHighlighter class - getQueryScorer is no longer
protected. If you happened to be overriding that method in custom code, overide getHighlighter instead.
Also, HighlightingUtils#getQueryScorer has been removed as it was deprecated and backcompat has been
broken with it anyway.
(Mark Miller)

SOLR-782: DIH: Refactored SolrWriter to make it a concrete class and
removed wrappers over SolrInputDocument. Refactored to load Evaluators
lazily. Removed multiple document nodes in the configuration xml. Removed
support for 'default' variables, they are automatically available as
request parameters.
(Noble Paul via shalin)

SOLR-1120: Simplified DIH EntityProcessor API by moving logic for applying
transformers and handling multi-row outputs from Transformers into an
EntityProcessorWrapper class. The behavior of the method
EntityProcessor#destroy has been modified to be called once per parent-row
at the end of row. A new method EntityProcessor#close is added which is
called at the end of import. A new method
Context#getResolvedEntityAttribute is added which returns the resolved
value of an entity's attribute. Introduced a DocWrapper which takes care
of maintaining document level session variables.
(Noble Paul, shalin)

IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
should be upgraded before the master! If the master were to be updated
first, the older searchers would not be able to read the new index format.

The Porter snowball based stemmers in Lucene were updated (LUCENE-1142),
and are not guaranteed to be backward compatible at the index level
(the stem of certain words may have changed). Re-indexing is recommended.

Older Apache Solr installations can be upgraded by replacing
the relevant war file with the new version. No changes to configuration
files should be needed.

This version of Solr contains a new version of Lucene implementing
an updated index format. This version of Solr/Lucene can still read
and update indexes in the older formats, and will convert them to the new
format on the first index change. Be sure to backup your index before
upgrading in case you need to downgrade.

Solr now recognizes HTTP Request headers related to HTTP Caching (see
RFC 2616 sec13) and will by default respond with "304 Not Modified"
when appropriate. This should only affect users who access Solr via
an HTTP Cache, or via a Web-browser that has an internal cache, but if
you wish to suppress this behavior an '<httpCaching never304="true"/>'
option can be added to your solrconfig.xml. See the wiki (or the
example solrconfig.xml) for more details...
http://wiki.apache.org/solr/SolrConfigXml#HTTPCaching

In Solr 1.2, DateField did not enforce the canonical representation of
the ISO 8601 format when parsing incoming data, and did not generation
the canonical format when generating dates from "Date Math" strings
(particularly as it pertains to milliseconds ending in trailing zeros).
As a result equivalent dates could not always be compared properly.
This problem is corrected in Solr 1.3, but DateField users that might
have been affected by indexing inconsistent formats of equivilent
dates (ie: 1995-12-31T23:59:59Z vs 1995-12-31T23:59:59.000Z) may want
to consider reindexing to correct these inconsistencies. Users who
depend on some of the the "broken" behavior of DateField in Solr 1.2
(specificly: accepting any input that ends in a 'Z') should consider
using the LegacyDateField class as a possible alternative. Users that
desire 100% backwards compatibility should consider using the Solr 1.2
version of DateField.

Due to some changes in the lifecycle of TokenFilterFactories, users of
Solr 1.2 who have written Java code which constructs new instances of
StopFilterFactory, SynonymFilterFactory, or EnglishProterFilterFactory
will need to modify their code by adding a line like the following
prior to using the factory object...
factory.inform(SolrCore.getSolrCore().getSolrConfig().getResourceLoader());
These lifecycle changes do not affect people who use Solr "out of the
box" or who have developed their own TokenFilterFactory plugins. More
info can be found in SOLR-594.

The python client that used to ship with Solr is no longer included in
the distribution (see client/python/README.txt).

Detailed Change List

SOLR-69: Adding MoreLikeThisHandler to search for similar documents using
lucene contrib/queries MoreLikeThis. MoreLikeThis is also available from
the StandardRequestHandler using ?mlt=true.
(bdelacretaz, ryan)

SOLR-253: Adding KeepWordFilter and KeepWordFilterFactory. A TokenFilter
that keeps tokens with text in the registered keeplist. This behaves like
the inverse of StopFilter.
(ryan)

SOLR-257: WordDelimiterFilter has a new parameter splitOnCaseChange,
which can be set to 0 to disable splitting "PowerShot" => "Power" "Shot".
(klaas)

SOLR-193: Adding SolrDocument and SolrInputDocument to represent documents
outside of the lucene Document infrastructure. This class will be used
by clients and for processing documents.
(ryan)

SOLR-244: Added ModifiableSolrParams - a SolrParams implementation that
help you change values after initialization.
(ryan)

SOLR-20: Added a java client interface with two implementations. One
implementation uses commons httpclient to connect to solr via HTTP. The
other connects to solr directly. Check client/java/solrj. This addition
also includes tests that start jetty and test a connection using the full
HTTP request cycle.
(Darren Erik Vengroff, Will Johnson, ryan)

SOLR-133: Added StaxUpdateRequestHandler that uses StAX for XML parsing.
This implementation has much better error checking and lets you configure
a custom UpdateRequestProcessor that can selectively process update
requests depending on the request attributes. This class will likely
replace XmlUpdateRequestHandler.
(Thorsten Scherler, ryan)

SOLR-264: Added RandomSortField, a utility field with a random sort order.
The seed is based on a hash of the field name, so a dynamic field
of this type is useful for generating different random sequences.
This field type should only be used for sorting or as a value source
in a FunctionQuery
(ryan, hossman, yonik)

SOLR-266: Adding show=schema to LukeRequestHandler to show the parsed
schema fields and field types.
(ryan)

SOLR-133: The UpdateRequestHandler now accepts multiple delete options
within a single request. For example, sending:
<delete><id>1</id><id>2</id></delete> will delete both 1 and 2.
(ryan)

SOLR-269: Added UpdateRequestProcessor plugin framework. This provides
a reasonable place to process documents after they are parsed and
before they are committed to the index. This is a good place for custom
document manipulation or document based authorization.
(yonik, ryan)

SOLR-260: Converting to a standard PluginLoader framework. This reworks
RequestHandlers, FieldTypes, and QueryResponseWriters to share the same
base code for loading and initializing plugins. This adds a new
configuration option to define the default RequestHandler and
QueryResponseWriter in XML using default="true".
(ryan)

SOLR-273/376/452/516: Added hl.maxAnalyzedChars highlighting parameter, defaulting
to 50k, hl.alternateField, which allows the specification of a backup
field to use as summary if no keywords are matched, and hl.mergeContiguous,
which combines fragments if they are adjacent in the source document.
(klaas, Grant Ingersoll, Koji Sekiguchi via klaas)

SOLR-291: Control maximum number of documents to cache for any entry
in the queryResultCache via queryResultMaxDocsCached solrconfig.xml
entry.
(Koji Sekiguchi via yonik)

SOLR-248: Added CapitalizationFilterFactory that creates tokens with
normalized capitalization. This filter is useful for facet display,
but will not work with a prefix query. (ryan)
SOLR-468: Change to the semantics to keep the original token, not the
token in the Map. Also switched to use Lucene's new reusable token
capabilities.
(gsingers)

SOLR-196: A PHP serialized "phps" response writer that returns a
serialized array that can be used with the PHP function unserialize,
and a PHP response writer "php" that may be used by eval.
(Nick Jenkin, Paul Borgermans, Pieter Berkel via yonik)

SOLR-308: A new UUIDField class which accepts UUID string values,
as well as the special value of "NEW" which triggers generation of
a new random UUID.
(Thomas Peuss via hossman)

SOLR-349: New FunctionQuery functions: sum, product, div, pow, log,
sqrt, abs, scale, map. Constants may now be used as a value source.
(yonik)

SOLR-359: Add field type className to Luke response, and enabled access
to the detailed field information from the solrj client API.
(Grant Ingersoll via ehatcher)

SOLR-334: Pluggable query parsers. Allows specification of query
type and arguments as a prefix on a query string.
(yonik)

SOLR-351: External Value Source. An external file may be used
to specify the values of a field, currently usable as
a ValueSource in a FunctionQuery.
(yonik)

SOLR-395: Many new features for the spell checker implementation, including
an extended response mode with much richer output, multi-word spell checking,
and a bevy of new and renamed options (see the wiki).
(Mike Krimerman, Scott Taber via klaas).

SOLR-408: Added PingRequestHandler and deprecated SolrCore.getPingQueryRequest().
Ping requests should be configured using standard RequestHandler syntax in
solrconfig.xml rather then using the <pingQuery></pingQuery> syntax.
(Karsten Sperling via ryan)

SOLR-350: Support multiple SolrCores running in the same solr instance and allows
runtime runtime management for any running SolrCore. If a solr.xml file exists
in solr.home, this file is used to instanciate multiple cores and enables runtime
core manipulation. For more informaion see: http://wiki.apache.org/solr/CoreAdmin(Henri Biestro, ryan)

SOLR-447: Added an single request handler that will automatically register all
standard admin request handlers. This replaces the need to register (and maintain)
the set of admin request handlers. Assuming solrconfig.xml includes:
<requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" />
This will register: Luke/SystemInfo/PluginInfo/ThreadDump/PropertiesRequestHandler.
(ryan)

SOLR-142: Added RawResponseWriter and ShowFileRequestHandler. This returns config
files directly. If AdminHandlers are configured, this will be added automatically.
The jsp files /admin/get-file.jsp and /admin/raw-schema.jsp have been deprecated.
The deprecated <admin><gettableFiles> will be automatically registered with
a ShowFileRequestHandler instance for backwards compatibility.
(ryan)

SOLR-446: TextResponseWriter can write SolrDocuments and SolrDocumentLists the
same way it writes Document and DocList.
(yonik, ryan)

SOLR-418: Adding a query elevation component. This is an optional component to
elevate some documents to the top positions (or exclude them) for a given query.
(ryan)

SOLR-478: Added ability to get back unique key information from the LukeRequestHandler.
(gsingers)

SOLR-127: HTTP Caching awareness. Solr now recognizes HTTP Request
headers related to HTTP Caching (see RFC 2616 sec13) and will respond
with "304 Not Modified" when appropriate. New options have been added
to solrconfig.xml to influence this behavior.
(Thomas Peuss via hossman)

SOLR-536: Add a DocumentObjectBinder to solrj that converts Objects to and
from SolrDocuments.
(Noble Paul via ryan)

SOLR-595: Add support for Field level boosting in the MoreLikeThis Handler.
(Tom Morton, gsingers)

SOLR-572: Added SpellCheckComponent and org.apache.solr.spelling package to support more spell
checking functionality. Also includes ability to add your own SolrSpellChecker implementation that
plugs in. See http://wiki.apache.org/solr/SpellCheckComponent for more details
(Shalin Shekhar Mangar, Bojan Smid, gsingers)

SOLR-469: Added DataImportHandler as a contrib project which makes indexing data from Databases,
XML files and HTTP data sources into Solr quick and easy. Includes API and implementations for
supporting multiple data sources, processors and transformers for importing data. Supports full
data imports as well as incremental (delta) indexing. See http://wiki.apache.org/solr/DataImportHandler
for more details.
(Noble Paul, shalin)

SOLR-559: use Lucene updateDocument, deleteDocuments methods. This
removes the maxBufferedDeletes parameter added by SOLR-310 as Lucene
now manages the deletes. This provides slightly better indexing
performance and makes overwrites atomic, eliminating the possibility of
a crash causing duplicates.
(yonik)

SOLR-689 / SOLR-695: If you have used "MultiCore" functionality in an unreleased
version of 1.3-dev, many classes and configs have been renamed for the official
1.3 release. Speciffically, solr.xml has replaced multicore.xml, and uses a slightly
different syntax. The solrj classes: MultiCore{Request/Response/Params} have been
renamed: CoreAdmin{Request/Response/Params}
(hossman, ryan, Henri Biestro)

SOLR-647: reference count the SolrCore uses to prevent a premature
close while a core is still in use.
(Henri Biestro, Noble Paul, yonik)

SOLR-737: SolrQueryParser now uses a ConstantScoreQuery for wildcard
queries that prevent an exception from being thrown when the number
of matching terms exceeds the BooleanQuery clause limit.
(yonik)

SOLR-342: Added support into the SolrIndexWriter for using several new features of the new
LuceneIndexWriter, including: setRAMBufferSizeMB(), setMergePolicy(), setMergeScheduler.
Also, added support to specify Lucene's autoCommit functionality (not to be confused with Solr's
similarily named autoCommit functionality) via the <luceneAutoCommit> config. item. See the test
and example solrconfig.xml <indexDefaults> section for usage. Performance during indexing should
be significantly increased by moving up to 2.3 due to Lucene's new indexing capabilities.
Furthermore, the setRAMBufferSizeMB makes it more logical to decide on tuning factors related to
indexing. For best performance, leave the mergePolicy and mergeScheduler as the defaults and set
ramBufferSizeMB instead of maxBufferedDocs. The best value for this depends on the types of
documents in use. 32 should be a good starting point, but reports have shown up to 48 MB provides
good results. Note, it is acceptable to set both ramBufferSizeMB and maxBufferedDocs, and Lucene
will flush based on whichever limit is reached first.
(gsingers)

SOLR-330: Converted TokenStreams to use Lucene's new char array based
capabilities.
(gsingers)

SOLR-624: Only take snapshots if there are differences to the index
(Richard Trey Hyde via gsingers)

SOLR-730: Use read-only IndexReaders that don't synchronize
isDeleted(). This will speed up function queries and *:* queries
as well as improve their scalability on multi-CPU systems.
(Mark Miller via yonik)

SOLR-297: Fix bug in RequiredSolrParams where requiring a field
specific param would fail if a general default value had been supplied.
(hossman)

SOLR-331: Fix WordDelimiterFilter handling of offsets for synonyms or
other injected tokens that can break highlighting.
(yonik)

SOLR-282: Snapshooter does not work on Solaris and OS X since the cp command
there does not have the -l option. Also updated commit/optimize related
scripts to handle both old and new response format.
(bill)

SOLR-294: Logging of elapsed time broken on Solaris because the date command
there does not support the %s output format.
(bill)

SOLR-400: SolrExceptionTest should now handle using OpenDNS as a DNS provider
(gsingers)

SOLR-541: Legacy XML update support (provided by SolrUpdateServlet
when no RequestHandler is mapped to "/update") now logs error correctly.
(hossman)

SOLR-267: Changed logging to report number of hits, and also provide a mechanism to add log
messages to be output by the SolrCore via a NamedList toLog member variable.
(Will Johnson, yseeley, gsingers)

SOLR-509: Moved firstSearcher event notification to the end of the SolrCore constructor
(Koji Sekiguchi via gsingers)

SOLR-470, SOLR-552, SOLR-544, SOLR-701: Multiple fixes to DateField
regarding lenient parsing of optional milliseconds, and correct
formating using the canonical representation. LegacyDateField has
been added for people who have come to depend on the existing
broken behavior.
(hossman, Stefan Oestreicher)

SOLR-539: Fix for non-atomic long counters and a cast fix to avoid divide
by zero.
(Sean Timm via Otis Gospodnetic)

SOLR-749: Allow QParser and ValueSourceParsers to be extended with same name
(hossman, gsingers)

SOLR-704: DIH NumberFormatTransformer can silently ignore part of the
string while parsing. Now it tries to use the complete string for parsing.
Failure to do so will result in an exception.
(Stefan Oestreicher via shalin)

SOLR-135: Moved common classes to org.apache.solr.common and altered the
build scripts to make two jars: apache-solr-1.3.jar and
apache-solr-1.3-common.jar. This common.jar can be used in client code;
It does not have lucene or junit dependencies. The original classes
have been replaced with a @Deprecated extended class and are scheduled
to be removed in a later release. While this change does not affect API
compatibility, it is recommended to update references to these
deprecated classes.
(ryan)

SOLR-268: Tweaks to post.jar so it prints the error message from Solr.
(Brian Whitman via hossman)

Upgraded to Lucene 2.2.0; June 18, 2007.

SOLR-215: Static access to SolrCore.getSolrCore() and SolrConfig.config
have been deprecated in order to support multiple loaded cores.
(Henri Biestro via ryan)

SOLR-367: The create method in all TokenFilter and Tokenizer Factories
provided by Solr now declare their specific return types instead of just
using "TokenStream"
(hossman)

SOLR-396: Hooks add to build system for automatic generation of (stub)
Tokenizer and TokenFilter Factories.
Also: new Factories for all Tokenizers and TokenFilters provided by the
lucene-analyzers-2.2.0.jar -- includes support for German, Chinese,
Russan, Dutch, Greek, Brazilian, Thai, and French.
(hossman)

Upgraded to commons-CSV r609327, which fixes escaping bugs and
introduces new escaping and whitespace handling options to
increase compatibility with different formats.
(yonik)

Upgraded to Lucene 2.3.0; Jan 23, 2008.

SOLR-451: Changed analysis.jsp to use POST instead of GET, also made the input area a
bit bigger
(gsingers)

SOLR-411. Changed the names of the Solr JARs to use the defacto standard JAR names based on
project-name-version.jar. This yields, for example:
apache-solr-common-1.3-dev.jar
apache-solr-solrj-1.3-dev.jar
apache-solr-1.3-dev.jar

SOLR-479: Added clover code coverage targets for committers and the nightly build. Requires
the Clover library, as licensed to Apache and only available privately. To run:
ant -Drun.clover=true clean clover test generate-clover-reports

IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
should be upgraded before the master! If the master were to be updated
first, the older searchers would not be able to read the new index format.

Older Apache Solr installations can be upgraded by replacing
the relevant war file with the new version. No changes to configuration
files should be needed.

This version of Solr contains a new version of Lucene implementing
an updated index format. This version of Solr/Lucene can still read
and update indexes in the older formats, and will convert them to the new
format on the first index change. One change in the new index format
is that all "norms" are kept in a single file, greatly reducing the number
of files per segment. Users of compound file indexes will want to consider
converting to the non-compound format for faster indexing and slightly better
search concurrency.

The JSON response format for facets has changed to make it easier for
clients to retain sorted order. Use json.nl=map explicitly in clients
to get the old behavior, or add it as a default to the request handler
in solrconfig.xml

The Lucene based Solr query syntax is slightly more strict.
A ':' in a field value must be escaped or the whole value must be quoted.

The Solr "Request Handler" framework has been updated in two key ways:
First, if a Request Handler is registered in solrconfig.xml with a name
starting with "/" then it can be accessed using path-based URL, instead of
using the legacy "/select?qt=name" URL structure. Second, the Request
Handler framework has been extended making it possible to write Request
Handlers that process streams of data for doing updates, and there is a
new-style Request Handler for XML updates given the name of "/update" in
the example solrconfig.xml. Existing installations without this "/update"
handler will continue to use the old update servlet and should see no
changes in behavior. For new-style update handlers, errors are now
reflected in the HTTP status code, Content-type checking is more strict,
and the response format has changed and is controllable via the wt
parameter.

Detailed Change List

SOLR-82: Default field values can be specified in the schema.xml.
(Ryan McKinley via hossman)

SOLR-89: Two new TokenFilters with corresponding Factories...
* TrimFilter - Trims leading and trailing whitespace from Tokens
* PatternReplaceFilter - applies a Pattern to each token in the
stream, replacing match occurances with a specified replacement.
(hossman)

SOLR-91: allow configuration of a limit of the number of searchers
that can be warming in the background. This can be used to avoid
out-of-memory errors, or contention caused by more and more searchers
warming in the background. An error is thrown if the limit specified
by maxWarmingSearchers in solrconfig.xml is exceeded.
(yonik)

SOLR-106: New faceting parameters that allow specification of a
minimum count for returned facets (facet.mincount), paging through facets
(facet.offset, facet.limit), and explicit sorting (facet.sort).
facet.zeros is now deprecated.
(yonik)

SOLR-80: Negative queries are now allowed everywhere. Negative queries
are generated and cached as their positive counterpart, speeding
generation and generally resulting in smaller sets to cache.
Set intersections in SolrIndexSearcher are more efficient,
starting with the smallest positive set, subtracting all negative
sets, then intersecting with all other positive sets.
(yonik)

SOLR-117: Limit a field faceting to constraints with a prefix specified
by facet.prefix or f.<field>.facet.prefix.
(yonik)

SOLR-104: Support for "Update Plugins" -- RequestHandlers that want
access to streams of data for doing updates. ContentStreams can come
from the raw POST body, multi-part form data, or remote URLs.
Included in this change is a new SolrDispatchFilter that allows
RequestHandlers registered with names that begin with a "/" to be
accessed using a URL structure based on that name.
(Ryan McKinley via hossman)

SOLR-152: DisMaxRequestHandler now supports configurable alternate
behavior when q is not specified. A "q.alt" param can be specified
using SolrQueryParser syntax as a mechanism for specifying what query
the dismax handler should execute if the main user query (q) is blank.
(Ryan McKinley via hossman)

SOLR-158: new "qs" (Query Slop) param for DisMaxRequestHandler
allows for specifying the amount of default slop to use when parsing
explicit phrase queries from the user.
(Adam Hiatt via hossman)

SOLR-81: SpellCheckerRequestHandler that uses the SpellChecker from
the Lucene contrib.
(Otis Gospodnetic and Adam Hiatt)

SOLR-211: Added a regex PatternTokenizerFactory. This extracts tokens
from the input string using a regex Pattern.
(Ryan McKinley)

SOLR-162: Added a "Luke" request handler and other admin helpers.
This exposes the system status through the standard requestHandler
framework.
(ryan)

SOLR-212: Added a DirectSolrConnection class. This lets you access
solr using the standard request/response formats, but does not require
an HTTP connection. It is designed for embedded applications.
(ryan)

SOLR-204: The request dispatcher (added in SOLR-104) can handle
calls to /select. This offers uniform error handling for /update and
/select. To enable this behavior, you must add:
<requestDispatcher handleSelect="true" > to your solrconfig.xml
See the example solrconfig.xml for details.
(ryan)

SOLR-170: StandardRequestHandler now supports a "sort" parameter.
Using the ';' syntax is still supported, but it is recommended to
transition to the new syntax.
(ryan)

SOLR-181: The index schema now supports "required" fields. Attempts
to add a document without a required field will fail, returning a
descriptive error message. By default, the uniqueKey field is
a required field. This can be disabled by setting required=false
in schema.xml.
(Greg Ludington via ryan)

SOLR-217: Fields configured in the schema to be neither indexed or
stored will now be quietly ignored by Solr when Documents are added.
The example schema has a comment explaining how this can be used to
ignore any "unknown" fields.
(Will Johnson via hossman)

SOLR-227: If schema.xml defines multiple fieldTypes, fields, or
dynamicFields with the same name, a severe error will be logged rather
then quietly continuing. Depending on the <abortOnConfigurationError>
settings, this may halt the server. Likewise, if solrconfig.xml
defines multiple RequestHandlers with the same name it will also add
an error.
(ryan)

SOLR-226: Added support for dynamic field as the destination of a
copyField using glob (*) replacement.
(ryan)

SOLR-234: TrimFilter can update the Token's startOffset and endOffset
if updateOffsets="true". By default the Token offsets are unchanged.
(ryan)

SOLR-208: new example_rss.xsl and example_atom.xsl to provide more
examples for people about the Solr XML response format and how they
can transform it to suit different needs.
(Brian Whitman via hossman)

SOLR-249: Deprecated SolrException( int, ... ) constructors in favor
of constructors that takes an ErrorCode enum. This will ensure that
all SolrExceptions use a valid HTTP status code.
(ryan)

SOLR-386: Abstracted SolrHighlighter and moved existing implementation
to DefaultSolrHighlighter. Adjusted SolrCore and solrconfig.xml so
that highlighter is configurable via a class attribute. Allows users
to use their own highlighter implementation.
(Tricia Williams via klaas)

A new method "getSolrQueryParser" has been added to the IndexSchema
class for retrieving a new SolrQueryParser instance with all options
specified in the schema.xml's <solrQueryParser> block set. The
documentation for the SolrQueryParser constructor and it's use of
IndexSchema have also been clarified.
(Erik Hatcher and hossman)

SOLR-179: By default, solr will abort after any severe initalization
errors. This behavior can be disabled by setting:
<abortOnConfigurationError>false</abortOnConfigurationError>
in solrconfig.xml
(ryan)

The example solrconfig.xml maps /update to XmlUpdateRequestHandler using
the new request dispatcher (SOLR-104). This requires posted content to
have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'
The response format matches that of /select and returns standard error
codes. To enable solr1.1 style /update, do not map "/update" to any
handler in solrconfig.xml
(ryan)

SOLR-231: If a charset is not specified in the contentType,
ContentStream.getReader() will use UTF-8 encoding.
(ryan)

SOLR-230: More options for post.jar to support stdin, xml on the
commandline, and defering commits. Tutorial modified to take
advantage of these options so there is no need for curl.
(hossman)

SOLR-114: HashDocSet specific implementations of union() and andNot()
for a 20x performance improvement for those set operations, and a new
hash algorithm speeds up exists() by 10% and intersectionSize() by 8%.
(yonik)

SOLR-115: Solr now uses BooleanQuery.clauses() instead of
BooleanQuery.getClauses() in any situation where there is no risk of
modifying the original query.
(hossman)

SOLR-221: Speed up sorted faceting on multivalued fields by ~60%
when the base set consists of a relatively large portion of the
index.
(yonik)

SOLR-221: Added a facet.enum.cache.minDf parameter which avoids
using the filterCache for terms that match few documents, trading
decreased memory usage for increased query time.
(yonik)

SOLR-145: Fix for bug introduced in SOLR-104 where some Exceptions
were being ignored by all "out of the box" RequestHandlers.
(hossman)

SOLR-166: JNDI solr.home code refactoring. SOLR-104 moved
some JNDI related code to the init method of a Servlet Filter -
according to the Servlet Spec, all Filter's should be initialized
prior to initializing any Servlets, but this is not the case in at
least one Servlet Container (Resin). This "bug fix" refactors
this JNDI code so that it should be executed the first time any
attempt is made to use the solr.home dir.
(Ryan McKinley via hossman)

SOLR-173: Bug fix to SolrDispatchFilter to reduce "too many open
files" problem was that SolrDispatchFilter was not closing requests
when finished. Also modified ResponseWriters to only fetch a Searcher
reference if necessary for writing out DocLists.
(Ryan McKinley via hossman)

SOLR-168: Fix display positioning of multiple tokens at the same
position in analysis.jsp
(yonik)

Changed the SOLR-104 RequestDispatcher so that /select?qt=xxx can not
access handlers that start with "/". This makes path based authentication
possible for path based request handlers.
(ryan)

SOLR-214: Some servlet containers (including Tomcat and Resin) do not
obey the specified charset. Rather then letting the the container handle
it solr now uses the charset from the header contentType to decode posted
content. Using the contentType: "text/xml; charset=utf-8" will force
utf-8 encoding. If you do not specify a contentType, it will use the
platform default.
(Koji Sekiguchi via ryan)

SOLR-241: Undefined system properties used in configuration files now
cause a clear message to be logged rather than an obscure exception thrown.
(Koji Sekiguchi via ehatcher)

Older Apache Solr installations can be upgraded by replacing
the relevant war file with the new version. No changes to configuration
files are needed and the index format has not changed.

The default version of the Solr XML response syntax has been changed to 2.2.
Behavior can be preserved for those clients not explicitly specifying a
version by adding a default to the request handler in solrconfig.xml

By default, Solr will no longer use a searcher that has not fully warmed,
and requests will block in the meantime. To change back to the previous
behavior of using a cold searcher in the event there is no other
warm searcher, see the useColdSearcher config item in solrconfig.xml

The XML response format when adding multiple documents to the collection
in a single <add> command has changed to return a single <result>.

A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin
words that were hyphenated and split by a newline.
(Boris Vitez via yonik, SOLR-41)

Added a CompressableField base class which allows fields of derived types to
be compressed using the compress=true setting. The field type also gains the
ability to specify a size threshold at which field data is compressed.
(klaas, SOLR-45)

Simple faceted search support for fields (enumerating terms)
and arbitrary queries added to both StandardRequestHandler and
DisMaxRequestHandler.
(hossman, SOLR-44)

In addition to specifying default RequestHandler params in the
solrconfig.xml, support has been added for configuring values to be
appended to the multi-val request params, as well as for configuring
invariant params that can not overridden in the query.
(hossman, SOLR-46)

Default operator for query parsing can now be specified with q.op=AND|OR
from the client request, overriding the schema value.
(ehatcher)

New XSLTResponseWriter does server side XSLT processing of XML Response.
In the process, an init(NamedList) method was added to QueryResponseWriter
which works the same way as SolrRequestHandler.
(Bertrand Delacretaz / SOLR-49 / hossman)

autoCommit can be specified every so many documents added
(klaas, SOLR-65)

${solr.home}/lib directory can now be used for specifying "plugin" jars
(hossman, SOLR-68)

Support for "Date Math" relative "NOW" when specifying values of a
DateField in a query -- or when adding a document.
(hossman, SOLR-71)

useColdSearcher control in solrconfig.xml prevents the first searcher
from being used before it's done warming. This can help prevent
thrashing on startup when multiple requests hit a cold searcher.
The default is "false", preventing use before warm.
(yonik, SOLR-77)

classes reorganized into different packages, package names changed to Apache

force read of document stored fields in QuerySenderListener

Solr now looks in ./solr/conf for config, ./solr/data for data
configurable via solr.solr.home system property

Highlighter params changed to be prefixed with "hl."; allow fragmentsize
customization and per-field overrides on many options
(Andrew May via klaas, SOLR-37)

Default param values for DisMaxRequestHandler should now be specified
using a '<lst name="defaults">...</lst>' init param, for backwards
compatability all init prams will be used as defaults if an init param
with that name does not exist.
(hossman, SOLR-43)

The DisMaxRequestHandler now supports multiple occurances of the "fq"
param.
(hossman, SOLR-44)

FunctionQuery.explain now uses ComplexExplanation to provide more
accurate score explanations when composed in a BooleanQuery.
(hossman, SOLR-25)

Lazy field loading can be enabled via a solrconfig directive. This will be faster when
not all stored fields are needed from a document
(klaas, SOLR-52)

Made admin JSPs return XML and transform them with new XSL stylesheets
(Otis Gospodnetic, SOLR-58)

If the "echoParams=explicit" request parameter is set, request parameters are copied
to the output. In an XML output, they appear in new <lst name="params"> list inside
the new <lst name="responseHeader"> element, which replaces the old <responseHeader>.
Adding a version=2.1 parameter to the request produces the old format, for backwards
compatibility
(bdelacretaz and yonik, SOLR-59).

getDocListAndSet can now generate both a DocList and a DocSet from a
single lucene query.

BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate
set

OpenBitSet completed, replaces BitSet as the implementation for BitDocSet.
Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize
is between 3 and 4 times faster.
(yonik, SOLR-15)

much faster unionSize when one of the sets is a HashDocSet: O
(smaller_set_size)

Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field
queries where DocSets aren't cached (for example, if the number of terms in the field
is larger than the filter cache.)
(yonik)

Optimized facet.field faceting by as much as 500 times when the field has
a single token per document (not multiValued & not tokenized) by using the
Lucene FieldCache entry for that field to tally term counts. The first request
utilizing the FieldCache will take longer than subsequent ones.