At the first connection, the Fusion MongoDB connector crawls the entire MongoDB and saves the checkpoint.

If Process oplog is not selected, when you restart the datasource, the connector recrawls the entire MongoDB.
In this mode the connector does not support incremental recrawling, nor does it delete entries that are deleted from MongoDB.

About reading from the MongoDB oplog

You can configure the Fusion MongoDB connector to read from the MongoDB oplog rather than from the entire MongoDB collection.
In this mode, the connector crawls the full MongoDB collection, saves a checkpoint in ZooKeeper, then continues running indefinitely, grabbing updates from the oplog as they happen in real time.
This way the connector can delete documents that are deleted from MongoDB.

If the connector stops for any reason, it stores a timestamp in ZooKeeper that shows what the latest update was. When the connector restarts, it continues reading from that checkpoint onward.

Make sure Process Oplog is selected in the Fusion MongoDB connector UI.

Configuration

Tip

When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

Property

Description

batch_size_solr_commit

Batch size Solr commit

The number of documents every time solr_commit will be made.

type: integer

default value: '1000'

collection

Collection

Collection documents will be indexed to.

type: string

pattern: ^[a-zA-Z0-9_-]+$

collections

MongoDB Collections to index

The MongoDB collections to index, in the format 'databaseName.collection'. Multiple collections can be separated by commas. The default '*.*' option crawls all databases (limited by user access) and their related collections.

type: string

default value: '*.*'

minLength: 1

commit_on_finish

Solr commit on finish

Set to true for a request to be sent to Solr after the last batch has been fetched to commit the documents to the index.

}source (required)
: {
display name: Source Field
type: string
description : The name of the field to be mapped.
}target
: {
display name: Target Field
type: string
description : The name of the field to be mapped to.
}
}

reservedFieldsMappingAllowed

Allow System Fields Mapping?

type: boolean

default value: 'false'

unmapped

Unmapped Fields

If fields do not match any of the field mapping rules, these rules will apply.

}source
: {
display name: Source Field
type: string
description : The name of the field to be mapped.
}target
: {
display name: Target Field
type: string
description : The name of the field to be mapped to.
}
}

ConnectorDb Configuration

Property

Description

aliases

Process Aliases?

Keep track of original URI-s that resolved to the current URI. This negatively impacts performance and size of DB.

type: boolean

default value: 'false'

inlinks

Process Inlinks?

Keep track of incoming links. This negatively impacts performance and size of DB.

type: boolean

default value: 'false'

inv_aliases

Process Inverted Aliases?

Keep track of target URI-s that the current URI resolves to. This negatively impacts performance and size of DB.