Merging MARC records from different ILS instances

Merging MARC records from different ILS instances

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records. Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

Re: Merging MARC records from different ILS instances

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index those into the same VuFind instance, some of the IDs are going
to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure that all IDs are unique. This will cause problems for the
Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records.
Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

Re: Merging MARC records from different ILS instances

Thanks Demian. The ID issue is possibly the reason as pointed out by you. In fact a closer look shows that each harvested record has different ID (generated by VuFind) but inside the record the ID is Koha biblio ID stored in non-MARC tag 999 $c. This we mentioned in marc_local.properties file as id=999c, first. As a result, all first 101 records (from Koha set 1) got replaced by the first 101 records from Koha set 2.

I'll study the study materials as advised by you and bounce back with the outcome of the experiment. Meanwhile one doubt... How can the VuFind track real time status of documents available in different Koha installations (as we can select only one driver in config.ini)?

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index those into the same VuFind instance, some of the IDs are going
to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure that all IDs are unique. This will cause problems for the
Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records.
Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

Re: Merging MARC records from different ILS instances

will it store id in vufind as K41.1234 for a record in Koha with bib id 1234?

We have to create two marc_local.properties (say marc_local-k41.properties and marc_local-k42.properties) for two instances/installations. Then how can we instruct to use respective marc_local durong batc-import?

Thanks Demian. The ID issue is possibly the reason as pointed out by you. In fact a closer look shows that each harvested record has different ID (generated by VuFind) but inside the record the ID is Koha biblio ID stored in non-MARC tag 999 $c. This we mentioned in marc_local.properties file as id=999c, first. As a result, all first 101 records (from Koha set 1) got replaced by the first 101 records from Koha set 2.

I'll study the study materials as advised by you and bounce back with the outcome of the experiment. Meanwhile one doubt... How can the VuFind track real time status of documents available in different Koha installations (as we can select only one driver in config.ini)?

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index those into the same VuFind instance, some of the IDs are going
to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure that all IDs are unique. This will cause problems for the
Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records.
Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

Re: Merging MARC records from different ILS instances

Regarding your question about real time status in different Koha installations, that is what the MultiBackend ILS driver described in the wiki link does – it
acts as a proxy in front of other ILS driver objects, allowing VuFind to communicate with multiple systems. It uses your custom ID prefixes to route requests to the most appropriate drivers.

Thanks Demian. The ID issue is possibly the reason as pointed out by you. In fact a closer look shows that each harvested record has different ID (generated by VuFind) but inside the record the ID is Koha biblio
ID stored in non-MARC tag 999 $c. This we mentioned in marc_local.properties file as id=999c, first. As a result, all first 101 records (from Koha set 1) got replaced by the first 101 records from Koha set 2.

I'll study the study materials as advised by you and bounce back with the outcome of the experiment. Meanwhile one doubt... How can the VuFind track real time status of documents available in different Koha
installations (as we can select only one driver in config.ini)?

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index
those into the same VuFind instance, some of the IDs are going to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure
that all IDs are unique. This will cause problems for the Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records
set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records. Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind
of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

Re: Merging MARC records from different ILS instances

The batch-import-marc.sh script accepts a -p switch to specify which properties file to use during indexing. This makes it possible to set up multiple configurations
for multiple ILS instances.

One possible point of confusion, though: the -p switch does NOT refer to marc_local.properties or other mapping files. Instead, it refers to the file that by
default is called import.properties. This file contains all of the SolrMarc settings, one of which is the list of mappings files to use (which includes marc_local.properties). So you want to do this:

1.Copy import.properties to import-k41.properties, and copy marc_local.properties to marc_local-k41.properties.

2.Edit import-k41.properties to refer to marc_local-k41.properties.

3.Run the batch import with -p import-k41.properties as an additional parameter.

will it store id in vufind as K41.1234 for a record in Koha with bib id 1234?

We have to create two marc_local.properties (say marc_local-k41.properties and marc_local-k42.properties) for two instances/installations. Then how can we instruct to use respective marc_local durong batc-import?

Thanks Demian. The ID issue is possibly the reason as pointed out by you. In fact a closer look shows that each harvested record has different ID (generated by VuFind) but inside the record the ID is Koha biblio
ID stored in non-MARC tag 999 $c. This we mentioned in marc_local.properties file as id=999c, first. As a result, all first 101 records (from Koha set 1) got replaced by the first 101 records from Koha set 2.

I'll study the study materials as advised by you and bounce back with the outcome of the experiment. Meanwhile one doubt... How can the VuFind track real time status of documents available in different Koha
installations (as we can select only one driver in config.ini)?

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index
those into the same VuFind instance, some of the IDs are going to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure
that all IDs are unique. This will cause problems for the Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records
set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records. Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind
of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

The batch-import-marc.sh script accepts a -p switch to specify which properties file to use during indexing. This makes it possible to set up multiple configurations
for multiple ILS instances.

One possible point of confusion, though: the -p switch does NOT refer to marc_local.properties or other mapping files. Instead, it refers to the file that by
default is called import.properties. This file contains all of the SolrMarc settings, one of which is the list of mappings files to use (which includes marc_local.properties). So you want to do this:

1.Copy import.properties to import-k41.properties, and copy marc_local.properties to marc_local-k41.properties.

2.Edit import-k41.properties to refer to marc_local-k41.properties.

3.Run the batch import with -p import-k41.properties as an additional parameter.

will it store id in vufind as K41.1234 for a record in Koha with bib id 1234?

We have to create two marc_local.properties (say marc_local-k41.properties and marc_local-k42.properties) for two instances/installations. Then how can we instruct to use respective marc_local durong batc-import?

Thanks Demian. The ID issue is possibly the reason as pointed out by you. In fact a closer look shows that each harvested record has different ID (generated by VuFind) but inside the record the ID is Koha biblio
ID stored in non-MARC tag 999 $c. This we mentioned in marc_local.properties file as id=999c, first. As a result, all first 101 records (from Koha set 1) got replaced by the first 101 records from Koha set 2.

I'll study the study materials as advised by you and bounce back with the outcome of the experiment. Meanwhile one doubt... How can the VuFind track real time status of documents available in different Koha
installations (as we can select only one driver in config.ini)?

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index
those into the same VuFind instance, some of the IDs are going to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure
that all IDs are unique. This will cause problems for the Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records
set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records. Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind
of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].

Re: Merging MARC records from different ILS instances

Dear Demian

I know you are busy and away this week. But could not stop me sharing the news that with your instructions and blessings we are able to set up Multibackend driver for different instances of Koha with MultiILS authentication facility. The ID collisions issue in Solr for records from different Koha sources is also solved (of course after some initial hitches and teething problems). The -p switch worked with hard coding of the file location ( -p /usr/local/.......). We are extremely grateful to you for showing us the path to solve the issue at hand. A few problems are still there. The detail report I'll send in pdf file for your observation.

The batch-import-marc.sh script accepts a -p switch to specify which properties file to use during indexing. This makes it possible to set up multiple configurations
for multiple ILS instances.

One possible point of confusion, though: the -p switch does NOT refer to marc_local.properties or other mapping files. Instead, it refers to the file that by
default is called import.properties. This file contains all of the SolrMarc settings, one of which is the list of mappings files to use (which includes marc_local.properties). So you want to do this:

1.Copy import.properties to import-k41.properties, and copy marc_local.properties to marc_local-k41.properties.

2.Edit import-k41.properties to refer to marc_local-k41.properties.

3.Run the batch import with -p import-k41.properties as an additional parameter.

will it store id in vufind as K41.1234 for a record in Koha with bib id 1234?

We have to create two marc_local.properties (say marc_local-k41.properties and marc_local-k42.properties) for two instances/installations. Then how can we instruct to use respective marc_local durong batc-import?

Thanks Demian. The ID issue is possibly the reason as pointed out by you. In fact a closer look shows that each harvested record has different ID (generated by VuFind) but inside the record the ID is Koha biblio
ID stored in non-MARC tag 999 $c. This we mentioned in marc_local.properties file as id=999c, first. As a result, all first 101 records (from Koha set 1) got replaced by the first 101 records from Koha set 2.

I'll study the study materials as advised by you and bounce back with the outcome of the experiment. Meanwhile one doubt... How can the VuFind track real time status of documents available in different Koha
installations (as we can select only one driver in config.ini)?

I would guess that the problem you are running into here is ID collisions. If you set up two instances of Koha, they're both going to start with record ID 1 and increment from there. If you index
those into the same VuFind instance, some of the IDs are going to be the same, and then they're going to overwrite each other.

The solution is to use a different marc_local.properties file for each instance so that you can assign a different prefix to each ID based on which Koha instance it came from; then you can ensure
that all IDs are unique. This will cause problems for the Koha ILS driver, though, because it expects the raw ID value, not a prefixed version -- but that's where the MultiBackend driver can come in to proxy appropriate requests to appropriate drivers.

This wiki page talks about the MultiBackend driver and provides a link to the SolrMarc configuration to achieve ID prefixing:

Today I encountered an unexpected result in VuFind 3.1.3 (working on Ubuntu LTS 16.04 with PHP 7.0).

1. harvested records (marcxml) from two different installation of Koha (one includes 101 and next includes 112 records respectively;

the process went on successfully.

2. indexed harvested records;

this has also performed by the system without any error.

The surprise came in next level. Instead merging records (101+112), Solr is actually replacing first records set (101 number records) with second records
set (112 records). As a result after importing in solr we are getting 112 records in place of 101+112 records. Initially we though there some problems in our process but found same results after performing the entire process in three different times.

Our expectation was that - after harvesting and importing records from different Koha instances (or installations) we will get a union catalogue kind
of things in VuFind discovery system. But it is not happening. What are we missing?

Note: we have included different sections for different Koha instance with unique id by IP address e.g [Koha-42 and Koha-41].