Berkeley DB replication is log-based. Everytime data is changed on the master, the log records are sent to the replicas. Replicas apply the log records to the data in order to stay current with the master. So, during normal operation, the rate of change at the master determines the performance of replication - the higher the rate of change, the more the processing required.

If you want to add a new replica, then there are two options you have:
You can take a recent backup of the data from the master, restore it on the newly added replica and then "add" the replica to the HA group - the master will detect that a new replica has been added, it will start sending it log records and the replica will catch up (become current) and then come "online". We call this mechanism "internal init".

A second choice is for the new replica to request data from the master over the network. The amount of time it takes to transfer data from master to new replica will depend on available network bandwidth etc.

For a 2 TB file, I'd suspect that the former approach might work better (restore from recent backup).

please refer to the BDB documentation for more details on this topic.

If you want to "refresh" the master, you could gracefully shutdown the master (BDB HA would automatically transfer mastership to an existing replica), refresh the master, and then bring it back online.

In order to do this correctly, you need to make sure that there are enough replicas, so the election of the new master can happen correctly (needs a quorum).

So, assuming that you use Berkeley DB HA the way it was designed, I'd venture to say that BDB is a good solution for the use-case you've described. Of course, the more detail you provide on your use case, the better we can assist you.

I'd encourage you to read the HA documentation to understand internal init, online addition of a new replica, elections etc.

I have a question on the below statement:
-----------------------------------------------

If you want to "refresh" the master, you could gracefully shutdown the master (BDB HA would automatically transfer mastership to an existing replica), refresh the master, and then bring it back online.

In order to do this correctly, you need to make sure that there are enough replicas, so the election of the new master can happen correctly (needs a quorum).
----------------------------------------------

So let say i shutdown the master and load the DB with a brand new BDB file. Now I bring it back online. I need to bring this back as master? And how does this replicate to slaves? This is a brand new BDB, how does the replication happen? I believe it has to replicate the whole new BDB file to the slaves rather than updates? How long will it take assuming production standard network bandwidth? During this replication, how does it handle the existing reads? Any stats published on slave catch up time?