Hi!
I am considering to use cassandra for clustered transaction logging in a
project.
What I need are in principal 3 functions:
1 - Log transaction with a unique (but possibly non-sequential) id
2 - Fetch transaction with a specific id
3 - Fetch X new transactions "after" a specific cursor/transaction
This function must be guaranteed to:
A, eventually return all known transactions
B, Not return the same transaction more than once
The order of the transactions fetches does not have to be strictly
time-sorted
but in practice it probably has to be based on some time-oriented order
to be able to support cursors.
I can see that 1 & 2 are trivial to solve in Cassandra, but is there any
elegant way to solve 3?
Since there might be multiple nodes logging transactions, their clocks might
not be perfectly synchronized (to millisec level) etc so sorting on time is
not stable.
Possibly creating a synchronized incremental id might be one option but that
could create a cluster bottleneck etc?
Another alternative might be to use cassandra for 1 & 2 and then store an
ordered list of id:s in a standard DB. This might be a reasonable compromise
since 3 is less critical from a HA point of view, but maybe someone can
point me to a more elegant solution using Cassandra?