Pavel Yaskevich
added a comment - 27/Jan/12 08:53 I don't think so because CASSANDRA-1983 proposes to change the way we identify SSTables and this one proposes to change the way we identify ColumnFamilies.

Which I interpret as the cf id:s that are embedded in e.g. row mutations (remember all those "invalid cfId..." cases upon schema changes). In general, UUID:s should minimize the chances of accidentally applying something to the wrong CF (which I suppose is a generalization of the simultaneous creation case mentioned in the ticket description).

Peter Schuller
added a comment - 27/Jan/12 08:59 Which I interpret as the cf id:s that are embedded in e.g. row mutations (remember all those "invalid cfId..." cases upon schema changes). In general, UUID:s should minimize the chances of accidentally applying something to the wrong CF (which I suppose is a generalization of the simultaneous creation case mentioned in the ticket description).

Pavel Yaskevich
added a comment - 21/Apr/12 00:18 Sure it is, given that CFs could be created on the disconnected nodes and have the same cfId assigned - after the schema merge system would crash with RTE.

This got me thinking that if we change int to UUID it wouldn't actually change the deal unless we generate UUID based on the CF attributes (so we can merge simultaneously created CF on the different machines) but after we update attributes again we would have to re-generate id. The question is - do we really need ID for CF? We can guarantee uniqueness of CF in the given keyspace by CF's name...

Pavel Yaskevich
added a comment - 15/May/12 15:29 This got me thinking that if we change int to UUID it wouldn't actually change the deal unless we generate UUID based on the CF attributes (so we can merge simultaneously created CF on the different machines) but after we update attributes again we would have to re-generate id. The question is - do we really need ID for CF? We can guarantee uniqueness of CF in the given keyspace by CF's name...

Yeah, I think that this was the main reason to add that. Although change proposed here would only work if CFs simultaneously created on the different machines are not the same CF otherwise we would have hopeless situation upon merge which would break the whole schema. We can't really check what schema mutation would lead to before we apply it and try to merge so I don't really see what we can do about it...

Pavel Yaskevich
added a comment - 16/May/12 00:08 Yeah, I think that this was the main reason to add that. Although change proposed here would only work if CFs simultaneously created on the different machines are not the same CF otherwise we would have hopeless situation upon merge which would break the whole schema. We can't really check what schema mutation would lead to before we apply it and try to merge so I don't really see what we can do about it...

Jonathan Ellis
added a comment - 16/May/12 16:20 Sounds like we should prototype the "just use cfname" approach and see how bad the hit is in practice. If we're lucky it will be negligible and we can move on.

Pavel Yaskevich
added a comment - 18/May/12 20:51 Made a migration from int to uuid, made sure that all SerializationTests pass (and all other tests) and schema is merged correctly in the situations described in previous comments.

For info, changing the messaging service version will break streaming during the upgrade. And last time we tried, there was strong opposition in changing the messaging service version in a minor release. Not that I have a good solution for fixing this without breaking the protocol.

Sylvain Lebresne
added a comment - 19/May/12 09:52 For info, changing the messaging service version will break streaming during the upgrade. And last time we tried, there was strong opposition in changing the messaging service version in a minor release. Not that I have a good solution for fixing this without breaking the protocol.

Jonathan Ellis
added a comment - 19/May/12 16:03 Our options include
break streaming for 1.1.1 as an "emergency fix"
retract claims that 1.1 allows concurrent schema changes, postpone this fix for 1.2
add a mapping between old int IDs and new UUIDs and version streaming accordingly
I'm okay with any of these, although if #3 is feasible that would be preferred.

Previous oppositions weren't mine but in any case I'm personally good going with #3 too (Pavel's patch already includes the id<->uuid mapping, our only problem is the dropping of streaming connection on version mismatch). This fix doesn't affect anything streamed, so it should be fine to bump the messaging version but special case streaming so that it doesn't drop the connection for streams coming from 1.1.0.

Sylvain Lebresne
added a comment - 21/May/12 09:22 Previous oppositions weren't mine but in any case I'm personally good going with #3 too (Pavel's patch already includes the id<->uuid mapping, our only problem is the dropping of streaming connection on version mismatch). This fix doesn't affect anything streamed, so it should be fine to bump the messaging version but special case streaming so that it doesn't drop the connection for streams coming from 1.1.0.

Looking more closely, there is actually two problems with respect to rolling upgrades:

because newly created CF won't have an old format id, it means people shouldn't create any CF in a mixed version cluster. That would clearly be fine for a major upgrade, it's more annoying to roll this in a minor upgrade though imo. I don't think there is anything we can do about that.

as is, streaming won't work in a mixed version cluster (as is the case in major upgrade) by virtue of the following code in IncomingTcpConnection:

We could avoid that, by say adding some isStreamingCompatible(v1, v2) method that would return true for VERSION_11 and VERSION_111, since after all there is no change to the stream format. However, the patch also need to version correctly StreamRequestMessage for it to work correctly.

Overall, this is not a small patch, and it will induces more limited rolling upgrade behavior than is the norm in a minor version, so I'll admit I'm personally growing more in favor of solution #2 above (postpone to 1.2).

That being said, on the patch itself:

In RowCacheKey.compareTo(), == is used intead of equals().

In Schema, we can remove the cfIdGen field && MIN_CF_ID.

nameUUIDFromBytes already does a md5 internally, so we should just pass the concatenation of ksName and cfName bytes (doubling the md5 slightly augments the chance of collisions).

When writing the schema, for the "id" column, the code write a string/UUID (toSchemaNoColumns) but expect an int when reading (fromSchemaNoColumns). The fact is, we don't need to save the new style id in the schema since we can recompute it. So we should keep the "id" column for oldId (if they exist). Also, when writing a CF schema, we should check if it has an associated old cfId and write it if it has (i.e. we should preserve the old ids mapping (when it exists) for now, we'll drop that in a future version).

Schema.addOldCfIdMapping should check for null value for the oldId and ignore it, since in fromSchemaNoColumns, result.getInt("id") will return null for new CF.

ColumnFamilySerializer needs to version the serialize version, when we talk to old node (same in RowMutation serialize method). Of course, when a CF don't have a old id, we'll have to throw an exception instead (that the 'user shouldn't create CF in a mixed cluster').

StreamRequestMessage should version cfId correctly.

In SchemaLoader, not sure we want to always assign an old style id to the CF. Instead, it would probably be better to add a few specific tests (serialization test ?) that validate the old id are correctly handled.

OCD nit: convertOldCFId could be renamed to convertOldCfId for consistency with the rest (i.e. 'F' could be lowercased)

Sylvain Lebresne
added a comment - 25/May/12 10:16 Looking more closely, there is actually two problems with respect to rolling upgrades:
because newly created CF won't have an old format id, it means people shouldn't create any CF in a mixed version cluster. That would clearly be fine for a major upgrade, it's more annoying to roll this in a minor upgrade though imo. I don't think there is anything we can do about that.
as is, streaming won't work in a mixed version cluster (as is the case in major upgrade) by virtue of the following code in IncomingTcpConnection:
if (version == MessagingService.version_)
{
....
}
else
{
// streaming connections are per-session and have a fixed version. we can't do anything with a wrong-version stream connection, so drop it.
logger.error("Received stream using protocol version {} (my version {}). Terminating connection",
version, MessagingService.version_);
}
We could avoid that, by say adding some isStreamingCompatible(v1, v2) method that would return true for VERSION_11 and VERSION_111, since after all there is no change to the stream format. However, the patch also need to version correctly StreamRequestMessage for it to work correctly.
Overall, this is not a small patch, and it will induces more limited rolling upgrade behavior than is the norm in a minor version, so I'll admit I'm personally growing more in favor of solution #2 above (postpone to 1.2).
That being said, on the patch itself:
In RowCacheKey.compareTo(), == is used intead of equals().
In Schema, we can remove the cfIdGen field && MIN_CF_ID.
nameUUIDFromBytes already does a md5 internally, so we should just pass the concatenation of ksName and cfName bytes (doubling the md5 slightly augments the chance of collisions).
When writing the schema, for the "id" column, the code write a string/UUID (toSchemaNoColumns) but expect an int when reading (fromSchemaNoColumns). The fact is, we don't need to save the new style id in the schema since we can recompute it. So we should keep the "id" column for oldId (if they exist). Also, when writing a CF schema, we should check if it has an associated old cfId and write it if it has (i.e. we should preserve the old ids mapping (when it exists) for now, we'll drop that in a future version).
Schema.addOldCfIdMapping should check for null value for the oldId and ignore it, since in fromSchemaNoColumns, result.getInt("id") will return null for new CF.
ColumnFamilySerializer needs to version the serialize version, when we talk to old node (same in RowMutation serialize method). Of course, when a CF don't have a old id, we'll have to throw an exception instead (that the 'user shouldn't create CF in a mixed cluster').
StreamRequestMessage should version cfId correctly.
In SchemaLoader, not sure we want to always assign an old style id to the CF. Instead, it would probably be better to add a few specific tests (serialization test ?) that validate the old id are correctly handled.
OCD nit: convertOldCFId could be renamed to convertOldCfId for consistency with the rest (i.e. 'F' could be lowercased)

In ColumnFamilySerializer, when we serialize for an old version and can't find an oldId, it'll probably be better to use a more user friendly message like "Cannot send column family X to Y as it's version is pre-1.1.1. Please update the whole cluster to 1.1.1 first".

But I think we agreed that it's safer to target this at 1.2 and acknowledge that concurrent table creation will only be supported then, so this will need rebase to trunk

Sylvain Lebresne
added a comment - 25/May/12 16:18 Two small nits while looking at v2 quickly:
In fromSchemaNoColumns, the try catch is not useful.
In ColumnFamilySerializer, when we serialize for an old version and can't find an oldId, it'll probably be better to use a more user friendly message like "Cannot send column family X to Y as it's version is pre-1.1.1. Please update the whole cluster to 1.1.1 first".
But I think we agreed that it's safer to target this at 1.2 and acknowledge that concurrent table creation will only be supported then, so this will need rebase to trunk