Cheolsoo Park
added a comment - 26/Mar/14 00:39 Actually, I introduced a terrible skew to union in my previous patch.
Since I am setting key to null, every record goes to a single task in reducer vertex. I will post a fix shortly.

if (isUnion) {
// Use the entire tuple as both key and value
key = HDataType.getWritableComparableTypes(result.get(1), keyType);
val = new NullableTuple((Tuple)result.get(1));
}

This seems very inefficient as map output size just doubles and the key is just discarded. Can we use POValueOutputTez instead of POLocalRearrangeTez and use RoundRobinPartitioner instead. Till now POValueOutputTez was only used with unordered input. Since the key is always POValueOutputTez.EmptyWritable, can you change it to implement WritableComparable and always return 1 (or -1 based on how comparator is called in reducer) so that no records are equal and they are not grouped together. If every key saying greater than other causes confusion while ordering and will not work, then we cannot use WritableComparable and have to go with each reducer having 1KV pair with all records grouped together. It might be overhead, but still should be better than having duplicating value in key and doubling the memory requirements.

Also will have to cleanup isUnion code in POLocalRearrangeTez if the solution works fine.

Rohini Palaniswamy
added a comment - 26/Mar/14 15:52 - edited Cheolsoo Park ,
POLocalRearrangeTez.java
if (isUnion) {
// Use the entire tuple as both key and value
key = HDataType.getWritableComparableTypes(result.get(1), keyType);
val = new NullableTuple((Tuple)result.get(1));
}
This seems very inefficient as map output size just doubles and the key is just discarded. Can we use POValueOutputTez instead of POLocalRearrangeTez and use RoundRobinPartitioner instead. Till now POValueOutputTez was only used with unordered input. Since the key is always POValueOutputTez.EmptyWritable, can you change it to implement WritableComparable and always return 1 (or -1 based on how comparator is called in reducer) so that no records are equal and they are not grouped together. If every key saying greater than other causes confusion while ordering and will not work, then we cannot use WritableComparable and have to go with each reducer having 1KV pair with all records grouped together. It might be overhead, but still should be better than having duplicating value in key and doubling the memory requirements.
Also will have to cleanup isUnion code in POLocalRearrangeTez if the solution works fine.

Cheolsoo Park
added a comment - 26/Mar/14 15:57 Yeah, totally agree with you. I was going to put up a patch in a separate jira since that's not related to VertexGroup. It had been written a long time before POValueOutputTez was added.

Cheolsoo Park
added a comment - 26/Mar/14 16:14 In addition, can we fix the skew first so that I can benchmark POValueOuputTez vs POLocalRearrangeTez? I'd like to measure the impacts of two changes separately-
VertexGroup change
POValueOutputTez
With the skew fix, I can measure #1 isolated. Overnight I was running the same script that I used before and seeing a perf regression. I'd like to understand it better. Thanks!

How bad is the performance regression? Are you asking for committing PIG-3743-fix-skew.patch and then address POValueOuputTez in a separate jira? I don't have a problem with that. Can probably be clubbed with PIG-3835.

Also for POValueOuputTez instead of implementing WritableComparable it would be better to write a custom Comparator and either return 0 always (if we want all records to be grouped into one) or 1/-1 always (if it works and does not cause too much of reordering) based on what works. Using WritableComparator together with implementing WritableComparable is slightly more overhead.

Rohini Palaniswamy
added a comment - 26/Mar/14 16:57 In addition, can we fix the skew first
How bad is the performance regression? Are you asking for committing PIG-3743 -fix-skew.patch and then address POValueOuputTez in a separate jira? I don't have a problem with that. Can probably be clubbed with PIG-3835 .
Also for POValueOuputTez instead of implementing WritableComparable it would be better to write a custom Comparator and either return 0 always (if we want all records to be grouped into one) or 1/-1 always (if it works and does not cause too much of reordering) based on what works. Using WritableComparator together with implementing WritableComparable is slightly more overhead.

Cheolsoo Park
added a comment - 26/Mar/14 17:03
I see 10% slower in the current head + PIG-3743 -fix-skew.patch compared to before. I don't have any specific analysis on what's contributed to it.
Are you asking for committing PIG-3743 -fix-skew.patch and then address POValueOuputTez in a separate jira? I don't have a problem with that.
Good. I'll do that. Thanks!
It might be overhead, but still should be better than having duplicating value in key and doubling the memory requirements.
A simple alternative is to set value to null in POLR and let POGroupInputTez retrieve key. Do you think this is fine? Here is the implementation-
https://github.com/piaozhexiu/apache-pig/commit/00f671d563ada56ece50493dd1d24aa5587c8171

Rohini Palaniswamy
added a comment - 26/Mar/14 17:28 A simple alternative is to set value to null in POLR and let POGroupInputTez retrieve key.
POLR adds lot of overhead while processing tuple. POValueOuputTez does no processing and just passes it through which will be more efficient.