Carl Steinbach
added a comment - 12/Jul/11 01:48 This UDF would be generally nice to have, but may also be a requirement if it turns out that the unordered output of functions like map_keys() and map_values() is non-deterministic.

INLINE COMMENTS
ql/src/test/queries/clientpositive/udf_sort_array.q:8 Please add EXPLAIN queries.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:82 Fix indentation.
ql/src/test/results/clientpositive/udf_sort_array.q.out:13 "sort_array(sort_array(...))"?
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:45 "Sorts the input array in ascending order according to the natural ordering of the array elements."
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:55 Please add a negative testcase that exercises these code paths.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:60 What happens if I try to sort an array of arrays, or array of structs?
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:67 The return type should be another ARRAY, not a string.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:85 This method needs to return another ARRAY, not a string containing the concatenated, sorted input elements. Take a look at the array() UDF for hints on how to return an array from a UDF.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:97 Indentation.

Phabricator
added a comment - 04/Jan/12 02:28 cwsteinbach has requested changes to the revision " HIVE-2279 [jira] Implement sort(array) UDF".
INLINE COMMENTS
ql/src/test/queries/clientpositive/udf_sort_array.q:8 Please add EXPLAIN queries.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:82 Fix indentation.
ql/src/test/results/clientpositive/udf_sort_array.q.out:13 "sort_array(sort_array(...))"?
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:45 "Sorts the input array in ascending order according to the natural ordering of the array elements."
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:55 Please add a negative testcase that exercises these code paths.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:60 What happens if I try to sort an array of arrays, or array of structs?
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:67 The return type should be another ARRAY, not a string.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:85 This method needs to return another ARRAY, not a string containing the concatenated, sorted input elements. Take a look at the array() UDF for hints on how to return an array from a UDF.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:97 Indentation.
REVISION DETAIL
https://reviews.facebook.net/D1107

cwsteinbach has commented on the revision "HIVE-2279[jira] Implement sort(array) UDF".

INLINE COMMENTS
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:47 This should be
value="FUNC(array)"

Otherwise, the output of "describe function sort_array" looks like this:

sort_array(sort_array(obj1, obj2,...)) - Sorts the input...
^^^^^^^^
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:51 Please change the example to something simpler that doesn't involve corporate names, e.g. "array(4, 1, 3, 2)"
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:449 I think the name should be changed to "sort" and "GenericUDFSort". In the future we may want to extend it so that it can also support sorting the elements in an input string. In a similar manner we may want to implement a "reverse" UDF that reverse the elements of an array as well as the characters in a string.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:74 "Argument 1" instead of "Argument " + 1 + ...
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:88 There's a single argument, right?

Phabricator
added a comment - 09/Jan/12 09:08 cwsteinbach has commented on the revision " HIVE-2279 [jira] Implement sort(array) UDF".
INLINE COMMENTS
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:47 This should be
value=" FUNC (array)"
Otherwise, the output of "describe function sort_array" looks like this:
sort_array(sort_array(obj1, obj2,...)) - Sorts the input...
^^^^^^^^
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:51 Please change the example to something simpler that doesn't involve corporate names, e.g. "array(4, 1, 3, 2)"
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:449 I think the name should be changed to "sort" and "GenericUDFSort". In the future we may want to extend it so that it can also support sorting the elements in an input string. In a similar manner we may want to implement a "reverse" UDF that reverse the elements of an array as well as the characters in a string.
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:74 "Argument 1" instead of "Argument " + 1 + ...
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:88 There's a single argument, right?
REVISION DETAIL
https://reviews.facebook.net/D1125

Carl Steinbach
added a comment - 16/Jan/12 09:49 @Zhenxiao: Please attach a copy of your patch (D1143) and click the box that gives license rights to Apache. This is a prerequisite for getting any patch committed. Thanks.