Samuel García Martínez
added a comment - Aug 23 2012 09:28:19 PM +00:00 I developed a fix for this issue. Is there any process or prerrequisites to do a pull request on Github with this fix?
To give this more accuracy, $group _id can be a superset of shardkey too.

Ian Whalen
added a comment - Aug 24 2012 02:07:33 PM +00:00 @samuel, the first step is to fill out the contributor agreement - http://www.10gen.com/contributor - and then open a pull request at https://github.com/mongodb/mongo/pulls

Asya Kamsky
added a comment - Apr 08 2015 06:12:27 PM +00:00 The same optimization can be extended if $group is done on the original document _id field (as would be the case if you $unwind and $group by _id to process some array in the document).

We've taken a look at how we might achieve this, and the changes required are non-trivial. After some discussion, we've decided this will not be completed for 3.2, and we will re-prioritize when planning for the next release.

What complicates this is that another stage such as a $project stage could modify the _id or shard key of a document, e.g. the $group stage could not be performed entirely on the shards for the following pipeline.

db.coll.aggregate([

{$project: {_id: {$literal: 1}}},

{$group: {_id: '$_id'}}

])

While this is still a very useful optimization, we believe several of the new expressions capable of operating on arrays available in the $project stage should reduce the need for unwinding and then regrouping, in turn reducing the need for this optimization. e.g. SERVER-9625, SERVER-4589, SERVER-8141, SERVER-10626, SERVER-14872.

Charlie Swanson
added a comment - Sep 16 2015 03:42:10 PM +00:00 We've taken a look at how we might achieve this, and the changes required are non-trivial. After some discussion, we've decided this will not be completed for 3.2, and we will re-prioritize when planning for the next release.
What complicates this is that another stage such as a $project stage could modify the _id or shard key of a document, e.g. the $group stage could not be performed entirely on the shards for the following pipeline.
db.coll.aggregate([
{$project: {_id: {$literal: 1}}},
{$group: {_id: '$_id' }}
])
While this is still a very useful optimization, we believe several of the new expressions capable of operating on arrays available in the $project stage should reduce the need for unwinding and then regrouping, in turn reducing the need for this optimization. e.g. SERVER-9625 , SERVER-4589 , SERVER-8141 , SERVER-10626 , SERVER-14872 .