This looks great it would be great to see a little more detail from the testing you did on this here for other people to understand the issue and how this resolves it.

Some background for everyone:

The basic issue as I recall is that the data cached by get_pages can get very large for large hierarchies because of all the "post" data being cached for each node - we saw an example where the memcache object cache backend had to split the get_pages cache into 62 1 Meg chunks.

The patch addresses this by only caching the IDs as the post objects are already cached.

Some questions for nprasath002:

When we don't have an object cache backend are we now doing more queries for get_pages calls, if so how many more - i.e. is it a single extra query or does the query volume scale with the number of pages (From memory the old code fetched all the page info in one query).

Can you write some unit-tests for get_pages to show that the result returned from the function has not changed in any way from the expected behaviour?

Unlike posts, pages are less likely to have a hot cache. Caching just the IDs and then running get_post() on them could result in a lot of extra queries when a persistent cache backend is not being used. We definitely need to investigate that. If we go with caching IDs, we should consider introducing and using wp_cache_get_multi().

Looks like caching only IDs will be just fine since get_pages() calls update_post_cache(). This ensures the pages are cached should we do the same get_pages() query later in the same page load. On the next page load the cache is empty when there is no persistent cache so get_pages() does a full pages query and cache set since there are no cached IDs to loop over. Query counts are looking fine so far in testing.

In get_pages(), cache queries to individual cache buckets instead of storing them in one cached array. Also, store post IDs instead of full objects. This reduces overall memory usage as well as the size of the cache buckets. Use incrementor style passive cache invalidation.

The default cache returns boolean true from wp_cache_set(). This works out since true is cast to '1'. Some cache backends return void which is cast to an empty string. If the same query is run again later in the page load it uses a different cache key resulting in another query instead of using the already cached query. The keys look like this:

Resetting $last_changed to 1 is wrong. It is very unlikely but if $last_changed got purged from cache while the content of the buckets remained (it was on different memcache server which failed) you might get a key which points to old data which might still be in the cache.
In other words this way you increase the possibility of key collision.

IMO it is better to use a random initial seed every time it needs to be calculated.