Description

Status BaseScalarColumnReader::ReadDataPage() {
// We're about to move to the next data page. The previous data page is
// now complete, pass along the memory allocated for it.
parent_->scratch_batch_->mem_pool()->AcquireData(decompressed_data_pool_.get(), false);

These in turn are passed along with the row batch. This is safe but unnecessary in many cases where the batch does not hold pointers into the decompression buffer: if the column has only fixed-length data, or if the data page is dictionary-encoded.

This can make problems like IMPALA-4923 worse than they would be otherwise because extra data is transferred across threads.

Attachments

Issue Links

is related to

IMPALA-6054Parquet dictionary pages should be freed on dictionary construction

Resolved

relates to

IMPALA-4923Operators running on top of selective Hdfs scan nodes spend a lot of time calling impala::MemPool::FreeAll on empty batches