This is a limitation imposed by jetty server where the HTTP response header is hard coded to 64kb in PHD 1.0.1 and 64kb is the default setting for PHD 1.1.0 and later. When PXF query attempts to read a table with a large number of columns, the http response header will be larger then 64kb and jetty server will return HTTP status code 413.

Fix

The fix will be to increase the http.header.size for the namenode http jetty server to a value higher then the default of 64.

PXF will send a http request json request to the namenode. The payload will include all of the column names and data types. The data will look as followed.

The "X-GP-ATTR" prefixes in the the three variables sent in the json packet will always be the same. However the TYPENAMEx and NAMEx values will vary depending on the tables column name and data type and how many columns are in the table. We can safely estimate PXF will need about 80 bytes for the meta data including TYPECODEx and TYPENAMEx. So when determining what the value should be for the http header size you need to include 80 bytes plus the number bytes in the column name times the number of columns. Here is an example to help make sense of this.

Assume you have 1000 columns with names like col1, col2, col3, col4...

1000 columns with names like col1, col2, col3, col4...

The largest column name is col1000 = 7 characters

We can then calculate the size of the json payload given the following estimates plus size of largest column