Past this point we observe the density estimation seems to get washed out by oversmoothing, but the BLEU scores continue to improve until k = 500 but only because the generated captions become increasingly shorter.

Past this point we observe the density estimation seems to get washed out by oversmoothing, but the BLEU scores continue to improve until k = 500 but only because the generated captions become increasingly shorter.