For a single X we have 784 different threshold. for examining best threshold with respect to all the train data we get 784x60000 thresholds wich we clearly can't cover.
is their a specific way we should find our threshold or any logical way would be appropiate?