how does the system handle the missing embedding? Are the misssing parts tokens embedding under the random initialization at the beginning of the trainning? then both the miss and unmiss are update when trainning?