They asked "why not?" It's because when the verb is put into the first position to create a question, the subject must follow the verb. So the sentence parses with es as the subject and sie as the object.