I think that the locking/ordering of events in the multiple queue design you're suggesting must be quite hard:
Say Window A is selected at the beginning.
a-A mouse event select Window B.
b-A key is pressed.
Depending whether a or b occurred first, the key pressed event must be sent to Window A or B.
So these queue must sometime be synchronised carefuly..

I wonder how BeOS worked, with one queue or with multiple queues?

While I don't know if a different thread&queue to handle each device is interesting, I think that a different thread in the GUI server for each client application makes sense as this simplify the multiplexing of clients: currently an application even if low priority may flood the X server, making the higher priority application appear less responsive than it ought to..
With a different thread in the X server for each client app, maybe the OS scheduling could solve this issue, of course there is still the risk of a priority inversion caused by shared resource (and as the videocard itself is shared, it's not truly possible to avoid sharing)..