They said that the AI system was provided with whole sentences so that it could teach itself which letter corresponded to which lip movement.

To train the AI, the team – from Oxford University’s AI lab – fed it nearly 29,000 videos, labelled with the correct text. Each video was three seconds long and followed a similar grammatical pattern.

While human testers given similar videos had an error rate of 47.7%, the AI had one of just 6.6%.

The fact that the AI learned from specialist training videos led some on Twitter to criticise the research.

Writing in OpenReview, Neil Lawrence pointed out that the videos had “limited vocabulary and a single syntax grammar”.

“While it’s promising to perform well on this data, it’s not really groundbreaking. While the model may be able to read my lips better than a human, it can only do so when I say a meaningless list of words from a highly constrained vocabulary in a specific order,” he writes.