Adobe prototypes 'Photoshop for audio'

Project VoCo is a prototype with the ability to synthesise new speech, effectively rewriting what a speaker has said.

Shares

Adobe's annual MAX event to give a sneak peek at some projects its team are working on in the background, some wild and fanciful but some that could indeed see the light of day at some point in the future.

At Adobe MAX 2016, amongst the sneak previews was Project VoCo, which Adobe developer Zeyu Jin described as doing for audio what Photoshop does for photography – speech editing capabilities that even includes adding words that did not originally appear in the audio file.

The prototype can rewrite speech, adding new words in the speaker's voice

Demonstrating the software, Jin took a clip of speech and by simply typing new text into an edit box was able to add that text into the speech, in exactly the same voice. In other words, he 'redubbed' what the speaker has actually said.

Jin says that you need around 20 minutes of recorded speech for the engine to be able to accurately add new words to the audio clip, but that is an impressively short time for what, in the demo, was a remarkably accurate representation of the recorded speaker's voice.

While this raises ethical alarm bells about the ability to change facts after the event (an ethical minefield already well-trodden by some using Photoshop's photo editing features), it could also be an incredibly useful tool for podcasters and audiobook-creators, making them able to post-produce audio edits without having to pay a voice actor for re-records.

But let's not get ahead of ourselves – this is an early prototype shown at Adobe Sneaks, but whether this is part of future product plans, only Adobe knows…