Jump to content

Adobe Voco

Adobe VoCo is an unreleased and generating prototype software by Adobe that enables novel editing and generation of audio. Dubbed "Photoshop-for-voice",[1] it was first previewed at the Adobe MAX event in November 2016. The technology shown at Adobe MAX was a preview that could potentially be incorporated into Adobe Creative Cloud. It was later revealed that Voco was never meant to be released and was meant to be a research prototype.[2][3]

In 2023, Adobe introduced the ability to edit video by editing an AI-generated transcript of the video in Premiere Pro, demonstrating similar functionality to Voco.[4]

Technical details

As the demo showed, the software takes approximately 20 minutes of the desired target's speech and generates a sound-alike voice including phonemes that were not present in the target example material. Adobe stated Voco would lower the cost of audio production.[1][3]

Concerns

Ethical and security concerns were raised over the ability to alter an audio recording to include words and phrases the original speaker never spoke, and the potential risk to voiceprint biometrics.[1]

Concerns also rose that it may be used in conjunction with:

Alternatives

Adobe's lack of publicized progress opened opportunities for other projects to build alternative products to VOCO, such as Resemble AI and 15.ai, a real-time text-to-speech tool using artificial intelligence.

WaveNet is a similar but open-source research project at London-based artificial intelligence firm DeepMind, developed independently around the same time as Adobe Voco.

See also

References

  1. ^ a b c "sapic". BBC.com. BBC. 2016-11-07. Retrieved 2016-07-05.
  2. ^ "Beta Testing #VoCo". 8 November 2016.
  3. ^ a b "Is Adobe VoCo dead ?". Adobe Blog. 2018-01-27. Retrieved 2020-06-17.
  4. ^ "Now in Beta: Introducing Text-Based Editing in Premiere Pro". community.adobe.com. 2023-02-03. Retrieved 2023-04-16.
  5. ^ Rodgers, Julian. "Adobe Voco - Should We Be Afraid?". Production Expert. Pro Tools. Retrieved 14 December 2018.
  6. ^ Thies, Justus (2016). "Face2Face: Real-time Face Capture and Reenactment of RGB Videos". Proc. Computer Vision and Pattern Recognition (CVPR), IEEE. Retrieved 2016-06-18.


See what we do next...

OR

By submitting your email or phone number, you're giving mschf permission to send you email and/or recurring marketing texts. Data rates may apply. Text stop to cancel, help for help.

Success: You're subscribed now !