Creepy AI reconstructs your portrait based only on your voice

Apr 5, 2022

Dunja Djudjic

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

Creepy AI reconstructs your portrait based only on your voice

Apr 5, 2022

Dunja Djudjic

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

Join the Discussion

Share on:

Turning speech into text has become so common that i’s a part of almost every smartphone. But have you ever thought about turning your speech into a portrait? Researchers have, and they’ve even made it possible.

Artificial intelligence scientists at MIT’S Computer Science and Artificial Intelligence Laboratory (CSAIL) have created AI that turns short snippets of audio speech recording into a human face. As if this weren’t both stunning and creepy enough, the results are actually fairly accurate, too!

The CSAIL researchers published a paper about their invention back in 2019. It’s an algorithm called, not surprisingly, Speech2Face, and the name says it all. In the demo, you can take a peek at how it works and what are the results. At the very top of the page, you’ll hear the audio snippets of different people speaking. Their real photo is just for your reference, and Speech2Face recreated their portrait based only on a three-second recording of their voice.

Interestingly enough, the AI seems to be working better when the audio clips are longer. The researchers have shared some examples of faces recreated from three versus six seconds of speech.

Of course, the results are still far from perfect, but they’re still amazing and eerily accurate. Still, the AI sometimes completely misses the point and mixes up the gender, age, and ethnicity of the subject:

Privacy concerns

Even though the algorithm was created for scientific purposes only, the question of privacy has been raised. The team claims that their method “cannot recover the true identity of a person from their voice,” i.e. recreate an exact image of their face.

“This is because our model is trained to capture visual features (related to age, gender, etc.) that are common to many individuals, and only in cases where there is strong enough evidence to connect those visual features with vocal/speech attributes in the data (see “voice-face correlations” below). As such, the model will only produce average-looking faces, with characteristic visual features that are correlated with the input speech. It will not produce images of specific individuals.”

However, if the algorithm becomes so sophisticated that it could recreate super-realistic faces, what impact could it have? The first thought that comes to my mind is that technology like this could be of immense help to police officers and detectives… Or I’m just looking too many crime TV shows. On the other hand, it could have a negative impact on YouTube and TikTok stars who’re trying to save their private life from followers so they only do voiceovers and don’t appear in front of the camera. But like every technology, I guess this one could be super-useful in good hands, and dangerous in bad ones.

[via PetaPixel]

Filed Under:

Tagged With:

Find this interesting? Share it with your friends!

Dunja Djudjic

Dunja Djudjic

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

Join the Discussion

DIYP Comment Policy
Be nice, be on-topic, no personal information or flames.

Leave a Reply

Your email address will not be published. Required fields are marked *

9 responses to “Creepy AI reconstructs your portrait based only on your voice”

  1. beachmike Avatar
    beachmike

    When testing Speech2Face on demented
    Beijing Biden, the system drew a bowl of mashed potatos. It was deemed a success!

    1. Austin Avatar
      Austin

      Ha! Stale mashed potatoes

  2. tyretes Avatar
    tyretes

    i though we will be having a flying cars..

  3. Robert93 Avatar
    Robert93

    Another bogus development from the media lab.
    Hyperware that never turns into anything really useful.

  4. Deadpool Avatar
    Deadpool

    The reconstructed face looks like Luka. Hehe

  5. bgg1 Avatar
    bgg1

    One of the failures produced the exapt picture of one of the successes. Seems to me that they just have a pool of generic faces that they pick from to match, rather than reconstruct the face from some clues that they get from the voice.

  6. John Beatty Avatar
    John Beatty

    I tried trump and it showed Putin.

  7. J.J Avatar
    J.J

    This could be beneficial for the police that are trying to solve certain cases. Right away I thought of the Delphi Murders.

  8. Nadya De'Lasoul Davis Avatar
    Nadya De’Lasoul Davis

    How many people use their voice as their password like when calling the bank and many other companies…..not so crazy about this