It’s official: Midjourney used a “hundred million” images without permission to train is AI

Dec 21, 2022

Dunja Djudjic

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

It’s official: Midjourney used a “hundred million” images without permission to train is AI

Dec 21, 2022

Dunja Djudjic

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

Join the Discussion

Share on:

David Holz, the founder of Midjourney, recently admitted something we’ve already assumed: the company’s AI was trained on hundreds of millions of images without consent from their authors. This revelation has sparked outrage among both artists and privacy advocates. It has raised concerns about the ethical implications of such actions, as well as copyright issues that might emerge.

Back in September, Holz gave in an interview to Forbes, revealing that his company didn’t seek consent from artists when using their work for AI training. He justified it with the lack of possibility to “get a hundred million images and know where they’re coming from.”

“It would be cool if images had metadata embedded in them about the copyright owner or something. But that’s not a thing; there’s not a registry. There’s no way to find a picture on the Internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.”

If this makes you furious, wait, there’s more. Forbes asked Holz if artists can opt out of being included in Midjourney’s data training model, or from being named in prompts. Can you guess his answers? You’re right, it’s a hard no on both, at least for now. Holz said that the company is “looking at that,” adding that “the challenge now is finding out what the rules are, and how to figure out if a person is really the artist of a particular work or just putting their name on it.” The only thing you can do for now is to check whether your art was used to train AI, and this feature has nothing to do with Midjourney, it is a tool on its own.

Forbes discussed the future of art that’s certainly already being impacted by Midjourney and other AI image generators. Will AI art destroy artists’ livelihoods? Holz sees two possible scenarios:

“I think there’s kind of two ways this could go. One way is to try to provide the same level of content that people consume at a lower price, right? And the other way to go about it is to build wildly better content at the prices that we’re already willing to spend. I find that most people, if they’re already spending money, and you have the choice between wildly better content or cheaper content, actually choose wildly better content. The market has already established a price that people are willing to pay.”

I agree up to a point: I’d always rather pay a living being, an artist, to create something of great quality. However, many people would rather pay less money for mediocre or under-average work than pay a higher price for high-quality work. There are still those willing to pay for excellent work – but I feel like their number is decreasing. And what are we going to do if AI-generated art becomes better, as I’m sure it will? In this case, there’s double damage.

First, all those artists whose work was used for AI training will stay without compensation for having their work contribute to the development of these tools. And second, many more artists won’t get gigs even if they undercharge for their work, as there will always be an AI tool ready to do it for free, or for even less cash.

While I’m excited and intrigued to see text-to-image and other AI technology expand, I’m trying to curb my enthusiasm and stay aware of the potential harm that it could do. I’m not saying there aren’t good sides to Midjourney and other text-to-image generators and I sure enjoy playing around with them. But using a hundred million images without consent isn’t one of the good sides, I’d say. it opens up a Pandora’s Box of various copyright issues, and we’re still to discover what will get out of it.

[via PetaPixel]

Filed Under:

Tagged With:

Find this interesting? Share it with your friends!

Dunja Djudjic

Dunja Djudjic

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

Join the Discussion

DIYP Comment Policy
Be nice, be on-topic, no personal information or flames.

Leave a Reply

Your email address will not be published. Required fields are marked *

20 responses to “It’s official: Midjourney used a “hundred million” images without permission to train is AI”

  1. Michael Chastain Avatar
    Michael Chastain

    “How dare anybody look at the image I specifically posted publicly.”

    I’m sure you explicitly got permission from every artist you’ve learned from over the years before even looking at their work, right?

    1. Markus Avatar
      Markus

      Right because AI is the same as a human being. A program designed to consume millions of images and output similar artwork in an instant to make a profit should be given the same considerations as a person.

      1. Michael Chastain Avatar
        Michael Chastain

        “Right because AI is the same as a human being. ”

        Tell me specifically how it’s different in a meaningful way.

        “A program designed to consume millions of images and output similar artwork in an instant”

        Ah… so it’s OK if it takes years to to be affected by everything you’ve ever seen in your life. What you’re pissed off about is that computers learn faster than you.

        “to make a profit ”

        So humans have to get permission for everything they’ve ever seen before they can charge for their art? There doesn’t seem to be any consistency in your argument.

        1. Lunatic Lawyer Avatar
          Lunatic Lawyer

          Genetic engineering springs to mind.

          How is producing a genetically modified organism that specifically matches certain features using e. g. the CRISPR/Cas method any different to breeding and selection? The latter could – in one or two eternities – achieve the same as the former. So let’s not put any restrictions on genetic engineering, right? Some people indeed think that to be right. Most disagree.

          Discussion necessary on how the implicit copyright of images published on the internet (just because something is freely accessible via a browser doesn’t mean anyone can do anything he likes with it) has to be interpreted.

          Let’s take CC BY 4.0 for example, which is a pretty liberal license. Quote: “Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made”. Good luck with that, Mr. Holz!

          1. Michael Chastain Avatar
            Michael Chastain

            “So let’s not put any restrictions on genetic engineering, right? Some people indeed think that to be right. Most disagree.”

            We have all kinds of restrictions on genetic engineering. Meanwhile AI learning is being actively encouraged. A wise person would recognize that experts see massive differences between those two areas, even if they were incapable of seeing them themself.

            “Discussion necessary on how the implicit copyright of images published on the internet (just because something is freely accessible via a browser doesn’t mean anyone can do anything he likes with it) has to be interpreted”

            We have tons of precedent on copyright law. Nobody is saying you can do anything with it. But learning from images and mimicking style are both explicitly protected.

            “”Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made”. Good luck with that, Mr. Holz!”

            Considering it’s not reusing those original images, I suspect they will have very good luck indeed.

  2. M Hector Avatar
    M Hector

    I imagine that the AI algorithm could include a list of all the original filenames or some identifying information, even in the absence of EXIF metadata identifying content authors. This is not an impossible problem to solve.

  3. Bart Ros Fotografie - Fotograaf Deventer & Overijssel Avatar
    Bart Ros Fotografie – Fotograaf Deventer & Overijssel

    *it’s AI

  4. Tomasz Staśko Avatar
    Tomasz Staśko

    So what? I also look at many, many images over the all Internet to learn different syles, new techniques and so on. Without any consent. Main difference that AI do it much faster than I… and still can not correctly reproduce, for example, hands or posture.
    To all artists, or maybe.. “artists” – ADAPT OR DIE.

    1. Saša Šijak Avatar
      Saša Šijak

      exactly. what is the difference in artist looking at different artists work and learning the style and techniques and then using those and selling artwork influenced by others, compared to AI tools learning by analysing images and abstracting that learning to some vague model

  5. Balazs Vihari Avatar
    Balazs Vihari

    Attila Földes

  6. Lunatic Lawyer Avatar
    Lunatic Lawyer

    That is very cute but also kinda logical in an ASD way of thinking: “Doing it ethically and legally would be hard and complicated, so let’s just do it secretly and criminally!”. The American Way of Life: Better to ask for forgiveness than permission.

    The financial aspects of the emerging shitstorm could break Midjourney’s/David Holz’s neck. Rightfully so.

    1. Michael Chastain Avatar
      Michael Chastain

      “so let’s just do it secretly and criminally!”

      If by secretly you mean clearly stating it in a major national interview. As for “criminally”, feel free to sue. Surely you’ll admit you’re wrong if you lose, right?

      1. Lunatic Lawyer Avatar
        Lunatic Lawyer

        Oh the semantics! Admitting to have done something wrong (violating copyright licenses for that matter) without telling anybody until explicitly being asked qualifies as “secretly” in my book. But at least that’s way better than trying to lie about it until there is overwhelming evidence. Kudos to Holz here. It has been done before (see Facebook), it will be done again.

        I have not been affected by Midjourney’s scraping of the Internet and even if I were, I wouldn’t care much. I’m not doing any business with my photography and I’m aware of the fact that the Internet is the new Wild West.

        But people doing business with their images who eventually find out that their license terms have been violated by a company doing business with a model trained on their material might be pissed. They might be tempted to sue.

        Then again losing the lawsuit has no correlation to something being “wrong” whatsoever. Not in technical terms and definitely not from an ethical perspective.

        I know that, you know that, everyone knows that.

        1. Michael Chastain Avatar
          Michael Chastain

          “Admitting to have done something wrong”

          Sure, if what he admitted to is ACTUALLY wrong, rather than just your ignorant opinion of what you think SHOULD be wrong, then it will be trivial for somebody to win a lawsuit. When that happens, feel free to come back and take your victory lap. If that doesn’t happen, I’m sure you’ll admit you were wrong. Right?

          1. Lunatic Lawyer Avatar
            Lunatic Lawyer
          2. Lunatic Lawyer Avatar
            Lunatic Lawyer

            Oh – and let’s not forget about:
            https://arxiv.org/abs/2301.13188
            Extracting (almost) exact training data from the Stable Diffusion AI model.

  7. Martin Gillette Avatar
    Martin Gillette

    I read the copyright rules a few years back and as far as I can remember there wouldn’t be any rules being broken here. You can definitely use other people’s works for training purposes. This looks like the computer gets trained to create from many different examples of photos being programed into it. I try to crate something with words. The computer is going to generate an image from millions of examples. It isn’t going to spit out someone else’s photo. I see no permission being needed.

  8. Photography by Joshua McTackett Avatar
    Photography by Joshua McTackett

    Adapt or die

  9. James Husted Avatar
    James Husted

    I imagine that there should be a ton of pissed off artist because of Archive.org because they have whole copies of the artist websites, not just the pictures. And also Google and every search engine out there scrapes the web the same way, or you wouldn’t show up in a image search on their site either.
    As for copying the art, until they can show a copy of their art being used in a AI generated image, they have no arguments. This is not collage, it is a new image. As far as copying styles goes, this has been done throughout the history of art. When you copy a image, it is forgery, when you copy a style, it is being non-creative.

  10. Valhalla Avatar
    Valhalla

    Sites like Google/Flickr have an option to only show CC images (also many CC sites) and there are billions of CC images when all sources are combined, they’re just not as good as paid art. THAT’S why they illegally scrape high quality art and use the Metadata defense, I just made a meme.