It might be a good time to study law. Or copyright law, at least. Researchers have released a study that proves that AI image generators can and do, copy existing images that they’ve ‘looked at’ during the machine learning process. This means that there is a chance that anything spat out by the software could be an exact replica of a copyrighted image.
This debunks the favorite argument that AI machines are no different from humans’ learning processes and that everything they sample is merely ‘inspiration’ to create something new. This appears to not be the case, according to the study. Although it is relatively rare at the moment, the researchers predict that with time it could become a greater problem.
Models such as Stable Diffusion are trained on copyrighted, trademarked, private, and sensitive images.
Yet, our new paper shows that diffusion models memorize images from their training data and emit them at generation time.
— Eric Wallace (@Eric_Wallace_) January 31, 2023
The study was done by researchers at both Google and DeepMind, alongside Berkeley, Cornell, and Princeton universities. The study found that both Google Imagen and Stable Diffusion were capable of directly copying sensitive and copyrighted images used during the training process.
Eric Wallace, a PhD student at UC Berkeley who worked on the study, told Gizmodo that, at the moment, occurrences of image duplication are rare at just 0.3%. However, the research shows that as AI systems become bigger and more powerful, that percentage is likely to increase.
“Maybe in next year, whatever new model comes out that’s a lot bigger and a lot more powerful, then potentially these kinds of memorization risks would be a lot higher than they are now,” Wallace says.
Us humans hold a very low opinion when people copy other people’s work verbatim. But according to this study, the AI is more than capable of doing just that, and it could end up getting into a bit of a pickle over it if it leads to plagiarism on a grand scale, especially as these diffusion-based image generators become more ubiquitous.
We are already seeing some of the stock image giants, such as Getty Images, filing lawsuits against Stable Diffusion et al. If these inconsistencies persist, there may well be grounds for a lot more of them in the future.