Your public Instagram and Facebook posts trained Meta’s AI assistant
Oct 2, 2023
Share:

Meta has revealed that it used public posts from Facebook and Instagram to train its new Meta AI virtual assistant. In a move to retain trust, the company emphasized that they excluded private posts shared only with family and friends to safeguard consumers’ privacy.
Meta also made it clear that private chats on its messaging services were not used as training data for the AI model. The company took additional steps to filter out private details from public datasets used during the training process.
Meta’s President of Global Affairs, Nick Clegg, stated, “We’ve tried to exclude datasets that have a heavy preponderance of personal information.” He highlighted that the “vast majority” of the data used for training was publicly available. He also named LinkedIn as an example of a website Meta deliberately chose not to use due to privacy concerns.
Internet scraping of data
These developments come amid increasing criticism of tech companies for using internet-scraped information without permission to train their AI models. These models process vast amounts of data to summarize information and generate content like images.
Concerns have arisen regarding the use of private or copyrighted materials during this process, potentially leading to copyright infringement lawsuits. Meta’s new AI assistant, Meta AI, was unveiled at the company’s annual Connect conference. This year’s event mostly centred around artificial intelligence.
Meta AI is capable of generating text, audio, and imagery. It has real-time information access through a partnership with Microsoft’s Bing search engine.
Public posts
The training data for Meta AI included public Facebook and Instagram posts encompassing both text and photos. These posts were used to train image generation, while the chat functions were supplemented with publicly available and annotated datasets.
Copyright questions
Regarding copyrighted materials, Clegg acknowledged the potential for litigation to determine whether creative content falls under existing fair use. Fair use permits limited use of protected works for purposes such as commentary, research, and parody. He stated, “We think it is, but I strongly suspect that’s going to play out in litigation.”
Perhaps it would have been better to have had an opt-out option for users on the platforms. After all, Meta still does not own the copyright for those images. Open.AI has just this week announced that they are letting artists opt out of training data. However, the process is so convoluted that they may as well have not bothered.
It’s an interesting problem and one that we will continue to see playing out regarding AI and copyrighted material to train data sets.
[Via Reuters]
Alex Baker
Alex Baker is a portrait and lifestyle driven photographer based in Valencia, Spain. She works on a range of projects from commercial to fine art and has had work featured in publications such as The Daily Mail, Conde Nast Traveller and El Mundo, and has exhibited work across Europe





































Join the Discussion
DIYP Comment Policy
Be nice, be on-topic, no personal information or flames.
10 responses to “Your public Instagram and Facebook posts trained Meta’s AI assistant”
Company known for poorly handling sensitive customer data, uses customer data to its own ends. News at 11.
of course it did 😁
So what? 🙄
So Twitter teaching Tesla AI haha
You *did* read the terms you agreed to when signing up?
Good news.
Yet another post that has nothing to do with DIY photography. But, I’ll give the real reason for concern. The problem is the AI will over use Left Wing views to answer questions.
Everything on the internet trains the AI…..also IRL…
Duh
We are all guinea pigs))