Meta reignites plans to train AI using UK users' public Facebook and Instagram posts
Meta has confirmed that it's restarting efforts to train its AI systems using public Facebook and Instagram posts from its U.K. userbase.
The company claims it has "incorporated regulatory feedback" into a revised "opt-out" approach to ensure that it's "even more transparent," as its blog post spins it. It is also seeking to paint the move as enabling its generative AI models to "reflect British culture, history, and idiom." But it's less clear what exactly is different about its latest data grab.
From next week, Meta said U.K. users will start to see in-app notifications explaining what it's doing. The company then plans to start using public content to train its AI in the coming months -- or at least do training on data where a user has not actively objected via the process Meta provides.
The announcement comes three months after Facebook's parent company paused its plans due to regulatory pressure in the U.K., with the Information Commissioner’s Office (ICO) raising concerns over how Meta might use U.K. user data to train its generative AI algorithms -- and how it was going about gaining people's consent. The Irish Data Protection Commission, Meta’s lead privacy regulator in the European Union (EU), also objected to Meta's plans after receiving feedback from several data protection authorities across the bloc -- there is no word yet on when, or if, Meta will restart its AI training efforts in the EU.
For context, Meta has been boosting its AI off user-generated content in markets such as the U.S. for some time but Europe’s comprehensive privacy regulations have created challenges for it -- and for other tech companies -- looking to expand their training datasets in this way.
Despite the existence of EU privacy laws, back in May Meta began notifying users in the region of an upcoming privacy policy change, saying that it would begin using content from comments, interactions with companies, status updates, and photos and their associated captions for AI training. The reasons for doing so, it argued, was that it needed to reflect "the diverse languages, geography and cultural references of the people in Europe."
The changes were due to come into effect on June 26 but Meta's announcement spurred privacy rights nonprofit noyb (aka “none of your business”) to file a dozen complaints with constituent EU countries, arguing that Meta was contravening various aspects of the bloc's General Data Protection Regulation (GDPR) -- the legal framework which underpins EU Member States' national privacy laws (and also, still, the U.K.'s Data Protection Act).
The complaints targeted Meta's use of an opt-in mechanism to authorize the processing versus an opt-out -- arguing users should be asked their permission first, rather than having to take action to refuse a novel use of their information. Meta has said it's relying on a legal basis set out in the GDPR that's called "legitimate interest" (LI). It therefore contends its actions comply with the rules despite privacy experts' doubts that LI is an appropriate basis for such a use of people's data.
Meta has sought to rely on this legal basis before to try to justify processing European users’ information for microtargeted advertising. However, last year the Court of Justice of the European Union ruled it couldn’t be used in that scenario, which raises doubts about Meta's bid to push AI training through the LI keyhole too.
That Meta has elected to kickstart its plans in the U.K., rather than the EU, is telling though, given that the U.K. is no longer part of the European Union. While U.K. data protection law does remain based on the GDPR, the ICO itself is no longer part of the same regulatory enforcement club and often pulls its punches on enforcement. U.K. lawmakers also recently toyed with deregulating the domestic privacy regime.
Opt-out objections
One of the many bones of contention over Meta's approach the first time around was the process it provided for Facebook and Instagram users to "opt-out" of their information being used to train its AIs.
Rather than giving people a straight "opt-in/out" check-box, the company made users jump through hoops to find an objection form hidden behind multiple clicks or taps, at which point they were forced to state why they didn't want their data to be processed. They were also informed that it is entirely at Meta’s discretion as to whether this request would be honored. Although the company claimed publicly that it would honor each request.
This time around, Meta is sticking with the objection form approach, meaning users will still have to formally apply to Meta to let them know that they don't want their data used to improve its AI systems. Those who have previously objected won't have to resubmit their objections, per Meta. But the company says it has made the objection form simpler this time around, incorporating feedback from the ICO. Although it hasn't yet explained how it's simpler. So, for now, all we have is Meta's claim that the process is easier.
Stephen Almond, ICO director of technology and innovation, said that it will "monitor the situation" as Meta moves forward with its plans to use U.K. data for AI model training.
"It is for Meta to ensure and demonstrate ongoing compliance with data protection law," Almond said in a statement. "We have been clear that any organisation using its users’ information to train generative AI models [needs] to be transparent about how people’s data is being used. Organisations should follow our guidance and put effective safeguards in place before they start using personal data for model training, including providing a clear and simple route for users to object to the processing."