AI Watermarking Won't Curb Disinformation

Deeplinks 2024-01-05


Generative AI allows people to produce piles upon piles of images and words very quickly. It would be nice if there were some way to reliably distinguish AI-generated content from human-generated content. It would help people avoid endlessly arguing with bots online, or believing what a fake image purports to show. One common proposal is that big companies should incorporate watermarks into the outputs of their AIs. For instance, this could involve taking an image and subtly changing many pixels in a way that’s undetectable to the eye but detectable to a computer program. Or it could involve swapping words for synonyms in a predictable way so that the meaning is unchanged, but a program could readily determine the text was generated by an AI.

Unfortunately, watermarking schemes are unlikely to work. So far most have proven easy to remove, and it’s likely that future schemes will have similar problems.

One kind of watermark is already common for digital images. Stock image sites often overlay text on an image that renders it mostly useless for publication. This kind of watermark is visible and is slightly challenging to remove since it requires some photo editing skills.

Images can also have metadata attached by a camera or image processing program, including information like the date, time, and location a photograph was taken, the camera settings, or the creator of an image. This metadata is unobtrusive but can be readily viewed with common programs. It’s also easily removed from a file. For instance, social media sites often automatically remove metadata when people upload images, both to prevent people from accidentally revealing their location and simply to save storage space.

A useful watermark for AI images would need two properties: 

  • It would need to continue to be detectable after an image is cropped, rotated, or edited in various ways (robustness). 
  • It couldn’t be conspicuous like the watermark on stock image samples, because the resulting images wouldn’t be of much use to anybody.

One simple technique is to manipulate the least perceptible bits of an image. For instance, to a human viewer these two squares are the same shade:

But to a computer it’s obvious that they are different by a single bit: #93c47d vs 93c57d. Each pixel of an image is represented by a certain number of bits, and some of them make more of a perceptual difference than others. By manipulating those least-important bits, a watermarking program can create a pattern that viewers won’t see, but a watermarking-detecting program will. If that pattern repeats across the whole image, the watermark is even robust to cropping. However, this method has one clear flaw: rotating or resizing the image is likely to accidentally destroy the watermark.

There are more sophisticated watermarking proposals that are robust to a wider variety of common edits. However, proposals for AI watermarking must pass a tougher challenge. They must be robust against someone who knows about the watermark and wants to eliminate it. The person who wants to remove a watermark isn’t limited to common edits, but can directly manipulate the image file. For instance, if a watermark is encoded in the least important bits of an image, someone could remove it by simply setting all the least important bits to 0, or to a random value (1 or 0), or to a value automatically predicted based on neighboring pixels. Just like adding a watermark, removing a watermark this way gives an image that looks basically identical to the original, at least to a human eye.

Coming at the problem from the opposite direction, some companies are working on ways to prove that an image came from a camera (“content authenticity”). Rather than marking AI generated images, they add metadata to camera-


From feeds:

Fair Use Tracker » Deeplinks
CLS / ROC » Deeplinks


& machine learning intelligence artificial


Jacob Hoffman-Andrews

Date tagged:

01/05/2024, 19:31

Date published:

01/05/2024, 13:46