COPPA and Signaling

Tech @ FTC 2013-03-21

As has been widely reported, the FTC recently amended its COPPA Rule enforcing the Children’s Online Privacy Protection Act. There’s a lot to be said about the new amendments to the Rule—indeed, a lot is being said—but as this is the FTC Tech Blog, I’m going to restrict my comments to technical aspects. Today, I’m going to talk about signaling—the way that a website can signal its COPPA status to the operators of other sites who provide it with some of the content that users see.

If you run a simple website, complying with COPPA is reasonably straightforward. If you’re covered—that is, if you have actual knowledge that a child is using your site, or if your content is directed towards children younger than 13— you must get parents’ permission before collecting personal information from kids. (N.B. Please see the formal rules to learn who is covered and for the precise definition of a “website or online service directed to children.” The Federal Register notice with the new rules is 167 pages of PDF; I’m not going to try to interpret or even summarize all that text. And ask your lawyers, not your computer scientists.) However, many commercial websites contain content from multiple sources: ad networks, third party plug-ins, etc. Who should be responsible for their COPPA compliance?

The announcement of the amended Rule makes this very clear: “The definition of an operator has been updated to make clear that the Rule covers a child-directed site or service that integrates outside services, such as plug-ins or advertising networks, that collect personal information from its visitors.” If it’s on your site, you’re responsible—period.

The announcement also says that “the definition of a website or online service directed to children is expanded to include plug-ins or ad networks that have actual knowledge that they are collecting personal information through a child-directed website or online service.” How can a plug-in “have actual knowledge” that it is on a child-oriented site?

To answer that question (and to return to purely technical matters), we have to take a deeper look at how a website is constructed. When a user types a URL into a browser or perhaps clicks on a link on some other site, the browser contacts the site to retrieve an HTML (Hypertext Markup Language) file. That HTML file, in turn, can contain pointers to other content necessary to render the page: style sheets, images, IFRAMES (mini-webpages embedded in a larger one) and more. The user’s browser, not the website, then fetches these additional HTML files, which in turn can contain other embedded content.

In many instances, a plug-in or other site or service offering up content will not know everywhere that it has been embedded, nor can it easily control or prevent embedding. A site may receive a Referer: header, but sending those headers is optional. (In fact, some browsers let you disable sending them.) If one is present, it may be from yet another party; often, references in the original HTML file point to, say, an ad network, which in turn points to the actual ad. But let’s assume that there’s a genuine Referer: header that really mentions the COPPA-covered site. Does the embedded site then “have actual knowledge”? That’s not likely without further information.

We can resolve this problem if there is explicit signaling from the embedding web page to the plug-in or other included content. This could be accomplished by a joint effort of industry members. Indeed, such signaling is already in place for other purposes; ad networks generally prescribe how to request ads that are relevant to the page on which they’re being displayed. Here’s a random example I stumbled on recently in a news article about North Korea:

http://ad.doubleclick.net/adj/trb.chicagotribune/news;;ptype=s;slug=sns-rt-us-korea-north-touristbre8bk10z-20121221;rg=ur;ref=chicagotribunecom;pos=T;dcopt=ist;sz=728×90;tile=1;ca=CrimeLawandJustice;en=SeoulSouthKorea;at=CrimeLawandJustice;at=SeoulSouthKorea;at=PhysicalFitnessandExercise;at=CivilRights;at=PyongyangNorthKorea;u=sz%7C728x90%21;ord=84919093?

Quite obviously, Doubleclick is being passed information about the website, the article name, the countries involved, and various keywords relevant to the topic. It would be no stretch at all to include a “COPPA-covered site” flag as well.

We could do better than this if there were a formal standard or agreed-upon convention, though. What might one look like? It has to be something in the URL, since that’s the only thing that a browser will understand and be able to pass on. While it’s possible to put the signal into the hostname or username/password sections of the URL, those are awkward. A better idea is to put the signal into the path. Thus, an IFRAME directive from a COPPA-covered site might start something like this:

A more general form, perhaps to be adopted by the W3C, might look something like this:

http://hostname/__RESTRICT:US-COPPA13,EU-PRIVACY,etc/…

Furthermore, if the embedded content itself embeds other content from yet other sites or sends Redirect messages, it would be obligated to pass along the COPPA signal. Note that in the first scenario, one couldn’t, even in principle, rely on the Referer: line, since it only goes back one hop. Your plug-in doesn’t collect any information that would implicate COPPA? No problem; with the Apache web server (and almost certainly with other popular web servers), configuring the system to ignore such flags if irrelevant is almost trivially simple. The same principle can be applied to links to platforms (e.g., a Facebook “Like” button, a Google “+1” button, etc.): the embedding site is the only component that knows at the outset at whom it is directed, and hence must pass along that signal. This can’t be done today to embed arbitrary content; as I said, there are no current COPPA signaling standards. But there’s an opportunity here for industry action to make such signaling a real option for tomorrow.