April 3, 2015

Many publishers don’t know that data collection on their video pages may be governed by the Video Privacy Protection Act of 1988 (VPPA), a federal law originally enacted to preserve records of videotape rentals.

Several lawsuits have been filed in recent years pertaining to potentially unauthorized sharing of video viewing habits of consumers on pages where social networking tags are present, since this code has the capability of identifying individual users when their technology is present on publishers’ video player pages. Tying together user identity and the name of the video could violate the VPPA.  

However, in California earlier this week, a federal judge dismissed claims against Hulu in a similar class action suit. In her ruling, the judge greatly reduced the risk of publishers by declaring that a VPPA violation could occur only when they “knowingly” understood that a social network would connect personally identifiable information (PII) with the names of viewed videos. The plaintiffs plan to appeal this summary judgment.

We thought that this was a good time to explain how video metadata is collected by third parties, including social networks, from publishers’ video pages. We particularly looked at information that could be transmitted before a user performed a social sharing action on the page, which provides consent under the VPPA.

When the publisher includes the video name in the page’s URL, this information must be transmitted to the third party whenever a remote call is made. In instances where the video name is only included in HTML tags or JavaScript variables, the third party must consciously collect and transmit this data. We found numerous instances where data collectors other than social networks gathered video information, but this would not pose a threat under VPPA. There were only two instances in which social networks transmitted this information before a user’s sharing action. However, due to the complexity of JavaScript, this does not rule out other similar transmissions.

While developments in the recent Hulu case may reduce the need for publishers to take defensive actions, we outline several changes that publishers can make with regard to including social networking technology on their video pages to reduce their VPPA risk.

 

Background

A number of DCN members have recently become concerned about their exposure regarding the federal Video Privacy Protection Act of1988 (VPPA) and the use of online video on their websites and mobile apps. As defined in 18 U.S. Code § 2710 and extended in Lane v. Facebook, personally identifiable information (PII) about individuals viewing specific videos (be they online or off-line media) cannot be shared with third parties without the informed writtern (or digital) content of the consumer. There is an exemption in the statute that permits exemption if it’s “incident to the ordinary course of business.”

With regard to online video, the problem arises when visitor views a video on a publisher’s website. The publisher likely has business relationships with dozens of third parties such as ad-tech companies, analytics providers and social networks. Each one of these companies may embed JavaScript code or pixels (called tags) on the publisher’s website for purposes of collecting data and tracking user browsing behavior. Oftentimes, each third party loads tags from still other third parties, many with no direct relationship with the publisher. Some pages may have more than 100 tags.

When the user visits a publisher’s website and views a video, many of these tags execute and transmit data about that user back to the third party. There is no violation of the VPPA as long as this information does not contain PII, or if it does, the user has provided some form of consent before the transmission occurs. Since a majority of third-party data collection only contains anonymous, non-PII information, this poses minimal potential liability. Some data collectors can transform anonymous data into PII by joining it with other data sets (a process called reidentification), but this would not seem to trigger any VPPA issue for the publisher. Likewise, the publisher’s video service provider may know that a user is watching a specific video, but since the provider doesn’t know the user’s identity—even if that user is observed on other websites—there is no violation.

While a number of these data collectors have no relationship with the visitor, the user may have existing accounts with some third parties, such as social networks. Logging in to a social media account often triggers the creation of cookies that are dropped on the user’s browser, permitting the social network to identify the user when she visits other websites. So when the user goes to the publisher’s site to view a video, the social media tag retrieves this cookie and can associate the user’s identity with the viewing of a specific video before she has liked, shared, tweeted, or provided other means of informed consent. There may be other third parties who could be similarly situated, such as paywall vendors that engage with multiple publishers, e-commerce vendors with affiliate tags or commenting widgets. Also, this issue may arise with some analytics vendors that have other products that collect PII or with browsers that have named user accounts.

The landscape changed on March 31. U.S. Magistrate Judge Laurel Beeler issued a summary judgment in a 2011 class action lawsuit against Hulu alleging that the company’s use of Facebook technology violated VPPA. Specifically, the judge said the plaintiffs did not prove that “Hulu actually knew that Facebook might combine information that identified Hulu users with separate information specifying which video that user was watching.”

So while it was technically possible for Facebook to federate this information that it had in its possession, the claim failed the VPPA’s three-party test, which requires the defendant to have “knowingly disclosed: 1) a consumer‘s identity; 2) the identity of ―specific video materials; and 3) the fact that the person identified ―requested or obtained that material.” Since Hulu never knew the viewer’s identity, it couldn’t disclose it, let alone connect the user and the video. Interestingly, while Facebook does have identity information, the company likely does not qualify as “video tape service provider” for videos it does not serve, giving it potential immunity to VPPA violations. Maintaining this separation may be an effective defense for publishers against VPPA actions.

The plaintiffs in this case also did not offer evidence about how Facebook used this data within its systems, let alone what Hulu knew about this. Perhaps not coincidentally, Facebook dropped the inclusion of video information from its Like button in 2012. An earlier court ruling removed the required that the plaintiffs needed to show actual injury from the disclosure of information.

This investigation centers around understanding whether video names are routinely gathered by tags and transmitted to third parties prior to a social media sharing action.

 

Investigation

Publishers handle online video in a multitude of ways that expose the name of the video clip. To simplify matters, four cases were investigated:

  1. The video name is embedded in the URL, such as www.website.com/video/name-of-video-clip
  2. The video name is not embedded in the URL, but is contained somewhere on the page in a JavaScript variable of the HTML.
  3. The video name is not contained anywhere outside of the player (either within the Flash application or the video content).

In a cursory such of several dozen publisher video pages, we did not see a page where the video name was not included somewhere in the URL or an HTML or JavaScript data element. Because most video players are Flash-based, it’s entirely possible to create a page that excludes the name; it just appears that this infrequently happens.

 

Case 1: Name embedded in URL

Every time the browser transmitted data to an external server, the URL is contained in the HTTP header that accompanies the transmission. So even if a tag doesn’t explicitly collect and send the URL, the third party receives it anyhow. In essence, the second the tag fires, the social network knows the name of the video the user is watching. In fact, any tag provider would be able to easily discern this. Even without the HTTP header, we saw video names explicated transmitted in some tag parameters ahead of user action. Some tags also transmitted a metadata element indicating the page contained video content, so connecting the page name with a video name could be easily done.

The easiest mitigation strategy is to exclude the video name from the URL. This may have an adverse effect on SEO, but this could be lessened by having pages with the video name in the URL that immediately redirect to a page without the name. More complex solutions involving proxy services, iframes and referrer spoofing also could mask the URL.

 

Case 2: Name not embedded in URL but contained elsewhere on the page

There are a few ways that publishers can structure pages not to include the name in the URL.

Sometimes, there are playlist pages that cycle through many videos, so the page name is usually something generic such as “News videos” while the video player itself and sometimes HTML labels contain the name of the actual video. Some broadcasters have video pages that contain the name of the series in the URL but not the name of the actual episode, which is displayed by the player or in HTML.

For instance, many publishers created data structures in the page that had a wealth of metadata about the video.

In examining in-page data transactions, we explicitly saw the name of the video being passed to analytics systems, as the publisher wishes to track this information. We also saw the video name being transmitted to video ad servers. This poses no issue under VPPA due to the lack of federation with the user’s identity.

We only saw one social vendor (a commenting widget) that explicitly passed back the video name in tag parameters. This occurred before the user indicated that she wanted to comment on that particular video. As this widget permits its commenters to remain logged in across multiple sites, this would technically federate a video name with a specific user. Some social providers transmitted a hashed contentID field to a remote server, which could include specific video segment names.

Memory dumps of browser sessions permitted us to examine JavaScript variables and their contents. While we did see some variables that referenced the video name, we did not see any social vendor that sent this information to a third party ahead of the user performing a sharing action.

From a mitigation standpoint, it’s important to understand that there is no barrier to any third party tag provider  from extracting the video name at any time, since this data is easily accessible in the browser data model. So the only definitive way to prevent this from happening is to bifurcate social sharing tools into two pieces:

  • Part 1: The visual element, such as Tweet, Like, or +1 button. This would simply be the placement of a graphical element on the page.
  • Part 2: The action performed when the button is depressed. This would trigger a JavaScript tag to execute the social action, at which time sharing video data is acceptable.

This method forces the publisher to do custom development to separate out these two functions, since most social vendors provide integrated widgets that don’t distinguish between them.

For commenting widgets, this poses an additional complication, in that comment cannot be shown on the page until the script fires. So publishers would have to use a “Display comment” control to trigger this script. This method, along with comment lazy loading, is already used by some publishers due to the latency involved in commenting tools.

Mezzobit’s audience control technology does have the capability to monitor and firewall off certain data elements from third party tags.

 

Case 3: Video name not contained outside the player

We examined more than 30 video pages of various publishers, but were unable to locate one where the video name was only contained within the player. This is likely due to publishers wanting to get SEO lift from including video information in the page HTML, as well as data requirements for sharing with third parties, including social networks.