A New Blog Design
AT&T Rejects Shareholder Proposal For Increased Transparency About Data Shared In Government Surveillance Requests

Study: Facebook Saves And Analyzes Your Unpublished Posts And Comments

Facebook logoI found this study very troubling and extremely slimy corporate behavior. There is a good article at Slate about how Facebook wants to know your thoughts by collecting your unpublished posts or status messages.

Yes, your unpublished status messages.

Think about all of the content you started to type in the compose box on Facebook, but then stopped and either erased or changed it. After all, we all have been taught to think carefully about what we share online. Unfortunately, Facebook captures this unpublished content: stuff that you typed and didn't post by actually selecting the "Post" button in the facebook compose box:

"... the code that powers Facebook still knows what you typed—even if you decide not to publish it. It turns out that the things you explicitly choose not to share aren't entirely private. Facebook calls these unposted thoughts "self-censorship," and... The study examined aborted status updates, posts on other people's timelines, and comments on others' posts. To collect the text you type, Facebook sends code to your browser. That code automatically analyzes what you type into any text box and reports metadata back to Facebook."

If you don't know (or forgot), read this primer about what metadata is and why it is valuable. (It will help you understand what data is attached to your posts, images, and video.) Remember, this is about content you typed in the Facebook compose box and never published. One might expect Facebook to collect versions of content you typed, posted, and later edited, because you selected the "Post" button, shared that content before editing it. I doubt uses would expect Facebook to save content you didn't share = content you may have type but never posted or published. Sadly, the social networking site does save (and analyze) your unpublished content.

And, if you are an informed online user that diligently reads the policies (e.g., Terms Of Use, Privacy) at websites:

"In Facebook’s Data Use Policy, under a section called "Information we receive and how it is used," it’s made clear that the company collects information you choose to share or when you "view or otherwise interact with things.” But nothing suggests that it collects content you explicitly don’t share. Typing and deleting text in a box could be considered a type of interaction, but I suspect very few of us would expect that data to be saved..."

I find this data collection of unpublished posts extremely slimy corporate behavior and a privacy intrusion:

"This may be closer to the recent revelation that the FBI can turn on a computer's webcam without activating the indicator light to monitor criminals. People surveilled through their computers’ cameras aren’t choosing to share video of themselves, just as people who self-censor on Facebook aren’t choosing to share their thoughts. The difference is that the FBI needs a warrant but Facebook can proceed without permission from anyone."

Researchers Adam Kramer, a Facebook data scientist, and Sauvik Das, a Ph.D. student at Carnegie Mellon and summer software engineer intern at Facebook, analyzed 17 days of usage from 3.9 million Facebook users. You can download the researchers' study (Adobe PDF). Some key findings from the study:

"... 71% of the 3.9 million users in our sample self-censored at least one post or comment over the course of 17 days, confirming that self-censorship is common. Posts are censored more than comments (33% vs. 13%)... decisions to self-censor content strongly affected by a user’s perception of audience: Users who target specific audiences self-censor more than users who do not... males censor more posts, but, surprisingly, also that males censor more than females when more of their friends are male... people with more boundaries to regulate censor more posts; older users censor fewer posts but more comments; and, people with more politically and age diverse friends censor fewer posts."

After reading this, I wonder if self-censor rates would have been higher if the study duration was longer than 17 days. The researchers seem to think so (emphasis added in bold:

"Over the 17-days, 71% of all users censored content at least once, with 51% of users censoring at least one post and 44% of users censoring at least one comment. The 51% of users who censored posts censored 4.52 posts on average, while the 44% of users who censored comments censored 3.20 comments on average... While 71% of our users did last-minute self-censor at least once, we suspect, in fact, that all users employ last-minute self-censorship on Facebook at some point. The remaining 29% of users in our sample likely didn’t have a chance to self-censor over the short duration of the study. Surprisingly, however, we found that relative rates of self-censorship were quite high: 33% of all potential posts written by our sample users were censored, and 13% of all comments. These numbers were higher than anticipated..."

The researchers (and probably Facebook managers, too) seemed worried that Facebook's user interface and website features may be inadequate and encourage more self-censorship by users than otherwise. They concluded:

"... we now know that current solutions on Facebook do not effectively prevent self-censorship caused by boundary regulation problems. Users with more boundaries to regulate self-censor more, even controlling for their use of audience selection and privacy tools. One reason for this finding is that users might distrust the available tools to truly restrict the audience of a post; another possibility is that present audience selection tools are too static and content agnostic, rendering them ineffective in allowing users to selectively target groups on the fly..."

The researchers also concluded:

"... the frequency of self-censorship seems to vary by the nature of the content (e.g., post or a comment?) and the context surrounding it (e.g., status update or event post?). The decision to self-censor also seems to be driven by two simple principles: People censor more when their audience is hard to define, and people censor more when the relevance or topicality of a CMC “space” is narrower. For example, posts are unsurprisingly censored more than comments..."

What I make of this:

  • Facebook expects more self-censorship by members who have a large or very diverse set of "friends." If your groups are small, and/or if you have segmented your friends into several smaller groups or circles, then you probably won't self-censor as much
  • Clearly, Facebook expects its users to self-censor, otherwise they wouldn't have built and deployed the capability into their system.
  • As your group of friends change and/or as your online skill changes, Facebook expects a your self-censor rates and instances to change
  • The saving and analyzing of users' unpublished posts reduces consumer rights. Consumers have lost the right to keep control over what they post and publish. Instead, Facebook is essentially saying it knows best/better. Very paternalistic and insulting.
  • This is dangerous, because almost nobody thinks that what they type and then delete or change would still be collected, saved, analyzed, and used to profile you with relevant metadata (e.g., gender, age, political likes, etc.) attached.
  • This is dangerous, because Facebook content is used in courts, and credit reporting agencies want access to your social networking content. Nobody expects to be challenged with stuff they typed and never posted because the social networking site decided to retain it anyway.

So, a word to the wise until online privacy laws catch up: be careful about what you type before posting, and be careful about what you post on Facebook. If this is too complicated for you, then don't post on Facebook or simply stop using Facebook altogether. I know people who only read Facebook.

Next, several related questions immediately come to mind:

  1. How does self-censorship vary by device type? Perhaps, users with desktops or laptops self-censor more or less than users with tablets or smart phones. Perhaps, the same user's rate of self-censorship varies when switching between devices. Perhaps, certain brands of devices have higher self-censor rates. If I worked in Facebook's usability department, or if I was a Facebook business partner, these are answers I'd want to know.
  2. When did Facebook start archiving users' unpublished posts?
  3. How long does Facebook archive users' unpublished posts?
  4. What companies, business partners, and/or affiliates does Facebook share unpublished posts with?
  5. Does Facebook share unpublished posts with the NSA and other spy organizations?


Feed You can follow this conversation by subscribing to the comment feed for this post.

Chanson de Roland

I am a lawyer, who, after reading Facebook's Privacy Policy, decided to never use Facebook, except perhaps for business. The reason for rejecting Facebook was that Facebook's Privacy Policy was utterly appalling with respect privacy. Indeed, Facebook’s Privacy Policy is more aptly called a Facebook’s Right to Disclose Policy. At least for adults, Facebook retained for itself the right to collect any interaction that you have with its services for it to use for almost any commercially valuable purpose, notwithstanding its promise to make one's personal information that it shares with others anonymous.

And several studies have shown that promise to be worthless in practice. And, of course, Slate now reveals and Mr. Jenkins reports that even that promise of anonymity is a lie or at least misleading, because Facebook must identify and link particular users to their respective published and unpublished post and comments to be able to use those post and comments, as it does, to know its particular users. So Facebook, even in promising to anonymize one's personal data misleads: It really doesn't promise to anonymize your data when it, Facebook, is the one analyzing your data, and even its promise to anonymize your data, which it provides to third parties, has been shown to be largely ineffective in practice.

So Facebook's use of its users' private information for its gain seems to be without any practical restraint. For the foregoing reasons and other of what I found to be outrageous breaches of privacy and the resulting exposures to unwanted marketing, my personal information being known and used by others without my consent, exposure to legal jeopardy, and/or risk of injury to my professional and/or personal reputation, I have hitherto refused to subscribe to Facebook and expect that I always will refuse to subscribe to Facebook.

P.S. If you must post to Facebook and don’t want Facebook to get your unedited comments and/or post, simply compose your comments and posts on your word processing app and then post only your final edits of your posts and comments to Facebook.

The comments to this entry are closed.