Privacy continues to be an important topic surrounding social media systems. A big part of the problem is that virtually all of us have a difficult time thinking about what information about us is exposed and to whom and for how long. As UMBC colleague Zeynep Tufekci points out, our intuitions in such matters come from experiences in the physical world, a place whose physics differs considerably from the cyber world.
Bruce Schneier offered a taxonomy of social networking data in a short article in the July/August issue of the IEEE Security & Privacy. A version of the article, A Taxonomy of Social Networking Data, is available on his site.
“Below is my taxonomy of social networking data, which I first presented at the Internet Governance Forum meeting last November, and again — revised — at an OECD workshop on the role of Internet intermediaries in June.
- Service data is the data you give to a social networking site in order to use it. Such data might include your legal name, your age, and your credit-card number.
- Disclosed data is what you post on your own pages: blog entries, photographs, messages, comments, and so on.
- Entrusted data is what you post on other people’s pages. It’s basically the same stuff as disclosed data, but the difference is that you don’t have control over the data once you post it — another user does.
- Incidental data is what other people post about you: a paragraph about you that someone else writes, a picture of you that someone else takes and posts. Again, it’s basically the same stuff as disclosed data, but the difference is that you don’t have control over it, and you didn’t create it in the first place.
- Behavioral data is data the site collects about your habits by recording what you do and who you do it with. It might include games you play, topics you write about, news articles you access (and what that says about your political leanings), and so on.
- Derived data is data about you that is derived from all the other data. For example, if 80 percent of your friends self-identify as gay, you’re likely gay yourself.”
Having a simple ontology for social media data could help us move forward toward better privacy controls for online social media systems. I like Schneier’s broad categories and wonder what a more complete treatment defined using Semantic Web languages might be like.