Mark Smith's Journal

Work related musings of a geek.

protected content security

There's a bunch of talk going around right now about the whole issue of content security, and trusting the people who host your content and have access to it. I wanted to talk on that for a moment as it's something that is really important to me.

The only people that are authorized to view protected content on Dreamwidth that they don't otherwise have access to are denise and myself. We are the only people with the proper access level. On top of that, it's not automatic -- in order to view the protected content, every time one of us visits a URL we have to edit it and add "viewall=1" to the end of it. It's a very manual process (for good reason). It's also logged -- and I don't know about Denise, but I review the logs regularly, just like every other security log we have.

The second level of access is for people who have access to the production servers that run Dreamwidth. When someone has the ability to log in to our servers, they have full access to the data on the databases and could in theory access protected content. The only people with server access are again myself and Denise, plus our two sysadmins: matthew, who used to work for LJ (before and during Six Apart), and alierak, who I've known for a decade and I trust completely.

That's it. The four of us.

At some point, it comes down to trust. We need the ability to work on the servers, so there are always going to be a set of people who have the ability to see private data. This isn't something that we can feasibly get rid of, either. The data exists on the servers (that's how we can show it to you and the people who are authorized to see it) and we need access to those servers to maintain them. The data isn't just sitting around visible to us, though -- it's tucked away in the database and requires a lot of manual effort to dig out, unzip, and connect to a user account. We never see post content accidentally.

In the end, I think that the best that I can offer anybody is to be explicit about who has access (and what kind of access they have) and to personally watch the security logs. I watch to make sure we don't have unauthorized access to our servers, and I look for unauthorized access to private data as well. It's part of the routine, and it's something I take very seriously. Having dealt with some problems related to this in the past (on other projects, with other people) it's not something I want to see Dreamwidth have to go through.

I'm happy to talk about this, if anybody has any thoughts, comments, or questions.

We were talking about something vaguely similar the other week at work, and we decided to trigger an email whenever someone did something with their elevated privs, so that it would be more visible to those who needed to know about it and the person doing it would be accountable.

Thinking along those lines, could you send a notification to the user, as well as logging, when someone with privs views a protected entry? The text of the message could include something like, "This is most likely in relation to a support request you raised" or whatever is necessary to explain why it's happening and reassure the user. It feels, to me, a bit like the email you get that says "Someone, possibly you, has requested a password change."

In theory I agree with you, it'd be nice if people were notified when their content is accessed by someone they haven't explicitly given permission to do so. In practice, though, I think it opens up a lot of cans that I'm not sure we want to get into.

My first thought is: it only addresses half of the issue. On-site access. Right now, that's only Denise and myself, but in the future I can see us adding staff to that list who need that ability. (Think Terms of Service people, senior support administrators, etc.) The idea doesn't address sysadmin level access -- and I'm not even sure how you could, at that.

But even if we accept only the on-site half as being worth addressing, then we run into more problems. The viewall utility is used for many uses, not all of which I would want to send emails for. (Reports of a credible suicide threat, investigating a ToS violation, me verifying imports/deletes/purges/things, etc etc...)

Also, I could see us being in a situation where someone makes a report that user X is posting their private details, invading their privacy, all behind a lock! So we viewall, but it turns out to be a false report -- now user X is all "WHY WERE YOU DOING THIS TO ME" and we have to say "someone reported something" and that seems problematic to me. We're not going to release details on who reported what, or why they reported it, so we're pretty tied in what we can say. I expect getting a form letter would get old and not engender trust.

Of course, then another thought is, would you rather know that someone looked, even if they won't tell you why, or just not know at all? I lean towards the latter, but I know other people will lean towards the former. I know that my data on Facebook is visible to their admins, I accept that, and I don't put anything there I don't want them to see.

I dunno. I definitely see where it would be nice to say "look, we email you when your content is viewed by admins" but I can easily see how unscrupulous people will weaponize that, or how it will hinder me in some situations (i.e., if I'm verifying a community import worked, sometimes I viewall to make sure everything worked right -- screened comments, members only posts, etc). The ability to, as a sysadmin, check that things worked properly is valuable to me.

All of those thoughts, plus the fact that it only works for on-site access, makes me think it's not really worth doing. What do you think?

One other thing along those lines to worry about is that you could end up in a situation where you're legally not permitted to notify the user that their content is being viewed (a valid search warrant, for instance). While this is exactly the case where I suspect most users would like to be notified, since you're a US business, you simply can't. So saying that you'll notify users when the content is viewed is to some extent promising something that you can't reliably deliver.

My purely personal perspective, admittedly as someone who knows a lot about how this sort of thing works since I'm a professional sysadmin, is that the site users have to trust you and Denise by necessity and, beyond that, I'm not sure there's a lot of utility in trying to enable community audit of your activities. I'm also not sure that there's much utility in notification if there's no option or decision available to the user.

One thing you could consider is, in a future world in which you may want to grant more fine-grained access to additional people, would be to distinguish between access under the user's control and access that is done by a site admin. For instance, when submitting a support ticket, a user could potentially get a checkbox saying "allow senior support people access to my restricted content" which they could check if the request seemed to warrant it. If they chose not to check it, then their support request may take longer if it requires such access until you or Denise or similar staff had a chance to look at it.

For instance, when submitting a support ticket, a user could potentially get a checkbox saying "allow senior support people access to my restricted content" which they could check if the request seemed to warrant it.

One other thing along those lines to worry about is that you could end up in a situation where you're legally not permitted to notify the user that their content is being viewed (a valid search warrant, for instance). While this is exactly the case where I suspect most users would like to be notified, since you're a US business, you simply can't.

If it's automatic, I suspect people would use it as a form of harassment; sending in a report purely so that the poster would get a notification that it was looked at. That's not a reason to avoid doing this, but I'm not sure that the benefits are worth it.

An alternative could be something like you guys have talked about it terms of sharing TOS investigations; doing it the other way? "In May, x journals were looked at for a, b, and c reasons". Let people know how rare or common it is?

I like this, especially if it's auto-generated by the database records. While it's self-reporting, it's enough to give everyone some feeling for the process, and change over time (so give percentages too).

The viewall utility is used for many uses, not all of which I would want to send emails for. (Reports of a credible suicide threat, investigating a ToS violation, me verifying imports/deletes/purges/things, etc etc...)

Maybe it would be useful to have a FAQ section or something about this, spelling out under what circumstances viewall gets used, who can use it, and emphasizing what DW does to protect privacy and ensure that the ability to access private/locked posts won't be abused.

Personally I try not to have locked content that is going to be damaging if it gets out because I heard the rumors about [other places] sending restricted-access content to advertisers for keyword mining.

I want to say that I really appreciate the transparency that you have been supporting and are willing to tell us just who has access to our private data. I'm also grateful that only these few have access.

I think I'm trying to have a grateful party right now, but I'm lacking the words to properly express it. Really, truly, thank you.

See, that's part of the reason why I trust you and Denise. You're up front with us. If an issue comes to your attention, and needs to be addressed by top level management, you guys address it publicly. And, even better, you do so clearly, and with as much details as possible/reasonable.

One interesting feature, if enough people really want more assurance about this, would be to add a "PGP posting" feature to the software itself - it'd be a pseudo-security-level, really just the same as a normal "locked" type of entry, but you would be able to select from some set of PGP public keys that you've uploaded for the content to be encrypted with, and then, using some out of band means, provide the private key half of that PGP key to anyone you want to be able to read it. I can't think of a secure way of Dreamwidth doing the decryption half without it again being possible for you guys to read the content, but there are browser plugins designed for doing PGP work on webmail systems that would work.

It's quite possibly excessive, and not something I personally would necessarily use or care much about, just an idea that popped into my head if you wanted to offer something for those who are really paranoid or really have something they feel strongly that they need to keep confidential.

Having some sort of tag or something that we could say "hey, here's content that is PGP encoded, use the key labeled 'xb95'" and having the browser handle decrypting that particular block of text...

Of course, then I wonder if that's even possible to make secure. I could write some JS that just runs in the DOM and waits for that particular block to be decoded. But then, the plugin could do something more fancy -- or maybe we could just make this an option that shows up to RSS readers. Then it's just, "here's the content, you can do something with it".

Personally, the security conscious and paranoid part of me really likes the idea. Realistically, I don't know that it's a good fit for what we're trying to do with Dreamwidth. But hmm...

There's something to be said for simple symmetric-key encryption (using AES for instance) even if Dreamwidth also stores the key and decrypts on the fly for authorized users. It means that the data is encrypted at rest and no one can see it by accident.

A minor gain probably not worth the complexity, but that sort of server-side encryption is becoming increasingly common in the storage world. (Although there the concern is often about backups, which are frequently sent off-site to unaffiliated storage companies who shouldn't be able to read the stuff they're storing.)

Hmm, interesting. I like the idea of users being able to broadcast stuff securely like this. *but* I think if we do something like this it's got to be really rigorous. It's fine with stuff that's access-locked being by-and-large good enough, but things that are claiming to be crypto have to be done right - giving a false sense of security is much worse than just not offering the service.

If the encryption is done browser-side with another plugin then Dreamwidth never touches the plaintext or the keys and the only thing that needs security auditing is the plugin. All DW has to do is provide a nice way of tagging the text. Or am I missing something?

Mark, thank you so much for addressing this issue publicly! I had emailed you, asking about it, and I can't tell you how happy it makes me to see you address this so promptly. That is awesome, and it really makes me trust you guys more than I already did. It's very reassuring to know how seriously you take this, and that (at least for now) only 4 people have access to locked content.

Thank you so much for clarify this issue. I am one of those persons who like private stuff... well private *laughs* And to know who could be looking at my locked entries and why, and to know you are actually monitoring for unauthorized access makes me feel better about the whole thing.Again, thanks!

I would also point out that server access gives us the theoretical ability to read network traffic (even https) off the wire, not just database content, so there is possibly more at stake there than just protected entries. I can't immediately think of a circumstance in which I would be looking at that type of content in order to do my job, unless I already had reason to believe that the traffic or database entries constituted an attack on Dreamwidth's server resources. I do not work support requests unless specifically asked, and will not access private information to do so. I do try to work on difficult-to-diagnose bugs, but thanks to sophie, it's simple enough to troubleshoot things on a Dreamhack instead.