Posted
by
Soulskill
on Sunday July 04, 2010 @12:35PM
from the enjoy-the-holiday-google dept.

Virak writes "Several hours ago, someone found an HTML injection vulnerability in YouTube's comment system, and since then sites such as 4chan have had a field day with popular videos. The bug is triggered by placing a <script> tag at the beginning of a post. The tag itself is escaped, but everything following it is cheerfully placed in the page as is. Blacked out pages with giant red text scrolling across them, shock site redirects, and all sorts of other fun things have been spotted. YouTube has currently blocked such comments from being posted and set the comments section to be hidden by default, and appears to be in the process of removing some of these comments, but the underlying bug does not seem to have been fixed yet."

Really? They're really only removing some of them? When they can just do a simple delete query and wipe everythin with a properly escaped script tag at the top of the comment? Wow. Just wow.

The solution to this is for users to be asked if they want to participate in commented sections when signing up. Not just at youtube, but everywhere. And probably not just comments, but any user input area.

The evolution of this bug exploit was quite interesting to follow up close.

At first it simply prevented any further comments to be posted.Then text was added.Then the text was scrolling.Suddenly, the entire page was blacked out except for the added text.

And that's when the more technical minded people realized much much more was possible.Bam! Popups!Infinite popups that lead to browser crashes!Page redirects to shock sites!The most sophisticated version I saw actually replaced the Youtube video in-place with the 1man1jar video..

And when the exploit was blocked in the comments, it had a small resurgence as video reply title, before being smacked down once more.

Reminds me of the slashdot <a onhover=".."> bug. It was a while back (2000-2002 era?) but inline javascript wasn't filtered from a tags. The first exploit (that I saw, anyhow) simply used DHTML (as it was then known) to add (paraphrasing) "I can't believe this hasn't been fixed" to the post. (which took about 5 minutes given the speed of computers, javascript, and dom manipulation). About 30 seconds later, redirects to porn, last measure, etc appeared. Slashdot's initial response was to mod them down to -5 and then deleting them.

They actually got it fixed a bit after I submitted this story. A shame, lemonparty was a big step up from the usual level of discussion on YouTube videos. More seriously, I'm interested in finding out exactly what happened here. Hopefully Google will post some sort of explanation. YouTube is a massive site and it's somewhat bizarre seeing them make the sort of mistake you'd expect from something put together by a drooling moron with nothing but a "How to learn PHP in 24 hours!" book.

1. It teaches you, over the course of an unspecified period of time, how to learn PHP in 24 hours?2. It teaches you, over the course of 24 hours, how to learn PHP? or3. After 24 hours have elapsed, it teaches you how to learn PHP?

Note that it doesn't actually teach you PHP. It just teaches you how to learn it.

It's 1,440 pages of "Wait one minute, then turn the page" which sadly forces one into an inescapable loop for 24 hours. After one has starved, missed sleep and soiled oneself through this excruciating 24 hour period the last page says only this:

Yes, this does seem like the kind of bug I'd expect halfway competent dev to take into consideration when building a site. A very simple fix is to translate all < and > characters to the & lt; and & gt; versions instead, AFAIK youtube doesn't even allow HTML in comments anyway...

I'd also be interested in knowing if this bug had been an issue for a long time. It seems like the sort of exploit that would have been very quickly discovered. I'm not a big YouTube comment reader, but I've noticed some interface/UI tweaks to the way comments can be thumbed up/down in recent weeks. Perhaps this crept in as a result of those.

The fact that educated people make mistakes is not equivalent to whether uneducated people can make educated programming decisions.

Outside of school, do you really think someone will pick up on the math and other concepts necessary to, for just one example, calculate the Big-O of a part of their program? Or understand why they should?

I haven't actually tried Comment Snob addon in some time and it seems that it hasn't been updated to work with the latest changes on YouTube. Maybe someone with a little free time has the passion to fix it.

Wow. You'd think somebody would've figured out something like this a long time ago.

But since merely gazing at youTube comments lowers your IQ by at least 20 points, I'm actually amazed someone found it. Must have used some of kind of proxy who looked at it, got dumber for it, but managed to pass along the code to someone who could look at it without being exposed to the dumb.

A lot of the comments are just troll BS. Most people log on for videos not to read the ramblings of basement dwelling trolls. I try to ignore them but they can be really obnoxious. I don't post on Youtube but I have had things pirated and posted just so they could make obnoxious comments. The work posted was just previs stuff that was just done for editing slugs but it was presented as finished work. It caused some trouble with a client so I got a lot more careful about letting development work out there. I

Yes, but I blame the comment system for that. A comment system that doesn't allow links, doesn't allow more then a handful of characters, is a complete usability nightmare when you want to browse more then the last ten comments, doesn't allow search and doesn't support threads or replies properly is just useless when you actually want to write something insightful. A comment system should encourage informative posts, not make them impossible like the Youtube system does.

The latest changes that the highest rated comments and comments from the video upload appear on top have helped a bit to cleanup the mess, but its still far away from being a comment system where people actually can have a meaningful discussion.

On top of that they need to implement some sort of penalty system for people who regularly post things that are downvoted. If out of 10 posts, the amount of downvotes you've gotten is higher than 80% then implement a week long "cool-off" period in which it resets to 0

If you don't want to spare the bandwidth on your own site (how much data are you pushing, anyway?) then try Vimeo. Cleaner, better optimization, has private (need a password) channels, offers a "pro" service where you get unlimited uploads, etc.

I find it interesting pondering the how and why these things fail-- the insight into how the code must have been put together to fail on a particular input.

My initial guess for this one would be that they escape html and scripts separately-- scripts do not need greater than, less than, and ampersand escaped-- and that detecting the keyword 'script' switched modes from html to script. The fact that the first script tag is properly html-escaped suggests that while it was properly detected, the code to switch between html and script modes did not take this detection into account and switched anyway. I'm going to further guess that this do to some support code meant for the programmers' side inadvertently managed to cross over into user land.

Exactly, why not just escape the whole thing? Or if you're even more paranoid, why not just strip the script tags and everything in between? That being said, the fact that this exploit exists in the first place shows that they're not doing either one of those things.

This sounds good in theory. In practice, people who read a lot generally cannot help but successfully read entire sentences in their peripheral vision. Nothing short of removing the text from my visual field will prevent the meaning of the words from becoming instantly lodged in my brain the moment they appear anywhere visible.

You're an accomplished speed reader.

I read _a lot_ myself, but never learned the skill to read anything other than what I focus on for the most part. Simply reading a lot doesn't automatically grant you the skill to be able to read like you do. You likely have a genetic advantage... Or perhaps disadvantage, in this particular case.;)

Since this was turned in to a massive, YouTube-wide trolling effort, it's being fixed nearly immediately. What if 4chan hadn't gotten a hold of it though? What if some scammers/spammers did? And used it for weeks? It would have been more subtle, and with YouTube's traffic, it could have been massively successful. Who knows what effect that could have had if this wasn't caught quickly. Did 4chan just do a good thing?

But until Google says otherwise, we can't know that this wasn't already the case.

Fortunately, they already have all the data with potential exploits and are reasonably well known for their ability to search for things. Depending on how things are stored, it even might be as simple as doing a first-cut by looking for an unescaped < character.

The first bug I found was that a new user could insert script tags in their username (any field, really), my employers response was "Why would anyone want to hack a website?"... I wouldn't drop the issue, so they dropped me.

It's only bad design / coding / development - who cares! It happens all the time and will happen as long as the subpar designs / development / coding is allowed. Mostly I would blame the design of these systems - it's very difficult to (safely) implement anything which is already broken, as most of the systems today! Or - if you don't agree, list the systems that haven't been broken one time or other? Or - which will not be broken in future?

That goatse.cx is very old news and that there are whole new horrors I never even heard of.

Someone must be looking out for me.

You know you are living a blessed life when you got no idea what 1man1jar or lemon party is. Reminds me of being a little kid and having no idea what the adults were talking about. Only this time I know the value of ignorance.

Let me see. 1 man 1 jar, must be about a man collecting pennies to buy a gift for his mother.

YouTube is supposed to be a kid-friendly place. Parents could do their best to try to responsibly monitor and guide their kids' surfing habits, but still fail because of this exploit. This is not funny, nor awesome. This is not someone finding a potential exploit and graciously letting Google know so they can patch it up. Just a bunch of 4channers screwing around, and to hell with the consequences. And people like you encouraging that type of behaviour.

Just because this is The Internet(TM), it doesn't mean that common courtesy need not apply.

From what I've seen, there were not only simple insults and racist annoyances, but numerous redirects to the hardest shock site you've probably ever seen. That video makes 2girls1cup, benzin.avi and even the hardest war-porn look like family-friendly softcore entertainment in comparison. It has something to do with 1 man and 1 jar and I dare you to Google that if you have doubt this is emotionally scarring material.

This isn't a simple mistake, it's a sign of pure incompetence since the developer put no forethought into the uses of the tool he was developing and blindly trusted user input from a textarea. User input is dirty, dirty dirty and any developer who does not clean and sanitize it before consuming it is not doing his/her job.

The summary states that the first script tag was escaped as it should be. It was a bug, not a lack of foresight.

Reactive regexing offending tags such as "script", "object" or "embed" don't work if you don't know they exist. As such, it's easier to simply include functions in the programming language API that escape/unescape strings sent in through user input so that junk like that doesn't get echoed into something hazardous.

Does anyone understand what IF_HTML_FUNCTION is supposed to mean in the exploit code? As far as I can tell it's just plain text with no special meaning, it's just copied and pasted blindly from some previous code. Am I wrong?

Yeah, I was wondering about this to. I ran into the exploit last night and noticed that in the page source. Fortunately, all the injected code did was insert a marquee comment asserting the video posters deviant sexuality while breaking the rest of the page.

Indeed, which is why everyone but Perl programmers use library functions rather than writing their own regular expressions for working with markup. As a bonus you avoid little bugs like forgetting to escape '&', and it'll probably escape '"' and ''' as well so you can use it for attributes.