if I want to validate the input of a <textarea>, and want it to contain, for example, only numerical values, but even want to give users the possibility to insert new lines, I can selected wanted characters with a javascript regex that includes even the whitespace characters.

/[0-9\s]/

The question is: do a whitecharacter can be used to perform injections, XSS,even if I think this last option is impossible, or any other type of attack ?

thanks

update

hi Polynomial, thanks for the advice, along with javascript there must be a php server side control, you're right, but my question it's about a case in which I just want to retrieve, or write, some file info through javascript, for example via ActiveX objects, in that case there could be a saved file with malicious code in it, that's not been filtered at the moment of input filtering.

Its hard to tell what you are doing. In general whitespace and numeric input cannot lead to an injection attack... However, you maybe missing something far more dangerous. Also your regex only matches a single character...
–
rookNov 7 '12 at 22:54

thank you, yes I know, it just was a sample, anyway I just need to run a script locally, not on a webserver, so here is the motivation of javascript, for validating input.
–
weboseNov 7 '12 at 23:00

@webose hacking is a very creative act. Its about looking at the application as a whole and using the implanted functionality in a way that the developer never interned. If you are worried about security you should higher a penetration tester.
–
rookNov 7 '12 at 23:46

2 Answers
2

However... In theory, there are farfetched situations where whitespace could potentially lead to an injection attack. For instance, suppose that the program executes a command like the following:

echo Something bad happened: code $n >> logfile

where $n is the request parameter you sanitized above. If the attacker uses a value like 42\n17 (where \n represents a newline), this will turn into two commands:

echo Something bad happened: code 42
17 >> logfile

The first command will execute, but its output will not be saved. The second command will trigger a command not found error (there is no program called 17), so it will not execute. The consequence is that no message will not be appended to the logfile, contrary to the programmer's intent.

This is very unlikely, in almost any program you are likely to run across. So, probably you can ignore it. But... in theory, I suppose it could happen.

P.S. Your regexp looks dubious. As @Rook says, it only matches a single character. Also, depending how it is used, you may need to anchor it. Maybe you meant something like this:

Really old versions of SpiderMonkey followed a part of the specification that said that all format control characters (Cf) were removed from program text before parsing started.

This meant that

"\CF\",alert(1337)//"

where CF is an invisible format control character appeared to be a single string but was actually treated by the parser as equivalent to

"\\",alert(1337)//"

or to separate parsed tokens with whitespace

"\\" , alert ( 1337 ) //"

If a poorly written JSON parser assumed \ followed by any character was an escape sequence, before forwarding the string to eval then an attacker who controls the input could get arbitrary JavaScript code to eval.

This attack is only of historical interest now, but don't assume similar mistakes won't be made again.

Programmers often write regular expression filters that include . but forget that it doesn't match newlines.

Complicated error recovery modes in parsers (like CSS parsers) often use line-breaks as places to attempt to restart parsing so newlines might be useful in causing quoted content to be interpreted as code as when a parser restarts parsing because it saw what seems to be an unclosed string literal.

Whitelist so that any widgy uses of invisible characters gets thrown away.