Hi,
We are trying to implement security in our application, wherein we need to encode and decode the user inputs.

So can anybody please provide me a list of all the characters that are disallowed or dangerous, that I need to encode?

For eg. for "<" character we use &lt;, for ">" character we use &gt;

so can anybody please tell me if the following mentioned characters are XSS vulnerable, and if yes, then how to encode them?

1) ! - exclamation mark - characters for additional command execution

2) - hyphen - can be used in database queries, and the creation of negative numbers.

3) /\ = The forward-slash and back-slash are often used for faking paths and queries

4) { } [ ] = Curly brackets and square brackets are often used as script, program or regex expressions.

5) *(asterisk) = Often used in database queries for “all”.

eg. <script>x=""*alert(1)*"";y=42;</script>

6) `(Grave accent) = If you need to use both double and single quotes you can use a grave accent(`) to encapsulate the JavaScript string - this is also useful because lots of cross site scripting filters don't know about grave accents.

<IMG SRC=`javascript:alert("Hello, 'XSS'")`>

7) / (division or forward slash) -

<script>x=""/alert(1)/"";y=42;</script>

8) Bitwise “xor” operator: (^)

<script>x=""^alert(1)^"";y=42;</script>

9) Bitwise Left Shift (<<)

<script>x=""<<alert(1)<<"";y=42;</script>

10) Bitwise Right Shift (>>)

<script>x="">>alert(1)>>"";y=42;</script>

11) Bitwise Right Shift With Zeros

<script>x="">>>alert(1)>>>"";y=42;</script>

12) Ternary Conditional Expression

<script>x=""?alert(1):"";y=42;</script>

Please let me know if I need to encode these characters too. I am using Java for development.

Which inputs are safe depends on the context. It sounds like you're trying to make a blacklist which is an innately treacherous approach. The safer/easier option is to use a whitelist: ie allow [a-Z0-9] and remove/carefully encode everything else.

Also see:
https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet

Thanks for the quick reply. Yes, we do want to display the code to the user. In our application, we will be taking user input, then we are encoding the user input with certain characters, which is listed below. Then this encoded value will be inserted in the Database. There are several places in application where we will be displaying this encoded vale i.e. user input, to the user.

So if the user enter's something malicious string or code, then that string will be searched for a list of characters like "<", ">" etc. listed below. If a match is found then it will be encoded.

We were using AppScan tool to test our application, and found that there are 17 characters which are vulnerable to XSS, so we must encode them. Please see the following link for the same.

http://www.51testing.com/?uid-13997-act ... emid-77651

Now I need to find if there are any other disallowed characters which may be vulnerable to XSS. So after googling out, I have found above characters, but I need to be sure that we will need to encode them.

I know that there are many pre-defined function available to encode, like one you said the JavaScript function encodeURIComponent(), but in our application we will be maintaining a whitelist of characters which will be stored in Database something like following.

[< &lt;],[> &gt;], and so on for other characters.

Here "[]"(square brackets) are used to contain the vulnerable characters. it contains the characters followed by space, and then followed by Html entity code for the character. So [< &lt;] contains less than(<) character followed by its Html Entity code.

So when the application starts, the application will query the Database to get all the characters, and keep them in a map.

So when the user inputs something, the application will check the string against the characters in the map, and if a match is found, then replace the character with its equivalent Html entity code.

So I have been asked to find out any other characters apart from 17 listed above, that will be vulnerable to XSS.

Never encode/convert user data upon _input_ into a database. Escape it, cast if you want and encode only upon _output_ or block/deny it when they submit it. Even removing characters can lead you to a whole new ballgame considering the multitude of issues that can arise.