I've encountered some strangeness while using RSyntaxTextArea. The document I'm editing sometimes contains hidden newlines. Hidden as in that they do not affect how the document is displayed (a line does not continue in a newline) but are there anyways.

Sometimes when I delete newlines while editing (Backspace), these hidden newlines remain. If I save such a document to a file on the filesystem and reopen it, the document is formatted the same way as before the change...as if nothing had happened.

I know that these hidden characters are there, since I can navigate them using keyboard arrow keys after deleting a newline. There's a "blank" hit, when I move the caret using the aforementioned keys (at some point I have to use left or right arrow twice to move the caret one unit to left or right respectively). If actions behind these keys don't ignore these hidden newlines, why does the action that handles deletion? Note that these actions come bundled with RSTA and were left untouched.

Both the RSTA XML Tokenizer and the third party parser I'm using also see these newlines. When I copy pasted some XML from an email that had been mangled by the email client (some tag names and text values were split in twine with additional newlines) and attempted to edit it manually to remove them newlines, I actually ended up with a tag that looked like a proper XML tag, let's say <diskUsageApp>, but was colored as if the "App" part were an XMLAttribute token! And the parser of course complained of an improper syntax of the "App" attribute...

I suspect it has something to do with those evil null tokens at the end of the lines. Any idea as to what might be causing this?

I get this using 2.0.3, 2.0.2 or 2.0.1 (unsure).

Edit:It is actually deleting with Delete key, that seems inconsistent (not Backspace). I cannot delete the invisible newline with Delete, but I can with Backspace (it is aware of them like the arrow keys). The thing is...I shouldn't have to bother with deleting something that cannot be seen.

Hmm, my guess would be carriage returns are getting included when you copy-and-paste text into RSTA, or possibly some other special character. If a non-printable character somehow gets into RSTA (or any Swing JTextComponent), you'll see behavior similar to what you have - an effectively 0-pixel width "character" that the caret seems to honor but may be a little funky.

Carriage returns are funky with JTextComponents since JTextComponents use only newlines to represent line breaks internally; typically, when copy/pasting, the new text comes to Swing as a java.lang.String, with e.g. "\r\n" pre-morphed into just "\n", so everything just works. This is what happens when reading text files into JTextComponents via the .read() method, for example. But you may have found some scenario where a non-printable character is getting through somehow.

Ah.. For some reason I populated RSTA's document using a java.util.Scanner and JTextComponent.setText(String), even though I should always use JTextComponent's dedicated methods to do this. Not sure why I did it this way, but the implementation normalized line endings to what System.getProperty("line.separator") returns. On windows that would be CR LF, but since, like you say, text components only handle LFs internally, all hell broke loose.

I think I remember experiencing difficulties when saving an loading files using using RSTA's dedicated methods. Documents got saved, but newlines appeared out of nowhere. The reason for this could be that I've so far considered using System.getProperty("line.separator")everywhere as a good practice. It seems that I've been wrong to assume that when around text components.

Thank you.

EditBTW, I asked a question on SO that touches on this matter. We'll see if there's a guru out there who knows the answer.

Aye, sounds good. I think it's always best to use the read() and write() methods whenever possible, as it'll handle things like this.

I personally always wished Documents remembered the actual line break char(s) read for each individual line, so even if they aren't uniform in a single file, you could load/save it without them being "normalized" in some way. But oh well.

My personal guess as to why '\n' was used is that it's simple, just a single byte, and there doesn't have to be logic to remember the line break char(s) for each single line. When appending via insertString() or using setText(), it's just one of those things you just have to remember, to just use '\n' for line breaks. Sorry, I know that's not an explanation.