Syntax Highlighting for English?

Do think it would be easier to read it all nouns were blues, all verbs were red, all adjective green, articles, periods, commas, and question marks were bold, etc?? It might take some getting use to, but after that it might make an interesting cognitive experiment.

Do think it would be easier to read it all nouns were blues, all verbs were red, all adjective green, articles, periods, commas, and question marks were bold, etc?? It might take some getting use to, but after that it might make an interesting cognitive experiment.

How would inserting distracting, redundant information make things easier? If anything I think it would make it harder to read. Or are you saying you have trouble identifying whether a word is a noun or a verb without some kind of hint?

Highlighting a natural language (is that the right term?) such as English would not be like highlighting source code. When one reads English, one doesn't need highlighting to tell what is a verb and what isn't. We're so used to reading English that our brains don't need any clues or hints.

Unlike syntax highlighting for source code, I think that syntax highlighting for English would only be useful if you were doing some sort of analysis of text, like counting the average number of verbs per sentence. For ordinary reading it would just be annoying, unless the colours only contrasted slightly (which sort of defeats the purpose).

Another point: English is meant to be read in a linear fashion. When you read code your eye jumps around.

Besides, it would be a rather massive undertaking. You'd need a dictionary, not just a simple set of rules like you need for highlighting programming languages.

dwk

Seek and ye shall find. quaere et invenies.

"Simplicity does not precede complexity, but follows it." -- Alan Perlis
"Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
"The only real mistake is the one from which we learn nothing." -- John Powell

Besides, it would be a rather massive undertaking. You'd need a dictionary, not just a simple set of rules like you need for highlighting programming languages.

It would require a bit more than just a dictionary. Take this very good example, almost identical to the title of a well-known book:

"He eats shoots and leaves."

Is the word "shoots" a verb or a noun in this sentence? It could refer to shoots as in, "bamboo shoots," in which case it is a noun. Or it could refer to shoots as is "shooting a weapon." Without context, there is no way to determine which sense is meant. It rises above the level of grammar into semantics.

I suppose you could argue that the presence of commas in certain places might make it unambiguous. But commas are not a part of natural English -- they are invented to make ambiguous written statements easier to parse. In spoken English there is no comma (although there is usually a brief pause where one would be).

And what if you're parsing ungrammatical or imperfect English? The highlighting would be in contradiction to the intended meaning and only serve to confuse the reader.

Kind of like code getting out of sync with comments, whenever you have redundant information there is a risk of it not being self-consistent. Better to have a single but potentially ambiguous source of information than risk introducing paradoxes.

Do think it would be easier to read it all nouns were blues, all verbs were red, all adjective green, articles, periods, commas, and question marks were bold, etc?? It might take some getting use to, but after that it might make an interesting cognitive experiment.

Considering cboard's recent past, I can't tell if you're serious or simply a master of biting sarcasm. Good job on that.

I think it depends, we don't need it, therefore it would actually be an annoyance. But if someone where to grow up learning english that way, then they would become dependent on it and would have a hard time without it. Again, it's all about conditioning. Unique idea though.

How would inserting distracting, redundant information make things easier? If anything I think it would make it harder to read. Or are you saying you have trouble identifying whether a word is a noun or a verb without some kind of hint?

I think it would make reading it easier. I would suggest code as a good example. If many of the words had a color, I could determine what type of word it was with out looking at its letters.

Highlighting a natural language (is that the right term?) such as English would not be like highlighting source code. When one reads English, one doesn't need highlighting to tell what is a verb and what isn't. We're so used to reading English that our brains don't need any clues or hints.

I could say the very same thing about code.

Another point: English is meant to be read in a linear fashion. When you read code your eye jumps around.

Besides, it would be a rather massive undertaking. You'd need a dictionary, not just a simple set of rules like you need for highlighting programming languages.

Actually my eyes often jump around when I read English., and often I find my self reading code in a somewhat linear fashion. I'm not quite sure that this is relevant though.

It would not be that large an undertaking. Dictionary databases already exist, and the use of artificial intelligence could make some things easier.

It would require a bit more than just a dictionary. Take this very good example, almost identical to the title of a well-known book:

"He eats shoots and leaves."

Is the word "shoots" a verb or a noun in this sentence? It could refer to shoots as in, "bamboo shoots," in which case it is a noun. Or it could refer to shoots as is "shooting a weapon." Without context, there is no way to determine which sense is meant. It rises above the level of grammar into semantics.

I suppose you could argue that the presence of commas in certain places might make it unambiguous. But commas are not a part of natural English -- they are invented to make ambiguous written statements easier to parse. In spoken English there is no comma (although there is usually a brief pause where one would be).

And what if you're parsing ungrammatical or imperfect English? The highlighting would be in contradiction to the intended meaning and only serve to confuse the reader.

Kind of like code getting out of sync with comments, whenever you have redundant information there is a risk of it not being self-consistent. Better to have a single but potentially ambiguous source of information than risk introducing paradoxes.

I would say that commas are part of natural written English.

If the english is ungrammatical or imperfect, then the system will not work as well. The same thing can be said about code. Another thing I could point out, is that each code editor has a different scheme for syntax highlighting. There is no standard highlighting scheme, yet one has little trouble reading code in an unfamiliar code editor.