Introduction

When writing text for a Web page, that text must be formatted according to the rules of HTML and XHTML. Characters that have special meaning in HTML must be properly encoded and, since most whitespace is ignored, special tags are required to denote line breaks and paragraphs.

However, sometimes an application needs to display text from a file or database. This text may be entered by the user or from another source that has not been formatted for HTML. In these cases, the text must be converted.

Fortunately, the .NET library provides the HttpUtility.HtmlEncode() method to encode special characters so that they will appear as expected in a browser. However, this method won't do anything with line breaks and paragraphs.

When a Web application needs to display unformatted text that contains multiple lines and/or paragraphs on a Web page, a little more work is required.

Writing a Text-to-HTML Converter

The ToHtml() method converts blocks of text separated by two or more newlines into paragraphs (using <p></p> tags). It converts single newlines into line breaks (using <br> tags). And it calls HttpUtility.HtmlEncode() to HTML-encode special characters.

In addition, this method supports a special syntax for specifying links. Because regular <a> tags would be encoded by this method, a special syntax is required to allow users to specify a hyperlink.

This syntax uses double square brackets ([[ and ]]). So, for example [[http://www.blackbeltcoder.com]] produces a hyperlink with http://www.blackbeltcoder.com as both the anchor text and the target URL.

You can also specify two text values in the form [[Black Belt Coder][http://www.blackbeltcoder.com]]. This produces a hyperlink with Black Belt Coder as the anchor text and http://www.blackbeltcoder.com as the target URL.

If you are simply taking unformatted text and displaying it on a Web page, then this special link syntax won't come into play (the double square brackets are unlikely to occur naturally). But if you or your users want the ability to submit Web content in plain text, this provides an easy syntax to specify hyperlinks.

The attached source code download includes a slightly more extensive example.

Conclusion

This code is based on a routine that I use all the time. For example, I might use it where users can enter text in a <textarea> box.

I also use it where my software allows me to enter quick comments. Rather than having to write correct HTML code, I can just type plain text and my code generates the HTML for me. And in cases where I want to generate a simple hyperlink, this code supports that as well.

Of course, this is a simple routine and there are much more sophisticated ways of converting text to HTML. For example, Markdown is a sort of hybrid syntax between plain text and HTML, and gives you much more control over HTML formatting while still entering primarily plain text. However, that approach would be overkill for many applications.

For applications that must convert text to HTML and don't need a lot of bells and whistles, the code I've presented in this article should do quite nicely.