An intermediate way to generate Indic PDFs using iText/iTextsharp

All about generating PDF files for Indic scripts using iText/iTextSharp.

Introduction

I am working/dealing with Indic scripts for my project and I faced many problems due to partial/no support given by libraries/languages/frameworks. I have been trying to resolve my problems. One of them is generating PDFs dynamically.

iTextSharp is a wonderful library for generating PDf Files dynamically. It supports Unicode but not Indic Scripts. iText/iTextSharp has a ready answer for this problem/bug/non-support, but no solution. I am not criticizing iText/iTextsharp here. Because it's Open Source, we have the option to improve it.

Background and Terms

I assume you have a basic idea of iText/iTextSharp and GIST and the terms ISCII, Unicode, IsFoc.

My Intermediate Solution

Going back to history. I use GIST (developed by CDAC) and ISCII.

Step 1: First you have to convert the Unicode HTML string to an ISCIIHtmlString.

Step 2: Convert the ISCII string to ISFOC and send this string to the iText HtmlParser.

Step 2 and step 3 can be merged by directly converting the Unicode string to ISFoc. This method is exposed by GIST. Refer to the GIST documentation.

Note: The above method is written in VB.NET so you have to compile it as a separate library and then consume it. I also tried it with C# ref variables (did not get fruitful results; needs to be written in C#).

Advantages

Compact and tiny compared with Unicode (obviously).

Limitations

It inherits all the limitations from ISCII .

It supports bilingual only (i.e., English + an Indic language).

It's not the recommended way/fails when there are multiple languages (can overcome this by splitting the string but it's difficult at runtime, but still you can try; it's possible) because it can't recognize the difference between the languages (Indic).