Recently we found Paul Madary's blog post about digital signatures in a Univeral Application (UWP) with iText 7, and we wanted to share it. Paul gracefully agreed to let us do that, and as a bonus we upgraded the code to be usable out-of-the-box with iText 7.1.3. The only change needed is the method SignDocumentSignature.

A big thank you to our Q3 top contributors!

There are so many people that contribute information to us in order to help improve our code, products and projects. We want to make sure that our top contributors are being recognized for the help they give and that they know how much we appreciate them.

Top menu

Breadcrumb

How to convert Arabic HTML to PDF?

Why are the characters generated as "?" characters?

5th November 2015

admin-marketing

I am having difficulties displaying Arabic Characters from HTML content in PDF. The characters are generated as "?" characters. I am able to display Arabic text from a String value, but not from HTML. I want to display the PDF with two column, left side English and the right side Arabic Text.

Please take a look at the ParseHtml7 and ParseHtml8 examples. They take HTML input with Arabic characters and they create a PDF with the same Arabic text:

A PDF table with HTML content

An HTML table in PDF

Before we look at the code, allow me to explain that it's not a good idea to use non-ASCII characters in source code. For instance: this is not done:

htmlContentAr = “

رقم التعميم

”;

You never know how a Java file containing these glyphs will be stored. If it's not stored as UTF-8, the characters may end up looking like something completely different. Versioning systems are known to have problems with non-ASCII characters and even compilers can get the encoding wrong. If you really want to stored hard-coded String values in your code, use the UNICODE notation.

For the examples shown in the screen shots, I saved the following files using UTF-8 encoding:

The second part of your problem concerns the font. It is important that you use a font that knows how to draw Arabic glyphs. It is hard to believe that you have arial.ttf right at the root of your C: drive. That's not a good idea. I would expect you to use C:/windows/fonts/arialuni.ttf which certainly knows Arabic glyphs.

Selecting the font isn't sufficient. Your HTML needs to know which font family to use. Because most of the examples in the documentation use Arial, I decided to use a NOTO font. I really like these fonts because they are nice and (almost) every language is supported. For instance, I used NotoNaskhArabic-Regular.ttf which means that I need to define the font familie like this:

style="font-family: Noto Naskh Arabic"

I defined the style in the body tag of my XML, it's obvious that you can choose where to define it: in an external CSS file, in the styles section of the

</p>

<p>

, at the level of a tag,... That choice is entirely yours, but you have to define somewhere which font to use.

Of course: when XML Worker encounters font-family: Noto Naskh Arabic, iText doesn't know where to find the corresponding NotoNaskhArabic-Regular.ttf unless we register that font. We can do this, by creating an instance of the FontProvider interface. I chose to use the XMLWorkerFontProvider, but you're free to write your own FontProvider implementation:

There is one more hurdle to take: Arabic is written from right to left. I see that you want to define the run direction at the level of the PdfPCell and that you add the HTML content to this cell using an ElementList. That's why I first wrote a similar example, named ParseHtml7:

There is less code needed in this example, and when you want to change the layout, it's sufficient to change the HTML. You don't need to change your Java code.

One more example: in ParseHtml9, I create a table with an English name in one column ("Lawrence of Arabia") and the Arabic translation in the other column ("لورانس العرب"). Because I need different fonts for English and Arabic, I define the font at the

</p>

<p>

level:

Lawrence of Arabia

لورانس العرب

For the first column, the default font is used and no special settings are needed to write from left to right. For the second column, I define an Arabic font and I set the run direction to "rtl".

How to add HTML headers and footers to a page?

Ready to use iText?

As always, if you have any technical questions, you can contact support with your valid support subscription or head over to one of our community support pages on Stack Overflow to see if your question has already been answered for our AGPL users.