Posts in Category "PDFG Generator"

A discussion on Adobe forums indicates that a lot of LiveCycle users are trying to figure out how to programmatically split a PDF file. Adobe LiveCycle provides a simple method to programmatically split PDF documents using LiveCycle Assembler service. You can split PDF files using the bookmark tags or by page number.

You can write a custom DDX document suited to your requirements. Some of the most commonly requested DDX are:

DDX for splitting PDF document using bookmarks

In the following sample DDX, LiveCycle Assembler servicegenerates a single document for each level 1 bookmark in the source document (AssemblerResultPDF.pdf in this example). The Assembler service generates a name for each document that is the concatenation of the following items:

A string specified by the prefix attribute

A 6-digit sequence number (This number could be used to re-create the original order of the pages after the document is disassembled.)

In this sample DDX, LiveCycle Assembler servicegenerates documents for the mentioned page number from the source document. The Assembler service generates a name for each document based on the result parameter specified in the DDX.
<?xml version="1.0" encoding="UTF-8"?>
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDF result="Final.pdf">
<PDF source="PDF1.pdf" pages="1"/>
</PDF>
<PDF result="Final2.pdf">
<PDF source="PDF1.pdf" pages="2"/>
</PDF>
</DDX>

DDX for splitting PDF document using the page range

In this sample DDX, LiveCycle Assembler servicegenerates documents for the mentioned range of the pages. The Assembler service generates a name for each document based on the result parameter specified in the DDX.
<?xml version="1.0" encoding="UTF-8"?>
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDF result="Final.pdf">
<PDF source="PDF1.pdf" pages="1-5"/>
</PDF>
</DDX>

DDX for splitting PDF documents using page range from different PDF documents and creating a single resultant PDF document

In the following sample DDX, LiveCycle Assembler serviceextracts pages from multiple documents as per the range of pages mentioned in the DDX and generates a single output document
<?xml version="1.0" encoding="UTF-8"?>
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDF result="Final.pdf">
<PDF source="PDF1.pdf" pages="1-3"/>
<PDF source="PDF2.pdf" pages="4-5"/>
</PDF>
</DDX>

Sample program to split a PDF document
Let us write a simple Java program to split a PDF document into multiple documents. To download the resources used in this sample program, click here.

Complete the following steps:

Create a new file and add the following code to the file
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDFsFromBookmarks prefix="Readme">
<PDF source="AssemblerResultPDF.pdf"/>
</PDFsFromBookmarks>
</DDX>
For this example, save the XML file as shell_disassemble.xml.

Create a new Java project and add shell_disassemble.xml to the project.

Add the following libraries to your project. These libraries are required to invoke assembler service in SOAP mode:

adobe-assembler-client.jar

adobe-livecycle-client.jar

adobe-usermanager-client.jar

adobe-utilities.jar

jbossall-client.jar (use a different JAR file if LiveCycle ES is not deployed on JBoss)

activation.jar

axis.jar

commons-codec-..jar

commons-collections-..jar

commons-discovery.jar

commons-logging.jar

dom-xml-apis-.jar

jaxen-.-beta-jar

jaxrpc.jar

log4j.jar

mail.jar

saaj.jar

wsdl4j.jar

xalan.jar

xbean.jar

xercesImpl.jar

Create a new class named DisassemblePDFSOAP .

Add the source PDF file to the project. I have used AssemblerResultPDF.pdf

Modify the locations mentioned in the sample code according to the file paths in your machine

Run the code.

The code splits the file into multiple PDF documents based on the bookmarks or the page numbers specified in the DDX.

This is first blog in the series of the blogs about programmatically splitting the PDF document. In this blog I have shared sample code to split PDF document using bookmarks. In the follow-up blogs, I will include sample code to split PDF documents using:

Page numbers

Page range

Pages from different PDF documents and generate a single output document

LCM Logs

As you might have noticed LCM logs are found at <LiveCycle Installation Location>/configurationManager/log. Default logging level of this is INFO. This is governed by properties file kept inside adobe-lcm.jar: \com\adobe\livecycle\lcm\logging\log.properties.

Using this property file, you can:

Change Logging Level

Define file location and file name.

Define rotation policy

If you want to overwrite the default location of this file to a more convenient location, you can do so by modifying <LiveCycle Installation Location>/configurationManager/bin/ConfigurationManager.bat and specifying following system property:

-Djava.util.logging.config.file=<path to file>

Generating ORB Trace

While working with natives like XMLForms, you can sometimes run into issues where an application abnormally terminates. Following parameters help in generating extra trace information for debugging such issues.

These are required to be placed as argument to the native application:

Also, when we are debugging an issue related to native applications, in System Out logs we can find system natives being invoked and a large IOR is passed to them as input. This IOR can be analyzed by many easily available IOR parsers. (Just Google for them). This can be first step towards debugging natives related problem.

Variable Logging

In order to better understand and debug an orchestration, LiveCycle offers excellent process debug feature. Using workbench, one can easily trace every step of a process and find what exact values any variable hold. For more information, one can refer this blog.

But sometimes this gets difficult due to environment constraints and performance overheads. One may want to introduce a step which will log current state of all variables in either System Out log or the log of your choice.

This can be accomplished using Variable Logger service. One can introduce this while designing the orchestration. Now each time the orchestration runs, the values of variables will be logged as the step is executed.

Other Application Logging Locations

Content Services and CMSA Logs

Content Services and CMSA logs are created in working directory of the application server.

LiveCycle Installer Logs

Installer logs can be found in following two locations

<LiveCycle Installation Home>

<LiveCycle Installation Home>/logs

Service Pack Logs

Service pack logs can be found at <LiveCycle Installation Home>/patch/<Patch Name>/log

CRX and Correspondence Management Logs

From ES3 onwards, you will find CRX and CM logs at <CRX Repository Directory>/logs. (More on this will be covered in next part of blogs)

Following are a few tips and workarounds for LiveCycle PDFG. Please note that the workaround marked as unsupported are not officially supported by Adobe.

[Unsupported] On UNIX servers customers can use 64-bit OpenOffice to do OpenOffice based conversions. The obvious benefit from this is the performance improvement we get. To achieve this just point JAVA_HOME_32 to 64 bit version of Java. Same can achieved on widows too but you may observe immediate conversion failures for other native file formats.

[Unsupported] Any file which can be opened by Acrobat (like a text file) can be converted to PDF using LiveCycle PDF Generator. You just need to add the comma separated file extension (for example txt for text files) in XPS to PDF file-type setting.

A user/administrator can directly jump to PDF Generator UI by hitting http(s)://<server-name>:<port>/pdfgui. This way a user can skip couple of clicks on UI to land on PDF Generator user interface.

LiveCycle PDF Generator supports HTML to PDF conversions. HTML document can be provided in any of following forms:

Submit an html file to be converted to PDF.

Provide http(s) URL of the html to be converted to PDF.

Submit a ZIP file containing an entire website (zip should contain index.html at the top level) for creating PDF.

While submitting an input HTML file, the user can provide a variety of options like:

The level to which spidering will be performed

Whether to get the entire site or not

Stay on same path (in terms of URL), while fetching the HTML document(s)

Stay on same server. It is useful when you have specified spidering level of more than 1 and at the same time does not want to create PDF from html documents linked on input html if it’s on a different server.

PDF page size and margin options

Add bookmarks

Enable tagging

Set initial views settings: It contains option like which page to open on PDF open

LiveCycle ES2 PDF Generator and later provides the facility to specify Adobe Acrobat Professional as the fallback to create PDF files. A downside to this fallback is that Acrobat based conversion is single-threaded, whereas LiveCycle PDFG based conversions are multi-threaded. Also, Acrobat Professional does not honor the options mentioned above. This facility is only available on Windows. LiveCycle Administrator can also configure the Generate PDF Service to always prefer the Acrobat route. To do this navigate to Home > Services > Applications and Services > Service Management > Configure GeneratePDFService and set the “Use Acrobat WebCapture (Windows Only)” option to True.

What’s New in ES3

The HTML to PDF engine creates high fidelity PDF documents. Time taken to create the best quality PDF document may seem longer to some users. For some user a low quality PDF is acceptable, if the conversion time is faster.

In LiveCycle ES3 a new conversion engine is introduced to achieve this and get a quick turnaround time for conversions. This engine is supported on all the supported platforms of LiveCycle. Moreover this engine honors all the conversion options mentioned above. There is a bonus option of specifying header and footer text to be put in the generated PDF document. This engine acts as fallback for high quality HTML to PDF engine on UNIX machines. In order to set this new engine as the preferred route, navigate to Home > Services > Applications and Services > Service Management > Configure GeneratePDFService and set “Use ICEBrowser based Html to PDF” option to True.

Indexing of content in LiveCycle Content Services 9 depends on different LiveCycle ES2 components and services. Here are a few important prerequisites:

Indexing of PDF files (except for dynamic PDF forms) requires the Assembler service, which is part of all LiveCycle ES2 installations.

Indexing of dynamic PDF files requires LiveCycle Output 9. If Output is not installed, the FormDataIntegration service, available on all LiveCycle ES2 installations, is used instead. However, in such cases, for dynamic PDFs created in Acrobat, only the form data is indexed. The form design is left unindexed.

Adobe LiveCycle PDF Generator ES Update 1 (8.2) introduced a new feature called the PDF Generator ES IPP Client, which allows you to generate a PDF from any application that supports printing. The feature is essentially a print driver that prints to PDF Generator ES. After the print driver is installed on a user’s computer, “Adobe LiveCycle PDF Generator ES” appears in the user’s list of available printers. Printing to that printer from any application sends the document (in PostScript format) PDF Generator ES. LiveCycle PDF Generator ES then converts the PostScript file to PDF and sends the PDF file to the user as an attachment to an email message.

Here are the steps required to get this feature working:
1. Install and configure LiveCycle PDG Generator ES.
2. Log into LiveCycle Administration Console, click Services > Applications and Services > Service Management, and find provider.email_sendmail_service. Click the service name and ensure that the Configuration tab is filled correctly. This is where you specify the information that LiveCycle uses to send the email messages.
3. Ensure that your users are configured with a valid email address in the LiveCycle database and assign the PDFGUserPermission to each user. (See Managing Users and Groups and Managing Roles in the LiveCycle User Management Help.)
4. Install and configure the print driver on your users’ computers. For instructions on installing the print driver, see “Installing the IPP client” in your LiveCycle Installing and Deploying guide (such as Installing and Deploying LiveCycle ES for JBoss).