Posts Tagged document generation

If your database serves an information system, there is a pretty good chance that system generates some written documents to end users – predefined reports or simply letters or some kind of official documents to be printed. In the case of relatively simple documents, i.e. not complicated multi-level detailed reports, application has to generate it using a template or some other meta-description of it, and fill it with a case-specific data. If there is a need to generate same documents from different apps (e.g. a web app and a desktop app), a solution to generate it and serve it right from the database might come in handy.

There are several ways to do this, but probably the simplest is to call the stored procedure (or a function) with the right parameters, which will return the document. Document templates which will be used should be stored somewhere in the database. The document generation function takes the appropriate template, searches for predefined tokens and replaces them with the correct data, depending on the input parameters. That’s why the template needs to be in some simple text format, perhaps RTF (rich text format) which allows neat formatting, could be prepared in most of the WYSIWYG text editors, can contain pictures and other advanced formatting stuff, and yet could be viewed as a plain text file, so tokens could be found and replaced easily.

Tokens should be predefined, agreed upon, and unique in a way that they could never be a part of a normal text. Some examples of tokens might be: REPLACEME_CURRENT_DATE, ##CUSTOMER_NAME## or %customer.id.number%. I suggest to define a single notation and stick to it. Just make sure the token format does not interfere with the special characters your chosen document format uses. After this, templates should be prepared in any rich text editor, like LibreOffice Writer or MS Word and stored in a database table with templates. Next, the document generating function has to be written. Depending on the size of the template, you might need to use some character data types like CLOB or LONGLVARCHAR (a new, still undocumented type), which make things more complicated. Main reason for this is the REPLACE function which doesn’t support bigger character types. Since we want to replace all tokens of the same kind in the document at once, we may store the file on the server, iterate through all the token types, replace them with proper values (using sed, for example), pull it back from the file system and return it to the user. Here is the outline of the function which might do that:

-- get the appropriate template from the table
-- and store it on the file system using LOTOFILE function

-- fetch the values which will replace tokens in the template

FOR token IN (list of tokens)

-- replace each token in the file using SYSTEM 'sed...'

END FOR

-- pull the file from the file system using
-- FILETOCLOB and return it

END FUNCTION;

This solution requires a smart blob space and file system privileges for the user executing it, but no other special demands exist. If you’re a fan of datablades, there is also a possibility of using a string manipulating blade functions which would make the replacements without meddling into a file system and leave the file system out.