This blog is about the Linux Command Line Interface (CLI), with an occasional foray into GUI territory.
Instead of just giving you information like some man page, I hope to illustrate each command in real-life scenarios.

Search This Blog

Monday, February 17, 2014

Most - if not all - new electronic devices don't come with printed manuals. For the Sony laptop SVF15A1 I bought, I had to go to the Sony web site to access the user guide. The user guide is composed of many individual HTML pages. This posted a problem for me because I wanted to convert some HTML pages to PDF documents for easier off-line access.

Modern web browsers, such as Chrome and Firefox, have the built-in print to PDF feature.

For Chrome,

Navigate to the HTML page, right click, and select Print.

Select Save as PDF to be the Destination.

Click Save, and select the output directory to save in.

Note that the output PDF file is automatically named User Guide _ Using the Touch Pad.pdf. I did not have to manually make up the file name because the default name was extracted from the HTML page content. This saves so much time for users that it becomes a major advantage for using the browser print function to do the conversion.

If you prefer a command-line solution, wkhtmltopdf is a simple yet powerful tool to convert HTML to PDF.

wkhtmltopdf can convert HTML files stored on the local hard drive or over the Internet. The above example specifies a URL from the Sony support web site.

By default, the pages in the output PDF file are of the size A4. North American users may specify the page size Letter using the -s parameter.

The last parameter to wkhtmltopdf is the mandatory output file name.
Despite its rich set of parameters, it is missing the feature to automatically name the output file.
To harness the full power of wkhtmltopdf, please refer to the man page.

If you want to merge individual PDF files, please see my posts on pdftk, and ImageMagick.

Saturday, February 8, 2014

If you administer a Linux computer, you may occasionally ask when a software package was last installed or updated on your system.

For a Red-Hat-based operating system - Centos, Fedora, RHEL, etc - getting the answer is a simple task of querying the RPM database. The RPM database stores, among other things, the last install date of rpm packages.

Note the Install Date for curl is Sun 11 Aug 2013 03:55:52 PM PDT. If curl was installed and subsequently updated, the stored Install Date is the update date, not the first install date.

If you run a Debian-based OS - Debian, Ubuntu, Mint, etc - you have to work harder to get the answer. The Debian package manager (dpkg) does not actually store the install date of packages in its database. However, you can still find it out using either of the following procedures.

Search the dpkg log files

The following example reveals the last update time for the curl package (2014-02-04 11:40:55).

There is a catch with this approach. You cannot search merely the current dpkg log. Depending on when the package was installed or last updated, the dpkg log may already be rotated out. Hence, the asterisk in dpkg.log*. It matches dpkg.log, dpkg.log.1, dpkg.log.2, etc. However, if the package was installed a long time ago, the log file you are looking for may be already auto-deleted from the system.

Look for the last modified timestamp of the package's file list.

When a package is installed or updated, its corresponding file list in /var/lib/dpkg/info/ is overwritten with the latest information. For instance, /var/lib/dpkg/info/curl.list contains a list of file names that are installed by the curl package.

The last modified timestamp of curl.list gives you a fairly accurate time of when curl was last installed/updated.

Monday, February 3, 2014

This post is about combining pdf files. It complements 2 earlier posts on splitting pdf files - part 1, and 2.

I am using a tool named SimpleScan to scan in multiples pages of a document. Each page is scanned into a separate pdf file. I must find a way to stitch the pdf files together into one single pdf.

You can use either the pdftk command, or the gs command to merge pdf files. You may recall that they are the same commands you would use to split pdf files.
In the examples below, the input files are T4a.pdf, T4b.pdf, and T4c.pdf; the merged output file is combined.pdf.