This blog is about the Linux Command Line Interface (CLI), with an occasional foray into GUI territory.
Instead of just giving you information like some man page, I hope to illustrate each command in real-life scenarios.

Search This Blog

Friday, November 13, 2015

To a technical savvy person, hosting your own website on a VPS server can bring a tremendous sense of accomplishment and a wealth of learning opportunities. If WordPress is what you fancy, with a minimal monthly financial commitment, you can host a WordPress website on the LAMP platform (Linux, Apache, MySQL, PHP). For example, the entry-level, $5 per month plan offered by DigitalOcean, of which I am an affiliate, will give you a 512MB RAM, single-core VPS.

With such a small RAM capacity, you will need to optimize how your Apache webserver is configured to run PHP applications such as WordPress and Drupal. The goal is to maximize the number of concurrent web connections.

This tutorial details the Apache/PHP setup procedure on Debian 8.2, aka Jessie. The procedure assumes Apache is yet to be installed. However, if Apache2 is already installed, you will find practical information below on how to reconfigure Apache2 to run a different multi-processing module.

Background knowledge

According to a recent Netcraft webserver survey, Apache powers 46.91% of the top million busiest websites on the Internet. Busy websites mean many concurrent web connections.

Concurrent connection requests to Apache are handled by its Multi-Processing Modules. MPMs can be loosely classified as threaded or non-threaded. Older Apache releases default to a MPM named Prefork. This MPM is non-threaded. Each connection request is handled by a dedicated, self-contained Apache process.

Newer Apache releases default to a threaded MPM, either Worker or Event. The Worker MPM uses one worker thread per connection. One issue with this approach is that a thread is tied up if the connection is kept alive despite it being inactive.

The Event MPM, a variant of Worker, addresses the aforesaid keep-alive issue. A main thread is used as the traffic controller that listens for requests and passes requests to worker threads on demand. In this scenario, an inactive but kept-alive connection does not tie up a worker thread.

Note that MPMs are mutually exclusive: only 1 MPM can be active at any given time.

For the Prefork MPM, each spawned Apache process embeds its own copy of the PHP handler (mod_php). Concurrency in this model is limited by the number of processes that Apache can spawn given the available memory.

For both Worker and Event MPMs, PHP requests are passed to an external FastCGI process, PHP5-FPM. PHP-FPM stands for PHP-FastCGI Process Manager. Essentially, the webserver and the PHP handler are split to separate processes. Apache communicates with PHP-FPM through an Apache module, either mod_fastcgi or mod_fcgid. Optimizing concurrency in this model means configuring both the MPM and the PHP handler (PHP-FPM) to have pools of processes and threads to handle requests.

The rest of this tutorial covers the cases of installing the Event MPM from scratch as well as migrating to Event from the Prefork MPM.

Installing Apache2

This tutorial starts with the installation of Apache2. If Apache is already installed, you should find out which MPM is currently running using the command apache2ctl -V, and proceed to the next section.

The above output tells us that we are running Apache release 2.4. Beginning with 2.4, Apache runs the Event MPM by default.
If you are running an older version of Apache, the default MPM is either Prefork or Worker.

Configuring Apache2

Back up the Apache configuration file, /etc/apache2/apache2.conf.

$ sudo cp /etc/apache2/apache2.conf{,.orig}

Edit the configuration file.

Below is a subset of configuration parameters belonging to the Apache core module. You should adjust their values in order to optimize concurrency. The corresponding values are what I use for an entry-level VPS.

While the mod_rewrite module is not strictly relevant to optimizing concurrency, I've included it here as a reminder to install the module. It is an important module for running WordPress.

$ sudo a2enmod rewrite

Installing Event MPM

If you are already running the Event MPM, skip to the next section, 'Configuring Event MPM'. Otherwise, follow the procedure below.

Install the Event MPM.

$ sudo apt-get install apache2-mpm-event

Disable existing MPM.

Recall that only 1 of Prefork, Worker or Event MPM can be running at any given time. Therefore, if you were previously running Prefork or Worker, you must first disable it, and then enable Event.

To disable the Prefork MPM, run this command:

$ sudo a2dismod mpm_prefork

To disable the Worker MPM, run this:

$ sudo a2dismod mpm_worker

Enable the Event MPM.

$ sudo a2enmod mpm_event

Note that the above enable and disable commands are quite 'forgiving'. If you attempt to enable an MPM that is already enabled, or disable an MPM that is already disabled, it will simply return a harmless informational message.

Configuring Event MPM

To configure the Event MPM, modify its configuration file, /etc/apache2/mods-available/mpm_event.conf.
Before making any changes, back it up using the following command:

The above configuration is what I recommend for an entry-level VPS (512MB RAM, single-core). You need to adjust the parameters to satisfy your own system requirements. For a detailed explanation of the above parameters, click here. Note that the Event MPM shares the same parameters as the Worker MPM.

Installing PHP5 handler

To execute PHP code, Apache requires a PHP handler. PHP5-FPM is the PHP handler to use with the Event MPM.

For a new PHP installation, install php5-fpm followed by the meta-package php5.

$ sudo apt-get install php5-fpm php5

In addition to the above packages, I also installed other PHP5 packages which WordPress requires.
While they are not strictly relevant to optimizing concurrency, I've included them here for completeness.

$ sudo apt-get install php5-mysql php5-gd php5-curl

Configuring virtual host

Suppose your WordPress website has the domain name example.com. To set up a virtual host with that domain name, follow the steps below:

Create the Apache configuration file for example.com.

Instead of creating the file from scratch, use the default site as a template.

Installing FastCGI

Apache requires a FastCGI module to interface with the external PHP5-FPM processes.
You can use 1 of 2 FastCGI modules: mod_fastcgi and mod_fcgid.
Click here for a discussion of their differences.
This tutorial uses mod_fastcgi.

Before you install mod_fastcgi, you must:

Enable non-free.

Debian pre-packages the mod_fastcgi module in the non-free archive area of its repositories. Make sure that non-free is included in the /etc/apt/sources.list file.

Disable mod_php.

If Apache2 was previously installed with the Prefork MPM, most likely, it is configured to execute PHP using the mod_php module. In this case, you must disable the mod-php module before you install mod_fastcgi. Otherwise, the install will fail with the error message, 'Apache is running a threaded MPM, but your PHP Module is not compiled to be threadsafe. You need to recompile PHP.'

To disable mod_php, run this command:

$ sudo a2dismod php5

To install mod_fastcgi, execute the following command:

$ sudo apt-get install libapache2-mod-fastcgi

Configuring FastCGI

Back up configuration file.

Before you edit the configuration file /etc/apache2/mods-available/fastcgi.conf, back it up using the following command.

To access the website, you need to grant the proper permission explicitly using the Require all granted statement. Without it, access to the website will be denied with the error message 'You don't have permission to access /php5-fcgi/index.php on this server.'

Enable additional modules.

$ sudo a2enmod actions fastcgi alias

Restart Apache.

The final step is to restart Apache to make all the above changes go live.

$ sudo systemctl restart apache2

Threads in action

Concurrency for WordPress occurs at both the webserver (Apache2) and the PHP handler (PHP-FPM) levels. You can use the ps -efL command to monitor the processes and threads at either level.

To monitor Apache processes and threads, execute the following ps command.

Conclusion

When traffic to your website increases over time, your webserver must scale up to handle the increase in traffic. This tutorial explains how to configure Apache2 and PHP to optimize the number of concurrent connections. After you try it out, if you still find that your website cannot keep up with the traffic, you should consider upgrading your VPS plan to have more RAM.

Friday, October 30, 2015

Introduction

WordPress is the most popular content management system (CMS) on the planet today.
You can customize the look and feel of a WordPress website using third-party themes.
If you want a functionality not offered by the WordPress core, you will most likely find a third-party plugin that satisfies your requirement. With the plethora of themes and plugins comes a major challenge in assuring their quality. Intruders can potentially exploit the vulnerabilities in poorly designed themes and plugins to gain unauthorized access to a WordPress website.

WPScan is a WordPress vulnerability scanner that is free for non-commerical use.
It scans your WordPress website and reveals any known vulnerabilities in the installed plugins and themes.

The rest of this post explains how to install and run WPScan.

Installation

WPScan comes pre-installed on only a handful of lesser-known Linux distributions. If you run Debian, Ubuntu, Centos or Fedora, you must download the WPScan source and build it yourself. Because WPScan is written in Ruby, to build WPScan, you need to install the Ruby development environment.

Your first decision is to select a machine on which to build WPScan. This is the machine you use to launch WPScan later. Note that you can (and should) run WPScan on a different machine than the WordPress host. The examples in this post are based on a Debian 8.2 machine, aka Jessie.

Your next decision is how you will install the Ruby development environment, including the supporting development libraries.
The WPScan website outlines 2 ways to install the necessary development environment on a Debian server: the Debian package management system and the Ruby Version Manager(RVM).

RVM is the WPScan-recommended method. It allows you to install multiple, self-contained Ruby environments on the same system.
RVM puts a dedicated Ruby environment under your Home directory (e.g., /home/peter/.rvm). You can find the RVM procedure on the WPScan home page. I've followed the steps, and it works as advertised.

I opted instead for the Debian package manager method because it is a shorter procedure and I did not need the versatility (and the complexity) that RVM offers.

Below are the steps to install WPScan using the Debian package manager. The procedure is largely based on what is on the WPScan home page. I've added a couple of missing packages that are actually required.

Conclusion

WPScan is an important tool in your defense against possible attacks on your WordPress websites. It is recommended that you schedule WPScan to run regularly to detect known WordPress vulnerabilities. Yet, running WPScan is only half of your job. You remain vulnerable until you patch the vulnerabilities.

In general, the WordPress community fixes most known vulnerabilities and distributes the fixes quickly after the vulnerabilities are first reported. It is important that you keep your WordPress core and the third-party themes and plugins up-to-date. If your WordPress platform is up-to-date, WPScan will most likely return a clean report, and you can stop feeling vulnerable about your WordPress website.

Wednesday, July 29, 2015

This is part 3 of the series on using MySQLTuner to optimize MySQL database performance and stability. Part 1 explains how to install and run MySQLTuner. Part 2 addresses the area of database defragmentation. This post illustrates how to manage MySQL memory footprint.

MySQLTuner output

MySQLTuner was used to analyze a WordPress database deployed on the LAMP platform (Linux, Apache, MySQL, PHP). The host was a VPS server with only 512 MB of memory.

$ perl mysqltuner.pl

If you scroll down to the Recommendations section of the above report, it is hard to miss the eye-catching message:
'MySQL's maximum memory usage is dangerously high. Add RAM before increasing MySQL buffer variables.'

Indeed, adding more RAM is often the cheapest and simplest solution to out-of-memory problems. By spending an extra $5 per month, I can upgrade my VPS server to have 1 GB of RAM. But, before you go spend your hard-earned money on RAM, let's explore some other ways to reduce MySQL's memory footprint.

Maximum number of database connections

Lines that begin with two exclamation marks ('!!') are warnings. Note the following lines in the above Performance Metrics section:

According to the above warning, MySQL could potentially use up to 597.8 MB of RAM. Where did the number come from?

The number was derived from the preceding line. MySQL required 192MB globally and 2.7 MB per connection to the database. By default, the maximum number of connections was 150+1.
(The 151st connection would be restricted to database administration only.) Hence, the maximum memory usage was 192 + 150 * 2.7, equaling 597.

Should you allow for 150 connections? Keep in mind that each connection, even in the idle state, will take up some memory.
MySQLTuner can help you answer the question with confidence.

MySQLTuner reports the highest number of concurrent connections since the last MySQL restart (13 in the above example). The database should be up for a minimum of 24 hours before you run MySQLTuner. In fact, the longer the time elapses since the last restart, the more trustworthy is the statistic.

You can find out from the MySQLTuner report how long MySQL has been up. Go back to the first line under the Performance Metrics section. In the above example, MySQL had been up for 36 days since the last restart.

Although MySQL was configured for accepting 150 connections, the highest number of concurrent connections made in the past 36 days was only 13 (8% of the maximum). In light of that knowledge, we could lower the maximum number of connections allowed, therefore, reducing the total memory footprint for MySQL.

Before we go ahead to reconfigure MySQL, we will consider the wait-timeout threshold which affects how long idle connections stay alive before timing out.

Wait timeout

One of the General recommendations in the above example was:

'Your applications are not closing MySQL connections properly.'

In other words, database connections were opened but not properly closed after queries or updates were already completed.
These idle connections would hang around until a predefined timeout threshold was reached. The default timeout threshold was 8 hours. So, if a query completed in 2 seconds, but because the connection was not closed properly, the connection would live for another 28,798 seconds before timing out. In the meantime, the idle connections continued to consume resources including counting toward the maximum number of open connections.

The culprit was easily identified in the above case: the database was used exclusively for WordPress, an application written in PHP. However, solving the problem can be out of your reach, unless you are a PHP developer.

The good news is that you can reduce the timeout interval by adjusting a MySQL configuration parameter. By making idle connections time out faster, there will be less concurrent connections. For WordPress/PHP applications, I set the wait timeout to be 60 seconds.

It is also worth mentioning that because there are less idle connections due to quicker timeout, you can further reduce the maximum number of connections.

Re-configuring MySQL

To change the maximum number of connections or the wait timeout threshold, edit the MySQL configuration file as follows.

$ sudo vi /etc/mysql/my.cnf

The configuration variables of interest are max_connections, and wait_timeout. Enter a value for each variable using the following syntax:

max_connections = 50
wait_timeout = 60

For the above configuration changes to take effect, a restart of the MySQL daemon is needed.

For non-systemd systems, run the following command:

$ sudo service mysql restart

For systemd-enabled systems, run:

$ sudo systemctl restart mysql

Alternatively, you can dynamically change the configuration variables, thereby avoiding the database restart. To do that, issue the following commands.

Note that modifying the MySQL configuration file is still required if you want the changes to persist after future system restarts.

What's next?

MySQLTuner is not something you run once and forget about it. Your web traffic pattern changes over time. You should schedule to run it regularly and examine the output. Please refer back to Part 1 of this series for instructions on how to schedule a run.

The more knowledgeable you are about database optimization, the more effective you become on using the information provided by MySQLTuner. I recommend the following videos if you want to learn more about MySQL optimization:

Tuesday, July 7, 2015

Part 1 of this series spells out how to install and run MySQLTuner, a script which recommends MySQL configuration changes. The goal is to optimize database performance and stability. This post describes how to interpret and use MySQLTuner output, specifically in the area of database defragmentation.

Proceed with caution

A word of caution is warranted before I plunge into the details of implementing MySQLTuner's suggestions. MySQLTuner does not excuse you from learning the basic database optimization principles and following industry best practices. Following a MySQLTuner recommendation without researching and understanding its ramifications is a gamble that may end up worsening your database performance and reliability.

Optimizing MySQL configuration is not a trivial matter, and must be done in a controlled manner. You should change only one MySQL configuration variable at a time. After every change, monitor the system to verify that the expected outcome is achieved without any negative side effect.

General comments

MySQLTuner is a PERL script which you can invoke like this:

$ perl mysqltuner.pl

The following is the MySQLTuner output for a low-memory VPS server running on the LAMP platform (Linux, Apache, MySQL, PHP). The VPS is dedicated for running a WordPress blog.

One is often tempted to bypass the first several sections of the report on database metrics, and head straight to the Recommendations section. But, the metrics provide the crucial context for the recommendations that follow, and should be read carefully.

Storage engine statistics

The Storage engine statistics section of the report summarizes the total number and size of InnoDB and MyISAM tables in your database.

In the above example, 18 InnoDB and 4 MyISAM tables were detected. But the report does not identify the tables. If you want to list all InnoDB tables, execute the command below.

To list all MyISAM tables, replace InnoDB with MyISAM in the above command.

The key actionable statistic in this section is the total number of fragmented tables (20 in the example). Fragmentation occurs during normal database operations when records are inserted and deleted, leaving behind 'gaps' in the database.

MySQLTuner does not report the size of the 'gaps' or unused space in the fragmented tables. You can find out by running the following MySQL statement.

The DATA_LENGTH and INDEX_LENGTH variables contain respectively the size of the data and the index for a table.
DATA_FREE is the size of the unused space in a table. The fragmentation ratio is the amount of unused space to the sum of the used data and index space.

If your tables are large, you can round up the output length variables to megabytes (MB) by using the following SQL statement:

Database Defragmentation

If you scroll down to the Recommendations section of the report, you will see that the first general recommendation is 'Run OPTIMIZE TABLE to defragment tables for better performance'. You may execute the OPTIMIZE TABLE SQL statement for each of the 22 tables. Alternatively, you can run the mysqlcheck command as follows:

$ mysqlcheck -Aos --auto-repair -u root -p

Notes:

Optimizing a table will lock it up. In other words, no update to the table is allowed while the operation is being performed. For a large production table, the substantial downtime is something that the database administrator should consider before deciding to optimize a table.

Optimizing a table does not necessarily reclaim its free space. This is especially true for InnoDB tables. Prior to MySQL version 5.6, all InnoDB tables are by default stored in a single file. This behavior is controlled by the MySQL configuration variable innodb_file_per_table. Optimizing InnoDB tables stored together in a single file may inadvertently produce the undesirable effect of increasing the file size.

InnoDB tables fragment differently than the legacy MyISAM tables. mysqlcheck optimizes an InnoDB table by recreating it. For each InnoDB table that it optimizes, mysqlcheck generates the following informational message: 'Note : Table does not support optimize, doing recreate + analyze instead'. You can safely ignore those messages.

The mysqld server process must be running for mysqlcheck to execute.

-A (--all-databases)

With -A specified, all tables of all databases are optimized.

If you want to defragment only a specific table of a specific database, customize the following command.

$ mysqlcheck -os <database> <table> -u root -p

-o (--optimize)

This option specifies that the optimize operation is to be performed.

-s (--silent)

-s enables silent mode: only error messages are displayed.

--auto-repair

If MySQLTuner finds a target table which is corrupted, it will try to repair it.

What's next?

Part 3 of this series continues the discussion on MySQLTuner output, specifically about the management of database memory footprint.

Friday, June 19, 2015

MySQL is the database engine behind many web applications on the Internet today. While it is relatively straightforward to install, configuring MySQL to best support your particular application requires expertise and the right tools. This post introduces MySQLTuner, a command-line program which offers suggestions to optimize MySQl performance and stability.

MySQLTuner is a read-only script: it won't actually write to the MySQL configuration file. Based on your database's past usage, it recommends new values to assign to specific MySQL configuration variables. It is your responsibility to understand each recommended change and its possible ramifications, select the changes you want to make, and to make them in a controlled manner.

Installing MySQLTuner

Before you install MySQLTuner, make sure that it supports your MySQL version. You can find the up-to-date compatibility information on its website.

To identify the MySQL version on your database server, run this command:

MySQLTuner is a PERL script that you can install from the standard Debian and Ubuntu repositories. You can install it using the following command:
$ sudo apt-get install mysqltuner

The prepackaged MySQLTuner may not be the latest release available. If you want the latest, or you run a Linux distro other than Debian/Ubuntu, you can install the up-to-date version by downloading it directly. Simply download the PERL script to a directory of your choice using the command:

$ wget http://mysqltuner.pl/ -O mysqltuner.pl

Running MySQLTuner

Your database should be up longer than 1 day before you run MySQLTuner. This is because MySQLTuner bases its recommendations on past database usage. The more data it has to analyze, the more accurate is its recommendations. If MySQLTuner is run on a database that has been restarted in the last day, you will get a warning message: 'MySQL started within last 24 hours - recommendations may be inaccurate'.

To run the script, enter the following:

$ perl mysqltuner.pl

Analyzing output

MySQLTuner reports statistics about the database, and makes tuning recommendations. The top section of the report gives you useful database metrics, many of them actionable. The bottom section provides tuning suggestions for the MySQL configuration file.

You should thoroughly research a suggested configuration change before deciding to implement it. To change a configuration variable, edit the file /etc/mysql/my.cnf.

After you make a MySQL configuration change, restart the MySQL service.

$ sudo service mysql restart

Scheduling runs

Database tuning is not a 'once and done' type of task. Conditions change over time. A good practice is to schedule regular MySQLTuner runs using crontabs.

By default, MySQLTuner prompts the user for the database login credentials. For a cronjob to run MySQLTuner, you may provide the database account and password in the user-specific MySQL configuration file.

Tuesday, June 9, 2015

If you are a Linux command-line user, most likely, you are familiar with the use of the single asterisk ('*') in pathname expansion (aka globbing). How the asterisk behaves is standardized across all shells (bash, zsh, tcsh, etc). For example, the ls * command lists the files and the immediate sub-directories of the current directory.

$ ls *

The single asterisk, however, is not recursive: it does not traverse beyond the target directory. You may use the find command to generate a recursive listing of pathnames. A simpler solution is the use of the double asterisk ('**').

Unlike the single asterisk, the double asterisk is not standardized. Different shells introduced the feature at different times with slightly different behavior. This post focuses on the use of '**' for the bash shell.

The double asterisk feature for bash first appears with bash4. To find out which bash version you are running, execute the following command:

When you do a pathname expansion using '*' or '**', you run the risk that a returned filename is the same as a command-line flag, e.g., -r. To mitigate that risk, precede '**' with '--' as below. The double dash marks the spot where command-line flags end, and positional parameters begin.

$ ls -- **

Under bash, '**' expands to follow symbolic links. This behavior, however, is shell-specific. For zsh, expanding the double asterisk does not follow a symbolic link.

The double dash is a useful tool to add to your everyday command-line usage.

Tuesday, May 26, 2015

A typical software installation goes like this. You install the software using apt-get install or yum install.
Then, you manually edit the software's configuration file in order to satisfy your requirements. If you have to repeat the install on multiple machines, this quickly becomes tedious.

Instead of manually editing the file, I run a text manipulation command such as sed or awk to make the required changes. Then, I script the procedure by inserting the commands in a bash script file.

The scripting of configuration changes serves multiple purposes:

It is a permanent record of the configuration changes.

It is readily repeatable on the same or a different machine.

Below, I illustrate 2 sed tricks to make configuration changes to the Apache webserver. The target configuration file is /etc/apache2/apache2.conf.

Before you make any change, please first backup the original configuration file.

$ sudo cp /etc/apache2/apache2.conf /etc/apache2/apache2.conf.orig

Replacing first occurrence of a string

The default apache2.conf file contains the following line:

Timeout 300

Below is the sed command to change the first occurrence of Timeout in the file to 100.

The -i parameter tells sed to edit the file in place - that is, directly in apache2.conf.

0,/^Timeout\s/ specifies the range of lines over which the sed command is to be executed. In this example, the starting line is the first line (line 0). The finishing line is the line returned by a search for the word Timeout which appears at the beginning of a line (^) and followed by a whitespace (\s).

The line range parameter limits the change to only the first occurrence of Timeout in the file. If you leave out the line range, each occurrence of Timeout in the file will be modified. In many scenarios, leaving it out is OK because the parameter occurs only once in the configuration file.

For some configuration files, a parameter can occur multiples times, in different sections. Next, I illustrate how to limit the change to within a particular section in the configuration file.

Replacing a string within a target section

The MaxClients parameter occurs in 3 sections within the apache2.conf file:

mpm_prefork_module

mpm_worker_module

mpm_event_module

I want to change the MaxClients parameter within the mpm_prefork_module only.

The line range is defined by the /<IfModule ... >/,\@</IfModule>@ clause in the above statement. The opening line in the line range is specified by a search for the <IfModule ... > pattern. The closing line is specified by the search pattern \@</IfModule>@.

An explanation of the closing line pattern is warranted. The slash (/) character is part of the search pattern for the closing line (</IfModule>). However, the slash is also the default delimiter for sed. Therefore, we must use a different delimiter (@) for the closing-line search pattern. Note that the first @ is escaped (\@).

The s/MaxClients.../MaxClients 18/ clause changes the value of MaxClients to 18.

Conclusion

The above are examples of how you can use sed to script common scenarios of changing configuration files. You can achieve the same result using other tools such as awk or perl. Please use the comment system to let us know your own examples.

If you are interested to learn more about sed, please read my earlier posts on the tool:

To read the Apache logs, you need root permissions. However, there is a shortcut that does not require you to run sudo. Note that adm - the admin group for Debian-based systems - is the group owner of the log files. So, if you become a member of adm, you don't need to sudo to read the log files.

To add peter to the adm group, execute any of the following commands:

$ sudo usermod -aG adm peter

$ sudo gpasswd -a peter adm

To verify that peter is now a member of the adm group, execute any of the following commands:

$ id -nG peter
peter adm www-data

You may be tempted, as I was, to not specify peter in the above command. Don't skip the parameter. Without the user parameter, you won't see the effect of the change in group membership - unless you log out and log back in. If you are running X, it means you have to log out of X, not just opening a new command shell window within the same X session.

$ groups peter
peter : peter adm www-data

Again, specify peter in the command. Otherwise, you must log out and then log back in before executing the command.

$ grep adm /etc/group
adm:x:4:peter

If you have made a mistake, and now want to remove peter from the adm group, run any of the following commands:

Besides the adm group, you should consider adding yourself to the www-data group. The Apache web server runs under the www-data user account on Debian systems. As a member of the www-data group, you can more easily modify web server files and directories.

Wednesday, April 29, 2015

The cron daemon is a great user tool to automate tasks that don't require human intervention. Users pre-specify jobs to run in the background at particular times, for example, every Monday, Wednesday and Friday at 2am.

To use cron, each user creates his own crontab ('cron table') file. The command to examine one's crontab file is crontab -l.

The MAILTO line specifies the email address to which cron sends the output of command execution. Please refer to my earlier post on how to set up an SMTP server to forward your emails.

The second crontab line specifies that the backupWP.sh script should be executed at 2am every Monday, Wednesday and Friday. The syntax may look complicated. Fortunately, you can use the on-line Crontab Generator to craft the crontab statements. If you want to learn the syntax, click here instead.

Create crontab

Your crontab file is initially empty. To create the file from scratch, run the crontab command and type in the crontab statements.

$ crontab

Alternatively, put the statements into a temporary file, say /tmp/cron, and run this command:

$ cat /tmp/cron | crontab -

Edit crontab

If you want to modify crontab contents after they are created, run this command:

$ crontab -e

The command opens the crontab file in your default text editor. It is the most versatile way to modify crontab. You can use it to create, modify, and delete crontab statements.
Don't forget to save the file after you finish editing.

The downside for this edit command is the time and overhead of starting the text editor. You can append a new statement directly by using the command in the next section.

Add to crontab

When I was new to crontab, I made the mistake of trying to append a statement by running crontab without any argument. That actually replaced everything in the crontab file with the new input.

The trick is to run 2 commands in a subshell grouped by the round brackets. The first command, crontab -l, fetches the existing crontab statements. The echo command echoes the new statement to be appended. The collective output from both commands are piped to crontab standard input.

Empty crontab

To erase all crontab contents, execute the following command:

$ crontab -r

Conclusion

You may use crontab to schedule regular maintenance and backup tasks. Once it is set up, the crontab file tends to be static. But, if you ever need to add another task, or change the scheduled times, the commands introduced in this post will come in handy.

This is part 1 of a series of Java courses, and will take 5 weeks to complete on-line.

I have never taken a formal programming course on-line. So, I can't advise on the effectiveness of such a course. But, I've taken non-programming-related edX courses before, and the experience was positive.

Thursday, April 23, 2015

Why Monit?

One morning, I went on-line to check my WordPress website. Lo and behold, I saw this error: 'Error establishing a database connection.' My website had been down for 4 hours, luckily in the middle of the night.

I used a free website monitoring service called StatusCake. Sure enough, it did send me an email alerting me about this problem. But, sending an email at 2am was not helpful in solving the problem. What I really needed was a tool that not only detected when the database process went down, but would also restart the process without human intervention. Monit is such a tool.

For the rest of this post, I assume you want Monit to monitor a LAMP server (Linux, Apache2, MySQL, PHP).

Specify a mail server for Monit to send email alerts. I set up exim4 as an SMTP server on the localhost. For instructions, refer to my previous post.

set mailserver localhost

Email format

Hopefully, you won't receive many alert emails, but when you do, you want the maximum information about the potential problem. The default email format contains all the information known to Monit, but you may customize the format in which the information is delivered. To customize, use the set mail-format statement.

If any actionable event occurs, Monit sends an email alert to a predefined address list. Each email address is defined using the set alert statement.

set alert root@localhost not on { instance, action }

In the above example, root@localhost is the email recipient. Please refer to my earlier post about redirecting local emails to a remote email account.

Note that an event filter is defined (not on { instance, action }). Root@local will receive an email alert on every event unless it is of the instance or action type. An instance event is triggered by the starting or stopping of the Monit process. An action event is triggered by certain explicit user commands, e.g., to unmonitor or monitor a service. Click here for the complete list of event types that you can use for filtering.

By default, Monit sends an email alert when a service fails and another when it recovers. It does not repeat failure alerts after the initial detection. You can change this default behavior by specifying the reminder option in the set alert statement. The following example sends a reminder email on every fifth test cycle if the target service remains failed:

set alert root@localhost with reminder on 5 cycles

Enabling reporting and service management

You can dynamically manage Monit service monitors, and request status reports. These capabilities are delivered by an embedded web server. By default, this web server is disabled. To enable it, include the set httpd statement.

set httpd port 2812 and
use address localhost
allow localhost

Note: I've only allowed local access to the embedded web server. The Useful Commands section below explains the commands to request reporting and management services.

Resource monitor settings

The following are the key resources to monitor on a LAMP server.

System performance

You can configure Monit to send an alert when system resources are running below certain minimum performance threshold. The system resources that can be monitored are load averages, memory, swap and CPU usages.

You can create a monitor which is triggered when the percentage of disk space used is greater than an upper threshold.

check filesystem rootfs with path /
if space usage > 90% then alert

You may have more than 1 filesystem created on your server. Run the df command to identify the filesystem name (rootfs) and the path it was mounted on (/).

MySQL

Instead of putting the MySQL-specific statements in the main configuration file, I elect to put them in /etc/monit/conf.d/mysql.conf. This is a personal preference. I like a more compact main configuration file. All files inside the /etc/monit/conf.d/ directory are automatically included in Monit configuration.

If the MySQL process dies, Monit needs to know how to restart it. The command to start the MySQL process is specified by the start program clause. The command to stop MySQL is specified by the stop command clause.

A timeout event is triggered if MySQL is restarted 5 times in a span of 5 consecutive test cycles. In the event of a timeout, an alert email is sent, and the MySQL process will no longer be monitored. To resume monitoring, execute this command:

$ sudo monit monitor mysql

Apache

I put the following Apache-specific statements in the file /etc/monit/conf.d/apache.conf.

At every test cycle, Monit attempts to retrieve http://example.com/monit/token. This URL points to a dummy file created on the webserver specifically for this test. You need to create the file by executing the following commands:

Besides testing web access, the above configuration also monitors resource usages. The Apache process is restarted if it spawns more than 250 child processes. Apache is also restarted if the server's load average is greater than 10 for 8 cycles.

Useful commands

To print a status summary of all services being monitored, execute the command below:

Conclusion

I'd recommend that you run Monit on your server in addition to signing up for a remote website monitoring service such as StatusCake. While the 2 services do overlap, they also complement each other. Monit runs locally on your server, and can restart processes when a problem is detected. However, a networking problem may go undetected by Monit. That is where a remote monitoring service shines. In the event of a network failure, the remote monitor fails to connect to your server, and will therefore report a problem that may otherwise go unnoticed.

Tuesday, April 14, 2015

Is your web app slow? Is network bandwidth the problem? To diagnose the problem, begin by measuring the network bandwidth. Many users run the popular, web-based speedtest.net to capture speed performance data. This is a good solution if the X Window System is installed on the webserver. However, I have a Linux VPS server without an X graphical environment. Command line is the only viable way to perform a speed test on that server.

Power Linux users may want to use the iperf program to measure network bandwidth. To use iperf effectively, you need some basic knowledge of TCP/IP. In addition, you need to setup iperf to run on 2 machines: the 'client' and the 'server'. Yet, if you like the simplicity of using speedtest.net, you will be happy to know the following command-line tool to access speedtest.net servers.

speedtest-cli is a command-line Python program for testing Internet bandwidth using speedtest.net.

To capture the upload and download speeds of a local machine, you can simply run speedtest-cli without any parameter. The program automatically selects the 'best' speedtest.net server to test bidirectional transmission from the local machine.

In the above example, the program selected a test server located only 3 kilometers away from the local machine.
That is not where most of my web visitors are from, namely the east coast of United States.
The speed tests are more useful to me if the test server is located say in New York city.

Then, select one from the list to specify as the test server, say 2947 (Atlantic Metro in New York City).
To track network speed performance more consistently over time, you can designate the same test server in your subsequent tests.

Monday, April 6, 2015

Suppose you downloaded a PowerPoint or a PDF file from slideshare. You liked it so much that you wanted to print it out.
But, alas, it was 50 pages long.

This tutorial introduces the command-line tools to n-up a PPT or PDF file, i.e., batch multiple pages of the input file onto a single page on the output file. The output file is of the PDF format.

To 2-up a file, you place 2 original pages on a single output page. Similarly, to 4-up a file, 4 original pages on a single output page. By n-upping a file, you drastically reduce the number of pages for printing.

Convert to PDF

If the original file is a PowerPoint file (PPT, PPTX, PPS, PPSX), you need to first convert it to PDF. The tool I use is unoconv.

To install unoconv on Debian,

$ sudo apt-get install unoconv

To convert input.ppt to input.pdf,

$ unoconv -f pdf input.ppt

N-up PDF

Now that you have a PDF file, use the pdfnup program to n-up the file.

To install pdfnup,

$ sudo apt-get install pdfjam

Behind the scene, pdfnup uses the TeX typesetting system to do the n-up conversion. So, you need to first install some LaTeX-related packages.

$ sudo apt-get install texlive-latex-base texlive-latex-recommended

Now, you are ready to execute the following command to n-up input.pdf.

--nup 2x3: 2x3 means 2 columns and 3 rows. This houses a total of 6 input pages on each output page.

--paper letter: The default paper size is A4. For North Americans, specify --paper letter for the US letter size.

--frame: By default, the subpages on the output page are not framed, i.e., there are no borders around each subpage. To specify that a frame should be drawn around each subpage, specify --frame true.

--no-landscape: The default page orientation is landscape. If you want the portrait orientation, specify --no-landscape.

The output PDF filename for the above example is input-nup.pdf. The output filename is constructed by appending the default suffix -nup to the input filename.

The above method is not the only way to n-up a PDF file. Below is an alternative method that involves first converting the PDF file to PostScript format, then doing the n-up, and finally converting it back to PDF.

You can choose either method to do the n-up conversion. I generally avoid the PostScript method because it involves an extra conversion step. Regardless of which method you choose, the environment will thank you for using less paper.

Friday, March 20, 2015

Updating a Debian system is as easy as executing the following command as root:

# apt-get update && apt-get upgrade

If you have a sudo account, run the command like this:

$ sudo apt-get update && sudo apt-get upgrade

Instead of running the command interactively, you can automate the manual update process by running a cron job. Below, I assume you login as root.

Run the following command to create or edit your cron jobs. Note that the default text editor is opened automatically for you to enter the cron jobs.

# crontab -e

As an example, I will schedule the update to happen daily at 2am. I entered the following line as my first (failed) attempt.

00 02 * * * apt-get update 2>&1 && apt-get -y upgrade 2>&1

A typical upgrade usually prompts you to confirm a transaction before it is executed. Because the cron upgrade is non-interactive, I specify the -y parameter to tell apt-get to assume yes for all prompts.

At 2am, the above command executed, and failed with the following errors:

debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure:unable to re-open stdin: Fetched 49.5 MB in 17s (2,840 kB/s)
dpkg: warning: 'ldconfig' not found in PATH or not executable
dpkg: warning: 'start-stop-daemon' not found in PATH or not executable
dpkg: error: 2 expected programs not found in PATH or not executable
Note: root's PATH should usually contain /usr/local/sbin, /usr/sbin and /sbin
E: Sub-process /usr/bin/dpkg returned an error code (2)

There were 2 problems. First, a front-end was expected, but there was none. Second, the PATH for locating commands was not set up correctly.

To correct the problems, re-run crontab -e, and insert the following lines to properly set up the run-time environment.

Automating the system update process saves you time, and keep your system more up-to-date as a protection against potential cyber attacks. If you are interested in Debian system administration, please see What to do after spinning up a Debian VPS.

Wednesday, March 4, 2015

convert is a member of the ImageMagick software suite for image manipulation.
Two of my earlier posts dealt with using convert to slice and
resize an image.
It is a lesser-known fact that convert also works with pdf files.
I'd previously explained how to merge and
split uppdf files using tools such as pdftk and gs.
In this post, I'll illustrate how to do the same using the convert program.

First, you need to install convert which is packaged in the ImageMagick suite.

$ sudo apt-get install imagemagick

Merging 2 pdf files (file1 and file2) into a new file (output) is as simple as executing:

$ convert file1.pdf file2.pdf output.pdf

You can merge a subset of pages instead of the entire input files.
To accomplish that, use the angle brackets to specify the target subset of pages.
For example, to merge page 1 of file1 with pages 1, 2 and 4 of file2,
run the following command:

$ convert file1.pdf[0] file2.pdf[0-1,3] output.pdf

Note that page numbers are zero-based.
Therefore, [0] is page 1, and [0-1] are the pages ranging from page 1 to page 2.

Finally, the following example splits up input into 2 files: first2output and next2output.
The former output file contains pages 1 and 2 from the original file; the latter, pages 3 and 4.

The above solution may look like a hack to some of us. There may even be other solutions, perhaps using complex joins and unions.
But I like the above approach because it is simple, both conceptually and syntactically.

Friday, February 20, 2015

Recently, I opened an unmanaged VPS hosting account with Digital Ocean (of which I am an affiliate), and created a barebone Debian virtual server. Because the hosting was unmanaged, I had to apply all the system changes to the machine myself. Below were the steps I took to set up the Debian virtual machine after spinning it up for the first time.

I assume that you want to set up the LAMP stack - Linux, Apache, MYSQL, PHP - on your machine.

SSH into the virtual machine.
Before you can login, you need to know the public IP address of the new machine. Digital Ocean sends it to you in an email.

$ ssh root@your.IP.address

Change root password.
When you first login to the Digital Ocean server, you are automatically prompted to enter a new password. If your VPS is with someone else, change the password with the following command.

Run the date command to verify the current time. If the time looks wrong, your machine may be pre-configured to the wrong timezone.

# date
Thu Dec 4 00:07:37 UTC 2014

The current time reported was in the UTC timezone. But, I live in Vancouver, Canada which is in the Pacific timezone. To change the timezone, write the proper time zone string to the file /etc/timezone, and run dpkg-reconfigure.

My VPS hosting plan with Digital Ocean gives me 512 MB of RAM and 20 GB in SSD disk space. While RAM is limited in that configuration, the virtual server has plenty of unused disk space. The extra disk space can be utilized to boost the available virtual memory. Follow the steps below to create the swap file /var/swap.img.

You run the risk of ignoring important system alerts if emails to root are not read regularly. Follow the steps below to forward all root emails to an email address that you actually monitor.

Configure exim4 to redirect all outbound emails to the Google Gmail SMTP server.

You need a Gmail account in order to use Gmail SMTP. Refer to my earlier post for more details.

# hostname --fqdn > /etc/mailname
The /etc/mailname file should contain the fully-qualified domain name to use for outgoing mail messages. The Sender email address of a message uses this value as its domain name.

# echo '*.google.com:yourAccount@gmail.com:yourPassword' >> /etc/exim4/passwd.client
Customize the above with your own Gmail account and password. If you are not using Gmail, replace google.com with the proper domain name.

# echo -e 'root: peter\npeter: yourEmailAddress@somedomain.com' >> /etc/aliases
The above command inserts 2 lines into /etc/aliases. The first line (root: peter) specifies that all emails to root are forwarded to peter, a new user we added earlier.
The second line (peter: yourEmailAddress@somedomain.com) specifies that all emails to peter are redirected to the external email address.

# newaliases
Run newaliases to rebuild the email aliases database.

Note which essential packages are still uninstalled.

My yet-to-install package list comprises:

chkconfig

MYSQL

Apache

PHP5

You can use the following command to find out if a package, say PHP5, is installed:

For further details on how to install the above packages, refer to my earlier post.

After you put in all the hard work to get this far, your machine is finally in a usable state. It is wise (albeit optional) to save the current machine state. Digital Ocean allows you to create a system snapshot of your server. You can restore the server to that particular state at any time in the future.

When you login to the virtual server via SSH, don't forget to use the new user and port number.