Friday, January 18. 2008

What is PgAgent?

PgAgent is a basic scheduling agent that comes packaged with PgAdmin III (since pre-8.0 or so) and that can be managed by PgAdmin III.
PgAdmin III is the database administration tool that comes packaged with PostgreSQL.
For those familiar with
unix/linux cronjobs and crontab structure, PgAgent's scheduling structure should look very familiar.
For those familiar with using Microsoft SQL Server Scheduling Agent or Windows Scheduling Tasks, but not used to crontab structure,
the PgAdmin III Job Agent interface to PgAgent should look very welcoming, but the schedule tab may look a little unfamiliar.

PgAgent can run both PostgreSQL stored functions and sql statements as well as OS shell commands and batch tasks.

Why use PgAgent over other agents such as cronjob, Microsoft Windows Scheduled Tasks, or Microsoft SQL Server Agent?

For one thing, since PgAgent runs off of standard Postgres tables,
you can probably more easily programmatically change jobs from it from within PostgreSQL sql calls
that insert right into the respective PgAgent pga_job, pga_jobstep, pga_jobagent, pga_schedule tables to roll your own App integrated scheduler.

Compared to CronTab, PgAgent has the following advantages:

You can have multiple steps for a job without having to resort to a batch script.

You can have multiple schedules for a job without having to repeat the line.

Is cross platform

For running PostgreSQL specific jobs such as stored function calls or adhoc sql update statements etc. it is a bit easier granted the PostgreSQL account used is a super user or has sufficient rights to the dbs.

Compared to Windows Scheduled Tasks - PgAgent has the following advantages:

You can go down to the minute level

Have several steps per job

Have multiple schedules per job

Is cross platform

For running PostgreSQL specific jobs such as stored function calls it is easier than using windows scheduled tasks.

Compared to SQL Server Agent - PgAgent has the following advantages:

SQL Server Agent comes only with Microsoft SQL Server Workgroup and above so not an option say for people running SQL Server Express editions or no SQL Server install.

Is cross platform

Some missing features in PgAgent which would be nice to see in later versions would be some sort of notification system similar
to what SQL Server Agent has that can notify you by email when things fail and a maintenance wizard type complement tool similar to what SQL Server 2005 Maintenace Wizard provides that allows
users to walk thru a set of steps to build automated backup/DB Maintenance tasks. This is a bit tricky since it would need to be cross-platform.
Granted the job history display in PgAdmin that provides success and time taken to perform task is a nice touch and makes up for some of this lack and you can always roll your own by running some monitor to check the job event logs.

How to install PgAgent

Note the docs describe how to install PgAgent: http://www.pgadmin.org/docs/1.8/pgagent-install.html,
but the example to install it in a db called PgAdmin seems to send people off in the wrong direction. We shall highlight the areas where people most commonly screw up in installation, but for master
reference, refer to the official PgAgent install docs listed above.

While you can install PgAgent in any database, to our knowledge, you can only administer it via PgAdmin III if it is installed in the maintenance database which is usually the database called postgres. For ISPs, having the ability to install it in any db and rolling your own agent interface may be a useful feature.

Other note that is not explicitly stated, but is useful to know: PgAgent need not be installed on the same Server/Computer as your PostgreSQL server. It just needs to have the pgAgent files, which you can get by installing PgAdmin III or copying over the necessary files. PgAgent service/daemon also needs necessary access to the PostgreSQL database housing the job tables. If you are using it to backup databases to a remote server, the account it runs under will also need network file access or ftp access to the remote server. You can also have multiple PgAgent's running on different servers that use the same schedule tables.

To install PgAgent, there are basically three steps

Make sure you have plpgsql language installed in the postgres database. Which you do with the sql command runin postgres database.

Run the PgAgent.sql using PgAdmin III or psql and run it in the db postgres - found in /path/to/PgAdmin III/1.8/scripts (on windows this is usually in "C:/Program Files/PgAdmin III/1.8/scripts"). This creates a schema catalog in the postgres database called pgAgent with the helper pgagent tables and functions.

Install the PgAgent server service/Daemon process: On windows - you run a command something like below - the -u user is not the PostgreSQL user but the computer user that the PgAgent will be running under."C:\Program Files\PostgreSQL\8.2\bin\pgAgent" INSTALL pgAgent -u postgres -p somepassword hostaddr=127.0.0.1 dbname=postgres user=postgres

After you install on Windows - you should go into Control Panel -> Administrative Tools -> Services - "PostgreSQL Scheduling Agent - pgAgent" -> and start the service.
If the service doesn't start - most likely you typed the postgres computer account password in wrong. Simply switch to the Log On tab and retype the password or change to use a different account.

Keep in mind - if you wish PgAgent to run scripts that require File Network access (e.g. copying files to network servers, you need to have the service run under a network account that has
network access to those servers.

On Unix/Linux systems - it varies how its installed. It is usually run under the root account and the line is added to startupscripts usually /etc/init.d or I think on MacOSX its /etc/xinetd.d/path/to/pgagent hostaddr=127.0.0.1 dbname=postgres user=postgres

Note: as the docs say - its probably best not to specify the password. Instead - you can set the postgres account to be trusted from server you have PgAgent installed on or use the ~pgpass approach.

Once you have PgAgent installed, and open/refresh PgAdmin III, you should see another section called Jobs that looks like below:

If per chance, you do not see the new Jobs icon, make sure that you have PgAgent jobs checked by going to File->Options->Display

Creating Backup Jobs

Creating backup jobs is done with a shell script of some sort. In Windows this can be done with a .bat file and specifying the file in the PgAgent job or by writing the command
directly in the PgAgent job. In Linux/Unix - this is done with a .sh file and specifying that in the PgAgent job or writing the command directly in the PgAgent job.

Generally we go with a .bat or .sh file, because using a shell script allows you more granular control - such as backing up multiple databases or having
a separately date named file for each daily backup.

Below is a sample batch script for Windows that backs up selected databases and then does a full Pg_dumpall as well

@echo off
REM - backup directory can be a file server share that the PgAgent windows service account has access to
set BACKUPDIR="/path/to/backup/"
set PGHOST="localhost"
set PGUSER="postgres"
set PGBIN="C:/Program Files/PostgreSQL/8.2/bin/"
for /f "tokens=1-4 delims=/ " %%i in ("%date%") do (
set dow=%%i
set month=%%j
set day=%%k
set year=%%l
)
for /f "tokens=1-3 delims=: " %%i in ("%time%") do (
set hh=%%i
set nn=%%j
)
REM - It would be nice to use gzip in the pg_dumpall call (or if pg_dumpall supported compression as does the pg_dump)
REM here as we do on the linux/unix script
REM - but gzip is not prepackaged with windows so requires a separate install/download.
REM Our favorite all purpose compression/uncompression util for Windows is 7Zip which does have a command-line
%PGBIN%pg_dumpall -h %PGHOST% -U %PGUSER% -f %BACKUPDIR%fullpgbackup-%year%%month%.sql
%PGBIN%pg_dump -i -h %PGHOST% -U %PGUSER% -F c -b -v -f "%BACKUPDIR%db1-%year%%month%%day%%hh%.compressed" db1
%PGBIN%pg_dump -i -h %PGHOST% -U %PGUSER% -F c -b -v -f "%BACKUPDIR%db2-%year%%month%%day%%hh%.compressed" db2

Save the respective above scripts in a (dailybackup.bat for windows pgagent) or (dailybackup.sh for Linux/Unix pgagent) file.

For bash unix scripts make sure it has unix line breaks (not windows) -
you may use dos2unix available on most linux/unix boxes to convert windows line breaks to unix linebreaks.
When saving as .sh make sure to give the .sh file execute rights using chmod on linux/unix.
Also change the db1, db2 and add additional lines for other databases you wish to backup to the respective names of your databases and add additional as needed.

771 permissions gives execute rights to public and all rights (read,write,execute) to owner and group. Alternatively you could do 640 instead which would remove all rights from public, but then you will need to do a Change owner chown
to change ownership to account you are running PgAgent under. Note the above script and commands we tested on a CentOS box so commands and script may vary if you are running on MacOSX or another Linux variant.

A couple of notes about the above which are more preferences than anything.

We like to create a dump all backup which would contain all the
databases and just overwrite it daily but keep one for each month. This is more for major disaster recovery than anything else.

We prefer the Postgres Native Compressed format for our date stamped backups.
The reason for that is with the pg_dump compressed format, it takes up less space, deals with binary objects well, and has the benefit that you can restore individual database objects for it. This is very useful
in cases where someone screws up and they come back to you days or months later.

You will note that the date stamp format we have included includes the Hour and would create a file something of the form - dbname-2008010102.compressed
- the reason for that is that it sorts nicely by name and date of backup and if disk space was an issue, you could easily include a line that deletes say backups older than a month. Going down to the hour level
allows us to quickly create emergency backups by clicking the Run Now on PgAdmin Jobs interface that wouldn't overwrite the current days backup.

In practice we also like to have at least one of the backups ftped to a remote location and include that as part of the
script and/or backed up to a remote server that has good connectivity with the pgagent server. This helps in cases of complete server failure. This step is not included
here since its too OS and install specific to get into.

Next to create the PgAgent backup job follow the following steps.

Open up PgAdmin - navigate to jobs section, right mouse click and click New Job -

Fill in the properties tab as shown in this snapshot -

Switch to the Steps tab and select Batch and fill in details as shown -

Switch to the Definition tab and type in the path to the batch or sh file. Keep in mind the path is in context of the PgAgent service. So if you have PgAgent installed on a server that is different from the PostgreSQL server, then
make sure the paths in your script and path to the file is set as if you were the PgAgent account on PgAgent server. As show here
and then click the OK button.

Next switch to the Schedules tab and click to add a Schedule.

Next Switch to Times tab. The reason we are skipping the Days tab is that anything you do not fill in is assumed to be All since we want all days, we leave that tab blank.
This diagram shows setting the backup time to be 02:15 AM every day -

Once the job is saved, the hierarchy in PgAgmin looks like the below snapshots

Clicking on the Daily Schedule Icon

Clicking on the respective objects in the Job Hierarchy such as a Step or schedule gives you detailed information about each of those.
The statistics tab gives you details such as how long a step took, whether or not it succeeded or failed and when it was run.

Keep in mind that while PgAgent is closely related to PostgreSQL and uses PostgreSQL for scheduling and logging, there isn't any
reason you can not use it as an all-purpose scheduling agent. In fact we use it to backup MySQL as well as PostgreSQL databases, do automated web crawls, download
remote backups etc. Using the SQL Job Type option, you can use it to run postgresql functions that rebuild materialized views, do other standard postgresql specific sql maintenance tasks, etc. On top
of that PgAdmin provides a nice interface to it that you can use on any computer (not just the one running PgAgent).

I just missed some information about setting up the daily backup job with the local authorization set to password on Windows. Passing the password to pgAgent with password=**** is not sufficient (and not save), as pg_dump requires a password too!
This is not easy to detect, as pgAgent just waits for infinity.

So setting the password in %APPDATA%\postgresql\pgpass.conf finally solved the issue.Now my db will get its daily backup :)

in the Linux script in this topic, when you want to backup more or less database/schema you must reedit the script, it's so inconvenient. I find out the way easier to manage what you backup. I hope it will be usefull. See my Script:

#!/bin/bash
#backup directory can be a file server share that the PgAgent daemon account has access to

Help!.
I did every thing you say and still can't run any job. Untill de job is shown in the panel, when I press "Run Now" nothing happens. Here is what I did:

1- create plpgsql languaje.
2- Run the script to create the schema
3- set the password in the user profile
4- install de agent services.
5- go to pgadmin
6- create new job as is shown
7- press "run now"
8- get confused when nothing happens.

Here is yet another script solution, using Python. I have this on a Windows machine tucked on a server. I just need to double-click it when I think it is a good time to do a backup (typically once in the morning and once in the afternoon).

Host - Windows XP SP2
PostgreSQL 8.2.9
PgAgent is installed and running as service
Problem - in PGAdminIII, i setup the job as listed and when i right-click on jobs, there's no Run Now.
i only get Refresh, New Job, and Object List Report.

able to configure everything and able to get a file in the specified output directory also. but the file size shows as 0kb and in statistis section of pgAgent job in pgAdmin III status shows as Successful.

Is pgAgent really meant to run on "postgres" database?? If not, why is that when I select other database on which the step will run, it says, "Couldn't connect to the database!", but when i select the postgres database, it is running smoothly. If pgAgent can run in other database, can you teach me how can i do that?

Depends what account you have the PgAgent service running under. For our purposes we setup a separate account with admin rights so we wouldn't have to worry about that stuff and have the service running under that account. That also allows us have it write the backups to our network file server.

which server are you running? I've had some stupid issues with Windows 2008 R2, though can't recall how I overcame them.

Which platform are you running on? Unfortuantely the pgagent success for non-sql jobs is not ahtat trustworthy (or at least not on windows anyway). But you can usually see the log results which often shows the error that happened.

First you want to make sure whatever script you have runs fine as is. Standalone. If it does, then you want ot make sure the account your have pgAgent running under can also run the process. That is suually the probelm that the pgAgent daemon (/service) account doesn't ahve enough rights to run your batch job.

You wrote: "In practice we also like to have at least one of the backups ftped to a remote location and include that as part of the script ...".
It would have been very interesting to see how you did this with pgagent on your CentOS test box, because I have some trouble getting both ftp and sftp performing a set of commands I try to feed them with, by several different mechanisms. Ftp/sftp do transfer the files when I run the script manually, but not by pgagent. I suspect it might have something to do with user or env settings, but have not found the solution yet.

I just installed pgAgent 3.0 on MS Win Server 2008 using pgAdmin 1.14.1 following the above to do batch jobs, and the only gotcha I had was that pgAgent kept failing to actually run the batch file.

By running pgAgent in DEBUG mode at the command prompt, I saw the batch file pgAgent was trying to run was not my batch file as per above, but a temp created batch file. After putting the above code directly into the definition, instead of using batch file path and filename, the job ran perfectly.

Paul,
Great. Come to think of it I think I've always run the scripts for windows by pasting the script directly in the batch window. For Unix/Linux I usually do the same too, though I think on linux I have done with path to file without issues. Maybe that's why I assumed it would work on windows as well.

I spoke to soon, and didn't do my "due diligence". I don't want to lead anyone down the wrong path. pgAgent ran while in debug mode from command prompt, but not in service mode. Also, I was able to run the batch file in debug mode. It just needed an extra "\" for the path:
set BACKUPDIR="C:\pgjobs\backup\\". This is odd, because PostgreSQL runs as the same user as pgAgent with no problem.

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.Enter the string from the spam-prevention image above: