Check volume groups

Choose one of the volume groups with sufficient space. Because we are using thin provisioning, we could use less space than normal provisioning.

Second, check existing logical volumes also.

lvs

Creating thin volume pool

Next, we create thin volume pool in the chosen volume group (example, vgdata).

lvcreate -L 50G --thinpool globalthinpool vgdata

Print the resulting volumes using lvs :

We see that globalthinpool are created with logical size 50 gigabytes.

Creating thinly provisioned volume

Now we create thinly provisioned volume using previously created pool.

lvcreate -V100G -T vgdata/globalthinpool -n dockerpool

The command would create a 100 G logical volume using thin provisioning. Note that the volume created is 100G, which is larger than the actual thin pool volume. Beware that we must monitor actual volume usage because if the 50GB volume runs out then the programs would froze. See http://unix.stackexchange.com/questions/197412/thin-lvm-pool-frozen-due-to-lack-of-free-space-what-to-do if you encountered such condition (hint : something like lvresize -L +100g vgdata/globalthinpool).

The result is shown in the picture below:

Complications

If we have errors such as /usr/sbin/thin_check execvp failed, usually this means thin_check not installed yet. In ubuntu, install by using

apt-get install thin-provisioning-tools

Formatting the new volume

Before the new volume could be used, format it using mkfs.ext4 :

mkfs.ext4 /dev/mapper/vgdata-dockervol

Now we could mount it :

mkdir /mnt/dockervol

mount /dev/mapper/vgdata-dockervol /mnt/dockervol

Conclusion

LVM thin provisioning could be used so that only used blocks are being allocated in the LVM.

Sunday, October 30, 2016

Background

Sometimes I need to run X Windows-based applications inside Docker containers, and running the server locally is too unpractical because of latency reasons or the working laptop has no X Windows Server. First I tried to create a VirtualBox-based Vnc Server, and it worked fine albeit a little slow, but Docker containers seem to have better memory and disk footprint. So I tried to create Vnc Server running X Windows inside a Docker container. I already tried suchja/x11server (ref) but it has strange problems ignoring cursor keys of my MacBook on webkit page (such as Pentaho Data Integration's Formula page).

After comparing the container condition to my VirtualBox VM (that already worked), the VM has downloaded xfonts-100dpi because xfce4 recommends xorg and it requires xfonts-100dpi. The Dockerfile apt line has standard no-install-recommends clause in order to keep images small. So first step is to change the Dockerfile into :

After doing the change, next problem occurred :

Setting up libfontconfig1:amd64 (2.11.0-6.3+deb8u1) ...
Setting up fontconfig (2.11.0-6.3+deb8u1) ...
Regenerating fonts cache... done.
Setting up keyboard-configuration (1.123) ...
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
Configuring keyboard-configuration
----------------------------------
Please select the layout matching the keyboard for this machine.
1. English (US)
2. English (US) - Cherokee
3. English (US) - English (Colemak)
4. English (US) - English (Dvorak alternative international no dead keys)
5. English (US) - English (Dvorak)
6. English (US) - English (Dvorak, international with dead keys)
7. English (US) - English (Macintosh)
8. English (US) - English (US, alternative international)
9. English (US) - English (US, international with dead keys)
10. English (US) - English (US, with euro on 5)
11. English (US) - English (Workman)
12. English (US) - English (Workman, international with dead keys)
13. English (US) - English (classic Dvorak)
14. English (US) - English (international AltGr dead keys)
15. English (US) - English (left handed Dvorak)
16. English (US) - English (programmer Dvorak)
17. English (US) - English (right handed Dvorak)
18. English (US) - English (the divide/multiply keys toggle the layout)
19. English (US) - Russian (US, phonetic)
20. English (US) - Serbo-Croatian (US)
21. Other
Keyboard layout:

During image build, there is a prompt asking us about keyboard layout, and typing an answer results in hang process. This error is similar to this stackoverflow question. I tried to do the suggestions there (copying to /etc/default/keyboard), but it still hangs. After struggling with many experiments, finally I use this Dockerfile :

How to use vnc server container

First, you build the image:

docker-compose build

Then, create the containers :

docker-compose up -d

Check the IP of the container (replace vncserver1 with container_name in your docker-compose.yml)

docker inspect vncserver1 | grep 172

Tunnel VNC by performing ssh port forwarding to the local IP (example, 172.17.0.9) and port 5903

ssh -L 5903:172.17.0.9:5903 user@servername

Now to view the screen, you could connect to the VNC server by using VNC Viewer or opening vnc://localhost:5903 url in Safari.

For the X Windows based application, you first must grab the magic cookie :

Variations

This Dockerfile uses LXDE :

This one uses OpenBox Window Manager only :

Conclusion

We are able to run X Window Server running inside a Docker container. The resulting images are about 625 MB, which could be a lot smaller if we remove firefox (iceweasel) and use only Openbox Window Manager.

Friday, October 28, 2016

Background

This post would describe notes that results from my initial exploration using docker. Docker could be described as a thin VM. Essentially docker runs processes in a linux host in a semi-isolated environment. It was a brilliant technical accomplishment that exploits several characteristic of running applications in a linux-based OS. First, that the result of package installation is the distribution of package files in certain directories, and changes to certain files. Second, that executable file from one Linux distribution could be run in another Linux distribution provided that all the required shared library and configuration files are in their places.

Basic characteristic of Docker images

Docker images are essentially similar to zip archives, organized as layer over layers. Each additional layer provide new file or changed files.

Docker image should be portable, means it could be used in different instances of application in different hosts.

Docker images are built using docker entry script and Dockerfile, which :

a. Dockerfile essentially shows the steps to install the application packages or files. After executing each RUN command in the Dockerfile, Docker creates a layer that stores the files add or updated by the RUN command.

b. Docker entry script shows what command will be executed if the image is being run. This could be a line running existing program, but it could also point to shell script provided.

The Dockerfile is written using a domain specific language, that shows how to build the layers composing the docker image. Some of the Dockerfile syntax are :

ADD : add a file to the docker image filesystem, if the file is tar then Docker would extract it first, and it were allowed to use file from local filesystem or an URL from the internet.

FROM : refers to one parent docker image, so the files from the parent image are available in the current docker image

RUN : executes some program/command inside the Docker environment, Docker would capture file changes resulting from the execution

CMD : shows docker entry command, this could point to a shell script already inside the Docker image filesystem or an existing program

Basic Docker usage

There are a few basic docker commands that you need to know when using docker.

docker inspect : show various information of a running docker container. Join with grep to get shorter output, like : docker inspect | grep 172 -> this will filter to show IP addresses. Container could be referred by its name or id.

docker history : shows the layers composing the image, also shows no-operation layers such as MAINTAINER.

docker run -d --name [command] [params] : creates a new container from an image, runs it in the background using the entry point. There are other useful parameters before --name, such as :

-v hostdir:containerdir -> mounts a host directory inside the container, this could also works for single file.

--link -> put the IP of othercontainer in /etc/hosts so the container could access the other container by specifying its name

-e VAR=value -> sets environment variable

docker start : starts the container using the entry point

docker stop : terminates the process inside the container

docker rm : deletes the container, note that named data volumes would not be deleted

Docker images from the internet could be referred when creating new containers, or also inside a Dockerfile. For example, to create a container using an official mariadb image, with data directory mounted from the host, using TokuDB storage engine, use this command :

docker-compose start : starts the process in the containers (similar to docker start) referred by docker-compose.yml

Docker-compose uses docker-compose.yml as the blueprint of the containers.

This example dockerfile shows various parts of the file: (note that this is a 1.x docker-compose format)

pdi:

volumes:

- ~/pentahorepo:/home/pentaho/pentahorepo

- /tmp/.X11-unix:/tmp/.X11-unix

- ~/pdidocker/dotkettle:/home/pentaho/data-integration/.kettle

- ~/phpdocker/public/isms/comm:/home/pentaho/comm

- ~/phpdocker/public/isms/mobility:/home/pentaho/mobility

image: y_widyatama:pdi

container_name: pdiisms

environment:

- DISPLAY=172.17.0.1:10.0

external_links:

- maria

The volumes clause specifies volumes that are mounted inside the container, like the -v parameter. Specifying a directory in the volume line before the colon (:) means a directory from the host. Specifying a simple name instead will be interpreted as Docker data volume, which would be created if it doesn't exist yet. The directory after the colon shows the target mount directory inside the container.

The image clause specified the base image of the container. Alternatively, a dockerfile clause could be specified instead of image clause.

The environment clause shows additional environment variables that would be set inside the container.

External_links clause refers to name of existing running container that will be referred in /etc/hosts of the container.

This example shows two containers with some dependency links. The first, ismsdash-webserver, links to the second container using links clause. The second, ismsdash-php-fpm refers to two existing containers outside this docker-compose.yml file.

The first container is created from a docker image. The second container requires the image to be built first using a specified Dockerfile.

The ports clause specifies port forwarding from the host port to the container port.

Conclusion

Knowledge of several basic commands are neccessary in order to use docker. This post described some commands, hoping that the reader will be able to start using docker using these commands.

Wednesday, October 19, 2016

Background

Docker is fast-growing trend that I could no longer ignore, so I tried Docker running in a Linux server machine. Running server app is a breeze inside docker, but I need to run Pentaho Data Integration in the server, which uses X11 display. There is several references about forwarding X11 connection to a Docker container but none works for my setup, which has Quartz XServer running in Mac OS X laptop and Docker service running in a remote Linux Server.

I already tried such steps with no results, because I need to add another step before these two, that is forwarding X11 connection thru ssh connection to the server (not container). The step is as follows from an OS X laptop :

ssh -X username@servername.domain

Complications

We have DISPLAY set to something like localhost:10.0, and empty /tmp/.X11-unix directory. The reason is that forwarding thru ssh will result in TCP-based X11 service, and it is the opposite of unix socket based X11 service.

Inside the container, we are also unable to connect the localhost:10.0, because in the container the localhost would refer to the container's loopback interface, which is different from the host's loopback. For X11 display connections, the 10.0 means TCP port 6000+10, which as the result of X11 forwarding only listens on 127.0.0.1 address.

Solutions

So we need to ensure the container able to connect to the forwarded X11 port. One solutions is to use socat :

Then the DISPLAY variable should be altered to point host IP in the docker network (172.17.0.1)

Unfortunately as you can see, there are authentication problems. I tried to copy the .Xauthority file from the host into the container, but it still failed. We could only understand why after getting it to work :

It seems that because different IP being used, the Xauthority file need a row with 172.17.0.1:10 as the key.

Having Xclock working, so we could also try something bigger like Pentaho's Spoon UI :

Additional notes

In order to get Pentaho Spoon UI working, I need to add gtk library as one layer in the Dockerfile :

USER root

RUN apt-get update && apt-get install -y libgtk2.0-0 xauth x11-apps

Caveat : for the formula editor to work in Spoon UI, we need to install firefox and libraries bridging firefox and swt, which is a topic for another post.

Conclusion

We have found out how to run X11-based apps from a Docker container with the X Server running in a laptop. What good does this for ? In some cases, we need to run a GUI-based app inside a Docker container, without installing vnc and x server inside the container. However we still have to provide X11 libraries inside the container.

Thursday, June 9, 2016

Background

Earlier this year I was part of a team that does System Copy for a 20 terabyte plus SAP ERP RM-CA System. And just now I am involved in doing two system copy in just over one week, for much lesser amount of data. I think I would note some lessons learned from the experience in this blog. For the record, we are migrating from HP/UX and AIX to Linux x86 platform.

Things that go wrong

First, following the System Copy guide carefully is quite a lot of work - mainly because some important stuff are hidden in references in the guide. And reading a SAP note that are referenced in another SAP note, that are referenced in Installation Guide.. is a bit too much. Let me describe what thing goes wrong.

VM Time drift

The Oracle RAC Cluster have time drift problem, killing one instance when the other is shutting down. The cure for our VMWare-based Linux database server is hidden in SAP Note 989963 "Linux VMWARE Timing", which is basically add a tinker panic 0 in the ntp.conf and removing local undisciplined time source. And I think there are additional kernel parameters if your kernel is not new enough.

Memory usage grows beyond whats being used

Memory usage is very high, given our production cluster hosts so many servers, and thus so many concurrent database connections. Page table overhead is so large that the server is drowning in it. The cure is to enable hugepages in the kernel, allocate enough pages for Oracle SGA as hugepages, and ensuring Oracle could use the Huge Pages. See SAP Note 1672954 "Hugepages for Oracle".

Temporary Sequential (TemSe) Objects

For the QA system copy, the step deleting inconsistencies in TemSe consistency check takes forever, like a whole day, and its not even finished yet. It seems that SAP daily job to purge job logs is never been run before in the server. Our solution is to use report RSBTCDEL2 to delete job logs in time period intervals, like deleting every finished job logs that is over 1000 days old, then over 500 days old. See SAP Note 48400 for information about kinds of objects that are stored in TemSe, and the tcode to purge each kind of them. For example, report RSBTCDEL2 for job logs and RSBDCREO for BDC.

The lessons are, please do purge your job logs in the QA and DEV server. In our PROD system we have 14 days retention for jobs, in QA and DEV, maybe 100 days of retention is enough.

NFS problems

If it is possible to avoiding NFS access, avoid it. For example, transporting datafiles from source to target. NFS problems delay our process by about 8 hours, which is resolved by obtaining terabytes of storage and copying the data files to local disk via SCP.

NFS problems occur mostly when the NFS server restarted. Be prepared to restart the NFS service in each client first, and restart client OS if things still doesn't work.

Out of disk space in non-database partitions

Forgotting to deactivate app server in the system copy process. The dev_w? log files were filled with endless database connection errors after system copy recreates the database.

Using the wrong directory for export destination.

These two should be obvious but it happens. The lessons are, check your running processes, and make sure export destination have enough free space.

Out of disk space in sapdata partition

Our database space usage estimation is very much off from what is really needed. The mistake is that we estimate the size from source system database size, meanwhile sapinst would create datafiles with additional space added as safety margin.

In order to correctly measure space, we need to be aware that :

The sizes in DBSIZE.XML is only for estimating SAP's core tablespaces, and for SYSTEM and PSAPTEMP the estimate is a bit off. For SYSTEM we need additional 800 MB from the DBSIZE.XML in order to pass the ABAP Import phase successfully

PSAPTEMP size is recommended by SAP in Note 936441 to be 20% from total data size used, which is very different from the estimate in DBSIZE.XML. 20% might be a bit too much for large (>500GB) databases, but if it is too small some large indexes might fail during import. For our case, DBSIZE estimates 10 GB for PSAPTEMP, we need 20 GB to pass ABAP Import phase (by Note 936441 it should be 100 GB though).

Sapinst will create SAP core tablespace datafiles that might be consuming too much disk space. Be ready to provide larger storage capacity in order to have smooth system copy operation. In our case, we are forced to shrink PSAPSR3 datafiles in order to make room for SYSTEM and PSAPTEMP tablespace.

The default reserved space for root is 5% for one partition. This is very significant number of wasted space (5% from 600 GB sapdata partition is 30 GB wasted space for example). Some references said that ext2fs performance drops after 95% disk usage, but because sapdata mainly store datafiles with autoextend off then this should not be an issue. For sapdata I change the reserved space to 10000 blocks : tune2fs -r 10000 /x/y/sapdata

Permission issues

We used local linux user for OS authentication and authorization. And there is a need to match uid/guid across the servers in order to make transport via NFS work. We only found minimum clue to this issue in the System Copy guide. Currently our lessons here are :

need to restart sapinst after changing groups

saproot.sh is a tool to fix permissions AFTER gid/uid already fixed

double check that you don't have double entries in /etc/group or /etc/passwd as the result of gid/uid update efforts

in SUSE that there is nscd service that caches groups, restart the service to update the cache, and you don't need to restart the OS

ls -l sometimes could be confusing. ls -nl is better, because it shows numerical gids.

Performance issues

Migrating to new OS and hardware environment, we face some performance issues. SAP Note that describes some insight on this are Note 1817553 "What to do in case of general performance issues" and Note 853576 "Performance analysis w ASH and Oracle Advisors". Currently our standard operating procedure are like this :

For the AWR, choose begin snapshot and end snapshot of 1/2 an hour where the performance issue occurs

In the AWR result, Check SQLs with high elapsed time or IO,

Check the index and tables involved for stale statistics

Check index storage quality from report RSORAISQN (see note 970538)

Execute Oracle SQL Tuning on the SQL id (sqltrpt.sql)

Apply the recommendation

Summary

If you only do dry run once before the real System Copy, be prepared to handle unexpected things during the real process, such as obtaining additional storage. We might avoid some problems by by reading the System Copy guide carefully and reading 'between the lines'.

Saturday, April 16, 2016

It is my understanding that the free memory in linux operating system, can be shown by checking the second line in the result of "free -m" :

The first line, shows free memory that are really really free. The second line, shows free memory combined by buffers and cache. The reason is, I was told, that buffer and cache memory could be converted to free memory whenever there is a need. The cache memory is filled with the filesystem cache of the Linux Operating System.

The problem is, I was wrong. There are several cases where I find that cache memory is not being reduced when there is an application needing more memory. Instead, a part of the application memory is being sent to the swap, increasing swap usage and causing pauses in the system (while the memory pages being written to disk). In one case an Oracle database instance restarted and the team thinks it is because the memory demand too high (I think this is a bug).

The cache memory suppose to be reduced when we issue this command (ref: how-do-you-empty-the-buffers-and-cache-on-a-linux-system)
# echo 1 > /proc/sys/vm/drop_caches
But in our oracle database instances, running the command would only get a small reduction to the cached column. I also tried echo with 2 and 3 as the value, same results.

The truth

First, the 'cached' part not only contains file cache. It also contains memory mapped files and anonymous mappings. Shared memory also falls into 'anonymous mappings'. In Oracle systems without hugepages enabled, the SGA (System Global Area) is created as shared memory (see pythian-goodies-free-memory-swap-oracle-and-everything). Of course, the SGA could not be freed from the memory.. otherwise the database would be offline !

Another way to get better understanding of the memory usage is to use /proc/meminfo :

You might want to check Shmem usage there. In this example server, 0 values in Hugepages section shows that that hugepages is not being used in this server. For an Oracle database with large amount of RAM (say, 64 GB) and large amount of processes (500 or something) this could became a problem, mainly that the PageTables is going to be exceedingly large (thats another story). The memory in PageTables could not be used for anything else, so it is going to calculated as 'used' in the 'free -m' outputs.

Second, some people in their blog said that page cache (file cache) competes with application memory in order to gain portions of the real memory. A SUSE documentation confirms this :

The kernel swaps out rarely accessed memory pages to use freed memory pages as cache to speed up file system operations, for example during backup operations.

Limiting page cache usage

SUSE developed additional kernel parameter to control this behavior, vm.pagecache_limit_mb and vm.pagecache_limit_ignore_dirty. This two parameters could be used to limit pagecache ( = file cache) size that competes with ordinary memory. When pagecache is below this limit, it is allowed to compete directly with application memory for memory usage, allowing the kernel to swap either when the blocks of memory (file cache or application memory) is not accessed in recent times. Pagecache blocks above this limit is deemed to be less important than application memory, so application memory will not get swapped if there is large amount of page cache that could be freed.

In Red Hat Enterprise Linux 5, we have kernel parameter vm.pagecache that is similar to SUSE's parameter, but using percentage value instead. The default parameter value is 100, meaning that whole memory is available for use as page cache (see Memory_Usage_and_Page_Cache-Tuning_the_Page_Cache). I believe this also the case with CentOS Linuxes.

Conclusion

You might want to check your Linux distribution's documentation about page cache. There are some non-standard parameters that each distribution have that enable us to contain the page cache usage of the Linux operating system.

Thursday, March 3, 2016

A writer once said that in the new world, programmers would be free to choose any programming languages to do their job with. They would be able to use the most productive language for themselves and also for the task at hand. In this event, I am feeling a bit nostalgic, and found that there is an open source software called Free Pascal Compiler. In the past I learnt programming using Pascal as my second language, specifically Turbo Pascal.

Background

Needing to write a simple program to verify that a CSV file have the specified number of columns. It could be done using awk etc, but I need the program to be fast because the file size is large and the rows is in the order of hundreds of thousand.

The program

Explanation

At first the field counter returns strange result that very much different than the expected one. I was baffled, until I remembered that pascal Strings have maximum 255 characters wide (the files have more characters in each line). So it turns out to be a switch ($H+) to enable null terminated AnsiString in FPC instead of normal 255 character strings (aka ShortString). See http://wiki.freepascal.org/Character_and_string_types :

The type String may refer to ShortString or AnsiString, depending from the {$H} switch. If the switch is off ({$H-}) then any string declaration will define a ShortString. It size will be 255 chars, if not otherwise specified. If it is on ({$H+}) string without length specifier will define an AnsiString, otherwise a ShortString with specified length. In mode delphiunicode'String is UnicodeString.

Compiling

In my ubuntu system, I installed fpc using apt-get.

apt-get install fp-compiler-2.6.2

Then all you do is fpc filename.pas :

The warning 'contains output sections' is fixed in later version of fpc, but for this version it is harmless.

Conclusion

You don't need to seek Turbo Pascal anymore to do Pascal programming, now you can use free pascal.

Monday, February 1, 2016

This post would describe how we deploy a Yii application into Openshift Origin. It should also work in Openshift Enterprise and Openshift online.

Challenges

PHP-based application that runs in a Gear must not write to the application directory, because it is not writable. Openshift provides a Data Directory for each gear, which we could use for the purpose of writing assets and application runtime log.

For the case of load-balanced application, error messages written to application log is also stored in multiple gears, making troubleshooting more complex than it is.

Solution

Use deploy action hook in order to create directories in the data directory and symbolic links to the application. Change the deploy script to be executable, in Windows systems without TortoiseGit we need to do some git magic.
Create this file as .openshift/action_hooks/deploy in the application source code.

If your application is hosted using 'php' directory in the source code :

If your application is hosted in the root of the application source code :

Then make it executable and commit it :

For checking application.log, do this in the Ruby command prompt : (a trick I learned from this blog)

Another way to troubleshoot is to do port forwarding to different gears :

After port forwarding set up, you could access the apache in the gear by going to http://127.0.0.1:8080 in your browser (adjust with the actual port shown by rhc port forward)

Openshift Origin is a Platform as a Service software platform which enables us to horizontally scale applications and also manage multiple applications in one cluster. One openshift node could contain many applications, the default settings allows for 100 gears (which could be 100 different applications or may be only 4 applications each with 25 gears). Each gear contains a separate apache instance. This post would describe adjustments that I have done on an Openshift M4 cluster that are deployed using the definitive guide. Maybe I really should upgrade the cluster to newer version, but we are currently running production load in this cluster.

The Node architecture

Load balancing in an Openshift application is done by haproxy. The general application architecture is shown below (replace Java with PHP for cases of PHP-based application) (ref : Openshift Blog : How haproxy scales apps).

The gear shown running code, for PHP applications, each consists of one Apache HTTPD instance.

What is not shown above is that the user actually connects to another Apache instance running in the node host, working as a Reverse-Proxy Server, this could be seen in below image (taken from Redhat Access : Openshift Node Architecture):

Using Haproxy as a load balancer means the 'Gear 1' in the picture above is replaced by a haproxy instance which distributes loads into another gears. I tried to draw using yEd with this diagram as a result :

Haproxy limits

First, the haproxy cartridge specified 128 as session limit. What if I planned to have 1000 concurrent users ? If the users are using IE, each could create 3 concurrent connection, totaling 3000 connections. So this need to be changed.

I read somewhere that haproxy is capable proxying to tens of thousands connections, so I changed the maxconn from 256 and 128 into 25600 and 12800, respectively. The file need to be changed is haproxy/conf/haproxy.cfg in the application primary (front) gear, lines 28 and 53. Normal user is allowed to make this change (rhc ssh first to your gear).

It seems that open files limit also need to be changed. The indication is these error messages are shown in app-root/logs/haproxy.log :

Such error messages are related to bug https://bugzilla.redhat.com/show_bug.cgi?id=1085115, which give us a clue of what really happened. On my system, I used mod_rewrite as reverse proxy module, which have the shortcoming of not reusing socket connections to backend gears. Socket open and closing if done too often would kept many sockets in TIME_WAIT state, which port number by default will not be used for 2 minutes. So one workaround is to enable net.ipv4.tcp_tw_reuse, add this line to the file /etc/sysctl.conf as root:

net.ipv4.tcp_tw_reuse = 1

And to force in effect immediately, do this in the shell prompt :

sysctl -w net.ipv4.tcp_tw_reuse=1

Conclusion

These changes, done properly, will adjust software limits in the openshift node, enabling the openshift node to receive high amount of requests and/or concurrent connections. Of course you still need to design your database queries to be quick and keep track of memory usage. But these software limits if left unchanged might became a bottleneck for your Openshift application performance.

Difficulties in long running PHP scripts

I encountered some problems when trying to make PHP do long running tasks :

PHP script timeout. This could be solved by running set_time_limit(0); before the long running tasks.

Memory leaks. The framework I normally use have a bit of memory issues, this can be solved either by patching the framework (ok, it is a bit difficult to do, but I did something similar in the past) or splitting the data to process into several batches. And if you are going to loop the batches in one PHP run, make sure after each batch there are no dangling reference to the objects processed.

Browser disconnects in Apache-PHP environment would terminate the PHP script. During my explorations I found that :

Difficulties in running pentaho transformations, because the PHP module would run as www-data, and will be unable to access the kettle repository stored in another user's home directory.

Workarounds

I have experiences using these workarounds to force PHP to be able to do long running web pages :

Workaround 1 : use set_time_limit(0); and ignore_user_abort(true); to ensure script keeps running even after client disconnects. Unfortunately the user will no longer see the result of our script.

Workaround 2 : use HTTPS so the firewall will unable to do layer 7 processing and doesn't dare disconnect the connection. If the user closed the browser then the script would still terminate, except when you also do workaround 1.

I haven't tried detaching a child process yet like , but my other solutions involve separate process for background processing with similar benefits.

Solution A - Polling task tables using cron

It is better to separate the user interface part (PHP web script) with the background processing part. My first solution is to create cron task that are run every 3 minutes, which runs a PHP CLI script which checks a background task table for tasks with state 'SUBMITTED'. Upon processing the task, the script should update the state to 'PROCESSING'.

So the user interface/ front end only checks the background task table, and when the user orders to, inserts a task there with the specification required by the task, setting the state to 'SUBMITTED'.

When cron gets to run the PHP CLI script, it would check for tasks, and if there any, change the first task state to PROCESSING and begin processing. When processing complete, the PHP CLI script would change the state to COMPLETED.

Complications happen, so we will need to do risk management by :

logging phases of the process in some database table, including warnings that might be issued during processing.

recording error rows if there is any in another database table, so the user could view problematic rows

Currently this solution works, but recently I came across another solution that might be a better fit for running a Linux process.

Solution B - Using inotifywait and control files

In this solution, I created a control file which contains only one line of CSV. I prepared a PHP CLI script which parses the CSV and executes a long running process, and also a PHP Web page which would write to the control file. Inotifywait from inotify-tools will listen on file system notifications from Linux kernel that are related to changes on the control file.

Conclusion

By using Linux file system notifications, we could trigger task execution with parameter specified from a PHP web page. The task could be run as another Linux user, provider the user running the shell script. Data sanitization are done by php, so no strange commands could be passed to the background task.

These solutions are written entirely in open source solutions. I saw that Azure have WebJobs which might fulfill similar requirements that I have, only it is in Azure platform which I never used.

Background

Sometimes I need to run a long-running background process in the server, and I need to know when does the CPU usage returns to (almost) 0, indicating the process finished. I know there are other options, like sending myself an email when the process finished, but currently I am satisfied with monitoring the CPU usage.

Saturday, January 2, 2016

Background

In this post I would tell a story about installing MariaDB in Ubuntu Trusty, and the process that I went through to enable TokuDB engine. I need to experiment using the engine as an alternative to the Archive engine, to store compressed table rows. It have better performance than InnoDB tables (row_format=compressed), and it was recommended in some blog posts (this post and this)

Packages for Ubuntu Trusty

In order to be able to use TokuDB, I seek the documentation and find out that Ubuntu version 12.10 and newer for 64-bit platform requires mariadb-tokudb-engine-5.5 package. Despite the existence of mariadb-5.5 packages, I found no package containing tokudb keyword in the official Ubuntu Trusty repositories. The mariadb 5.5 server package also doesn't contain ha_tokudb.so (see file list).