Friday, November 23, 2012

Background: I have mentioned earlier that we were baffled by Amazon's Elastic Load Balancer(ELB). We tested with Nginx as reverse proxy, it seems to be hands down better than ELB. (I will have Nginx performance in another post.) To scale this setup horizontally, we configured multiple Nginx with Weighted Round Robin (WRR) DNS configuration; to alleviate some load off Nginx server.

We later decided to have redundant Nginx server in N+1 setup, so that if one server goes down the redundant takes over. I stumbled on Heartbeat. Unfortunately there is no good tutorial on how this can be done on Amazon EC2.

Warning: I will keep this tutorial brief. Limited to Heartbeat configuration, as the Wiki mentioned:

In order to be useful to users, the Heartbeat daemon needs to be combined with a cluster resource manager (CRM) which has the task of starting and stopping the services (IP addresses, web servers, etc.) that cluster will make highly available. Pacemaker is the preferred cluster resource manager for clusters based on Heartbeat.

Goal: We need to make sure auxiliary node takes over, as soon as a main node dies. This is what should happen:

This configuration is same on both Nginx server except one thing, the add_header X-Whom node_1; part. This is to identify which Nginx loadbalancer is serving. This will help to debug the situation later. On the second Nginx, we have add_header X-Whom node_2;. This line says Nginx to inject a header X-Whom to each response.

Configure Heartbeat: There are three things you need to configure to get stuffs working with Heartbeat.

ha.cf:/etc/ha.d/ha.cf is main configuration file. It has list of nodes, features to be enabled. And the stuffs in it is order sensitive.

authkeys: It is basically to maintain security of cluster. It provide a mean to authenticate nodes in a cluster.

haresources: List of resources to be managed by heartbeat. It looks like this preferredHost service1 service2. where preferredHost is the hostname where you prefer the subsequent services to be executed. service1 service2 are the service that stay inside /etc/init.d/ and stops and starts the service when called in /etc/init.d/service1 stop|start.

When a node is woken up, the service is called from left to right. Means service1 first; and then service2. And when a node is brought down service2 is terminated first, and then service1.

---------
For sake of simplicity lets call one node main node and the other aux node.

where,deadtime: seconds after which a host is considered dead if not responding.warntime: seconds after which the late Heartbeat warning will be issued.initdead: seconds to wait for the other host after starting Heartbeat, before it is considered dead.udpport: port number used for bcast/ucast communication. The default value for this is 694.bcast/ucast: interface on which to broadcast/unicast.auto_failback: if ‘on’, resources are automatically failed back to its primary node.node: nodes in the HA set-up, dentified by uname -n.
----

/etc/ha.d/authkeys must be the same on both the machines. You may generate a random auth-secretkey using, date|md5sum

authkeys on both the machines:

auth 1
1 sha1 1e8d28a4627ed7f83faf1d57f5b11645

----

/etc/ha.d/haresources

haresources on both the machines:

ip-10-144-75-85 updateEIP nginx

updateEIP and nginx are shell script that I wrote and stored under /etc/init.d/.

Test: Time to test. Note that we are just watching nodes, not Nginx service. So, we need to make heartbeat service on 'aux' look like as if 'main' server is down.

Keep tailing the debug file in 'aux' machine. This will show you the transition. Now, it's time to kill 'main' node. On main, so this:

service heartbeat stop
/etc/init.d/nginx stop

You can see the debug file on 'aux' shows that 'aux' is taking over 'main'. Now, to see how service switches back to main node when it comes up. Start just the heartbeat service, you will see: a. aux: EIP gets detached. (not really, but you can), b. aux: nginx stops, c. main: EIP is assigned, d. main: Nginx is started.

Note: While we flip EIPs, you may get disconnected from SSH connection. So, keep looking into AWS web console to get new assigned public DNS for node which we revoked the EIP from. And use EIP to connect to the node which assigned it.

Conclusion: So we have basic HA configuration ready, we need to add toppings to this setup to be able to monitor services and respond them. You need to add a Cluster Resource Management (CRM) layer over it to be useful.

Configure slave: Keep the configuration same as master #1, except give it a different server-id. You may not include log-bin in slave configuration. But it is good idea to keep it. It will be useful in case where you want to make this slave master.

Restart slave and login to it. Drop the database which you have taken dump of, if exists; and recreate it. Load the db dump.

Tuesday, October 23, 2012

"A service outage for Amazon’s Web Service (AWS) affected websites like
Reddit, Flipboard, FastCompany, Heroku, Airbnb, Imgur, Pinterest and
others late last night. It also seems that the outage affected some of
Amazon’s storage units on the US East coast, and thus these websites
which are hosted by AWS went down." - FirstPost

This was pretty exciting night of US Prez debate on October 22, 2012. We were having all our infrastructure and code-base made of solid steel. At least that's what we thought, until we realized our infrastructure wasn't as solid as Amazon claimed. About six hours before the the debate, 3PM EDT we started to see smaller glitches. And we realized that we can't reboot one of our EBS backed servers. We quickly launched another, but then ELB (Elastic Load Balancer) was was paralyzed. Then complete infrastructure started to fall like dominoes.

What went down? Some of the AWS's most marketed and claimed to be the best alternatives available, failed.

ELB: Elastic load balancers hit hard. This is the gateway to our 20+ servers. It was hosed. It was unpredictably throwing instance in-service/out-of-service. And complete blackouts. We launched new ELBs, no avail.

On one of the ELBs, we had to continuously re-save the instances in ELB to get them active once ELB marks them as unhealthy and throws out. (They were healthy).

If you think the last para was bad. Prepare for the worse. One other load balancer just went inactive. Showing registering instances. Forever. Nothing worked. Then we launched another ELB, in the healthy zone us-east-1a, same problem for next 4 hour.

EBS Backed Instances: They were complete zombies. A couple if EBS backed instances would just do not do anything. SSH access, Nagios' NRPE checks all gone.

EBS Volumes: Talk about tall claims, talk about EBS. Amazon market it as a better alternative to instance store backed. And this is not the first time EBS gave us a pain in the neck. It's slow for very fast IO like we do on Cassandra. It's expensive. And opposite to what Amazon says, it's not rock solid yet. The only guarantee you get is that your EBS volume will not be terminated unexpectedly. So, you will have your data safe. But, as we found, it may be unavailable/inaccessible for long enough time to spoil the game. In fact, one of the EBS volumes attached to our MySQL could not be detached until now (about 10 hours). And I do not know what's going to happen. We terminated the instance, tried force-detach. No luck.

How did we survive? We were little overcautious.

Copy of a copy of a copy: We just needed one MySQL server. And stuffs were cached efficiently in MemcacheD. So, we initially thought that having one instance-store backed AMI with MySQL data directory in external EBS volume is safe enough. Since AWS says EBS won't lose data. Later, our CTO suggested to have a slave, created in much the same way. It would just be an extra layer of security.

It turns out that we could not use MySQL master. No access. Pending queries. The data containing EBS volume was inaccessible. Cloudwatch was showing no R/RW active while we were loading MySQL with decent load. After 90 minutes of exercise, we realized that shit hit the fan. we turned our slave into new master. One front fixed.

We were lucky that slave was not screwed the same way as master did. It was quite likely that slave would fail. It had the same configuration, lived in same availability zone.

Never break a protocol: One of very crucial server was having EBS root device (not a instance-store with extra EBS volume like the last case). This machine went into coma. This was a complete setup of Wordpress, with all Wordpress tweaking, plug-ins, custom code, Apache mod-rewrite, mod-proxy el al, and MySQL.

The survival trick lies in one of consistent behavior: whenever make any change to instance, create an AMI of it. So, we relaunched the the server from latest AMI. But data? Our daily backup script was just ready for this particular situation. So we launched a new instance, took the latest MySQL-dump file. Loaded. Updates DNS records. back up.

ELB Outage: ELB outage made us helpless. There is nothing much that one can do with ELB. We were lucky that the ELB on top of our main application was not completely dead. ELB was just going crazy and marking instances out of service -- randomly. So, I jot down to the ELB screen and refresh it regularly. If instances go out of service, I will re-save, and ELB get back up.

You must be wondering if our instances were sick, or health-check was failing. None. These were instance-store machines, unaffected by the outage. The direct access to machines were fast and correct. There are two parts of it. 1. In past, I have observed ELB does not scale, with a surge of traffic. It ramp up slowly. I observed that during load tests. High latency of connection via ELB compared to direct access was also observed. 2. ELBs were sick during the outage as per Amazon's status.

Key Takeaway:

EBS ain't that great. Stick to instance store, fault tolerant design.

Backup. Everything that matters, should be backed up in such a way they can be reloaded really fast when required.

Spread out. If you can, scatter instances in different availability zone. The best system would have the system designed in such a way that if an availability zone goes complete berserk, the user wouldn't feel the impact.

ELB can be SPOF. If ELB is entry point to your application. It's single point of failure. Even if there is no downtime, we have seen poor scaling/slow ramp up on surge traffic. I am not sure if I have another alternative. But I am actively looking.

Monday, October 22, 2012

It certainly does not says what the problem is. It's actually a three way partitioning problem.

Definition: An array A[1..n] needs to be rearranged in such a way that moving from left to right, we see three subgroups in this order: (-∞, low), [low, high], (high, ∞).

Algorithm: I came to this solution while working on a StackOverflow question about rearranging an array in such that all the elements less than zero are in left zero; and all the elements greater than zero lies to the right of it. My initial intuition was that I can do it by the improvising the partition part of Quicksort mechanism. It turns out while partitioning works for single pivot element. If you have a dupe of pivot, it may not work.

Monday, October 15, 2012

If this is your first attempt to understand red-black tree, please watch this video lecture. It will give you a solid foundation. Next, before you start to this tutorial, make sure:

You are not in a hurry. (In that case, just use the code from the link in footnote)

You do not get impatient.

You do not bother why it works. It works. Once you get the algorithm and code it, it will be much simpler to visualize how it works. But, for 'why' -- you may need to research.

So, here we go

One Aim: Approximately balanced tree creation. From root to leaf, no (simple) path is more than twice as long as any other.

Two Rules: (CLRS says five, but others are trivial)

Color Constraint: Red nodes will have both of it's children black. Black nodes can have either.

Number Constraint: For each node, simple path to all the leaves will have same numbers of black children.

Three Situations: Couple of points to be noted before we see the situations

Wherever I refer Uncle, it means (same as in common life) -- "sibling of parent node", "the other child of node's parent's parent".

We assume as NIL nodes of all the leaves are black.

We always insert red node. So, We are not breaking #2: Number Constraint by adding a black node, but we are breaking constaint #1: Color Constraint.

Keep the root node black. You will see, it makes life easier. It does not breaks any of the constraints.

Situations:

A Red uncle is a good uncle: If newly inserted node has red uncle, we can fix the situation just by fixing colors. Invert parent, grand parent, and uncle's colors. Now we need to check if grandparent is satisfying the Color Constraint.

A black uncle on opposite side wants one rotation: If the uncle is black, we got to rotate. A slightly better case is when the newly inserted node is left child of its parent and uncle is right child of its parent or vice versa. In that case, you make a left-rotation or a right-rotation (rotations will be discussed later in this post). And then fix colors.

A black uncle on the same side is double as dangerous: A case when newly inserted node is right child and the black uncle is also right child or both are left children, you need to make a left-rotation if inserted node is right-child or right rotation if newly inserted node is left child. This will make it situation 2.

The good thing about finding a black uncle is your quest ends there. No need to look further up the tree.
Conceptually, you are done here. All left now is some slight minor details about plumbing rotation and a couple of fine prints in algorithm like some times rotation may cause root node reassignment, how we treat a NIL node a black node, etc.

Rotation: A rotation is basically to swap a parent and child such that the BST property still holds. So, why do we call it rotation? Well, the process mutates the tree in such a way that it looks, to a casual observer, as if the tree just did a Toe Loop Jump. (well, this is the best I can explain.)

People claim there are two type of rotations -- right, when left child swaps with it's parent (i.e. moves to right); and left rotation, when a right child swaps with it's parent. Basically, if you start with a tree and right rotate a child, then left rotate the original parent the tree stays unchanged. (Figure out how and why the nodes are snipped and pasted the way they are.)

The algorithm:(Please note the below is not a valid Python code, I have use Python-esque code so that code highlighter shows key areas)

Saturday, October 13, 2012

I have been asked this a couple of times by the people learning or ramping on sorting algorithm. Why does merge operation in merge sort takes O(n)? Most of the times it turns out that people think it's two sorted array, just append 'em. It's intuitive -- wrong intuition.

The idea is we have two sorted array, and not two arrays with one of them having it's smallest element larger than that the other's highest. What I meant to say is, it's not [1, 3, 9, 12] and [13, 17, 19]. But it's more like [3, 12, 13, 17] and [1, 9, 19].

You see, in later case you can't append them. So, how do you combine? You follow the following algorithm:

Lets say the first array has m elements, and the second has n. You make an auxiliary array of size m+n. This will store the final combined and sorted array.

Assign two cursors -- i and j, pointing to the heads of the arrays.

Compare the elements at i and j; choose the smaller one. Copy the value auxiliary array, and move the smaller cursor ahead.

Repeat #3, till one of the arrays exhausts; and then copy the rest of the elements from unexhausted array to the auxiliary array.

In best case, you will have sorted array as input. In such case, both the subarrays to be merged will just require joining. But we will apply the generic algorithm mentioned above. In that case, you will have to compare min(m,n); and then copy max(m,n) element to auxiliary array. In anyway, you are going to have total (m+n) operations.

Discussion: Binary search dictates that in order to find an element in a sorted array, we test the middle[1] element (call it pivot) for equality. If key is smaller than pivot, we search in subarray before pivot. If key is bigger, we search in subarray after the pivot. If equal, then we found the element. Divide an conquer.

This is a slight variant of it. We need to keep searching left until we ensure that it's the first occurrence.

Strategy: The first idea that comes to my mind, is to use regular binary search to find the element. It may not necessarily be the first occurrence. Now, we creep up the sorted array till we find an element less than the search key or we hit the boundary of the array. The complexity of this mechanism is O(log(n)) + O(k), where n is the array length, and k is repetition of key.

The other idea is to modify binary search slightly, and keep looking for the key in subarray preceding the 'pivot', if the pivot is equal to the `key` with element previous to the pivot is same as key and array boundary is not yet hit. This is O(logn). Here is the algorithm.

The elements are called nodes. Each node can have at most two children. The top element is called root-node or just root. A binary search tree (BST) is one that has elements in its left subtree less than itself; and elements in its right subtree is higher than it[1]. If you remember quick sort earlier, any node in a BST is basically a pivot. Expected height[2] of a randomly build binary search tree with n elemements is lg(n) where lg is log2.

Most operations in a BST is simple and recursive. I will list the algorithms here.

So, these were the easier ones. With a little effort one can understand these. I will now discuss, some of the harder ones.

Successor and Predecessor: A successor of a node is a node which comes the next if we sort the tree in increasing order. A predecessor of a node is the node that comes immediately before it, if we sort the tree in increasing order. That is it. In a tree containing values 1 3 5 6 7 9, the successor and predecessor of 5 are 6 and 3, respectively.

To find successor of a node we need to find immediate larger node. So, here is the algorithm:

The right subtree of the node will contain the values larger than the node. (right?) If we find the lowest node in right subtree we will have the immediate larger node, and we get our successor.

What if the node does not have a right subtree? We will have to go up in the tree somewhere. See the image below.

If the node is left child, the parent will the immediate next.

If the node is right child, we need to keep moving up until we get a node whose left child is one of the node's parents.

Once you get the idea behind the successor, predecessor is easy.

Predecessor is highest in left subtree.

If there exists no left subtree, then:

If the node is right-child, immediate parent is the predecessor.

If the node is left child move up till you get a node whose right child is one of the node's parents.

Deletion: You must be wondering why I am covering delete so late? That's because it's so damn hard without understanding the concepts above. So, here is the algorithm to delete a node:

If the node had no child, well just knock it off.

If it has one child, yank the node and connect the child to the node's parent.

If it has two children, then we need to get a replacement for this node. What would be a good replacement? A good replacement is the one that on replacement, maintains the BST property. (At this moment I want you to pause, and think if you have to replace this node who could be the best candidate?) You can choose either the successor or the predecessor. The common practice is too use successor (further discussion assumes we replaced with successor). And then delete the successor from the subtree.

Now, if you read successor code properly, you will observe that if the successor chosen from right subtree, the successor will have at most one child. So, it is NOT going to be a recursive as in "delete a node with two children, then delete the successor with two children, and so on". NO. Just place the successor's value in node. Then clip the successor using either step 1 or step 2.

Footnotes:
[1] The basic idea is to have elements less than the parent on one side and greater than the parent, on the other. And, whatever criteria you choose, it must apply to all the nodes. The most common is to have smaller in left, but if your rebel inner self does not want to conform to the standard go ahead, make a tree with higher in left.

Wednesday, October 3, 2012

New Generation of Sorting Approach? The general idea behind sorting is to compare the elements with one another and move them in correct relative positions. This involves O(n^2) algorithms like Bubble Sort, Selection Sort, Insertion Sort; and O(n*logn) procedures like Heapsort, Merge Sort, and Quicksort. Counting sort is linear order sorting and it does NOT depend on how elements compare to each other, rather it depends on their absolute values.

Can you explain it to my 5 year old brother? Say, you have a bag of coins; values: 1 to 5. Your mom asked you to arrange the coins in increasing order making a queue. One of the mechanisms is to pick a coin and search for the right location in sorted queue, and insert at right place. (Do you see insertion sort here?). Another idea is you pick five empty buckets, mark them as 1, 2,... 5. Now, you pick coins from the bag put coin with value 1 in bucket marked as 1, coin with value 2 in bucket number 2, and so on.

Buckets in Counting Sort, start adding coins :)

Once we are done with this, we empty bucket 1, make a queue of coins of value 1. Then empty bucket 2, append to the queue, and continue this till you emptied all the buckets. You now have sorted queue. This is the idea behind counting sort. And with a little variation, the same concept applies to Bucket sort and Radix sort.

Wow! I will use it everywhere! So, why do we not always use linear sorting? The caveat is, it needs extra space; something like an array at least of length (Amax - Amin). Just to give you an idea, if you have an array with a number 134,217,728, (about 134Mn; let's say you are sorting rich people by their wealth) you will have to preallocate an array of type int of length 134217728. The maths of it is 134217728*4 bytes = 536870912 bytes = 512MB of space!! So, you can't just apply it on any input. It is good as long as you know the bounds of the array and the auxiliary array can be accomodated in your machine's RAM without twitching it.

The quickSort routine is pretty clear. Let me explain some how partition works:

We select pivot as the last element.

We declared a cursor leftIndex. This cursor is our handle to the place where chunk of elements having value less than the pivot value ends. Basically this cursor is pointing last element of left sub-array. We start is just before the start.

We start walking on the sub-array from left to right. If we encounter an element higher or equal to pivot, we keep moving. If we find an element that is smaller than pivot, we increase leftIndex (now leftIndex points to an element that should be in right sub-array) and we swap the current element (which is less than pivot) with leftIndex. In any case, leftIndex will point to the last of the left sub-array.

When the loop ends, we will have leftIndex pointing the last element of left part of sub-array. We need to place our pivot next to it.

Complexity: The worst case complexity of quick-sort is O(n^2). However, in most practical cases, it shows complexity O(n*lg(n)). Lets see how:

WORST CASE
Assume our array is sorted in decreasing order: A[1..n] for all 1 <= i < j <= n, A[i] > A[j]
Our partition will split this into left sub-array of size 0, and right sub-array of size n-1, so,
T(n) = T(n-1) + T(0) + Θ(n)
which is same as picking the lowest from remaining array, repeating it (n-1) times. Much like selection sort.
The above equation yields: O(n^2)
---------
TAKE A RANDOM CASE
Lets assume our pivot divides the array in 9:1 ratio, we have
T(n) = T(9*n/10) + T(n/10) + Θ(n)
if you make recursion tree, the height of tree is determined by the subtree with higher number of children. So,
The height of recursion tree will be log(10/9)(n) [log n base 10/9]
At each level we walk the sub-array which costs O(n) let assume c*n is the exact function.
So, we perform c*n operation log(10/9)(n) at max.
So, the order: O(n*log(n))
The above order stays even if for a long array the partion is 99:1.
The best case would be the one where pivot divides the array into exact halves.

Heap: A heap is a nearly complete binary tree, that means tree is filled at all levels except probably the last one. Interesting thing about heaps are with a little set of rules on how to determine children of a node or parent of a node -- they can be easily stored in a array. There are two types of heaps max-heap, and min-heap. Any node of a max-heap is larger than its children. Thus the largest element stays on the top. A min-heap is one whose nodes are smaller than their children. A min-heap has the lowest value at the top. In all discussions here, I will assume heap implies max heap.

So, with the knowledge that (a) max heap has larger value to parent nodes, (b) it's a nearly complete binary tree. We can visualize a tree, that look like the image above.

Heap Representation: If we index nodes from top to bottom, from left to right, we would come with numbers (the blue digits in the image) that can, perhaps, work as index in an array. The only problem is how one would know what are children of a node; or what are the parents of nodes?

Look closely, you'd see the following relationship:

Left child of node with index i, has index 2*i

Right child of node with index i, has index 2*i + 1

Parent of any node with index i, has index floor(i/2)

Well, now we can just forget the tree and all the baggage (pointer maintenance, recursive traversal, and extra space) associated with a tree. We'd use array to represent heap, with the following ground rules

Creating a Heap: Building heap is two step process, append the new element to the array; and then check if all of it's parents (parents, grandparents, great-grandparents,...) satisfy the heap property. If not, fix it. Here is the process

Append the element to the array

Check if it's parent satisfy the heap-property (that each of it's children is smaller than the parent), if not swap them. Now check, if the newly swapped location satify the heap-property. Continue until you either get a parent that is OK with it or reach to top. This process is called Heap-Increase-Key or Bubble Up or Percolate Up.

Complexity: parent, left, right are simple O(1) operations. Bubble up is a order O(lg(n)) operation. Because, in worst case, we have T(n) = T(floor(n/2)) + O(1) while n >= 2. O(1) for swap operation.

Heapify an Array: Heapify means toss and turn the array to make sure all the nodes statisfy heap-property. If you have an array, A[1..N], we need to make sure all the parents have children that are smaller than it. We should start from heapifying the last parent to root. So, who is the last parent? It's the parent of last child -- A[floor(N/2)]. We will visit all parents starting from floor(N/2) to 1, and make sure they satisfy heap property. The process is similar to bubbleUp, except here we are ajusting parents fixing the tree under it. This process of moving node to a lower position is called, to Heapify or to Bubble Down or to Percolate Down.

Complexity: Bubble down is similar to bubble up. In worst case, when you will always hit the recursion block till leaf nodes. Also, you will have exactly 2*n/3 elements in worst case, where bottom level of the tree is exactly half occupied. So, we have T(n) <= T(2*n/3). This leads to order O(lg(n)).

Heap Sort: Now that we know how to convert an array into a heap, sorting is very simple.

Heapify the array, making it max-heap. This will make sure largest element at index 1.

Swap the first element with the element at heapSize, decrease the heapSize by 1. Effectively, sending the top element at right location and outside the heap.

Sunday, September 30, 2012

Definition: Suppose you have given an array A[1.. n]. You need to find out a p and a q, such that A[q] - A[p] is maximum where 1 <= p < q <= n.

Discussion: Couple of common approaches that does not really work are:

look for global min (A[p]) and a global min(A[q]). This may not necessarily satisfy p < q constrain.

find global min index (a) and global max index (b). Walk from right till a for max difference (X), and left till b for max difference (Y). The higher of the two will give our p and q.

Both of these theories looked promising intially until I came to a counter example:

12 10 11 20 2 6 19 1 3
^ ^ ^ ^
max p q min

So, what failed us? The underlying thought that we need to have global max or global min in our equation to be able to find max chnage is fundamentally flawed. What we are looking for is maximum difference between two elements when going left to right. So, with this knowledge, we are left in a bad zone of brute-force. We will compare each element with all elements that are right to it, and store the max. For an n length array, the number of comparisons would be (n-1) + (n-2) + ... + 1 = n*(n-1)/2 i.e. O(n2)

Divide and Conquer: O(n2) is as bad as it could be. We can increase performance by not looking into all possible combinations but just the ones that matter. We can divide the problem into smaller identical subproblems and take the winners of the smaller subproblems.

Another thing we observe is it's the sum of the differences that matters. Let me explain this. Take our case

Original: 12 10 11 20 2 6 19 1 3
Difference: 0 -2 1 9 -18 4 13 -18 -2

All we need to do is to find out a subarray that has maximum sum. Divide and conquer suggests to divide the array into two parts, then get the higher of maximum subarray from left subarray qand maximum subarray from right subarray. But wait, these two conditions assume that the maximum subarray is completely in either left or right subarray. What if there exists a subarray that crosses over the mid point. We should take that into consideration too. So, here is what we come up with:

Wednesday, September 19, 2012

Motivation: Minification of JS and CSS is a common best practice when developing web application. It's done to save bandwidth and keep single unified JS and CSS file, which once loaded will be used for subsequent requests.

Prerequisites: Introductory knowledge of Maven.

Steps: I will use Samaxes minifier plugin. Read details about this plugin here. I will follow an example. Assuming your directory structure is similar to this

Monday, August 27, 2012

Motivation: Sometimes, Eclipse on Ubuntu (specifically, 11.10) hangs beyond recovery. It eventually crashes or I have to kill -9 it. When, I start it again, the splash screen gets stuck at about 70%.

The only solution till now was to create a new workspace which leads to reconfiguring the plugins, downloading code from repository -- a complete waste of time.

Requisites: Access to corrupted workspace directory.

Steps: Since changing workspace fixed the problem. I knew that something has got corrupted in the workspace. With a little poking around in $WORKSPACE/.metadata, I came to realize that at least one plugin's (with .ui in its name) metadata has gone bad. So, I started hit and trial by moving files from $WORKSPACE/.metadata/.plugins to another location and starting Eclipse. (starting Eclipse causes recreation of the moved plugin metadata file).

Long story short,

look into Eclipse log file in the workspace, find out which plugin is doing the mischief.

Wednesday, August 15, 2012

Motivation: Something weird happened today, I was asked to setup remote debug mechanism on our test server so that developers can step through code in their machine while execution takes place on test environment sitting in other city.

I never imagined need of remote debug to actually debug a code on remote machine.

Prerequisite:

Standalone Jetty installation

Eclipse (or your favorite Java IDE that has remote debug facility)

Steps: $JETTY_HOME denotes the directory where Jetty is installed/copied/extracted.

Edit ini file: Open $JETTY_HOME/start.ini file in your favorite text editor. uncomment --exec and after that add the debug setup. Here is a snippet of that file:

Connecting to Remote Debugger: In Eclipse, go to Run > Debug Configurations... menu. Double click Remote Java Application, give an appropriate name, link to relevant project, provide the host-name, mention the port that you have started your remote debugger on. So, from step#1, in my case this will be 4000.

Here is the screen shot:

Apply and start debugging.

A word about firewall: Since it makes TCP connection to your Jetty make sure, you have debug port -- 4000, open on the host machine. In AWS EC2, you can allow port 4000 in your security group.

Friday, August 3, 2012

Motivation: Most of the time running a command or a group of commands on a remote machine is a thoughtless process -- ssh username@remoteIP.or.dns "command1 param1 param2; command2 param3;". What throws you out is when you have 40 lines of convoluted script to be run on 37 machines! You won't use this approach. You'd probably scp the script to remote machine and then call it using ssh as mentioned earlier, and then probably delete this file. It's three step process. If you haven't automated stuff, it's a pain.

A better way is to automate this process and do it in a single connection.

Prerequisites:

Executable access to the remote machine.

Some knowledge of shell script.

Steps: It's just one line command. I will write the breakdown.

Keep the shell script that you wanted to execute remotely, on your local machine. cat this file.

The cated file is piped to ssh command that...

Writes this file to a location of your choice on remote machine, say /tmp/remote.sh and...

Changes mode chmod to executable then...

Calls this script with parameters, if any, and finally...

Deletes this file from remote location.

The most interesting part is that all this is done in a single connection. Here is the code. The \is continue command to next line. You may want to keep the whole command in a single line.

--add-locks: Surround each table dump with LOCK
TABLES and
UNLOCK
TABLES statements. This results in faster inserts
when the dump file is reloaded.

--flush-privileges: Send a FLUSH
PRIVILEGES statement to the server after dumping
the mysql database. This option should be
used any time the dump contains the mysql
database and any other database that depends on the data in
the mysql database for proper
restoration.

The tricky part is how would you know, what file was uploaded X days ago? With richer (scripting) languages like Python, or Groovy, it's lot more easy to enumerate the files in the backup folder. And if files are more than X, delete the excessive files. To me, it's tricky to this in shell script. So, I ended up in having smart filename instead.

I name the files as www.mywebsite.com.YYYY-MM-DD.sql.tar. So, to delete the file that was created X days ago. All I have to do it to generate the date of X day ago and place in similar filename structure and then call s3cmd del s3://BUCKETNAME/SUB/DIRECTORY/OLD_FILENAME

Thursday, July 19, 2012

Sure you can do it manually. And it worked flawlessly until now. Wait
until you have a multi-module project and yet dared to do it manually.
Sooner than later you will see you have made a typo -- you renamed the version to 1.2.3-SHAPSHOT

Anyway, the most common Google search results into mvn release:update-revisions which requires the pom.xml to be in SNAPSHOT revision, and it's release, you may not want to perform a release.

A better (and correct) alternative is to use Maven Versions plug-in. It updates the versions of submodules too. Here is how to use it

Saturday, July 7, 2012

My last install of Nagios on CentOS went all fine except the home page of Nagios was not loading the main container with welcome page. It was blank. I had to change default home page to something meaningful. I changed it to "tactical overview" page.

In /usr/local/nagios/share/index.php or /usr/share/nagios/htdocs/index.php file, change this line

$corewindow="main.php";

to:

$corewindow="cgi-bin/tac.cgi";

If you do not find your index.php in above two locations, try using this command to locate it on your disc: