I run a single server (call it 'server A') IRC 'network', and thank to the generosity of some friends, I have been given a second server ('server B') that I can run an IRCd on in order to provide redundancy in case server A crashes. This is fine, I can set up a round-robin DNS with the servers linked. The problem I have is what to do about services? Does anyone know of a way to get the services to 'fail over' in case of a server failure? Eg, Server A starts off running the services, but suddenly crashes. Server B detects this and starts its own copy of the services (ideally with the same configuration and data as the services on Server B)

One solution that comes it mind is to write a bot that each server runs, that sit in a channel periodically checking if the bot from the other server is in the channel. If it is, then all is well. If not, then failover. I would prefer not to have to code this myself though

3 Answers
3

Long Answer:
Your idea is a good one, and you may not have to code from scratch - you can probably take existing template code for simple IRC bots (e.g. in Python); they also wouldn't necessarily have to regularly poll the other bots but just process join/part/quit messages. You would, of course, have to deal with various race conditions in some way (e.g. netsplits, database access issues, etc)

I was afraid you were going to say that :P. I can't believe that none of the larger IRC networks have come up with a solution. Looks like ill have to brush up on my python then.
–
insertjokehereFeb 14 '11 at 23:30

I know some networks have the ops manually restart a backup when they're sure the services are going to be down for a while; I haven't personally asked in the likes of Freenode, etc, however
–
jonodloFeb 14 '11 at 23:34

The closest thing to a failover you can get without coding custom services(not a trivial task). is to install services on a few other accounts you already run an IRCd from on your network and crontab an rsync script to distribute services database to other machines. This way if the box services is on dies, you can start services from another machine and still have relatively current databases.

Crontab is not hard to learn, nor is rsync and much faster than coding a custom solution.

Best of all, this method works with all existing services packages that use a flat database.

Building on what @IRCGuru stated. One could also setup a DNS fall over for each server to be accessed viva region.irc.yourirc.net. Both methods work well in the environment you are describing and if one is running things within a virtual machine environment.

Using the vm version you can just simply spin up a second instance of your server and redirect the inbound port from one to the other or even load balance between the two with both of them connecting each other internally to prevent netsplits in your channels.