My Life as a Sys Admin

Monthly Archives: April 2015

It’s almost 2 months since i’ve started playing full time on ansible. Like most of the SYS-Admin’s, ive been using ansible via cli most of the time. Unlike Salt/Puppet, ansible is an agent less one. So we need to invoke things from the box which contains ansible and the respective playbooks installed. Also, if you want to use ansible with ec2 features like auto-scaling, we need to either buy Ansible Tower, or need to use ansible-fetch along with the userdata script. I’ve also seen people, who uses custom scripts, that fetches their repo and execute ansible playbook locally to bootstrap.

Being a good fan of Flask, i’ve used flask on creating many backend API’s to automate a bunch of my tasks. So this time i decided to write a simple Flask API for executing Ansible playbook/ Ansible Adhoc commands etc.. Ansible also provides a Python API, which also made my work easier. Like most of the Ansible user’s, i use Role’s for all my playbooks. We can directly expose an API to ansible and can execute playbooks. But there are cases, where the playbook execution takes > 5min, and offcourse if there is any network latency it will affect our package download etc. I don’t want to force my HTTP clients to wait for the final output of the playbook execution to get a response back.

So i decided to go ahead with a JOB Queue feature. Each time a request comes to my API, the job is queued in Redis and the JOB ID will be returned to the clients. Then my job-workers pick the job’s from the redis queue and performs the job execution on the backend and workers will keep updating the job status. So now, i need to expose 2 API’s first, ie, one for receiving jobs and one for job status. For Redis Queue, there is an awesome library called rq. I’ve been using rq for all queuing tasks.

Flask API

The JOB accepts a bunch of parameters like host, role, env via HTTP POST method. Since the role/host etc.. have to be retrieved from the HTTP request, my playbook yml file has to be a dynamic one. So i’ve decided to use Jinja templating to dynamically create my playbook yml file. Below is my sample API for Role based playbook execution.

Once the playbook file is ready, we need to invoke Ansible’s API to perform our bootstrapping. This is actually done by the Job workers. Below is a sample function which invokes the playbook API from Ansible CORE.

Now, we have a fully fledged API server for executing Role based playbooks. This API can also be used with user data scripts in autoscaling, where in we need to perform an HTTP POST request to the API server, and our API server will start the Bootstrapping. I’ve tested this app locally with various scenarios and the results are promising. Now as a next step, i’m planning to extend the API to do more jobs like, automating Code Pushes, Running AD-Hoc commands via API etc… With applications like Ansible, Redis, Flask, i’m sure SYS Admins can attain the DevOps Nirvana :). I’ll be pushing the latest working code to my Github account soon…

Nowadays CI or Conitnous Integration is being implemented in almost all IT companies. Many of the DevOps work’s are in related to the CI. The common scenario is, Developers push the codes to the GIT/SVN repo and triggers jenkins to perform tests and sometimes packaging, and if it’s a fuly automated system the new changes are deployed to the staging. And the QA team takes over the testing part. But when you are in small team, all these has to be achieved with the minimal team. So before the new change is completely pushed to staging, i decided to have a simple testing of all the components quickly. I read about blogs where many DevOps engineers spins up new instances like a full replica of their entire architecture and performs the new code deployment and load test on this new cluster and if all the components are behaving properly with the new code change, it’s then further deployed to Staging for next level of full scale QA.

Though the above step seems to be interesting, i didn’t want to waste up resources by spinnig up a new set of instances each time. Being a hardcore Docker fan, i decided to replace the instance lauch iwth Docker containers. So instead of launching ne instances, Jenkins will launch new Docker containers with SDN(Software Defined Network). Below is simple architecture diagram of my new design.

So the work flow goes like this,

1) Developers pushes the new code changes along with the new Tag to the corresponding Repositories.

2) Github webhook then triggers jenkins to start the Build jobs.

3) Jenkins performs the build and if the build succeeds, jenkins triggers Debian pacakging for the application.

4) Once the packaging is completed, Jenkins will trigger Docker image creation for the corresponding application using the newly build packages.

5) Once the image build is completed, Jenkins uses Docker Compose to build our Virtual clusters which is an exact replica of our Prod/Staging.

6) Once the cluster is up, we perform automated testing of all our components and makes sure that the components are behaving normally with the new code changes.

Now once the test results are normal, we can initiate the code deployment to staging and can start the full scale QA.

By using Docker, i was able to reduce the resource usage. All these containers are running on a Single M3.Medium box. Sice i’m concentrating more on the components working part and not on the load test side, with this smaller box i was able to achieve my results properly.

A bit about docker-compose. I’m using docker-compose for managing the docker cluster. Compose is a tool for defining and running complex applications with Docker. With Compose, we can define a multi-container application in a single file, then spin our applications up in a single command which does everything that needs to be done to get it running. Below is my docker-compose yml file content.

pkgr is a tool for building deb/rpm packages for Python/Ruby/Node/GO applications. It uses heroku buildpack and embed all the dependencies related to the application runtime within the package. It also gives us a nice executable, which closely replicates the Heroku toolbelt utility. There are only 2 requirements for pkgr, 1) It must have a Procfile and 2) It should be Heroku compatible.

By default, pkgr supports packaging Ruby/GO/Node apps. But it also supports custom buildpacks, so we can use heroku-python build pack to pacakge Python apps too.

MongoDB is one of the commonly used NOSQL document store. For smaller use cases, we might not need a full scaled replica set, instead we can use MongoDB in a traditional way like a Master-Slave architecture. In this blog, i’m going to explain how to convert a Standalone MongoDB server to a Master-Slave Model, and Promoting a Slave instance into a Master node in case of master crash.

Standalone to Master-slave Model.

First, on the master node, we need to add master=true on to the mongodb config file and restart the mongo service. On the new mongo node, which is going to be the slave, add the below config options to the mongodb configuration file.

We can also check the replication status from the Mongo master cli via rs.printReplicationInfo() or db.serverStatus( { repl: 1 } ). We can also check the same on the slave nodes, but by default, read queries are not allowed on the slave and it will throw an error. We can allow reads by running db.getMongo().setSlaveOk() on the slave mongo shell. This will override the restriction and we can use the rs.printReplicationInfo() or db.serverStatus( { repl: 1 } ) to see the replication status.

Promoting a Slave node to Master

This is one the requirement that we keep slave nodes. In case of Master crash, we can easily promote the Slave node and can minimize the interruption. Now promoting a Slave node to Master, follow the below steps.

1) Stop the mongo service on the slave
2) remove all the local files from the mongo data directory
$ cd <mongo_data_directory> && rm -rvf local*
3) Remove the slave configurations from mongo config file, and set `master=true` (This is required if we have more than 1 Slaves, so that the rest of the slaves can connect to new master).
4) Restart the mongo service, now this new master ready to accept writes.

If we have multiple slaves, we need to change the slave source IP, so that they can connect to the new master. But even if the connect to the new master, replication will fail. So we have two methods, either remove the data and perform a new data replication or use force a complete resync to all the slaves using the below command

#On the mongo master shell, run
$ use admin
$ db.runCommand( { resync: 1 } # This will force a complete resync on all the slaves.

This procedure is useful, if you are using a Standalone/Master-Slave method. For a real HA/Fault tolerant design, replica set proves to be more efficient, where primary master selection takes place automatically if the actual primary node crashes, thus preventing the down time to minimum.

It’s been quite a while since my last blog. This time i’m coming with a bunch of topics to write, starting with Kannel. After moving to my new role, the first task i got was to set up an SMPP server with one of our carriers. After digging sometime in internet i found one project kannel, which is a perfect game player for me. So in this blog, i’ll be explaining on how to setup an SMPP SMS gateway locally.

Now we have the kannel installed on our custom prefix folder. Let’s go ahead setting the Kannel application.

Setting up Kannel

Kannel comprises of two processes, smsbox and bearerbox. Bearerbox service is the one which is in contact with the carrier gateways, responsible for sending and receiving SMS. smsbox is the service which interacts between our application and bearerbox. ie, it receives incoming sms from our bearer box and sends it our application and vice versa. The kannel config consists of multiple parts, which are explained below.

Add all the above configurations according to the requirement on to the kannel.conf file. A sample init script for Debian/Ubuntu is available here

Once the SMPP service is started, check the bearebox logs for the connectivity with the carrier’s smpp gateway. Once the connection is up, we can start to send/receive sms. For incoming sms, smsbox will make an HTTP request based on our configuration. For example, if we are using a POST method, the sms details like From, To can be retrieved from the POST HEADERS and the sms text from the request data. Below are some of the headers that come along with the POST requests.

X-Kannel-From => sender id
X-Kannel-To => recepient id

Similarly for outbound sms, our application makes a HTTP GET request to the smsbox url and smsbox will carry it over to the bearerbox which then carry over to the carrier for delivery.