With Drupal 8 reaching its maturity and coupling/decoupling from other services — including itself — we have an increasing demand for Drupal sites to shine and make engaged teams thrive with good DevOps practices and resilient Infrastructure. All that done in the biggest Distributed System ever created by humans: the Internet. The biggest challenges of any distributed system are heterogeneity of systems and clients, transparency to the end user, openness to other systems, concurrency to support many users simultaneously, security, scalability on the fly and failure handling in a graceful way. Are we there yet?

We envision, in the DevOps + Infrastructure track, to see solutions from the smallest containers that can grow to millions of services to best practices in the DevOps world that accomplish very specific tasks to support Drupal and teams working on it and save precious human time, by reducing repetitive and automatable tasks.

Questions about container orchestration, virtualization and cloud infrastructure arise every day and we expect answers to come in the track sessions to deal with automation and scaling faster — maybe using applied machine learning or some other forms of prediction or self management. See? We’re really into saving time, by using technology to assist us.

We clearly don’t manage our sites in the same way we did years ago, due to increased complexity of what we manage and how we are managing change in process and culture, therefore it’s our goal at Drupal Europe to bring the best ideas, stories and lessons learned from each industry into the room and share them with the community.

What’s your story?

How is your platform scaling? How do you solve automated testing and continuous integrations? How do you keep your team’s happiness with feature velocity and still maintain a healthy platform? How do you make your website’s perceived performance even faster? What chain of tooling is running behind the scenes and what is controlling this chain? Are you using agentless configuration management or are you resorting to an agent. Are you triggering events based on system changes or do you work with command and control.

Be ready to raise, receive and answer some hard questions and but most of all, inspire people to think from a different angle. What works for a high-high traffic website might not be applicable for maintaining a massive amount of smaller sites. We want operations to inspire development on reliability and for development to inspire operations on any kind of automation. We want security to be always top of mind while still have an impact on business value rapidly and efficiently. And that is just the beginning…

About industry tracks

Drupal Europe’s 2018 program is focused on industry verticals, which means there are tons of subjects to discuss therefore when you submit your session be sure to choose the correct industry track in order to increase the chance of your session being selected.

Please help us to spread the word about this awesome conference. Our hashtag is #drupaleurope.

About the Drupal Europe Conference

Drupal is one of the leading open source technologies empowering digital solutions in the government space around the world.

Drupal Europe 2018 brings over 2,000 creators, innovators, and users of digital technologies from all over Europe and the rest of the world together for three days of intense and inspiring interaction.

Location & Dates

Drupal Europe will be held in Darmstadtium in Darmstadt, Germany — which has a direct connection to Frankfurt International Airport. Drupal Europe will take place 10–14 September 2018 with Drupal contribution opportunities every day. Keynotes, sessions, workshops and BoFs will be from Tuesday to Thursday.

Computers are finicky. As stable and reliable as we would like to believe they have become, the average server can cease to function for hundreds of different reasons. Some of the common problems that cause websites or services to crash can’t really be avoided. If you suddenly find your site suffering from a DDOS attack or a hardware failure, all you can do is react to the situation.

But there are many simple things that are totally preventable that can be addressed proactively to ensure optimal uptime. To keep an eye on the more preventable issues, setting up monitoring for your entire stack (both the server as well as the individual applications) is helpful. At Zivtech, we use a tool called Sensu to monitor potential issues on everything we host and run.

Sensu is a Ruby project that operates by running small scripts to determine the health of a particular application or server metric. The core project contains a number of such scripts called “checks.” It’s also very easy to write custom checks and they can be written in any language, thus allowing developers to easily monitor new services or applications. Sensu can also be run via a client server model and issue alerts to members of the team when things aren’t behaving properly.

Server checks

As a general place to start, you should set up basic health checks for the server itself. The following list gives you a good set of metrics to keep an eye on and why it is in your best interest to do so.

RAM

What to check

Monitor the RAM usage of the server versus the total amount of RAM on the server.

Potential problem monitored

Running out of RAM indicates that the server is under severe load and application performance will almost certainly be noticeable to end users.

Actions to take

Running low on RAM may not be a problem if it happens once or twice for a short time. Sometimes there are tasks that require more resources and this may not cause problems, but if the RAM is perpetually running at maximum capacity, then your server is probably going to be moving data to swap space (see swap usage below) which is much slower than RAM.

Running near the limits of RAM constantly is also a sign that crashes are eminent since a spike in traffic or usage is surely going to require allocating resources that the server simply doesn’t have. Additionally, seeing spikes in RAM usage may indicate that a rogue process or poorly optimized code is running, which helps developers address problems before your users become aware of them.

Linux swap usage

What to check

Check swap usage as a percentage of the total swap space available on a given server.

Potential problem monitored

When the amount of available RAM is running short or the RAM is totally maxed out, Linux moves data from RAM to the hard drive (usually in a dedicated partition). This hard drive space is known as swap space.

Generally, you don’t want to see too much swap space being used because it means that the available RAM isn’t enough to handle all the tasks the server needs to perform. If the swap space is filled up completely, then it means that RAM is totally allocated and there isn’t even a place on disk to dump extra data that the system needs. When this happens, the system is probably close to a crash and some services are probably unresponsive. It can also be very hard to even connect to a server that is out of swap space as all memory is being used completely at this point and new tasks must wait to run.

Actions to take

If swap is continually running at near 100% allocation, it probably means the system needs more RAM, and you’d want to increase the swap storage space as part of this maintenance. Keeping an eye on this will help ensure you aren’t under-allocating resources for certain machines or tasks.

Disk space

What to check

Track current disk space used versus the total disk space on the server’s hard drives, as well as the total inodes available on the drives.

Potential problem monitored

Running out of disk space is a quick way to kill an application. Unless you have painstakingly designed your partitions to prevent such problems (and even then you may not be totally safe), when a disk fills up some things will cease working.

Many applications write files to disk and use the drive to store temporary data. Backup tasks rely on disk space as do logs. Many tasks will cease functioning properly when a drive or partition is full. On a website running Drupal, a full drive will prevent file uploads and can even cause CSS and JavaScript to stop working properly as well as data not being persisted to the database.

Actions to take

If a server is running low on space, it is relatively easy to add more. Cloud hosting providers usually allow you to attach large storage services to your running instance and if you use traditional hardware, drives are easy to upgrade.

You might also discover that you’ve been storing data you don’t need or forgot to rotate some logs which are now filling up the drive. More often than not, if a server is running out of space, it is not due to the application actually requiring that space but an error or rogue backup job that can be easily rectified.

CPU

What to check

Track the CPU usage across all cores on the server.

Potential problem monitored

If the CPU usage goes to 100% for all cores, then your server is thinking too hard about something. Usually when this happens for an extended period of time, your end users will notice poor performance and response times. Sites hosted on the server might become unresponsive or extremely slow.

Action to take

In some cases, an over-allocated CPU may be caused by a runaway processes but if your application does a lot of heavy data manipulation or cryptography, it might be an indication that you need more processing power.

When sites are being scraped by search providers or attacked by bots in some coordinated way, you might also see associated CPU spikes. So this metric can tip you off to a host of issues including the early stages of a DDoS attack on the server. You can respond by quickly restarting processes that are misbehaving and blocking potentially harmful IPs, or identify other performance bottlenecks.

Reboot required

What to check

Linux servers often provide an indication that they should be rebooted, usually related to security upgrades.

Potential problem monitored

Often after updating software, a server requires a reboot to ensure critical services are reloaded. Until this is done, the security updates are often not in full effect.

Action to take

Knowing that a server requires a reboot allows your team to schedule downtime and reduce problems for your end users.

Drupal specific checks

Zivtech runs many Drupal websites. Over time we have identified some metrics to help us ensure that we are always in the best state for security, performance, search indexes, and content caching. Like most Drupal developers, we rely on drush to help us keep our sites running. We have taken this further and integrated drush commands with our Sensu checks to provide Drupal specific monitoring.

Drupal cron

What to check

Drupal’s cron function is essential to the health of a site. It provides cleanup functions, starts long running tasks, processes data, and many other processes. Untold numbers of Drupal’s contributed modules rely on cron to be running as well.

Potential problem monitored

When a single cron job fails, it may not be a huge problem. But the longer a site exists without a successful cron run, the more problems you are likely to encounter. Services start failing without cron. Garbage cleanup, email notifications, content updates, and indexing of search content all need cron runs to complete reliably.

Action to take

When a cron job fails, you’ll want to find out if it is caused by bad data, a poorly developed module, permissions issues, or some other issue. Having a notification about these problems will ensure you can take proactive measures to keep your site running smoothly.

Drupal security

What to check

Drupal has an excellent security team and security processes in place. Drush can be used to get a list of modules or themes that require updates for security reasons. Generally, you want to deploy updates as soon as you find out about them.

Potential problem monitored

By the time you’ve been hacked, it’s too late for preventative maintenance. You need to take the security of your site’s core and contributed modules seriously. Drupal can alert site administrators via email about security alerts, but moving these checks into an overarching alerting system with company wide guidelines about actions to take and a playbook for how to handle them will result in shorter times that a given site is vulnerable.

Action to take

Test your updates and deploy them as quickly as you can.

Be Proactive

It’s difficult to estimate the value of knowing about problems before they occur. As your organization grows, it becomes more and more disruptive to be dealing with emergency issues on servers. The stress of getting a site back online is exponentially more than the stress of planned downtime.

With a little bit of effort, you can detect issues before they become problems and allocate time to address these without risking the deadlines for your other projects. You may not be able to avoid every crash, but monitoring will enable you to tackle certain issues before they disrupt your day and will help you keep your clients happy.

You might have heard about high availability before but didn’t think your site was large enough to handle the extra architecture or overhead. I would like to encourage you to think again and be creative.

Background

Digital Ocean has a concept they call a floating IPs. A Floating IP is an IP address that can be instantly moved from one Droplet to another Droplet in the same data center. This idea is great, it allows you to keep your site running in the event of failure.

Credit

I have to give credit to BlackMesh for handling this process quite well. The only thing I had to do was create the tickets to change the architecture and BlackMesh implemented it.

Exact Problem

One of our support clients had the need for a complete site relaunch due to a major overhaul in the underlying architecture of their code. Specifically, they had the following changes:

Change in the site docroot

Migration from a single site architecture to a multisite architecture based on domain access

Upgrade of PHP version that required a server replacement/upgrade in linux distribution version

Any of these individually could have benefited from this approach. We just bundled all of the changes together to delivering minimal downtime to the sites users.

Solution

So, what is the right solution for a data migration that takes well over 3 hours to run? Site downtime for hours during peak traffic is unacceptable. So, the answer we came up with was to use a floating IP that can easily change the backend server when we are ready to flip the switch. This allows us to migrate our data on a new separate server using it’s own database (essentially having two live servers at the same time).

Benefits

Notice that we won’t need to change the DNS records here which meant we didn’t have to wait for DNS records to propagate all over the internet. The new site was live instantly.

Additional Details

Some other notes during the transition that may lead to separate blog posts:

We created a shell script to handle the actual deployment and tested it before the actual “go live” date to minimize surprises.

A private network was created to allow the servers to communicate to each other directly and behind the scenes.

To complicate this process, during development (prelaunch) the user base grew so much we had to off load the Solr server on to another machine to reduce server CPU usage. This means that additional backend servers were also involved in this transition.

Go-Live (Migration Complete)

After you have completed your deployment process, you are ready to switch the floating ip to the new server. In our case we were using “keepalived” which responds to a health check on the server. Our health check was a simple php file that responded with the text true or false. So, when we were ready to switch we just changed the health checks response to false. Then we got an instant switch from the old server to the new server with minimal interruption.

Acceptable Losses

There were a few things we couldn’t get around:

The need for a content freeze

The need for a user registration freeze

The reason for this was that our database was the database updates required the site to be in maintenance mode while being performed.

A problem worth mentioning:

The database did have a few tables that would have to have acceptable losses. The users sessions table and cache_form table both were out of sync when we switched over. So, any new sessions and saved forms were unfortunately lost during this process. The result is that users would have to log in again and fill out forms that weren’t submitted. In the rare event that a user changed their name or other fields on their preferences page those changes would be lost.

Additional Considerations

Our mail preferences are handled by third parties

Comments aren’t allowed on this site

Recommended Posts

Engineers find solving complex problems exciting, but as I’ve matured as an engineer, I’ve learned that complexity isn’t always as compelling as simplicity.

Cloudflare Bug May Have Created Security Leak Cloudflare, a major internet host, had some unusual circumstances that caused their servers to output information that contained private information such as HTTP…

When you already have a design and are working with finalized content, high fidelity wireframes might be just what the team needs to make decisions quickly.

Chris Martin

Chris Martin is a junior engineer at Four Kitchens. When not maintaining websites he can be found building drones, computers, robots, and occasionally traveling to China.

Startups and products can move faster than agencies that serve clients as there is no feedback loops and manual QA steps by an external authority that can halt a build going live.

One of the roundtable discussions that popped up this week while we’re all in Minsk is that agencies which practice Agile transparently as SystemSeed do see a common trade-off. CI/CD (Continuous Integration / Continuous Deployment) isn’t quite possible as long as you have manual QA and that lead time baked-in.

Non-Agile (or “Waterfall”) agencies can potentially supply work faster but without any insight by the client, inevitably then needing change requests which I’ve always visualised as the false economy of Waterfall as demonstrated here:

Would the client prefer Waterfall+change requests and being kept in the dark throughout the development but all work is potentially delivered faster (and never in the final state), or would they prefer full transparency, having to check all story details, QA and sign off as well as multi-stakeholder oversight… in short - it can get complicated.

CI and CD isn’t truly possible when a manual review step is mandatory. Today we maintain a thorough manual QA by ourselves and our clients before deploy using a “standard” (feature branch -> dev -> stage -> production) devops process, where manual QA and automated test suites occur both at the feature branch level and just before deployment (Stage). Pantheon provides this hosting infrastructure and makes this simple as visualised below:

This week we brainstormed Blue & Green live environments which may allow for full Continuous Integration whereby deploys are automated whenever scripted tests pass, specifically without manual client sign off. What this does is add a fully live clone of the Production environment to the chain whereby new changes are always deployed out to the clone of live and at any time the system can be switched from pointing at the “Green” production environment, to the “Blue” clone or back again.

Assuming typical rollbacks are simple and databases are either in sync or both Green and Blue codebases link to a single DB, then this theory is well supported and could well be the future of devops. Especially when deploys are best made “immediately” and not the next morning or in times of low traffic.

In this case clients would be approving work already deployed to a production-ready environment which will be switched to as soon as their manual QA step is completed.

One argument made was that our Pantheon standard model allows for this in Stage already, we just need an automated process to push from Stage to Live once QA is passed. We’ll write more on this if our own processes move in this direction.

DrupalCon Dublin is just around the corner. Earlier today I started my journey to Dublin. This week I'll be in Mumbai for some work meetings before heading to Dublin.

On Tuesday 27 September at 1pm I will be presenting my session Let the Machines do the Work. This lighthearted presentation provides some practical examples of how teams can start to introduce automation into their Drupal workflows. All of the code used in the examples will be available after my session. You'll need to attend my talk to get the link.

As part of my preparation for Dublin I've been road testing my session. Over the last few weeks I delivered early versions of the talk to the Drupal Sydney and Drupal Melbourne meetups. Last weekend I presented the talk at Global Training Days Chennai, DrupalCamp Ghent and DrupalCamp St Louis. It was exhausting presenting three times in less than 8 hours, but it was definitely worth the effort. The 3 sessions were presented using hangouts, so they were recorded. I gained valuable feedback from attendees and became aware of some bits of my talk needed some attention.

Just as I encourage teams to iterate on their automation, I've been iterating on my presentation. Over the next week or so I will be recutting my demos and polishing the presentation. If you have a spare 40 minutes I would really appreciate it if you watch one of the session recording below and leave a comment here with any feedback.

Global Training Days Chennai

DrupalCamp Ghent

Note: I recorded the audience not my slides.

DrupalCamp St Louis

Note: There was an issue with the mic in St Louis, so there is no audio from their side.

Continuous Integration is often sold as overall process improvement through continued learning. It serves to end error-prone manual processes and achieve the long-standing DevOps goal of consistency and automation.

“Don’t solve the same problem more than once.”

The practice of Continuous Integration has already yielded technical breakthroughs, and can be summarized by the moniker “don’t solve the same problem more than once”. Tools like Jenkins provide a common platform for development teams to create commands to automate their deployments, run code quality checking, facilitate security scans, or analyze the performance of systems. Free and open source tools are being shared through tools like GitHub to build community, create robust tools, and evolve the entire DevOps movement. Innovations in collaborative problem solving make Continuous Integration a reality for any development team.

But, this innovation is often only accessible by technical audiences to solve technical problems. There still is a significant missing piece to this practice. So, what’s next?

It’s time to evolve Continuous Integration beyond technical problem solving into a practice based on serving those we build the tools for. Continuous Integration is ready to undergo the transformation needed for broader adoption, that it may reach its full potential.

Accessible Continuous Integration

“Accessible” isthe process of creating products that areusableby people with the widest possible range of abilities, operating within the widest possible range of situations.

What exactly do we mean when we talk about “Accessible Continuous Integration?” It is the adoption of any tool, innovation, or practice that eliminates silos between teams and knocks down barriers to entry.

We believe (Accessible) Continuous Integration can be for everyone — and we’d like to see a mind shift in how this practice is promoted and pursued. Bridges must not only be built between the systems we use, but also between technical and non-technical team members. We need to study this problem space, standardize conventions, and help communities recognize the influence of Accessible Continuous Integration. Technology must mature to a point where it is capable of serving a broad population in order to become truly powerful and effective.

There is evidence that these ideas are catching on in the wild. Conceptually, Accessible Continuous Integration seems to be exemplified by the following tenets:

Simple – There is an emphasis on making tools and practices streamlined and unencumbered.

Useful – Problems identified are solved comprehensively and inclusively with consideration for the needs of all parties.

Flexible – Tools have robust sets of features, integrate easily with other adopted tools, and are configurable for many use cases now and in the future.

Transparent – Concise communication is pursued to promote broad visibility by the ongoing operations of the continuous integration practice.

Some of today’s most innovative tools are models of Accessible Continuous Integration case studies. They demonstrate some or all of the aforementioned tenets.

Slack – Messaging platform used to reduce barriers between technical and non-technical collaborators with a wide variety of plugins and extendable behaviors for integrating other systems and processes (GitHub, Jenkins, JIRA, etc.)

GitHub – Development tool used to foster technical collaboration between developers with a set of available plugins for integrating continuous integration tools

Pantheon – Cloud-based hosting framework with an on-demand container-based environment that lowers the barrier of entry for non-technical staff to test development work in a fluid manner, in addition to APIs that allow developers to extend hosting operations

Trello – An easy-to-use and customizable backlog management tool that has integrations for Slack, GitHub, and many other systems

This conversation of Accessible Continuous Learning is just getting started. These tools and practices, among others, have the potential for tremendous growth in this space. At CivicActions, we’re committed to exploring Accessible Continuous Integration to its furthest possibilities. It resonates with who we are and our vision of giving back to our open source roots, strengthening our non-technical stakeholders, and delivering quality solutions.

The idea was recently presented at the 2016 Stanford Drupal Camp. The recording can be found here.

We want to jump-start this discussion and drive toward elegant and lasting solutions. Our clients deserve the best we can offer. Accessible Continuous Integration is a significant step toward our vision of digital empowerment for all.

At this year’s php[world] hackathon, I spent my time getting a Vagrant machine configured to run Drupal 8. I know there are other options, like Acquia’s own Dev Desktop, or even Zend Server. However, I like using Vagrant to run my LAMP stacks, especially on OS X. I’ve never been able to easily run xAMP on non-Linux machines. Installing MySQL can be a pain, system updates can change the version of PHP you’re running, and some PHP extensions are really difficult to build—even with Homebrew. Vagrant simplifies getting a working development environment running by automating the provision of a virtual machine for you, usually with a tool like Chef, Puppet, or Ansible. I used the hackathon as an opportunity to check out the shell script provisioner. If you know you’re way around the shell, this is a very straight-forward method since it uses the same commands you’d type in yourself at the terminal. There are a number of other benefits to using Vagrant:

Match your development environment exactly to your production server, so there’s no more “works on my machine”.

No need to install stuff you don’t need on your main operating system just for one client site.

Each site you work on gets its own machine, that you can destroy when the project is done.

Save your provisioning scripts in Version Control, to track how they change.

Share your Vagrantfile and scripts with colleagues, so everyone works in an identical environment.

Easily test code in new environments. How will your codebase run in PHP 5.6? Fire up a new machine and test it.

In this post, I’ll walk through setting up a Vagrantfile and shell script to install Apache, MySQL, PHP on a basic Debian machine; automatically download and extract Drupal 8, and create a database for the site. At the end, we’ll just have to walk through the installation steps in a browser.

Configuring the Virtual Machine

The Vagrantfile describes the machine for a project and how to configure and provision it. It controls things like how to setup networking, map files between the host and guest OS, and run the provisioner we chose. Let’s take a look at some key configuration settings. First, we have to specify the box to build on. These are box images, many provided by the community, that have the base OS and some tools. There are boxes for CentOS, Redhat, and more. The line below tells vagrant to use a Debian 7.4 box that has chef installed (even though we won’t be using chef).

config.vm.box = "chef/debian-7.4"

Next, we configure networking so that our box is accessible from the host OS. Here, I’m using the private_network option, so the box and specifying an internal IP address. If you need your virtual machine to be accessible from other devices on your local network, look into using a public_network.

config.vm.network "private_network", ip: "192.168.33.10"

For this config, I’m just using the default shared folders setting. This maps your project directory to /vagrant/ on the guest OS. If you haven’t used Vagrant before, shared folders map one or more directories from your host machine to paths on the virtual machine. In effect, this lets you work and edit files in your favorite IDE on your machine while the guest OS sees and uses the same files. So, for example, you change sites/default/default.settings.php in PHPStorm in your home OS, and Drupal in the VM will process the changes. If you need to map more directories, see the comments in the default Vagrantfile. For example, to map the web directory to /var/www/drupal8, you’d do something like the following:

config.vm.synced_folder "web", "/var/www/drupal8"

Finally, we have to tell vagrant what do once the machine is booted up in order to configure it. This is done with a provisioner, and there are many to choose from. As I mentioned earlier, I’m using a simple shell script for this machine; specified with the config.vm.provision setting.

config.vm.provision "shell", path: "provision/setup.sh"

Setting Up Additional Components

When you start your box for the first time with vagrant up, the provisioner will take care of installing additional components and configuring them. My first pass at this was a bit naive, though it worked. I later came back and augmented it to check if something was already available (you can run vagrant up --provision to re-provision an existing machine). For example, the following part of the script will setup Apache. First, we test if the Apache configuration is present. If it is, we assume it’s already configured and continue. If it’s not present:

It’s installed from the apt repositories,

The Rewrite module is enabled,

We copy some custom environment settings and change directory ownership so that Apache runs as the vagrant user. This is necessary to allow the web server to copy uploaded files into a folder at sites/default/files.

Take a look at the complete script to familiarize yourself with what it does.

Provisioning with Ansible

At the same hackathon, Sandy Smith worked on setting up a similar machine with Ansible. We had a friendly competition going over who’s method would work first, and we both hit a few snags. Learn more about provisioning with Ansible in the September 2014 issue In the end, there wasn’t much of a difference. However, the Ansible provisioner reuses existing “roles” coupled with variables you specify. This leads to much smaller “playbooks” to configure a new machine. It also takes care of tracking what changes have been made if you re-provision a box.

Using the Box

Before using the box, you have to install Vagrant and Virtualbox on your machine. Then clone or download the files from from github: https://github.com/omerida/drupal8-vm. In the directory where the files are use vagrant up to start the machine. The default URL is http://drupal8.dev/ To access it, add the following line to your hosts file:

192.168.33.10 drupal8.dev

Once the box is booted, open a browser and go to “http://drupal8.dev” and you should see the Drupal 8 installation wizard.

What’s next?

This gives you a very basic machine to work on. From here, you could do a number of things:

The Acquia Cloud API makes it easy to manage sites on the platform. The API allows you to perform many administrative tasks including creating, destroying and copying databases, deploying code, managing domains and copying files.

Acquia offers 2 official clients. The primary client is a drush plugin which can only be downloaded from Acquia Insight. The other is a PHP library which states in the README that it is "[n]ot ready for production usage".

On a recent project using WF Tools we needed some pretty advanced deployment scripts for sites hosted on Acquia Cloud. We had tried using a mix of bash and PHP, but that created a maintenance nightmare, so we switched to Python.

I was unable to find a high quality Python library, so I wrote a python client for the Acquia Cloud API. The library implements all of the features that we needed, so there are a few things missing.

Chaining complex commands together is easy because the library implements a fluent interface. An extreme example of what is possible is below:

import acapi
# Instantiate the client
c = acapi.Client('[email protected]', 'acquia-token')
# Copy the prod db to dev, make a backup of the dev db and download it to /tmp
c.site('mysite').environment('prod').db('mysite').copy('dev').backups().create().download('/tmp/backup.sql.gz')

Some of the code is "borrowed" from the Python client for Twilio. The library is licensed under the terms of the MIT license.

I am continuing to develop the library. Consider this a working alpha. Improving error handling, creating a comprehensive test suite and implementing the missing API calls are all on the roadmap. Pull requests are welcome.

This is a repost of an article I wrote for the Acquia Blog some time ago.

As mentioned before, devops can be summarized by talking about culture, automation, monitoring metrics and sharing. Although devops is not about tooling, there are a number of open source tools out there that will be able to help you achieve your goals. Some of those tools will also enable better communication between your development and operations teams.

When we talk about Continuous Integration and Continuous Deployment we need a number of tools to help us there. We need to be able to build reproducible artifacts which we can test. And we need a reproducible infrastructure which we can manage in a fast and sane way. To do that we need a Continuous Integration framework like Jenkins.

Formerly known as Hudson, Jenkins has been around for a while. The open source project was initially very popular in the Java community but has now gained popularity in different environments. Jenkins allows you to create reproducible Build and Test scenarios and perform reporting on those. It will provide you with a uniform and managed way to , Build, Test, Release and Trigger the deployment of new Artifacts, both traditional software and infrastructure as code-based projects. Jenkins has a vibrant community that builds new plugins for the tool in different kinds of languages. People use it to build their deployment pipelines, automatically check out new versions of the source code, syntax test it and style test it. If needed, users can compile the software, triggering unit tests, uploading a tested artifact into a repository so it is ready to be deployed on a new platform level.

Jenkins then can trigger an automated way to deploy the tested software on its new target platform. Whether that be development, testing, user acceptance or production is just a parameter. Deployment should not be something we try first in production, it should be done the same on all platforms. The deltas between these platforms should be managed using a configuration management tool such as Puppet, Chef or friends.

In a way this means that Infrastructure as code is a testing dependency, as you also want to be able to deploy a platform to exactly the same state as it was before you ran your tests, so that you can compare the test results of your test runs and make sure they are correct. This means you need to be able to control the starting point of your test and tools like Puppet and Chef can help you here. Which tool you use is the least important part of the discussion, as the important part is that you adopt one of the tools and start treating your infrastructure the same way as you treat your code base: as a tested, stable, reproducible piece of software that you can deploy over and over in a predictable fashion.

Configuration management tools such as Puppet, Chef, CFengine are just a part of the ecosystem and integration with Orchestration and monitoring tools is needed as you want feedback on how your platform is behaving after the changes have been introduced. Lots of people measure the impact of a new deploy, and then we obviously move to the M part of CAMS.

There, Graphite is one of the most popular tools to store metrics. Plenty of other tools in the same area tried to go where Graphite is going , but both on flexibility, scalability and ease of use, not many tools allow developers and operations people to build dashboards for any metric they can think of in a matter of seconds.

Just sending a keyword, a timestamp and a value to the Graphite platform provides you with a large choice of actions that can be done with that metric. You can graph it, transform it, or even set an alert on it. Graphite takes out the complexity of similar tools together with an easy to use API for developers so they can integrate their own self service metrics into dashboards to be used by everyone.

One last tool that deserves our attention is Logstash. Initially just a tool to aggregate, index and search the log files of our platform, it is sometimes a huge missed source of relevant information about how our applications behave.. Logstash and it's Kibana+ElasticSearch ecosystem are now quickly evolving into a real time analytics platform. Implementing the Collect, Ship+Transform, Store and Display pattern we see emerge a lot in the #monitoringlove community. Logstash now allows us to turn boring old logfiles that people only started searching upon failure into valuable information that is being used by product owners and business manager to learn from on the behavior of their users.

Together with the Graphite-based dashboards we mentioned above, these tools help people start sharing their information and communicate better. When thinking about these tools, think about what you are doing, what goals you are trying to reach and where you need to improve. Because after all, devops is not solving a technical problem, it's trying to solve a business problem and bringing better value to the end user at a more sustainable pace. And in that way the biggest tool we need to use is YOU, as the person who enables communication.

This is a repost of an article I wrote for the Acquia Blog some time ago.

People often ask, why does DevOps matter?

The honest answer to that question is...because having the development and operations team work together is the only way IT is successful.

Over the past few decades I've worked in different environments that include: small web start ups, big pharmaceutical companies, hardware engineering shops and large software companies and banks. All were trying different approaches to deliver quality software to their end users, customers, but most of them were failing badly.

Operations people were being pulled in at the last minute. A marketing campaign needed to go live at 5 p.m. because that's when the first radio commercial was scheduled to be broadcasted. At 11 a.m., the operations people still didn't know the campaign existed.

It was always the other person’s fault. Waterfall projects and large PID documents were the solution to all the problems. But people learned; they figured out that we can't expect humans to predict how long it would take to implement something they have never done before. Unfortunately, even today, only a small set of people understand the value of being agile and that we cannot break a project down to its granular details without factoring in the “unpredictable.” The key element here is the “uncertainty” of the many project pieces.

So on came the agile movement and software development became much smoother.People agreed on time boxing a reasonable set of work that would result in delivering useful functionality in frequent batches. Yet, on the day of deployment, all hell breaks loose because someone forgot to loop in the Ops team.

This is where my personal experience differs from a lot of others, because I was part of a development team building a product where the developers were sitting right next to the system administration team. Within sprints, our DevOps team was building both system features and application features, making the application highly available was a story on the board next to an actual end user feature.

In the old days, a new feature that was scheduled for Friday couldn't be brought online for a couple of days because it couldn't be deployed to production. In the new setup, deploying to production was a no brainer as we had already tested the automated deployment to the acceptance platform.

This brings us to the first benefit : Actually being able to go live.

The next problem came on a Wednesday evening. A major security issue had popped up in Drupal and an upgrade needed to be performed, however nobody dared to perform the upgrade as they were afraid of breaking the site. Some people had made changes, they hadn't put their config back in code base, and thus the site didn't get updated. This is the typical state of the majority of any type of website where people build something, deploy it and never look back. This is the case until disaster strikes and it hits the evening news.

Teams then learn that not only do they need to implement features and put their config changes in code, but also do continuous integration testing on their sites.

From doing continuous integration, they go to continuous delivery and continuous deployment, where an upgrade isn't a risk anymore but a normal event which happens automatically when all the tests are green. By implementing infrastructure as code, they now have achieved 2 goals. By implementing tests, we build the confidence that the code was working, but also made sure that the number of defects in that code base went down so the number of times people needed to dig back into old code to fix issue also came down.

By delivering better software in a much more regular way, it enables the security issues to be fixed faster, but also brings new features to market faster. With faster, we often mean that there is an change from releasing software on a bi-yearly basis to a release each sprint, to a release whenever a commit has passed a number of test criteria.

Because they started to involve other stakeholders, the value of their application grew as they had faster feedback and better usage statistics. The faster feedback meant that they weren't spending as much time on features nobody used, but focusing their efforts on things that mattered.

Having other stakeholders like systems and security teams involved with early metrics and taking in the non functional requirements into the backlog planning meant that the stability of the platform was growing. Rather than people spending hours and nights fixing production problems, Potential issues are now being tackled upfront because of thecommunication between devs and ops. Also, scale and high availability have been built into the application upfront, rather than afterwards -- when it is too late.

So, in the end it comes down to the most important part, which is that devops creates more happiness. It creates more happy customers, developers, operations teams, managers, and investors and for a lot of people it improves not only application quality, but also their life quality.

This is a repost of an article I wrote for the Acquia Blog some time ago.

DevOps, DevOps, DevOps … the whole world is talking about DevOps, but what is DevOps?

Since Munich 2012, DrupalCon had a dedicated devops track. After talking toa lot of people in Prague last month, I realized that the concept of DevOps is still very unclear to a lot of developers. To a large part of the development community, DevOps development still means folks working on 'the infrastructure part' of the development life cycle and for some it just means simply deploying Drupal, being concerned about purely keeping the site alive etc.

Obviously that's not what DevOps is about, so let's take a step back and find out how it all started.

Like all good things, Drupal included, DevOps is a Belgian thing!

Back in 2009 DevopsDays Europe was created because a group of people met over and over again at different conferences throughout the world and didn’t have a common devops conference to go to. These individuals would talk about software delivery, deployment, build, scale, clustering, management, failure, monitoring and all the important things one needs to think about when running a modern web operation. These folks included Patrick Debois, Julian Simpson, Gildas Le Nadan, Jezz Humble, Chris Read, Matt Rechenburg , John Willis, Lindsay Holmswood and me - Kris Buytaert.

O’Reilly created a conference called, “Velocity,” and that sounded interesting to a bunch of us Europeans, but on our side of the ocean we had to resort to the existing Open Source, Unix, and Agile conferences. We didn't really have a common meeting ground yet. At CloudCamp Antwerp, in the Antwerp Zoo, I started talking to Patrick Debois about ways to fill this gap.

Many different events and activities like John Allspaw and Paul Hammond’s talk at “Velocity”, multiple twitter discussions influenced Patrick to create a DevOps specific event in Gent, which became the very first ‘DevopsDays'. DevopsDays Gent was not your traditional conference, it was a mix between a couple of formal presentations in the morning and open spaces in the afternoon. And those open spaces were where people got most value. The opportunity to talk to people with the same complex problems, with actual experiences in solving them, with stories both about success and failure etc. How do you deal with that oldskool system admin that doesn’t understand what configuration management can bring him? How do you do Kanban for operations while the developers are working in 2 week sprints? What tools do you use to monitor a highly volatile and expanding infrastructure?

From that very first DevopsDays in Gent several people spread out to organize other events John Willis and Damon Edwards started organizing DevopsDays Mountain View, and the European Edition started touring Europe. It wasn’t until this year that different local communities started organizing their own local DevopsDays, e.g in Atlanta, Portland, Austin, Berlin, Paris, Amsterdam, London, Barcelona and many more.

From this group of events a community has grown of people that care about bridging the gap between development and operations, a community of people that cares about delivering holistic business value to their organization.

As a community, we have realized that there needs to be more communication between the different stakeholders in an IT project lifecycle - business owners, developers, operations, network engineers, security engineers – everybody needs to be involved as soon as possible in the project in order to help each other and talk about solving potential pitfalls ages before the application goes live. And when it goes live the communication needs to stay alive too.. We need to talk about maintaining the application, scaling it, keeping it secure . Just think about how many Drupal sites are out there vulnerable to attackers because the required security updates have never been implemented. Why does this happen? It could be because many developers don't try to touch the site anymore..because they are afraid of breaking it.

And this is where automation will help.. if we can do automatic deployments and upgrades of a site because it is automatically tested when developers push their code, upgrading won't be that difficult of a task. Typically when people only update once in 6 months, its a painful and difficult process but when its automated and done regularly, it makes life so much easier.

This ultimately comes down to the idea that the involvement of developers doesn’t end at their last commit. Collaboration is key which allows every developer to play a key role in keeping the site up and running, for more happy users. After all software with no users has no value. The involvement of the developers in the ongoing operations of their software shouldn't end before the last end user stops using their applications.

In order to keep users happy we need to get feedback and metrics, starting from the very first phases of development all the way up to production. It means we need to monitor both our application and infrastructure and get metrics from all possible aspects, with that feedback we can learn about potential problems but also about successes.

Finally, summarizing this in an acronym coined by John Willis and Damon Edwards- CAMS. CAMS says Devops is about Culture, Automation, Measurement and Sharing.Getting the discussion going on how to do all of that, more specifically in a Drupal environment, is the sharing part .

Alias Directory

You can place the aliases.drushrc file either in the 'sites/all/drush' directory or your global environment drush folder (eg. /home/username/.drush) and the naming convention is 'group.aliases.drushrc.php' so I normally use 'project.aliases.drushrc.php' or 'client.aliases.drushrc.php' to group related sites.

Over the past few years we've moved away from using Subversion (SVN) for version control and we're now using Git for all of our projects. Git brings us a lot more power, but because of its different approach there are some challenges as well.

Git has powerful branching and this opens up new opportunities to start a new branch for each new ticket/feature/client-request/bug-fix. There are several different branching strategies: Git Flow is common for large ongoing projects, or we use a more streamlined workflow. This is great for client flexibility — a new feature can be released to production immediately after it's been approved, or you can choose to bundle several features together in an effort to reduce the time spent running deployments. Regardless of what branching model you choose you will run into the issue where stakeholders need to review and approve a branch (and maybe send it back to developers for refinement) before it gets merged in. If you've got several branches open at once that means you need several different dev sites for this review process to happen. For simple tasks on simple sites you might be able to get away with just one dev site and manually check out different branches at different times, but for any new feature that requires database additions or changes that won't work.

Another trend in web development over the past few years has been to automate as many of the boring and repetitive tasks as possible. So we've created a Drush command called Site Clone that can do it all with just a few keystrokes:

Copies the codebase (excluding the files directory) with rsync to a new location.

Creates a new git branch (optional).

Creates a new /sites directory and settings.php file.

Creates a new files directory.

Copies the database.

Writes database connection info to a global config file.

It also does thorough validation on the input parameters (about 20 different validations for everything from checking that the destination directory is writable, to ensuring that the name of the new database is valid, to ensuring that the new domain can be resolved).

There are a few other things that we needed to put in place in order to get this working smoothly. We've set up DNS wildcards so that requests to a third-level subdomain end up where we want them to. We've configured Apache with a VirtualDocumentRoot so that requests to new subdomains get routed to the appropriate webroot. Finally we've also made some changes to our project management tool so that everyone knows which dev site to look at for each ticket.

Once you've got all the pieces of the puzzle you'll be able to have a workflow something like:

Stakeholder requests a new feature (let's call it foo) for their site (let's call it bar.com).

Developer clones an existing dev site (bar.advomatic.com) into a new dev site (foo.bar.advomatic.com) and creates a new branch (foo).

Developer implements the request.

Stakeholder reviews on the branch dev site (foo.bar.advomatic.com). Return to #3 if necessary.

Merge branch foo + deploy.

@todo decommission the branch site (foo.bar.advomatic.com).

Currently that last step has to be done manually. But we should create a corresponding script to clean-up the codebase/files/database/branches.

Most experienced Drupal developers have worked on sites with security issues. Keeping modules up to date is a pretty straight forward process, especially if you pay attention to the security advisories. Coder can find security issues in your code.

Recently I needed to perform a mass security audit to ensure a collection of sites were properly configured. After searching and failing to find a module that would do what I needed, I decided to write my own. The Security Check module for Drupal checks basic configuration options to ensure a site configuration doesn't have any obvious security flaws. This module isn't designed to find all flaws in your site.

Security Check works by checking a list of installed modules, settings for variables and permission assignments. I hope that others in the community will have suggestions for other generic tests that can be implemented in the module. If you have any ideas (or patches), please submit them as issues in the queue.

This module isn't a substitute for a full security audit, which can be conducted in house or by a third party such as Acquia's Professional Service team. Security Check is designed to be run as part of an automated site audit to catch low hanging fruit.

To use Security Checker install it in ~/.drush and then cd to any docroot and run "drush security-check" or if you prefer to use an alias run "drush @example secchk" from anywhere.

Until we can override the enabled module/theme/library list dynamically in Drupal 8 via configuration, we can bundle up our environment-specific development modules and Strongarm variables in a feature and enable the feature on a per-environment basis in settings.php (or even better, local.settings.php) via a project called Environment Modules. For example, one could create a feature for the development environment, enable the devel module, and set environment-specific Strongarm variables such as those that leave Drupal core caching disabled or environment-specific settings for Domain Access domains. Additionally, whenever refreshing the development, integration or staging environment databases from the production database as part of a release cycle, doing a simple Features revert reenables the correct environment-specific modules and sets the right environment-specific variables.

As the Environment Modules project states, $conf[‘environment_modules’] should not be set on your production site in its settings.php, however the module itself can still remain enabled on production without a performance hit — it just doesn’t do anything in that particular environment.

My last post has been a while ... in that I announced that there would be another event right before FOSDEM ... I totally forgot to announce it here but I`m sure that most of you already know. Yes. PuppetCamp Europe is coming back to it's roots... it's coming back to the city where we hosted it for the first time on this side of the ocean.. Gent. (that's 31/1 and 1/2 )

There is still time to register for the event http://puppetcampghent2013.eventbrite.com/ The schedule for the event will be published soonish (given that the selection was done on Friday evening and the speakers already received their feedback)

Co-located with PuppetCamp there will another Build and Open Source cloud dayBuild a Cloud day with interesting topics such as Cloudstack, Ceph, devops and a really interesting talk on how the Spotify crowd is using Cloudstack.

So after those 2 days in Ghent, a lot of people will be warmed up for the open source event of the year FOSDEM.

And right after FOSDEM a bunch of people will gather at the Inuits office for 2 days of discussing, hacking and evangelizing around #monitoringlove (see previous post)

I almost forgot but even before the FOSDEM week-long there is the 2013 PHP Benelux Conference where I`ll be running a fresh version of the 7 Tools for your devops stack

There is a ****load of #DevopsDays events being planned this year .... the 2012 edition of New York will be taking place next week .Austin and London have been announced and have opened up their CFP and Registration but different groups are organizing themselves to host events in Berlin, Mountain View, Tokyo, Barcelona, Paris, Amsterdam , Australia , Atlanta and many more ..

And there's even more to come .. April 6 and 7 will be the dates for the Linux Open Administration Days (Loadays 2013) in Antwerp again ... a nice small conference where people gather to discuss different interesting Linux topics .... Call For Presentations is still open ..Submit here

On the other side of the ocean there's DrupalCon Portland which once again is featuring a #devops track , and also the folks over at Agile 2013 (Nashville)have a #devops track now. Both events are still looking for speakers ..

So if by the end of this year you still don't know what devops is all about .. you probably don't care and shouldn't be in the IT industry anyhow.

And those are only the events I`m somehow involved in for the next couple of months

While heading back home from DrupalCon Munich after 4 days of good interaction with lots of Drupal folks.I realized to my big suprise that there are a lot of people using Vagrant to make sure that developers are not working on platforms they invented their own. Lots of people have realized that "It works on my computer" is not something they want to hear from a developer and are reaching out to give them viable solutions to work on shared and reproducible solutions.

During the Vagrant BOF , I briefly ran over @patrickdebois old slides on Vagrant after which people started discussing their use cases.. 2 other projects came up

First is Project Oscar which aims at providing developers with a default Drupal development environment in a Jiffy. they do this by providing a bunch of puppetmanifests that sets up a working environment.

And the second one is Ariadne which is a standardized virtual machine development evironment for easily developing Drupal sites in a local sandbox that is essentially identical to a fully-configured hosted solution. It attempts to emulate a dedicated Acquia/Pantheon server as closely as possible, with added development tools. Project Ariadne is just like the examples from Jochen Lillich based on Chef

With all of these tools and examples around , there should be no excuses anymore for Drupal developers to hack on their own machine and tell the systems people "It works on my machine" (let alone to hack in production).

3+ months is probably the biggest timeout I've taken from blogging in a while..Not that I didn't have anything to write ..but more that I was prioritizing writing different content overover writing blogposts.

Blogging tech snippets and contributing documentation used to be one now all of that has evolved.Anyhow ..

So to get things going here's my preliminary Conference schedule for the next couple of months.

Next up .. content ... on how monitoring tools still suck .. and I`m still not sure wether a certification program is relevant for open source consultants ..

There’s a new kid on the block in the configuration management world that claims to be lean, simple and easy to understand. We’re using it internally for some exciting new projects, and have found that it lives up to it’s promise.

Ansible’s goal is to unify two similar but traditionally separate tools: configuration management and deployment. Instead of using a combination of Puppet + Capistrano or Chef + Fabric, you can now use a single tool to update your configuration, deploy new code or execute ad-hoc tasks against groups of servers. There is no requirement to setup any daemons or any software at all on the target servers - as long as you can reach them over SSH, you’re good. There is also very little setup on the machine running Ansible, since it only relies on a few standard Python libraries. There’s a detailed comparison page in the FAQ that compares Ansible to these various other tools.

To illustrate some of the features, we’ve created a playbook that sets up an Ubuntu 12.04 server as a multi-user LAMP developer environment. Our old method of creating a server would be to perform these steps once, document them on a wiki, and then anytime we want another similar server, clone the first one or repeat the steps manually. This procedure is fragile over time and in the case of cloning, virtualization-platform specific. It’s far better to create a reusable, automated set of steps that can be tweaked and improved, and executed against any fresh Ubuntu 12.04 server, running anywhere. The example is documented and fully functional, although it’s not 100% secured so be careful.

Playbooks

The tasks that need to be performed are recorded in YAML formatted Ansible playbooks, which are then executed against the target host. The main components of a ‘play’ are:

Hosts against which the actions are performed.

Variables that can be used in the play or in file templates.

Actions that will be performed when the play runs.

Handlers that respond to change events from the actions.

Here is an abbreviated example from our server build playbook file, setup.yml, that will apply php.ini and restart Apache if the file has changed:

The first two directives, ‘host’ and ‘user’, are easy - they specify that the play should by default run against all hosts, as the root user. In the following sections we’ll cover what the others do.

Variables

The vars_files section lists files that will be imported into the current play. The variables specified in these files are then available to use as value substitution or logic control in both the play and in the templates.

In our case, the settings-default.yml file contains a list of configuration variables for the services like Apache, MySQL, etc. They’re all organized into their respective sections, but looking at the file as a whole we get a great birds-eye view of how the play will be modifying the default server configuration. Here is an excerpt:

Any value that will be modified from it’s default out-the-box state is recorded in this one place. This makes changing settings really easy - no more grepping through thousands of lines of configuration in the master template file.

Tasks & Handlers

These are the meat of the playbook. Each task is given a descriptive name and an action. In the example above, the action is ‘template’ and the parameters ‘src’ and ‘dest’ point to a source template on the local host, and a destination location on the target host, respectively.

To execute this step, the Ansible ‘template’ module will be invoked, passing the etc-php5-apache2-php-ini.j2 file through the Jinja2 templating engine, which performs variable substitutions in the appropriate place. For example, in this template we can insert the value of the variable php_max_execution_time (sourced from settings-default.yml above) in it’s correct place:

The great thing about Ansible modules is that they’re idempotent, meaning no changes will be made if the new file is identical to the old one. And since we have this data available, we can trigger events when changes are actually made. That’s what the ‘notify’ section does - when Ansible detects this file has changed, it will call ‘Restart Apache’, which is defined at the end of the play and does exactly what you might think it does.

Modules

When Ansible executes a task, it creates an SSH connection to the target server and copies the required module over. The module is then executed on the target with the correct parameters, and all the module has to do is return a JSON object containing pass/fail and other optional status information.

This has several advantages: for one, modules can be written in any scripting language, as long as the target can execute that code from the shell. It also means your modules can be quite sophisticated, and since they’re running locally on the target server as opposed to sending individual commands over the wire, they run fast.

There are git, apt, yum and service modules, just to name a few. Developing extra modules is easy and there is a growing collection of ‘contrib’ modules that come from the community but are not part of core Ansible.

Command mode

In addition to creating playbooks, you can also execute ad-hoc tasks against your servers using the exact same modules and syntax. For example, I could execute from the command line:

ansible webservers -m shell -a '/sbin/reboot now'

That would reboot all servers defined in the ‘webservers’ pool. This shared syntax and configuration is one of Ansible’s strengths, we can reuse a single server cluster specification for all tasks.

Project Status

Michael DeHaan is the project lead, and he has a lot of experience in this area, having worked at Red Hat & Puppet Labs, co-created Func and developed Cobbler.

Ansible is still very new, having only been released this year. However, it’s already used in production and there is a strong community forming. Michael announced recently that he’d written almost zero code in the upcoming 0.5 release, a sure sign of good community momentum.

Resources

Check out the pedantically commented playbook example, that covers almost all the features in one place.
The Ansible mailing list is active and friendly.

Devopsdays Mountainview sold out in a short 3 hours .. but there's other events that will breath devops this summer.DrupalCon in Munich will be one of them ..

Some of you might have noticed that I`m cochairing the devops track for DrupalCon Munich,The CFP is open till the 11th of this month and we are still actively looking for speakers.

We're trying to bridge the gap between drupal developers and the people that put their code to production, at scale.But also enhancing the knowledge of infrastructure components Drupal developers depend on.

We're looking for talks both on culture (both success stories and failure) , automation,specifically looking for people talking about drupal deployments , eg using tools like Capistrano, Chef, Puppet,We want to hear where Continuous Integration fits in your deployment , do you do Continuous Delivery of a drupal environment.And how do you test ... yes we like to hear a lot about testing , performance tests, security tests, application tests and so on.... Or have you solved the content vs code vs config deployment problem yet ?

How are you measuring and monitoring these deployments and adding metrics to them so you can get good visibility on bothsystem and user actions of your platform. Have you build fancy dashboards showing your whole organisation the current state of your deployment ?

We're also looking for people talking about introducing different data backends, nosql, scaling different search backends , building your own cdn using smart filesystem setups.Or making smart use of existing backends, such as tuning and scaling MySQL, memcached and others.

So lets make it clear to the community that drupal people do care about their code after they committed it in source control !

I`m parsing the responses of the Deploying Drupal survey I started a couple of months ago (more on that later)

One of the questions in the survey is "What is devops" , apparently when you ask a zillion people (ok ok, just a large bunch of Tweeps..), you get a large amount of different answers ranging from totally wrong to spot on.

So let's go over them and see what we can learn from them ..

The most Wrong definition one can give is probably :

A buzzword

I think we've long passed the buzzword phase, definitely since it's not new, it's a new term we put to an existing practice. A new term that gives a lot of people that were already doing devops , a common word to dicuss about it. Also lots of people still seem to think that devops is a specific role, a job description , that it points to a specific group of people doing a certain job, it's not . Yes you'll see a lot of organisations looing for devops people, and giving them a devops job title. But it's kinda hard to be the only one doing devops in an organisation.

I described one of my current roles as Devops Kickstarter, it pretty much describes what I`m doing and it does contain devops :)

But devops also isn't

The connection between operations and development.

people that keep it running

crazy little fellows who find beauty in black/white letters( aka code) rather than a view like that of Taj in a full moon light.

the combination of developer and operations into one overall functionality

The perfect mixture between a developer and a system engineer. Someone who can optimize and simplify certain flows that are required by developers and system engineers, but sometimes are just outside of the scope for both of them.

Proxy between developer and management

The people in charge of the build/release cycle and planning.

A creature, made from 8-bit cells, with the knowledge of a seasoned developer, the skillset of a trained systems engineer and the perseverence of a true hacker.

The people filling the gap between the developer world and the sysadmin world. They understand dev. issues and system issues as well. They use tools from both world to solve them.

Or

Developers looking at the operations of the company and how we can save the company time and money

And it's definitely not

Someone who mixes both a sysop and dev duties

developers who know how to deploy and manage sites, including content and configuration.

I believe there's a thin line line between Ops and Devs where we need to do parts of each others jobs (or at least try) to reach our common goal..

A developer that creates and maintains environments tools to help other developers be more successful in building and releasing new products

Developers who also do IT operations, or visa versa.

Software developers that support development teams and assist with infrastructure systems

So no, developers that take on systems roles next to their own role and want to go for NoOps isn't feasable at all ..you really want collaboration, you want people with different skillsets that (try to) understand eachoter and (try to) work together towards a common goal.

Devops is also not just infrastructure as code

Writing software to manage operations

system administrators with a development culture.

Bring code management to operations, automating system admin tasks.

The melding of the art of Systems Administration and the skill of development with a focus on automation. A side effect of devops is the tearing down of the virtual wall that has existed between SA's and developers.

Infrastructure as code.

Applying some of the development worlds techniques (eg source control, builds, testing etc) to the operations world.

Code for infrastructure

Sure infastructure as code is a big part of the Automation part listed in CAMS, but just because you are doing puppet/chef doesn't mean you are doing devops.Devops is also not just continous delivery

A way to let operations deploy sites in regular intervals to enable developers to interact on the systems earlier and make deployments easier.

Devops is the process of how you go from development to release.

Obviously lots of people doing devops also often try to achieve Continuous delivery, but just like Infrastructure as Code it devops is not limited to that :)

But I guess the truth is somewhere in the definitions below ...

That sweet spot between "operating system" or platform stack and the application layer. It is wanting sys admins who are willing to go beyond the normal package installers, and developers who know how to make their platform hum with their application.

Breaking the wall between dev and ops in the same way agile breaks the wall between business and dev e.g. coming to terms with changing requirements, iterative cycles

Not being an arsehole!

Sysadmin best-practise, using configuration as code, and facilitating communication between sysadmins and developers, with each understanding and participating in the activities of the other.

Devops is both the process of developers and system operators working closer together, as well as people who know (or who have worked in) both development and system operations.

Culture collaboration, tool-chains

Removing barriers to communication and efficiency through shared vocabulary, ideals, and business objectives to to deliver value.

A set of principles and good practices to improve the interactions between Operations and Development.

Collaboration between developers and sysadmins to work towards more reliable platforms

Building a bridge between development and operations

The systematic process of building, deploying, managing, and using an application or group of applications such as a drupal site.

Devops is collaboration and Integration between Software Development and System Administration.

Devops is an emerging set of principles, methods and practices for communication, collaboration and integration between software development (application/software engineering) and IT operations (systems administration/infrastructure) professionals.[1] It has developed in response to the emerging understanding of the interdependence and importance of both the development and operations disciplines in meeting an organization's goal of rapidly producing software products and services.

The cultural extension of agile to bring operations into development teams.

Tight collaboration of developers, operations team (sys admins) and QA-team.

But I can only conclude that there is a huge amount of evangelisation that still needs to be done, Lots of people still don't understand what devops is , or have a totally different view on it.

A number of technology conferences are and have taken up devops as a part of their conference program, inviting experienced people from outside of their focus field to talk about how they improve the quality of life !

There is still a large number of devops related problems to solve, so that's what I`ll be doing in 2012

Devops is gaining momentum, the idea that developers and operations should work much closer together , the idea that one should automate as much as possible in both their infrastructure and their release process brings along a lot of questions, ideas and tools that need to be integrated in your daily way of working.

Drupal has one of the biggest development communities in the open source world, being part of both communities We are trying to bridge the gap,

At Inuits we are building tools and writing best practices to close the gap, but we are not alone in this world and we would like to gather some feedback on how other people are deploying, and managing their Drupal environments

Working with Drupal, build with Drupal in mind .. how do you release your sites .. That's what we are trying to figure out ... for everybody else to learn from

Oh and you can win some items of our brand new fashion line !

The survey is here , please spend a bit of your time helping us to better understand the needs of the community

For those who haven't noticed yet .. I`m into devops .. I`m also a little bit into Drupal, (blame my last name..) , so one of the frustrations I've been having with Drupal (an much other software) is the automation of deployment and upgrades of Drupal sites ...

So for the past couple of days I've been trying to catch up to the ongoing discussion regarding the results of the configuration mgmt sprint , I've been looking at it mainly from a systems point of view , being with the use of Puppet/ Chef or similar tools in mind .. I know I`m late to the discussion but hey , some people take holidays in this season :) So below you can read a bunch of my comments ... and thoughts on the topic ..

First of all , to me JSON looks like a valid option.
Initially there was the plan to wrap the JSON in a PHP header for "security" reasons, but that seems to be gone even while nobody mentioned the problems that would have been caused for external configuration management tools.
When thinking about external tools that should be capable of mangling the file plenty of them support JSON but won't be able to recognize a JSON file with a weird header ( thinking e.g about Augeas (augeas.net) , I`m not talking about IDE's , GUI's etc here, I`m talking about system level tools and libraries that are designed to mangle standard files. For Augeas we could create a separate lens to manage these files , but other tools might have bigger problems with the concept.

As catch suggest a clean .htaccess should be capable of preventing people to access the .json files There's other methods to figure out if files have been tampered with , not sure if this even fits within Drupal (I`m thinking about reusing existing CA setups rather than having yet another security setup to manage) ,

In general to me tools such as puppet should be capable of modifying config files , and then activating that config with no human interaction required , obviously drush is a good candidate here to trigger the system after the config files have been change, but unlike some people think having to browse to a web page to confirm the changes is not an acceptable solution. Just think about having to do this on multiple environments ... manual actions are error prone..

Apart from that I also think the storing of the certificates should not be part of the file. What about a meta file with the appropriate checksums ? (Also if I`m using Puppet or any other tool to manage my config files then the security , preventing to tamper these files, is already covered by the configuration management tools, I do understand that people want to build Drupal in the most secure way possible, but I don't think this belongs in any web application.

When I look at other similar discussions that wanted to provide a similar secure setup they ran into a lot of end user problems with these kind of setups, an alternative approach is to make this configurable and or plugable. The default approach should be to have it enable, but the more experienced users should have the opportunity to disable this, or replace it with another framework. Making it plugable upfront solves a lot of hassle later.

Someone in the discussion noted :
"One simple suggestion for enhancing security might be to make it possible to omit the secret key file and require the user to enter the key into the UI or drush in order to load configuration from disk."

Requiring the user to enter a key in the UI or drush would be counterproductive in the goal one wants to achieve, the last thing you want as a requirement is manual/human interaction when automating setups. therefore a feature like this should never be implemented

Luckily there seems to be new idea around that doesn't plan on using a raped json fileinstead of storing the config files in a standard place, we store them in a directory that is named using a hash of your site's private key, like sites/default/config_723fd490de3fb7203c3a408abee8c0bf3c2d302392. The files in this directory would still be protected via .htaccess/web.config, but if that protection failed then the files would still be essentially impossible to find. This means we could store pure, native .json files everywhere instead, to still bring the benefits of JSON (human editable, syntax checkable, interoperability with external configuration management tools, native + speedy encoding/decoding functions), without the confusing and controversial PHP wrapper.

Figuring out the directory name for the configs from a configuration mgmt tool then could be done by something similar to

cd sites/default/conf/$(ls sites/default/conf|head -1)

In general I think the proposed setup looks acceptable , it definitely goes in the right direction of providing systems people with a way to automate the deployment of Drupal sites and applications at scale.

I`ll be keeping a eye on both the direction they are heading into and the evolution of the code !

A couple of weeks ago I noticed a weird drop in web usage stats on the site you are browsing now. Kinda weird as the drop was right around Fosdem when usually there is a spike in traffic.

So before you start.. no I don't preach on practice on my own blog, it's a blog dammit, so I do the occasional upgrades on the actual platform , with backups available, do some sanity tests and move on, yes I break the theme pretty often but ya'll reading this trough RSS anyhow.

My backups showed me that drush had made a copy of the Piwik module somewhere early february, exactly when this drop started showing. I verified the module , I verified my Piwik , - Oh Piwik you say .. yes Piwik, if you want a free alternative to Google Analytics , Piwik rocks .. - I even checked other sites using the same piwik setup and they were all still functional happily humming and being analyzed.... everything fine ... but traffic stayed low ..

So that brings me to the point I`m actually wanting to make...
as according to @patrickdebois in his chapter on Monitoring "Quis custodiet ipsos custodes?" who's monitoring the monitoring tools, who's monitoring the analytics tools,

So not only should you monitor the availability of yor monitoring tools, you should also monitor if their api hasn't changed in some way or another.
Just like when you are monitoring an web app you shoulnd't just see if you can connect to the appropriate http port, but you should be checking if you get sensible results back from it , no gibberish.

Next weekend saturday I`ll be giving a talk about devops at StartUp Weekend Brussels, from what I've read so far it promises to be an audience that needs the talk,

The week after I`ll be speaking at the DrupalDevDays, again about devops , however this time with a touch of Drupal , giving a devops talk at Devoxx last year to a Java audience learned me that the devops evangelist need to go outside of their usual conference audiences and als talk to the people that are usually in the other silos.

If you are around at one of these confs and you want to talk Devops, Clustering, sipx or just have a beer .. don't hesitate ! There's already plenty of people promising me beers , and some even sushi :)

When I first started out giving talks about devops , I realized that I was preaching to the choir, some Barcamps, the Keynote at Loadays , the Dutch Unix Usergroup etc .. lots of people in the audiences knew about the pains we were trying to solve, lots of them already knew some of the tools we use and lots of them already talk a lot with their developers or are part of the deveoplment teams

With our Devoxx talk, Patrick and I started to talk to a different audience , the Java devs , and it was great, we all learned from it. With that experience in mind I submitted a variation of the talk to an audience that is also very important to me ... the Drupal Community .

Devops is gaining importance , while we been practicing devops methodologies since ever, now even the big analyst companies etc are writing and talking about the movement, the drupal community really should also get involved.

So if you care about devops, about devs and ops working together, about continuous integration, continuous deployment, configuration mangement, automation, monitoring and scale, if you've heard about all of the above but have no clue what Puppet, Hudson or Fabric can do for you , vote here for my proposed talk at Drupalcon Chicago,

So John wrote down his experiences on deploying Drupal sites with Puppet .

It's not a secret that I've been thinking about similar stuff and how I could get to the best possible setup.

John starts of with using Puppet to download Drush... while I want to use rpm for that ...

I want my core infrastructure to be fully packaged... not downloaded and untarred. I want to be able to reproduce my platform in a couple of months , with the exact same versions I`m using now .. not with the version that happens to be on ftp.drupal.org at that point in time, or with ftp.drupal.org being down.

Now the next question off course is what's the core infrastructure.
Where does the infrastructure end and does the application start. There's little discussion about having a puppet created vhost , an apache conf.d file, a matching .htaccess file if wanted , and the appropriate settings.php for a multisite drupal config.

There's also little doubt to me on using drush to run the updates, manage the drupal site etc . Reading John's article made me think some further about what and when I want things packaged.

John's post lead to a discussion on #infra-talk on getting all drupal modules packaged for Centos with Karan and some others

In a development environment I probably want to have periodic drush updates getting the latest modules from the interwebs and potentially breaking my devs code. But making sure that when you put a site in production it will be on a fairly up to date platform, and not on the platform you started developing on 24 months ago.

In a production environment however you only want tested updates of your modules as indeed they will break code.

It's probably going to be a mix and match setup having a local rpm/deb repo with packaged modules that have been tested and validated in your setup and using drush to enable or configure them for that production setup.

But also having a CI environment wher Drush will get the new modules from the interwebs when needed. and package them for you.

To me that sounds beter than getting all the available Drupal modules and packaging them, even automated, and preparing a repository of those modules of which only a small percentage will actually be used by people.