wait 20 minutes (at best) to find out if the tests pass or fail with your patch

If the tests fail, then you get to do it all over again, including making possible revisions to the issue summary.

Ever wanted to run the tests while iterating on a change, but are reluctant due to:

insufficient local machine resources or configuration to run tests efficiently

inconsistent results compared to the official testbot

disruption of your 'creative coding moments' due to the patch submission workflow

long test runtime on qa.d.o

Well you may be in luck.

What if you could type "drush testbot" to have the changes in your working branch submitted and tested as a patch against Drupal? And be able to view the results in five minutes or less? If this sounds interesting, then install the "drush testbot" command file (see instructions) and take it for a spin.

Suggested usage

If your changes only affects a single module (or a few), then an assessment of your changes can often be had by simply running the tests for that module (or those modules). Once you know that your changes do not break the immediate context, then the entire test suite can be run before posting the patch on the issue queue.

You can do so by including a list of test classes to run (see examples); in so doing you might reduce your response time to a matter of seconds.

Drupal 8 examples

Run the command from the working directory with your code changes. The default test environment for Drupal 8 is mysql 5.5 and php 5.4. In lieu of setting properties on the command line, add them to a '.testbot' file in the repository root directory (i.e. the same directory that contains the .git directory).

Drupal 7 examples

The default test environment for Drupal 7 is mysql 5.5 and php 5.3. The command syntax matches that shown above with the addition of the 'branch:7.x' parameter to the properties. An example of testing two classes is:

Instructions

Caveat

If our test queue is empty when you submit a patch (and do not specify a list of classes to test), the actual response time for a Drupal 8 patch will be under ten minutes based on existing test suite and available hardware at the time of this writing. If the queue is full, we may not run your patch and will let you know.

Final thoughts

If you have questions, would like to offer constructive comments or suggestions, or want to let us know that you found the tool useful, please post in the github issue queue.

The testbot is provided as a complimentary service to the Drupal community. No financial assistance is received from the Drupal Association (to defray even the hardware costs). Unlike the current testbot which runs on the most powerful compute instances offered by Amazon Web Services, these tests are run on hardware with roughly 25% the processing power yet return results in under half the time.

Tags:

For those not familiar with me, a little research should make it clear that I am the person behind the testbot deployed in 2008 that has revolutionized Drupal core development, stability, etc. and that has been running tens of thousands of assertions with each patch submitted against core and many contributed modules for 6 years.

My intimate involvement with the testbot came to a rather abrupt and unintended end several years ago due to a number of factors (which only a select few members of this community are clearly aware). After several potholes, detours, and bumps in the road, it became clear to me the impossibility of maintaining and enhancing the testbot under the policies and constraints imposed upon me.

Five years ago we finished writing an entirely new testing system, designed to overcome the technical obstacles of the current testbot and to introduce new features that would enable an enormous improvement in resource utilization that could then be used for new and more frequent QA.

Five years ago we submitted a proposal to the Drupal Association and key members of the community for taking the testbot to the next level, built atop the new testing system. This proposal was ignored by the Association and never evaluated by the community. The latter is quite puzzling to me given:

the importance of the testbot

the pride this open source community has in openly evaluating and debating literally everything (a healthy sentiment especially in the software development world)

I had already freely dedicated years of my life to the project.

The remainder of this read will:

list some of the items included in our proposal that were dismissed with prejudice five years ago, but since have been adopted and implemented

compare the technical merits of the new system (ReviewDriven) with the current testbot and a recent proposal regarding "modernizing" the testbot

provide an indication of where the community will be in five years if it does nothing or attempts to implement the recent proposal.

This read will not cover the rude and in some cases seemingly unethical behavior that led to the original proposal being overlooked. Nor will this cover the roller coaster of events that led up to the proposal. The intent is to focus on a technical comparison and to draw attention to the obvious disparity between the systems.

About Face

Things mentioned in our proposal that have subsequently been adopted include:

paying for development primarily benefiting drupal.org instead of clinging to the obvious falacy of "open source it and they will come"

paying for machine time (for workers) as EC2 is regularly utilized

utilizing proprietary SaaS solutions (Mollom on groups.drupal.org)

automatically spinning up more servers to handle load (e.g. during code sprints) which has been included in the "modernize" proposal

Comparison

The following is a rough, high-level comparison of the three systems that makes clear the superior choice. Obviously, this comparison does not cover everything.

Will butcher a variety of systems from their intended purpose and attempt to have them all communicate

Adds a number of extra levels of communication and points of failure

Minimal custom PHP code and Drupal

Uses commonly understood contrib code like Views

Maintainability

Learning curve but all PHP

Languages and tools not common to Drupal site building or maintenance

Vast array of systems to learn and the unique ways in which they are hacked

Less code to maintain and all familiar to Drupal contributors

Speed

Known; gets slower as test suite grows due to serial execution

Still serial execution and probably slower than current as each separate system will add additional communication delay

An order of magnitude faster thanks to concurrent execution

Limited by the slowest test case

*See below

Extensibility (Plugins)

Moderately easy, does not utilize contrib code so requires knowledge of current system

Several components, one on each system used

New plugins will have to be able to pass data or tweak any of the layers involved which means writing a plugin may involve a variety of languages and systems and thus include a much wider breadth of required knowledge

Capable of displaying partial results as they are concurrently streamed in from the various workers

Speed and Resource Utilization

Arguably one of the most important advantages of the ReviewDriven system is concurrency. Interestingly, after seeing inside Google I can say this approach is far more similar to the system Google has in place than Jenkins or anything else.

Systems like Jenkins and especially travis-ci, which for the purpose of being generic and simpler, do not attempt to understand the workload being performed. For example Travis simply asks for commands to execute inside a VM and presents the output log as the result. Contrast that with the Drupal testbot which knows the tests being run and what they are being run against. Why is this useful? Concurrency.

Instead of running all the test cases for a single patch on one machine, the test cases for a patch may be split out into separate chunks. Each chunk is processed on a different machine and the results are returned to the system. Because the system understands the results it can reassemble the chunked results in a useful way. Instead of an endlessly growing wait time as more tests are added and instead of having nine machines sitting idle while one machine runs the entire test suite all ten can be used on every patch. The wait time effectively becomes the time required to run the slowest test case. Instead of waiting 45 minutes one would only wait perhaps 1 minute. The difference becomes more exaggerated over time as more tests are added.

In addition to the enormous improvement in turnaround time which enables the development workflow to process much faster you can now find new ways to use those machine resources. Like testing contrib projects against core commits, or compatibility tests between contrib modules, or retesting all patches on commit to related project, or checking what other patches a patch will break (to name a few). Can you even imagine? A Drupal sprint where the queue builds up an order of magnitude more slowly and runs through the queue 40x faster?

Now imagine having additional resources automatically started when the need arises. No need to imagine...it works (and did so 5 years ago). Dynamic spinning up of EC2 resources which could obviously be applied to other services that provide an API.

This single advantage and the world of possibility it makes available should be enough to justify the system, but there are plenty more items to consider which were all implemented and will not be present in the proposed initiative solution.

Five Years Later

Five years after the original proposal, Drupal is left with a testbot that has languished and received no feature development. Contrast that with Drupal having continued to lead the way in automated testing with a system that shares many of the successful facets of travis-ci (which was developed later) and is superior in other aspects.

As was evident five years ago the testbot cannot be supported in the way much of Drupal development is funded since the testbot is not a site building component placed in a production site. This fact drove the development of a business model that could support the testbot and has proven to be accurate since the current efforts continue to be plagued by under-resourcing. One could argue the situation is even more dire since Drupal got a "freebie" so to speak with me donating nearly full-time for a couple of years versus the two spare time contributors that exist now.

On top of lack of resources the current initiative, whose stated goal is to "modernize" the testbot, is needlessly recreating the entire system instead of just adding Docker to the existing system. None of the other components being used can be described as "modern" since most pre-date the current system. Overall, this appears to be nothing more than code churn.

Assuming the code churn is completed some time far in the future; a migration plan is created, developed, and performed; and everything goes swimmingly, Drupal will have exactly what it has now. Perhaps some of the plugins already built in the ReviewDriven system will be ported and provide a few small improvements, but nothing overarching or worth the decade it took to get there. In fact the system will needlessly require a much rarer skill set, far more interactions between disparate components, and complexity to be understood just to be maintained.

Contrast that with an existing system that can run the entire test suite against a patch across a multitude of machines, seamlessly stitch the results together, and post back the result in under a minute. Contrast that with having that system in place five years ago. Contrast that with the whole slew of improvements that could have also been completed in the four years hence by a passionate, full-time team. Contrast that with at the very least deploying that system today. Does this not bother anyone else?

Contrast that with Drupal being the envy of the open source world, having deployed a solution superior to travis-ci and years earlier.

Tags:

Since posting Drupal on Google App Engine the App Engine team has been working hard to identify and improve many of the troublesome areas identified in the original post, other external sources, and further internal discussions. In addition, I have developed a Drupal integration module for Google App Engine. The combination of the integration module and the work done internally provides a more compelling Drupal experience on App Engine.

For those of you who are not sure what features App Engine provides or why to consider App Engine have a look at the reference material. Getting started is also easier than ever as the whitelist has been removed and the SDK comes bundled with PHP. The rest of this post will focus on what improvements have been made, what the integration provides, and how to make use of it.

To see a functioning Drupal site making use of the App Engine module and the Memcache module see my demo site. The demo site includes an example of a file field served out of Google Cloud Storage.

If you are just interested in making use of the integration visit the Google App Engine Drupal project, have a look at the included README and/or the Getting started section of this post. The second half covers some technical details on the implementation for those who are interested.

Getting started

The preferred method of developing for App Engine is to work locally using the SDK and the included development server. Once the site is ready it can be uploaded to App Engine.

You can choose to skip the local development setup and work directly on App Engine, but you will still need the SDK setup for uploading the app.
Regardless, there are a number of ways to get a hold of Drupal and the integration module.

all-in-one download

drush make

manual

All-in-one download

Drush make

The App Engine module includes Drush make scripts. There are two profiles: minimal using drupal.make; and full (to include more going forward) using drupal-full.make. Only the latter contains the memcache module.

Depending on which profile is desired, invoke the appropriate command.

Manual

Obviously, the components can be downloaded and manually assembled as well.

Drupal installation

Follow the normal process for installing Drupal. Keep in mind that the Drupal files will not be writable on App Engine so any changes to settings.php or any other modules configuration will need to be made prior to upload. See the SETTINGS.PHP section of the README for details on setting up the settings.php file for development against the local server and production App Engine.

The gist of the comments is to use the following for database credentials, filling in the {} sections.

Additionally, a patch should be applied to define MEMCACHE_COMPRESSED which is missing from the App Engine implementation (fix upcoming).

App Engine module

To make use of the integration enable the App Engine module. In order to use Google Cloud Storage be sure to configure the default storage bucket by visiting admin/config/media/file-system in your Drupal site.

If you choose to enable CSS/JS aggregation be sure to read through the serving options on admin/config/development/performance and choose the one that best suites your workflow.

Importing

If you are looking to import an existing site into Google App engine take a look at the following documentation links.

App Engine mail service

Implements Drupal MailSystemInterface to make use of the App Engine mail service. The system email address will be used as the default from address and must be authorized to send mail. To configure the address, visit admin/config/system/site-information. For details on App Engine mail service, read this document.

Cloud Storage

The GAE team has provided a PHP stream wrapper which allows the use of standard PHP file handling functions for interacting with GCS. The current implementation requires a storage bucket in the file path which means applications must be altered to not only make use of the stream wrapper, but include a bucket in all file paths. Additionally, Drupal requires the implementation of an additional set of methods (DrupalStreamWrapperInterface) on top of the default set required for all PHP stream wrappers.

Instead of attempting to provide a format that allows the bucket to be optionally included, the best route forward is to provide a bucketless stream wrapper that always assumes no bucket is included and instead uses a default bucket. The new stream wrapper would sit atop the default stream wrapper and add the default bucket to all paths before handing off to the parent implementation. For lack of anything more descriptive the letter b (for bucketless) was appended to the gs stream wrapper. The examples below demonstrate the usage.

gs://defaultbucket/dir1/dir2/file

gsb://dir1/dir2/file (assumed defaultbucket and thus equivalent to first example)

The additional stream wrapper solution means that paths can be identical to those used with a local file-system, but applications wishing to utilize more than one bucket can still do so.

An additional implication of removing the bucket is that it allows for staging sites in different environments with the same set of files and corresponding data since a different bucket may be configured for the entire site instead of duplicated in each path and requiring changing. Obviously, those applications that choose to use more than one bucket will need to handle the cases themselves. This also aids in site migration as file paths stored in database do not need to be changed.

In lieu of an upstream GAE PHP runtime user-space setting the Drupal module will provide a typical Drupal setting and make use of it in the bucketless stream wrapper. In the future, it would make sense for the Drupal setting to merely set the upstream user-space setting.

Implementation

The following is the class hierarchy used for implementing the bucketless and Drupal specific stream wrappers discussed in details below.

In order to facilitate a clean implementation and the possibility for moving upstream, while working within the restriction that the current stream wrapper is a final class, a rough facsimile, that acts as a proxy, of the base implementation is provided in order to allow for extension. The bucketless stream wrapper is built on top of the facsimile. This provides two basic stream wrapper implementations without any Drupal specific additions. The facsimile is implemented as a PHP Trait in order to allow for multiple inheritance as needed later for the Drupal wrappers.

There are two levels of integration with Drupal that make sense to allow GCS to be used as comprehensively and easily as possible. The first is providing stream wrappers that implement the additional functionality defined by the DrupalStreamWrapperInterface and the second is overriding the default provided local file-system stream wrappers to use GCS. The first is required for the second, but also allows for the use of GCS in a specific manner vs catch-all local file-system.

The three core stream wrappers (private://, public://, temporary://) are overridden via hook_stream_wrappers_alter() to use the GCS wrappers. The storage bucket must be configured in order for the GCS integration to function properly. The standard mechanisms for controlling the file system setup (admin/config/media/file-system) can be used and file fields can be stored within one of the default stream wrappers.

Drupal core patch

In order to have Drupal run properly on Google App Engine a few changes need to be made to Drupal core. Those changes can be found in root/core.patch which is managed in the 7.x-appengine branch and rebased on top of Drupal core updates. The patch creates three other files within the appengine root directory that need to be placed in the Drupal root.

Alters drupal_move_uploaded_file() in includes/file.inc to support newly uploaded files from the $_FILES array being referenced via a stream wrapper. In the case of App Engine all uploaded files are uploaded through the GCS proxy, hosted on GCS, and thus start with gs://. The change should be generally useful and has been rolled as a core patch.

Alters file_upload_max_size() in includes/file.inc to only check PHP ini setting 'upload_max_filesize' instead of also checking 'post_max_size' which is normally relevant, but in the case of App Engine is not since all uploads are sent through GCS proxy and are thus not affected by app instance post limits.

Alters drupal_tempnam() in includes/file.inc to manually simulate tempnam() since it is currently not supported by App Engine.

Alters system_file_system_settings() in modules/system/system.admin.inc to include #wrapper_scheme property to be picked up by system_check_directory() in modules/system/system.module. Given that the current code voids using the stream wrappers this is technically a bug and is a candidate for being fixed in Drupal core as well.

Alters system_requirements() in modules/system/system.install to skip the directory check since the GCS integration will not be loaded until the App Engine module is enabled.

A number of the changes included in the patch are being looked at and will hopefully becoming unnecessary in the future. The following are also included in the patch as a convenience, but they do not alter Drupal core.

Add app.yaml to root which provides basic information about the app to Google App Engine so that it can invoke Drupal properly.

Add php.ini to root which enables some php functions used by Drupal and turns on output buffering.

Aggregate CSS/JS

Since a local writable file-system is not available on Google App Engine for various reasons, the ability for Drupal to aggregate CSS and JS into combined files is restricted. There are three choices.

Directly from static files (recommended, but requires proper setup)

Serving from static files requires that the aggregate files be uploaded with the app. There are a couple of ways to achieve this some of which are better than others.

Build site locally using the development server and generate the files locally. During upload the aggregate files will be present and included with app.

Upload app and generate the files while running on App Engine and written to GCS. Download the files locally into the app and re-upload. This method means that your app may serve with out-of-date CSS or JS until you re-upload which can cause all sort of issues.

By default, aggregate files are served via a Drupal router which acts as a GCS proxy. The proxy should always work without any additional configuration, but this will consume instance hours for serving static aggregate resources.

Directly from GCS

Serving directly from GCS does not require uploading static files with the app, but can cause difficulties since resources referenced from CSS will need to be uploaded to GCS as well (or referenced using an absolute URL). Also note that that the CSS and JS files will be served from a different domain which may cause complications.

Closing

There are areas that could be improved and I plan to continue working so stay tuned. If you have any ideas or want to pitch in you may do so in the Google App Engine issue queue. As always I look forward to your feedback.

Today Google announced PHP support for Google App Engine! I have been one of the lucky folks who had early access and so of course I worked on getting Drupal up and running on GAE. There are a few things that still need to be worked out which I will continue to discuss with the app engine team, but I have a working Drupal setup which I will detail below. Note that much of this may also apply to other PHP frameworks.

Getting up and running

I will cover the steps specific to getting Drupal 7 (notes for Drupal 6 along with branches in repository) up and running on App Engine and not how to use the SDK and development flow which is detailed in the documentation. For an example (minimal profile from core) of Drupal running on Google App Engine see boombatower-drupal.appspot.com.

Sign up to be whitelisted for PHP runtime

Currently, the PHP runtime requires you to sign up specifically for access. Assuming you have access you should be able to follow along with the steps below. Otherwise, the following steps will give you a feel for what it takes to get Drupal running on GAE.

Create a Cloud SQL Instance

Once the instance has been created select the SQL Prompt tab and create a database for your Drupal site as follows.

CREATE DATABASE drupal;

Download Drupal

There are a few tweaks that need to be made to get Drupal to run properly on GAE which are explained below, but for the purposes of this walk-through one can simply download my branch containing all the changes from github.

Install

Visit your-app.appspot.com/install.php and follow the installation steps just as you would normally except that the database information will already be filled in. Go ahead and ignore the mbstring warning and note that the GAE team is looking into supporting mbstring.

Explanation of changes

If you are interested in what changes/additions were made and the reasons for them continue reading, otherwise you should have a working Drupal install ready to explore! There are a few basic things that do not work perfectly out of the box on GAE. The changes can be seen by diffing the 7.x-appengine branch against the 7.x branch in my repository.

File directory during installation

The Drupal installer requires that the files directory be writeable, but GAE does not allow for local write access thus the requirement must be bypassed in order for the installation to complete.

Clean URLs

In order to take advantage of clean urls, of which most sites take advantage, mod_rewrite is required for Apache environments. Since GAE does not use Apache it does not support mod_rewrite and thus another solution is needed. The app.yaml can configure handlers which allow for wildcard matching which means multiple paths can easily be routed to a single script. Taking that one step further we can alter the <?php $_GET['q']?> variable just as mod_rewrite would so that Drupal functions properly. Rather than modify core this can be done via a wrapper script as show below (this should work well for other PHP applications).

// Provide mod_rewrite like functionality by using the path which excludes // any other part of the request query (ie. ignores ?foo=bar).$_GET['q'] = $path;}

// Override the script name to simulate the behavior without wrapper.php.// Ensure that $_SERVER['SCRIPT_NAME'] always begins with a / to be consistent// with HTTP request and the value that is normally provided (not what GAE// currently provides).$_SERVER['SCRIPT_NAME'] = '/' . $file;require $file;?>

PHP $_SERVER['SCRIPT_NAME'] variable

The <?php $_SERVER['SCRIPT_NAME'] ?> implementation differs from Apache mod_php implementation which can cause issues with a variety of PHP applications. The variable matches the HTTP spec and not the filesystem when called through Apache.

Drupal derives the URL from <?php dirname() ?> of <?php $_SERVER['SCRIPT_NAME'] ?> which will return . if no slashes or just / for something like /index.php.

The wrapper script above solves this by ensuring that the SCRIPT_NAME variable alway starts with a leading slash.

HTTP requests

GAE does not yet support support outbound sockets for PHP (although supported for Python and Java) and if/when it does the preferred way will continue to be streams due to automatic caching of outbound requests using urlfetch. I have included a small change to provide basic HTTP requests through drupal_http_request(). A proper solution would be to override the drupal_http_request_function variable and provide a fully functional alternative using streams. Drupal 8 has converted drupal_http_request() to use Guzzle which supports streams. Making a similar conversion for Drupal 7 seems like the cleanest way forward rather than reinventing the change.

php.ini

GAE disables a number of functions for security reasons, but only softly disables some functions which may then be enabled. Drupal provides access to phpinfo() from admin/reports/status and uses output buffering, both of which are disabled by default. The included php.ini enables both functions in addition to getmypid which is used by drupal_random_bytes().

Future

I plan to continue working with the GAE team to ensure that support for Drupal can be provided in a clean and simple manner. Once current discussions have been resolved I hope to provide more formal documentation and support for Drupal.

File handling

I worked on file support, but there were a number of upcoming changes that would make things much cleaner so I decided to wait. GAE provides a stream wrapper for Google Cloud Storage which makes using the service very simple. Assuming you have completed the prerequisites files on GCS may be accessed using standard PHP file handling functions as shown in the documentation.

Unfortunately, the wrapper does not currently support directories nor does file_exists() work properly. Keep in mind that the filesystem is flat so a file may be written to any path without explicitly creating the directory. Meaning one can write to gs://bucket/foo/bar.txt without creating the directory foo. With that being the case it is possible to get some hacky support by simply disabling all the directory code in Drupal, but not really usable. It should be possible to hack support in through the stream wrapper since directories are simply specially name files, but the app engine team has indicated they will look into the matter so hopefully this will be solved cleanly.

Assuming the stream wrappers are fixed up then support can be added in much the same way as that Amazon S3 support is added except that no additional library will be needed.

Additionally, the documentation also notes the following.

Direct file uploads to your POST handler, without using the App Engine upload agent, are not supported and will fail.

In order to support file uploads the form must be submitted to the url provided by CloudStorageTools::createUploadUrl() and the forwarded result handled by Drupal. A benefit of proxying requests through uploader service is that uploaded files may be up to 100TB in size.

Other

There are a number of additional services provided as part of GAE of which Drupal could take advantage.

Closing

Hopefully this will be useful in getting folks up and running quickly on GAE with Drupal and understanding the caveats of the environment. Obviously there is a lot more to cover and I look forward to seeing what others publish on the matter.

Working for Google is a great opportunity for numerous reasons not the least of which...it's Google. The software, languages, and tools that Google uses are quite different from the tools to which I am accustomed, but have lots of similarities. It has been encouraging to see the similarities to Drupal in the processes and reasons for the ways things are done. The story behind Google's testing efforts follows a similar path to that of getting testing into Drupal core, the difficulties, processes setup, and the systems put in place. Hearing the same reasons for decisions as we came up with in the Drupal project has been encouraging. On the flip side looking at the differences in development processes and tools has been rather enlightening.

The aspect that I am most excited about for Drupal is using my Google "20% time" to contribute to Drupal. Having the ability to spend more regular time working on Drupal projects such as qa.drupal.org is something that I believe will make a big difference. On my first 20% day, Friday (today), I will be working on the next generation testing system to replace the current qa.drupal.org. The system is the open sourced portion of the platform that powers ReviewDriven. The code can be found on github at drupalorg_qa, drupalorg_qa_worker, and also through a number of projects on Drupal.org which will become the primary hub. The plan is to run the new system alongside the existing system and to demonstrate it to the community in the near future.

Another big change that came with the job was the move to Mountain View, California from Omaha, Nebraska. Quite a big change both in environment/living style and leaving friends/family to which I am still getting used to, but I expect it to be a rewarding opportunity. I would like to thank those in the community whose encouragement has meant a lot to me, specifically Angie Byron (webchick) and Kieran Lal (amazon).

Tags:

For those who just want the result please scroll to the bottom, otherwise you can read the story.

I started looking into how to backport commits since I know how easily git forward ports commits and figured there had to be a better way then the tricks and manual methods recommended in documentation and various results from google. My original plan was to make a plugin that would re-roll patches that need forward porting automatically on drupal.org and if possible backport as well. Over the course of an hour and half I found a variety of snippets that got me part of the way there, developed a basic idea of how to accomplish backporting in git, and slowly refined it to built-in commands. Huge thanks to cbreak and FauxFaux in #git on freenode!

The basic idea I started from was to simply remove all the commits between the commit you want to patch on top of and the last commit in the tree. To test my approach I started by creating a simple situation representing 7.x -> 8.x using the following.

The next step was to figure out how to remove all the commits between the one to be backported and the last point at which 7.x and 8.x "last overlap" or at the point 8.x was branched (those are not necessarily the same and in Drupal's case they are not). I first envisioned dumping git rebase -i to a file, editing with script to remove commits and then rebasing. To that end it was suggested to change the GIT_EDITOR environment variable to a grep command and then use git rebase -i which would work, but the better approach is to use --onto which does exactly that.

$ git rebase --onto=A B

In simplest terms: remove all commits between A and B non-inclusively.

Now I simply needed a way to find the best point to merge onto. There are several snippets in stackexchange and elsewhere that find the "oldest ancestor" which would work, but a) are not optimal, and b) require long snippets that are not easy to understand. When I dug around in #git I was pointed to git merge-base which finds the best point for such an operation.

$ git merge-base 7.x 8.x

Finds the last common commit between the 7.x and 8.x branches. Using this we get the following.

$ git rebase --onto=`git merge-base 7.x 8.x` HEAD~1

Backport the latest commit onto the merge-base for 7.x and 8.x. Finally, we just need to forward port the patch to the end of the 7.x branch which is easy.

$ git rebase 7.x

Result

So putting it all together we get the following powerful one-liner that will backport the last commit in the 8.x branch to the 7.x branch.

$ git rebase --onto=`git merge-base 7.x 8.x` HEAD~1 && git rebase 7.x

If you want to backport something other than the last commit simply checkout a branch with that commit as the HEAD.

$ git checkout -b backport-branch commit-to-backport

This can also be extended to backport a patch straight on drupal.org by downloading and committing the patch first.

Using the module you can get coverage reports for individual page executions or any combination of tests. Code coverage reports can be extremely useful in determining areas of code that are not tested at all or provide a snapshot of the code involved in generating a page.

Pages

Code coverage can be recorded by adding ?code_coverage=true to the end of a page URL. After the page has completed execution a linked will be placed at the bottom of the HTML which will display outside the page style.

The link will open the coverage report generated for that page request. The report will include all the files that where loaded during the execution of the page.

Tests

Similarly, a link is provided for the code coverage recorded during a test run.

Reports

The report includes two parts: 1) the summary or index, and 2) the line-by-line coverage information. The links described above point to the summary of the coverage information.

The links in the summary point to the line-by-line coverage information overlayed on the corresponding code. The colors indicate the following.

green = executed

red = not executed

gray = ignored (or non-executable)

Filters

The coverage scope may be filtered to focus on improving coverage for a particular module/file/directory. Reducing the scope will also improve the coverage recording performance which may be useful when when dealing with large tests.

Future

I have already integrated the Code coverage project with Conduit (the open source ReviewDriven platform) which will be replacing the current system running qa.drupal.org. The plan is to get the new platform up and running in parallel with the current system at which time regular coverage runs against core (and contrib projects) can be made publicly available.

Tags:

The purpose of this post is to describe the solution that, after careful consideration, seems best suited to alleviating the situation described in the previous post. Other solutions may exist that we have not considered and that will effectively solve the problem. We are open to discussing alternatives and welcome constructive comments on our proposal. At the same time, we discourage negative comments that do not offer a positive alternative. As it is clear the current situation needs improvement, simply dismissing our proposal without offering a better alternative is not useful.

Proposal

ReviewDriven (RD) is a distributed quality assurance platform built to provide a simple yet powerful interface that makes it easy to apply the best practices of continuous integration, test driven development, and automated quality reviews to your development life cycle. The ReviewDriven stack provides a completely rebuilt system designed to take advantage of Drupal 7 and contributed modules that will allow Drupal.org, Drupal development shops, and site owners to take advantage of automated quality assurance. In Drupal terms, RD is the next generation of the testbot (qa.drupal.org).

We would like to see DO and other interested parties take advantage of automated QA tools. Towards that end, we propose that DO engage RD to assume the role of the testbot and provide those same services to the Drupal community.

Advantages

One of the limitations of the current system and one of the primary concerns we addressed with RD is the lack of control over the testing workflow. For example, the current workflow settings apply globally instead of on a more granular basis. In contrast, the RD platform will allow Drupal.org full control to define the workflow and settings to be used with each review. The integration between the testbot (RD) and Drupal.org will continue to be maintained as an open source module which will allow anyone to contribute ideas and changes to the QA workflow on Drupal.org. Since the ReviewDriven stack provides a versioned API, the Drupal.org integration may be maintained and updated independent of RD and on it's own schedule. This approach leaves all the control and flexibility in the hands of the Drupal community and shifts the burden of the testbot to RD.

Other challenges faced by the current system were also taken into consideration when building ReviewDriven. The ReviewDriven stack is extremely flexible which in itself solves a number of the current issues and opens up a variety of new options. This opens up the possibility of reviews for things like Coder and code coverage.

The RD stack as a whole is much more maintainable since it is built on as many contributed modules as possible. This keeps the actual codebase much smaller then the current system (PIFR). Depending on contributed modules has led us to suggest and contribute a number of improvements to those modules, and to create other contributed modules. This also implies the system is easily maintainable by the Drupal community in the event we open source the code (since the majority of the code is already maintained by others). The RD architecture is analogous to any other Drupal site (sponsored by a good citizen) in that we are maintaining the code specific to our site while contributing back to the community modifications to existing modules and new features designed as generic modules.

Putting QA in context

Our proposal hinges on the fact that the testbot (and most of Drupal.org) provides a direct benefit to everyone but, just like roads and other infrastructure, the cost needs to be shared. Core and contributed modules provide a direct benefit in an indirect way by reducing the amount of and time spent writing custom code, ensuring the base system works as intended, and allowing you to leverage with confidence that code base while building your site. Core and contributed modules represent the sure foundation upon which you build your house.

There has been plenty of discussion about the important role the testing infrastructure and testing as a whole played in Drupal 7 development. The benefit of QA is also evident by the fact that a number of very large Drupal sites launched on Drupal 7 before it was officially released. The stability and dependability offered by quality assurance testing is something everyone wants.

Drupal.org is one of very few open source projects, much less projects in general, to adopt quality assurance and testing. The Drupal core development process requires new features to have tests and bug fixes to include tests. This workflow is encouraged for contributed projects and has been adopted by many of the more used projects among others. Not only does this help ensure the stability and quality of Drupal and its contributed projects, but in turn serves as a selling point and differentiator for Drupal adoption. Given that QA is both an adoption point and a vital tool for improving Drupal, does it not follow that it makes sense to provide funding for a full-time effort towards its improvement?

Just as many Linux developers are full-time employees paid to work on improving Linux, we seek to work full-time on improving the Drupal ecosystem through quality assurance. We are not the first to be funded full-time to work on Drupal or to be paid to improve Drupal.org. A perfect example is Angie Byron (webchick) who was hired by Acquia to work full time on Drupal. Just as Linux was started by hobbyists and grew into a profession, so too Drupal appears to have outgown the ability to be maintained entirely by volunteers.

Funding

We see two separate areas that need funding. The first focuses on taking advantage of the ReviewDriven platform by updating the Drupal.org integration with the new testbot (RD). The second area is the ongoing fee for use of the platform (which includes infrastructure costs). RD will use the ongoing fees to improve and maintain the platform (like any other business).

Harnessing the flexibility provided by ReviewDriven will require a large overhaul of the current Drupal.org QA integration module (PIFT). We envision Drupal taking advantage of the granular settings supported by RD to provide per project, per release, per issue, and per patch settings to control the reviews made. Granular settings will ensure that the various workflows, coding standards, and environments that exist in "contrib" can be handled properly. Many projects have different requirements or adhere to different standards between their various releases. The integration with the new testbot would remain open source as Drupal.org integration and can be funded just like any other Drupal.org project.

We would also like to see a QA status advertised on each project page, possibly even some sort of ranking based on a number of quality assurance metrics. These metrics would help people select between similarly featured modules, advertise that we do QA, and help motivate developers to adopt QA. We have many other ideas for improvements and anticipate suggestions from the community.

The ongoing service also requires ongoing funding to handle infrastructure costs, feature improvements, updates for Drupal core changes, and requests from the Drupal community would ensure that things do not stagnate. We have a vision for new features that would significantly improve the Drupal ecosystem, some of which we have discussed with a few community members.

We envision either the Drupal Association or a group of businesses and other organizations with an interest in Drupal to hire RD as the logical successor to the current testbot. Our business will be to develop and maintain the testbot for use by Drupal.org and other organizations. The same approach can then be applied to other critical peices of infrastructure such as the improvement of Drupal.org and its maintenance. We would like to pioneer this effort for Drupal to further enhance the process and tools available to Drupal contributors and the community.

Further details about the specifics of the arrangement, details of the improvements, and plans for the future can be found in our formal proposal and addendum.

What to expect during the transition

With both the short-term upgrades and improvements to the Drupal.org integration and the ongoing RD services funded, we see the transition to RD taking shape as follows.

The first stage of the transition will require the update of Drupal.org's integration with the testbot to provide basic connectivity with ReviewDriven. Supporting RD will require a number of changes, both user-facing and behind the scenes. In addition, just using the ReviewDriven platform will enable a number of features and workflows.

Once the initial integration is ready to be tested in production, we suggest both ReviewDriven and the predecessor be run in parallel. Running both systems in parallel will provide the community with a preview of what is to come and an opportunity for feedback. Results from the two systems can be compared to provide a final round of human checks and give people time to adjust to the new system. After the completion of the parallel phase the old testbot will be deactivated and the new system will be given priority.

The second phase involves the larger changes necessary to take advantage of ReviewDriven's features and flexibility. We will start discussions and work on this phase as the initial integration stabilizes.

Some of the exciting features we will expose to DO in one or both of the stages include:

Much improved turnaround time with the ability to scale as the test suite grows

Code coverage reports from test runs [picture]

Testing of sandboxes (both core forks and modules)

Support for the developer application process

Drush make scripts for retrieving third party libraries.

Drush make files in lieu of parsing the project dependencies

Execution of arbitrary commands during various stages of the worker processing

Automatic enabling of issue retesting

Completely automated site reviews (reviews run against a configured site)

Reroll of a patch (using git --rebase) on one issue after a commit on another issue (for example, this large core change)

Display of quality metrics

Visible branch test results

Forcing a patch to run when a branch is broken (in order to fix the branch)

Determine the disruption to other patches that would be caused by another patch (e.g. the patch to move all core files)

Run Selenium tests

Provide separation between the testbot and the Drupal installer by writing a special script maintained by the community

Conclusion

We look forward to feedback about our proposal and encourage you to voice your opinion. Please be sure to be constructive. In case its not obvious, we are extremely passionate about doing this. So let's make this happen.

Tags:

The intent of this series of posts is not to blame people, but rather to point out the testbot needs full-time attention. Integral to this story are the decisions and circumstances that led me to stop working on SimpleTest in core and the "testbot" which runs on qa.drupal.org. I intend to follow-up this post with others dealing with rejuvenation of the testbot and improvements to SimpleTest. I understand some will not agree with my position, but I would like everyone to understand my reasons and intentions, and how we find ourselves in the current state of affairs. After everything is out in the open, my hope is that a useful discussion will ensue and meaningful progress will result.

Factors

Four factors led me to stop working on SimpleTest in core and the testbot:

I no longer had gratuitious amounts of free time.

I now had a need to make a living (and working on the testbot does not generate any income).

The core development process being what it is led to burnout and lack of desire.

The request to stop working on the testbot in conjunction with the Drupal 7 code freeze.

With me out of the picture, it magnified the fact that noone else worked on the testbot and, going forward, noone stepped up to take my place.

Background

Lets start off with some background about my involvement with the Drupal testing story.

SimpleTest's journey to core

Rewind the clock back to early 2008. I had gotten involved in Drupal through GHOP and became maintainer of SimpleTest. I proceeded to perform a large-scale refactoring and cleaning up of SimpleTest. This, combined with other community efforts, resulted in SimpleTest being added to Drupal 7 core during the Paris Coding Sprint. The rapid pace at which I was able to develop SimpleTest quickly slowed as I no longer had the ability to commit changes nor make design decisions. Instead, even the most trivial changes took days or weeks to get committed. In spite of these additional challenges, I continued to diligently work on SimpleTest in core. To my dismay I discovered on multiple occasions that large changes were virtually impossible to push through the core queue, and I spent countless hours rerolling patches and refactoring code at various developers' whims. In the end, the patches simply died, but not for lack of quality or merit.

The chart shows 37 commits to the SimpleTest project before and after it was added to Drupal core. It is clear the pace of development slowed immediately and lessened further with time.

Changing course I focused on small changes to SimpleTest in core, but ran into similar throughput issues. For all intents and purposes, my ability to make contributions to SimpleTest had ground to a halt. This led me to write a blog post detailing the problem and possible solutions. I was not alone in my conclusions and many would still like to see the problem resolved. I continued to contribute to core now and then, but I was completely burned out. I even took month long breaks from Drupal as it literally burned me out to try to make any contribution to core. My burnout was not caused by overwork but was due to frustration with the exaggerated length of time to accomplish a minor commit.

Following up SimpleTest with the testbot

On a parallel track, getting SimpleTest into core turned out to be only half of the battle. Actually seeing the tests adopted and maintained remained a challenge. I led the charge to keep the tests in sync (initially doing it almost alone). The effort to create an automated system for running the tests had been underway for quite some time, but lacked the necessary volunteers and commitment to really get it off the ground. I was then asked to take over the project at which point I evaluated its status and decided to start over. I created PIFR, a plan for realizing the goal, and proceeded to rapidly make progress. Testing.drupal.org launched shortly afterward and testing became an integral part of the Drupal core workflow.

Seeking sponsors

After graduation from high school I was no longer financially able to devote large portions of my time to the testing system or core development so I sought sponsors to enable me to continue my work. Acquia provided an internship that allowed me to focus on testing again. After successfully completing the internship I found a job with Examiner.com that allowed me to spend a portion of my time improving and maintaining the automated testing system and roll out the initial work for contributed project testing and a number of other improvements in ATS (PIFR and PIFT) 2.2. The contributed project testing with dependencies was labeled beta because it did not support specific versions and had known issues. The plan was to make a followup release to solve the issues.

Code freeze and the request to stop

After deploying PIFR 2.2, I was asked to stop making changes to the testbot to ensure stability of the testing system during the final stages of Drupal 7 development. I continued to make improvements that I planned to deploy once the freeze was lifted, but the short freeze turned into months and more months. This delay ultimately forced me to stop development before the codebase diverged too much from the active testbot.

The chart shows my combined commit activity for PIFR and PIFT and indicates the dramatic slowdown that occurred as a result of the freeze placed on the testbot.

During this time I was the only person who worked on the testbot in any significant capacity (or virtually at all). My availability for working on testing dwindled when my time with Examiner ended. This, combined with the stagnation forced upon the testbot, meant things simply ceased moving forward. The complete stagnation is seen in the long period of time between the 2.2 release and the 2.3 release of PIFR on January 28, 2010 and March 28, 2011, respectively. During that entire period of more than a year no changes were made to the testbot. When changes were finally made, they were done merely out of necessity to accommodate the git migration.

Post-freeze undeployed features

Shortly after the 2.2 release I completed a number of improvements before things came to a stand-still. Some of the recent deployments have included functionality that I had completed, most notably:

Version control system abstraction and plugins for bazaar, cvs, git, and svn

Coder reviews in addition to testing

Beta support for contributed project testing with dependencies

Recent changes

As mentioned above, I had already abstracted the version control handling in the testbot and had four plugins (bazaar, cvs, git, and svn). Unfortunately, there were a number of assumptions that had to be made due to limitations with the project module's VCS integration. These assumptions had to be updated for the shiny new version control API. The changes required were very minor and did not represent any feature improvements, but were simply part of the changes necessary to complete the git migration. Randy Fay made the necessary changes and the testbot saw its first update in a very long time. A few small followups were released as part of the planned phasing out of the old patch format and such. It is interesting to note the other major components of the Drupal.org migration were contracted by the Drupal Association except the automated testing system.

Jeremy Thorson has recently been working on using the testbot's ability to perform coder reviews to help solve the woefully broken project application process which he describes in several blog posts. Again we see change coming to the testbot out of necessity rather than a focused plan for improvement. For those not aware of it, the project application queue has several hundred applications and it takes months to even receive a review. Jeremy has worked hard on improving the application process, at the heart of which is the ability to perform automated coder reviews. Providing automated reviews has been held back on multiple fronts not the least of which is finding people to get things done. This is a definite hurdle considering that only three people have every worked on the testbot code itself not to mention there is an average of less than one active maintainer at any give time.

As mentioned above, I had deployed the first stage of contributed project testing over a year ago, but was forced to shelve the follow-up deployments. The code to properly handle module dependencies fell into disarray with the git migration and required refactoring to work with the version control API. Derek Wright and I spent a lot of time hashing out the details to ensure things were properly abstracted for the project module. I completed the code, but it was never committed and thus was not maintained through the migration. Randy took it upon himself to update the code, but deviated from the agreed upon design. This choice meant the code would not be included in the project module and has a number of other ramifications. The feature was rebuilt in a drupal.org specific manner that precludes others from taking advantage of the code and eliminates the possibility of exposing the data through the update XML information. Exposing the data in that fashion would mean projects like drush, drush make, Aegir and others could discard code written to recreate this data or would now be able to support proper dependency handling. In addition, the recent deployment of dependency handling has led to large delays and instability in the testbot.

Conclusion

The decision to freeze the testbot in conjunction with the Drupal 7 code freeze made sense at the time. However, the extended freeze of the testbot (due to the extended Drupal 7 code freeze) along with moving SimpleTest into core had the unintended and disappointing side effect of causing the effective stagnation of the testing system. The only changes to the testbot in the past 20 months have been made out of necessity and annoyance (the git migration and the unfinished testbot integration with the project application process for new developers). During my tenure with Examiner.com, a fair number of changes were made to the testing system but not deployed on drupal.org. The module dependency code had been written over a year ago and finalized shortly thereafter but languished and was never deployed. Recently, some of these changes were finally deployed along with the git migration. All the while, I had set forth a detailed roadmap for the testing system.

The testing system had been stable and running for 3 years. Recent changes (implemented by others) have resulted in the ups and downs of the testing system. The importance of testing to Drupal development coupled with the recent instability strongly suggests the testing system requires full-time attention. The lack of feature changes since the 2.2 release of PIFR in January, 2010 is a direct result of a lack of financial testing resources, the lock-down of the testing system components, the burnout caused by extreme difficulty to make changes, and the extended freeze placed on the testbot.

Various solutions were tried to enable the continuation of work on the testbot. None represented a viable long-term solution. In the end, my father and I decided the solution was to establish a business to advance testing for the Drupal community and to create an environment where we no longer have our hands tied behind our back. In the next post, I will share the vision and passion we have for testing along with several features that could be made available to the community immediately.

Tags:

The following technique and many of the tools were developed/improved by ReviewDriven.

Problems

The field API in Drupal 7 combined with Views is very powerful combination. Yet there are certain data structures that are difficult or inefficient to work with using the two tools. One such structure is tables of data with multiple columns which can be stored using the fields API, but have a number of issues that arise.

Field data is loaded every time the entity is loaded which can be catastrophic with large datasets.

For example, when editing a node with a large dataset attached each value has a corresponding field element generated which eats up memory and processing power quickly and can easily cause white screens.

When viewing a node even if the fields are hidden the data is loaded and again is a tax on memory.

Relationship between columns is not exported to Views, since it has no way to know, which limits the way values may be displayed.

Aggregating columns in Views cannot be done across entities.

Views loads field data through the field API which is inefficient for large datasets, but ensures that display formatters and such are respected.

Solutions

Thankfully, there are a number of tools that can be utilized to solve each of these problems and in combination provide a very powerful way to handle tables of data and aggregation across entities.

Suppress field data from being loaded during entity_load(). Since field data will not be loaded it will not be displayed nor editable through the interface. This can be handy if you are using an alternate means to display or edit data and/or if you have a large amount of data in fields which will cause the node edit (or similar) interface to use a huge amount of memory and take a very long time to build.

Field suppress solves the first problem, but data will then need to be edited directly through code and manually displayed (ex. Views).

The next problem requires an interface/API for defining relationships between fields (or columns). Relationships can be defined using Field group views which is a plugin for Field group which provides both a UI and exportables for defining groups of fields. A display group can be defined and the plugin set to Views which will then export the proper relationships to Views and generate a stub view which can then be customized. The view will automatically replace the fields when displaying the entity. A requirement of Field group views is Views field which actually solves the last two problems.

Views field exposes the field tables and revision field tables to Views as base tables. Using the field base tables means that field data can be loaded directly instead of going through the field API. Loading data directly is much more efficient (especially for large datasets) and allows for aggregation across multiple entities, but losses the formatting capabilities of the fields API (could possibly add support for formatting to Views field). Formatting can also be added to the exposed field base table through hook_views_data_alter() in the following manor.

Please note, there are two bugs that cause annoyances due to changes in the Views API, but do no prevent Views field from working. Feel free to submit patches.

Example

I have utilized these tools in combination on a number of projects with quite satisfying results, but I will attempt to provide a few generic examples to provide a clearer picture of how these tools can be used.

Tallying summary results

If you have a collection of entities and you want to be able to group the overall field data and perform SQL operations like COUNT() or SUM() Views field makes it easy. The core poll module could be rewritten using this technique so we will use poll as an example (could be done many ways).

Say you have a node type for "Foo" poll entries that looks something like the following.

poll_foo_entry

poll_foo_value: customizable field capable of storing the value for a poll, in this case lets go with a text field

Results can be calculated using a view with the base table poll_foo_value and aggregation enabled. The poll_foo_value column can be used as the group by column and additional columns can be added for determining the COUNT(). You could even then display the results using a chart plugin for views.

I used this technique to create http://survey.reviewdriven.com/results. The tallied results are display on the left.

Table of data

Another powerful usecase is storing and displaying a table of data. Lets use a simple example of storing temperature data over time on nodes (possibly a node per region). A possible node structure is as follows.

The date and temperature fields can then be related using Field group views by placing them in the same group and displayed using a view. The fields are related based on each fields delta. In other words the data is stored using the field API in the following manor.

Improvements

Storing related field data in a single table allows for the removal of overlapping columns and removes the need to join multiple tables. A field group could then be placed in a separate bundle or otherwise exposed to PBS and stored together in a single table. The database scheme would then be much more manageable and easier to query manually.

Please keep in mind that PBS is not currently functional.

Editing

Being able to edit the dataset using a view would also be a major plus. There a number of modules that provide functionality of this sort, but not in such a flexible manor. Something like Editview would complete Field group views functionality. Large datasets could then be paginated for editing to prevent overload.

Exciting usage

Given that Views is plugable virtually an data structure could be displayed using this technique. The advantage over writing a one off field is that the structure is then easy to extend and modify, and requires little to no code to create. Simply export the fields, and field group definitions.

Fields such as the Name field could be turned into a collection of fields displayed using a view and editable (assuming that gets implemented) using a similar structure.

The possibilities opened up by this technique are quite exciting and I look forward to seeing what people come up with.