Category Archives: Analytics

We have already discussed in previous posts how important it is to speedup the feedback loop in your Software Development Lifecycle. Having early feedbacks gives you the chance of evaluating your hypothesis and eventually change direction if needed.

The more information you have, the smarter can be your decisions.

We recently added in our Gerrit DevOps Analytics the possibility of extracting data coming from Code Reviews’ metadata to extend the knowledge we can get out of Gerrit.

Furthermore, it is possible to extract meta-data from repositories not necessarily hosted on the Gerrit instance running the analytics processing. This is a big improvement since it allows to fully analyse repositories coming from any Gerrit server.

Hashtags aggregation

One important type of meta-data contained in the Code Reviews is the hashtag.

Hashtags are freeform strings associated with a change, like on social media platforms. In Gerrit, you explicitly associate hashtags with changes using a dedicated area of the UI; they are not parsed from commit messages or comments.

Similar to topics, hashtags can be used to group related changes together and to search using the hashtag: operator. Unlike topics, a change can have multiple hashtags, and they are only used for informational grouping; changes with the same hashtags are not necessarily submitted together.

You can use them, for example, to mark and easily search all the changes blocking a particular release:

Hashtags can also be used to aggregate all the changes people have been working on during a particular event, for example, the Gerrit User Summit 2019 hackathon:

The latest version of the Gerrit Analytics plugin exposes the hashtags attached to their respecting Git commit data. Let’s explore together some use cases:

The most popular Gerrit Code Review hashtags over the last 12 months

Throughput of changes created during an event

see for example the Palo alto hackathon (#palo-alto-2018). We can see at the end of the week the spike of changes to release Gerrit 2.16.

The extend of time for a feature

Removing GWT was an extensive effort which started in 2017 and ended in 2019. It took several hackathons to tackle the removal as shown by the hashtags distribution. Some changes were started in one hackathon and finalised in the next one.

Those were some example of useful information on how to leverage the power of GDA.

The above examples are taken from the GDA dashboardprovided and hosted by GerritForge on https://analytics.gerrithub.io which mirror commits and reviews on a regular basis from the Gerrit project and its plugin ecosystem.

How to setup GDA on Gerrit

Hashtag extraction is currently available from Gerrit 3.1 onwards. You can download the latest version released from the Gerrit CI.

To enable hashtag extraction you need to enable the feature extraction in the plugin config file as follow:

# analitycs.config[contributors] extract-hashtags =true

For more information on how to configure and run the plugin, look at the analytics plugin documentation.

Conclusion

Data is the goldmine of your company. You need more and more of it for making smarter decision. The latest version of the GDA allows you to leverage even more data produced during the code review process.

You can explore the potential of the information held in Gerrit on the analytics dashboard provided by GerritForge on analytics.gerrithub.io.

Accelerating your time to market while delivering high-quality products is vital for any company of any size. This fast pacing and always evolving world relies on getting quicker and better in the production pipeline of the products. The whole DevOps and Lean methodologies help to achieve the speed and quality needed by continuously improving the process in a so-called feedback loop. The faster the cycle, the quicker is the ability to achieve the competitive advantage to outperform and beat the competition.

It is fundamental to have a scientific approach and put metrics in place to measure and monitor the progress of the different actors in the whole software lifecycle and delivery pipeline.

Gerrit DevOps Analytics (GDA) to the rescue

We need data to build metrics to design our continuous improvement lifecycle around it. We need to juice information from all the components we use, directly or indirectly, on a daily basis:

SCM/VCS (Source and Configuration Management, Version Control System)how many commits are going through the pipeline?

Code Review
what’s the lead time for a piece of code to get validated?
How are people interacting and cooperating around the code?

Issue tracker (e.g. Jira)
how long does it take the end-to-end lifecycle outside the development, from idea to production?

Getting logs from these sources and understanding what they are telling us is fundamental to anticipate delays in deliveries, evaluate the risk of a product release and make changes in the organization to accelerate the teams’ productivity. That is not an easy task.

Gerrit DevOps Analytics (aka GDA) is an OpenSource solution for collecting data, aggregating them based on different dimensions and expose meaningful metrics in a timely fashion.

GDA is part of the Gerrit Code Review ecosystem and has been presented during the last Gerrit User Summit 2018 at Cloudera HQ in Palo Alto. However, GDA is not limited to Gerrit and is aiming at integrating and processing any information coming from other version control and code-review systems, including GitLab, GitHub and BitBucket.

Case study: GDA applied to the Gerrit Code Review project

One of the golden rules of Lean and DevOps is continuous improvement: “eating your dog food” is the perfect way to measure the progress of the solution by using its outcome in our daily life of developing GDA.

As part of the Gerrit project, I have been working with GerritForge to create Open Source tools to develop the GDA dashboards. These are based on events coming from Gerrit and Git, but we also extract data coming from the CI system, the Issue tracker. These tools include the ETL, for the data extraction and the presentation of the data.

As you will see in the examples Gerrit is not just the code review tool itself, but also its plugins ecosystem, hence you might want to include them as well into any collection and processing of analytics data.

Wanna try GDA? You are just one click away.

We made the GDA more accessible to everybody, so more people can play with it and understand its potentials. We create the Gerrit Analytics Wizardplugin so you can have some insights in your data with just one click.

What you can do

With the Gerrit Analytics Wizard you can get started quickly and with only one click you can get:

Initial setup with an Analytics playground with some defaults charts

Populate the Dashboard with data coming from one or more projects of your choice

The full GDA experience

When using the full GDA experience, you have the full control of your data:

Schedule recurring data imports. It is just meant to run a one-off import of the data

Create a production ready environment. It is meant to build a playground to explore the potentials of GDA

One click to crunch loads of data

Once you have Gerrit and the GDA Analytics and Wizard plugins installed, chose the top menu item Analytics Wizard > Configure Dashboard.

You land on the Analytics Wizard and can configure the following parameters:

Dashboard name (mandatory): name of the dashboard to create

Projects prefix (optional): prefix of the projects to import, i.e.: “gerrit” will match all the projects that are starting with the prefix “gerrit”. NOTE: The prefix does not support wildcards or regular expressions.

Date time-frame (optional): date and time interval of the data to import. If not specified the whole history will be imported without restrictions of date or time.

Username/Password (optional): credentials for Gerrit API, if basic auth is needed to access the project’s data.

Sample dashboard analytics wizard page:

Once you are done with the configuration, press the “Create Dashboard” button and wait for the Dashboard, tailored to your data, to be created (beware this operation will take a while since it requires to download several Docker images and run an ETL job to collect and aggregate the data).

At the end of the data crunching you will be presented with a Dashboard with some initial Analytics graphs like the one below:

You can now navigate among the different charts from different dimensions, through time, projects, people and Teams, uncovering the potentials of your data thanks to GDA!

What has just happened behind the scenes?

When you press the “Create Dashboard” button, loads of magic happens behind the scenes. Several Docker images will be downloaded to run an ElasticSearch and Kibana instance locally, to set up the Dashboard and run the ETL job to import the data. Here a sequence workflow to illustrate the chain of events is happening:

Conclusion

Getting insights into your data is so important and has never been so simple. GDA is an OpenSource and SaaS (Software as a Service) solution designed, implemented and operated by GerritForge. GDA allows setting up the extraction flows and gives you the “out-of-the-box” solution for accelerating your company’s business right now.

Contact usif you need any help with setting upa Data Analytics pipeline or if you have any feedback about Gerrit DevOps Analytics.

The Gerrit User Summit 2018 has ended. It has been a truly memorable event and with strong emotions, gratitude and celebration of the success of the Gerrit OpenSource project.

During the hackathon, some major events happened:

The release of Gerrit v2.16

Support for Kubernetes

New plugins contributed by Qualcomm

The announcement of a new maintainer, Marco Miller from Ericsson, congrats!

Hackathon and Gerrit v2.16

The hackathon took place at SAP (2 days) and Cloudera (1 day) in Palo Alto with the main focus of completing the migration to PolyGerrit and cut the v2.16 release. Three major milestones of the Gerrit project:

With regards to the achievement #1, during the hackathon, the commit for removing all the GWT related code from Gerrit master has been finally merged.

PolyGerrit as default Gerrit Code Review UI

The PolyGerrit Team has reached the summit of the long and troubled journey of getting rid of the outdated and Gerrit GWT UI. Kasper and Logan (Google) burned every single bit of their development energy to fix, implement and fill all the remaining gaps in the user journeys.

On Gerrit v2.16, PolyGerrit UI is the default and standard way of interacting with Gerrit Code Review. GWT is still there for allowing people to smoothly adapt to the new interface. However, the next forthcoming version of Gerrit v3.0 will not contain anymore any reference to the old UI. People will still have a few more months to adapt and, from the feedback received on GerritHub.io, they seem to not miss the GWT interface at all.

Well done to the entire PolyGerrit Team, starting from Andrew who kicked off the project back in 2015 to Wyatt, Becky, Victor, Kasper and Logan who brought it up to the usability and portability to multiple devices as it is today. Thanks to their amazing effort, we can all interact with Gerrit Code Review wherever and whenever we want, without losing speed and effectiveness.

Migration of Gerrit groups to NoteDb

One more step towards the complete elimination of ReviewDb has been achieved in Gerrit v2.16: there isn’t anymore any mutable data in the leftover of the (in)famous Gerrit DB. The new version includes a new git repository (All-Groups) which contains all the information related to the groups create through the Gerrit UI.

What’s left in ReviewDb? Not much: only the schema version. When you install Gerrit it is populated with a single row and a single column with the value ‘170’ and will never change. There is no need to share it across master or slaves: just set database.type = h2 and forget about it.

Git protocol v2

It is a major achievement of Gerrit v2.16 as the Git wire protocol v2 solves the performance problems of large repositories: the refs filtering during the advertisement phase.

In a nutshell, if you were fetching a single from a repository with 500k refs, the server would have sent a huge payload of all refs with the associated SHA1s even if at the end of the day you wanted only to fetch a single one. The delay could have been of several seconds if not minutes, which is quite relevant if you are fetching all the times, like in a CI/CD pipeline.

Git protocol v2 reduced the refs advertisement phase by over 70%, which makes Gerrit v2.16 so appealing that you may want to place in the top 5 tasks of your backlog.

Farewell to GWT UI code

Even though Gerrit v2.16 keeps the ability to display the old-fashioned GWT Gerrit UI, the code has been definitely removed on the Gerrit master branch, which will the base for Gerrit v3.0 that will be released in Spring 2019.

This is a memorable moment because represents the final act of making PolyGerrit UI the unified user-experience of Gerrit moving forward.

Data and Insights of the Hackathon

This year the event has been truly focused as well on the analytics side of the Code Review and the entire CI/CD pipeline: was it possibly the location, Cloudera, the world leader in OpenSource BigData and Analytics, giving the right vibrations? Possibly, yes.

Numbers

Over 20 people have attended the Hackathon on-site and, for the first time, remotely via video-conference, from companies and countries all around the world:

Google (USA and Germany)

CollabNet (Japan and Germany)

GerritForge (UK, Ireland, Italy, and Germany)

SAP (Germany)

Qualcomm (USA)

Ericsson (Canada)

Wikimedia Foundation (UK)

Over 300+ changes have been uploaded and 244 of them have been already merged.

They have been working on 44 projects, showing how diverse is today the universe of Gerrit Code Review and its associated plugins and integrations.

Gerrit DevOps Analytics

GerritForge has been hacking on improving the platform that powers the Gerrit Code Review Analytics Platform which has been serving the community for many years.

For the first time, thanks to the introduction of branch-level analytics, the community had data automatically crunched in near-real time of what was happening on the master branch and on the on-going stable 2.16 release.

As a matter of fact, the numbers mentioned before are directly coming from Gerrit DevOps Analytics. Thanks again GerritForge for innovating and improving the Gerrit Code Review platform.

Gerrit Analytics Wizard

For the first time, two new contributors from GerritForge, Tony and Ponch, finalized and demoed a new plugin called ‘analytics-wizard’ that allows anyone with a Gerrit server to set up a mini-version of the Gerrit DevOps Analytics, using Docker, ElasticSearch, and Kibana.

Two new plugins: batch and tasks

Qualcomm has released two new plugins based on their experience of running complex pipelines on a multi-master Gerrit setup. The majority of the code was developed in-house before the hackathon. However, the final release and announced happened this week and I am sure it will allow many other companies to benefit from their experience of validating complex multi-repo changes on a large scale CI system.

Batch

This plugin provides a mechanism for building and previewing sets of proposed updates to multiple projects/refs that should be applied in a batch. These updates are built with Gerrit changes.

Task

The plugin provides a mechanism to manage tasks which need to be performed on changes along with a way to expose and query this information. Tasks are organized hierarchically, and task definitions use Gerrit queries to define which changes each task applies to, and how to define the status criteria for each task. An important use case of the task plugin is to have a common place for CI systems to define which changes they will operate on, and when they will do so.

Gerrit on Kubernetes from SAP and GerritForge

Luke (GerritForge), Matthias (SAP) and Thomas (SAP remotely from Germany) worked on a PoC for supporting Kubernetes deployments.

Last year SAP started using containerized slaves and decided to go cloud native and base our future Gerrit deployments on Kubernetes and leverage project Gardener to support multiple cloud providers. Matthias worked with Thomas to prepare a PoC and a demo the current work in progress and discuss plans moving forward.

The code has been published to the new k8s-gerrit repository and new commits will be subject to the standard Gerrit Code Review process.

Luke prepared an example of Gerrit v2.15 deployment in Kubernetes (single master at the moment) with shared storage and showed how to upgrade easily by leveraging Helm Charts deployments through Tiler.

Even though the two projects started in parallel, they agreed to cooperate and work together with the community to have a unified Kubernetes deployment code-base for installing and upgrading multi-master Gerrit setups. This is definitely an area where we will see very interesting developments very soon.

What about the Summit?

The talks of the summit have been very interesting this year and have covered mostly three main topics: Gerrit roadmap and scalability, sharing of real-life migrations to Gerrit v2.14 and v2.15 and, as previously mentioned, Data Analytics & Insights including research work on using machine-learning for supporting the Code Review process.

More blog posts will follow with more background on each talk, stay tuned and watch this space to know more about what has been presented and announced at the Summit.

Proposals for the next Gerrit User Summit 2019

The overall feedback on the summit has been very positive, including the Birthday Party organized and sponsored by GerritForge for the 10th anniversary of the Gerrit Code Review project.

During the final closing keynote, many interesting proposals have been made:

Gerrit users’ wish-list session.
Any Gerrit user can feed the backlog of new feature proposing what they will like to see implemented

Gerrit plugin hackathon.
Learn how to implement Gerrit plugin and trying to put together in one or two days something useful using any of the programming languages they wish, including Java, Groovy, Scala or anything else.

More details will come as soon as the closing keynote session recording will be published.

Remembering Shawn Pearce and his legacy

Dave Borowitz, the leader of the Gerrit Code Review project, made a thoughtful and touching speech about Shawn Pearce, the father of the Gerrit 2, who started the project exactly 10 years ago. The cake, the hackathon, the summit, and the v2.16 release have been fully dedicated to his memory as appreciation for his humanity and passion for OpenSource.

Thanks, Shawn. We will continue on your legacy and improve the Gerrit Code Review platform as you would have wanted us to do.