November 24, 2016

If you're using CentOS, you probably noticed that we have a CR repository containing all the built packages for the next minor release, so that people can "opt-in" and already use those packages, before they are released with the full installable tree and iso images.

Using those packages on a subset of your nodes can be interesting, as it permits you to catch some errors/issues/conflicts before the official release (and so symlink on mirrors being changed to that new major.minor version)

For example, I tested myself some roles and found an issue with zabbix-agent refusing to start on a node fully updated/rebooted with CR pkgs (so what will become 7.3.1611 release). The issue was due to selinux denying something (that was allowed in previous policy)

You can now use your configuration management platform to distribute that built .pp policy (you don't need to build it on every node). I'll not dive into details, but I wrote some slides around this (for Ansible and Puppet) for a talk I gave some time ago, so feel free to read those, especially the last slides (with examples)

November 15, 2016

Official Vagrant images for CentOS Linux 6.8 and CentOS Linux 7.2.1511 for x86_64 are now available for download, featuring updated packages to 30 October 2016, as well as the following user-visible changes:

several optimisations to make the images smaller and faster:

do not install most firmware packages

do not install microcode_ctl

do not build a rescue initramfs (resulting in significantly faster kernel updates)

do not load the floppy module on centos/7 (this reduces boot time by ca. 5s)

[security]: do not allow regular users to use su to become root or vagrant – see issue #76

set the SELinux type of /etc/sudoers.d/vagrant to etc_t

Known Issues

The centos/7 image is based on CentOS Linux 7.2.1511, since CentOS Linux 7.3 is not available yet.

The VirtualBox Guest Additions are not preinstalled; if you need them for shared folders, please install the vagrant-vbguest plugin and add the following line to your Vagrantfile:

config.vm.synced_folder “.”, “/vagrant”, type: “virtualbox”

We recommend using NFS instead of VirtualBox shared folders if possible.

Since the Guest Additions are missing, our images are preconfigured to use rsync for synced folders. Windows users can either use SMB for synced folders, or disable the sync directory by adding the line

config.vm.synced_folder ".", "/vagrant", disabled: true

to your Vagrantfile.

Please use Vagrant 1.8.6 (version 1.8.5 is unable to create new Linux boxes due to Vagrant bug #7610, while version 1.8.7 is unable to download or update boxes due to Vagrant bug #7969).

If you are using CentOS Linux on the host, we recommend installing Vagrant from SCL and using the libvirt images. In general, the Vagrant packages provided by your Linux distribution are preferable, since they usually backport fixes for some upstream bugs. If you are using Vagrant on other operating systems, please use Vagrant 1.8.6 (see Known issues, item 4).

Verifying the integrity of the images

The SHA256 checksums of the images are signed with the CentOS 7 Official Signing Key. First, download and verify the checksum file:

November 10, 2016

Over past few months, we’ve been working on CentOS Community Container Pipeline which aims to help developers focus on what they love doing most – write awesome code – and sysadmins have an insight into the image by providing metadata about it! The project code is hosted at Github.com since its inception. The hosted service, that runs off this code, is available to the community at large, and delivers content to registry.centos.org.What is CentOS Community Container Pipeline?

CentOS Community Container Pipeline enables developers and sysadmins to have a container images built, tested and scanned on the CentOS Project’s infrastructure right after a developer pushes code to the git repository!

Container Pipeline Flow

Once the developer pushes code to git repo, Container Pipeline fetches the changes and container images are built using OpenShift which provides an enterprise distribution of Kubernetes project. Once the image is built, it gets scanned using atomic scanners (more on this soon!). The result of these scanners is combined into a mail and sent to the author of the container image. Container images can also be tested using the user provided test scripts to ensure that container can be spinned off the image on platforms like CentOS Linux, CentOS Atomic Host and OpenShift.

Why scan images?

Building container images and spinning containers is rather simple. Having more information a.k.a metadata about the container images before running them in one’s production environment is of paramount value! Of course, the kind of information is what makes it of paramount or negligible value. That’s what we aim to provide with CentOS Community Container Pipeline.

Scanners in CentOS Community Container Pipeline

At this point we have two scanners operational. One that checks your CentOS Linux based container images for package updates and other that verifies them. Both the scanners are based on atomic tool developed by the Project Atomic folks. We are working on rolling out more scanners in near future!

Atomic Scanner

The scanners based on atomic are run automatically by the Pipeline after successful completion of image building process. These scanners can be run stand-alone as well! That is, you can install the scanner on your CentOS Linux based systems and run it against a container image built on CentOS Linux base image. And it does this without bringing up or executing the container itself.

In the pipeline, upon completion of scan process, the user is notified about issues with the image that need to be addressed. Addressing these issues would instill more confidence in deploying the resulting container image in a production environment.

Besides scanning an image after it is built, in near future, scanners would also run periodically and provide developer with the actionable information.

yum update scanner

This scanner provides user with the information about RPM packages that need to be updated in the container image. If you’re a developer this information is helpful to ensure you’re running latest packages with bug and security fixes to avoid having surprises in production.

The Package Updates key in above output lists packages that need to be updated in the scanned container image.

RPM verify scanner

As its name suggests RPM verify scanner verifies all installed files (libraries and binaries) via RPM packages in given container image. It reports any modified or tampered libraries and binaries in given container image. This is useful to ensure that given container image is not shipped with any tainted libraries or binaries.

It’s simple! Add an entry for your opensource project under index.d directory on CentOS Container Index. You can see a few files representing projects or individual developers under this directory already. Also, you need to have a cccp.yml file in your project that has information useful for the Container Pipeline to use. You can refer respective GitHub repos to get more information. Or get in touch with us on #centos-devel IRC channel on FreeNode network.

October 27, 2016

If you find any security issue in a CentOS.org website or service, please let us know; the same goes for any issues in CentOS Linux as well as the SIG content on centos.org. And the best way to get in touch is to email security@centos.org – and if the content is sensitive, please use the corrosponding gpg key to encrypt the content with. eg for CentOS Linux 7 specific issue, please encrypt the content with the CentOS Linux 7 key. Similarly for any content specific to the Virt SIG, please use the CentOS SIG Virt key.

October 26, 2016

The typical workflow for most ci.centos.org ( cico ) jobs is :
* Call Duffy's API endpoint with node/get and grab some machines
* Setup the machines environment for the ci job to come
* Push content to nodes
* Run the tests
* Clear out / tear down
* Call Duffy's API end point with node/done to return the machines
* Report status via Jenkins

Machines handed out in this manner to the CI Jobs are available for upto 6 hours at a time, at which point they are reaped back into the available pool for other jobs to consume. What this also means is that if for any reason, the job gets stuck, it could be upto six hours before the developer/user gets any feedback about the tests failing.

The usual way to resolve this situation is to setup a timeout in the jenkins job. That would allow Jenkins to watch for the run, on timeout, kill the job and report failure. However, if your job is setup with a single build step that also includes requesting the machines and returning them when done, Jenkins killing the job will mean your machines wont get returned for upto 6 hrs. Given that most projects are setup with a quota of 10 deployed machines; not returning them when done, would mean your jobs get put into a queue that isnt clearing out in a rush.

One way to work around this would be to split the machine request and machine return functions into a pre-build and post-build step, and then pass over the session-id for the deployed machines via the build steps. That way, you could trap and report specific conditions. A varioation of this would be to setup conditional build steps, and have them execute different functions as needed.

An easy and simple workaround however, is to just wrap the test commands in a /usr/bin/timeout call. timeout is delivered as a binary from the coreutils package on CentOS Linux 7 and would be available on all machines, including the jenkins worker instances. Take a look at https://github.com/almighty/almighty-jobs/blob/master/devtools-ci-index.yaml#L64 for a quick example of how this would work in a JJB template. This way we can timeout on the job, and yet be able to return nodes or handle any other content we need, in the same ci job script. A script that then does not have or need any Jenkins specific content, making it possible to run from developer laptops or as child jobs on its own.

/usr/bin/timeout ( man 1 timeout ) also allows you to preserve the sub commands exit status, if you need to track and report different status from your ci jobs. And ofcourse, there are many other uses for /usr/bin/timeout as well!

October 20, 2016

It's not a secret that we use Zabbix to monitor the CentOS.org infra. That's even a reason why we (re)build it for some other architectures, including aarch64,ppc64,ppc64le on CBS and also armhfp

There are really cool things in Zabbix, including Low-Level Discovery. With such discovery, you can create items/prototypes/triggers that will be applied "automagically" for each discovered network interface, or mounted filesystem. For example, the default template (if you still use it) has such item prototypes and also graph for each discovered network interface and show you the bandwidth usage on those network interfaces.

But what happens if you suddenly want to for example to create some calculated item on top of those ? Well, the issue is that from one node to the other, interface name can be eth0, or sometimes eth1, and with CentOS 7 things started to also move to the new naming scheme, so you can have something like enp4s0f0. I wanted to create a template that would fit-them-all, so I had a look at calculated item and thought "well, easy : let's have that calculated item use a user macro that would define the name of the interface we really want to gather stats from ..." .. but it seems I was wrong. Zabbix user macros can be used in multiple places, but not everywhere. (It seems that I wasn't the only one not understanding the doc coverage for this, but at least that bug report will have an effect on the doc to clarify this)

That's when I discussed this in #zabbix (on irc.freenode.net) that RichLV pointed me to something that could be interesting for my case : Alias. I must admit that it's the first time I was hearing about it, and I don't even know when it landed in Zabbix (or if I just overlooked it at first sight).

So cool, now I can just have our config mgmt pushing for example a /etc/zabbix/zabbix_agentd.d/interface-alias.conf file that looks like this and reload zabbix-agent :

That means that now, whatever the interface name will be (as puppet in our case will create that file for us) , we'll be able to get values from net.if.default.out and net.if.default.in keys, automatically. Cool

That also means that if you want to aggregate all this into a single key for a group of nodes (and so graph that too), you can do something always referencing those new keys (example for the total outgoing bandwidth for a group of hosts) :

grpsum["Your group name","net.if.default.out",last,0]

And from that point, you can easily also configure triggers, and graphs too.
Now going back to work on some other calculated items for total bandwith usage for a period of time and triggers based on some max_bw_usage user macro.

October 11, 2016

An updated version of CentOS Atomic Host (tree version 7.20161006), is now available, featuring the option of substituting the host’s default docker 1.10 container engine with a more recent, docker 1.12-based version, provided via the docker-latest package.

CentOS Atomic Host is a lean operating system designed to run Docker containers, built from standard CentOS 7 RPMs, and tracking the component versions included in Red Hat Enterprise Linux Atomic Host.

CentOS Atomic Host is available as a VirtualBox or libvirt-formatted Vagrant box, or as an installable ISO, qcow2 or Amazon Machine image. These images are available for download at cloud.centos.org. The backing ostree repo is published to mirror.centos.org.

Images

Vagrant

The easiest way to consume these images is via the Atlas / Vagrant Cloud setup (see https://atlas.hashicorp.com/centos/boxes/atomic-host). For example, getting the VirtualBox instance up would involve running the following two commands on a machine with vagrant installed:

$ vagrant init centos/atomic-host && vagrant up --provider virtualbox

ISO

The installer ISO (776 MB) can be used via regular install methods (PXE, CD, USB image, etc.) and uses the Anaconda installer to deliver the CentOS Atomic Host. This image allows users to control the install using kickstarts and to define custom storage, networking and user accounts. This is the recommended option for getting CentOS Atomic Host onto bare metal machines, or for generating your own image sets for custom environments.

QCOW2

The CentOS-Atomic-Host-7-GenericCloud.qcow2 (1.2 GB) image is suitable for use in on-premise and local virtualized environments. We test this on OpenStack, AWS and local Libvirt installs. If your virtualization platform does not provide its own cloud-init metadata source, you can create your own NoCloud iso image.

Release Cycle

The CentOS Atomic Host image follows the upstream Red Hat Enterprise Linux Atomic Host cadence. After sources are released, they’re rebuilt and included in new images. After the images are tested by the SIG and deemed ready, we announce them.

Getting Involved

CentOS Atomic Host is produced by the CentOS Atomic SIG, based on upstream work from Project Atomic. If you’d like to work on testing images, help with packaging, documentation — join us!

The SIG meets weekly on Thursdays at 16:00 UTC in the #centos-devel channel, and you’ll often find us in #atomic and/or #centos-devel if you have questions. You can also join the atomic-devel mailing list if you’d like to discuss the direction of Project Atomic, its components, or have other questions.

Getting Help

If you run into any problems with the images or components, feel free to ask on the centos-devel mailing list. Have questions about using Atomic? See the atomic mailing list or find us in the #atomic channel on Freenode.

Known Issues

The VirtualBox Guest Additions are not preinstalled; if you need them for shared folders, please install the vagrant-vbguest plugin. We recommend using NFS instead of VirtualBox shared folders if possible.

Since the Guest Additions are missing, our images are preconfigured to use rsync for synced folders. Windows users can either use SMB for synced folders, or disable the sync directory by adding the line

config.vm.synced_folder ".", "/vagrant", disabled: true

to your Vagrantfile.

Vagrant 1.8.5 is unable to create new Linux boxes due to Vagrant bug #7610. Please upgrade to Vagrant 1.8.6.

October 01, 2016

Rolling ISOs

The CentOS Linux team produces rolling CentOS-7 isos, normally on a monthly basis.

The most recently completed version of those ISOs are version 1609 (16 is for 2016, 09 is for September).

The team usually creates all our ISO and cloud images based on all updates through the 28th of the month in question .. so 1609 would mean these ISOs will contain all updates for CentOS-7 through September 28th, 2016.

These rolling ISOs have the same installer as the most recent CentOS-7 point release (currently 7.2.1511) so that they install on the same hardware as our original ISOs, while the packages installed are the latest updates.

This means that the actual kernel that boots up on the ISO is the 7.2.1511 default kernel (kernel-3.10.0-327.el7.x86_64.rpm), but that the kernel installed is the latest kernel package (kernel-3.10.0-327.36.1.el7.x86_64.rpm for the 1609 ISOs).

You can verify the sha256sum of your downloaded ISO following these instructions prior to install.

The DVD ISO contains everything needed to do an install, but still fits on one 4.3 GB DVD. This is the most versatile install that will fit on a single DVD and if you are new to CentOS this likely the installer you want. If you pick Minimum Install in this installer, you can do an install that is identical to Minimal ISO. You can also install many different Workstation and Server installs from this ISO, including both GNOME and KDE.

The Everything ISO has all packages, even those not used by the installer. You usually do not need this ISO unless you do not have access to the internet and want to install things later from this DVD and not included by the graphical installer. Most users will not need this ISO, it is > 7 GB but can do installs from a USB key that is big enough to hold it (currently an 8 GB key).

The LiveGNOME ISO is a Basic GNOME Workstation install, but there is no modification or personalization allowed during the install. It is a much easier install to do, but any extras packages must be installed from the internet later.

The LiveKDE ISO is Basic KDE Workstation install. It also does not allow modification or personalization until after the install has finished.

The Minimal ISO is a very small and quick install that boots to the command console and has network connectivity and a firewall. It is used by System Administrators for the minimal install that they can then add functionality to. You need to know what you are doing to use this ISO.

Newer Hardware Support

As explained above, the normal rolling ISOs boot from the Point Release installer. Sometimes there is newer hardware that might not be supported in the point release installer, but could be supported with a newer kernel. This installer is much less tested and is only recommended if you can not get one of the normal installers to work for you.

There are only 2 ISOs in this family, here are the links and sha256sums:

September 21, 2016

As soon as you're running some IT services, there is one thing that you already know : you'll have downtimes, despite all your efforts to avoid those...

As the old joke says : "What's up ?" asked the Boss. "Hopefully everything !" answered the SysAdmin guy ....

You probably know that the CentOS infra is itself widespread, and subject to quick move too. Recently we had to announce an important DC relocation that impacts some of our crucial and publicly facing services. That one falls in the "scheduled and known outages" category, and can be prepared. For such "downtime" we always announced that through several mediums, like sending a mail to the centos-announce, centos-devel (and in this case , also to the ci-users) mailing lists. But even when we announce that in advance, some people forget about it, or people using (sometimes "indirectly") the concerned service are surprized and then ask about it (usually in #centos or #centos-devel on irc.freenode.net).

In parallel to those "scheduled outages", we have also the worst ones : the unscheduled ones. For those ones, depending on the impact/criticity of the impacted service, and also the estimated RTO, we also send a mail to the concerned mailing lists (or not).

So we just decided to show a very simple and public dashboard for the CentOS Infra, but only covering the publicly facing services, to have a quick overview of that part of the Infra. It's now live and hosted on https://status.centos.org.

We use Zabbix to monitor our Infra (so we build it for multiple arches, like x86_64,i386,ppc64,ppc64le,aarch64 and also armhfp) , including through remote zabbix proxies (because of our "distributed" network setup right now, with machines all around the world).
For some of those services listed on status.centos.org, we can "manually" announce a downtime/maintenance period, but Zabbix also updates on its own that dashboard.
The simple way to link those together was to use zabbix custom alertscripts and you can even customize those to send specific macros and have that alertscript just parsing and then updating the dashboard.

We hope to enhance that dashboard in the future, but it's a good start, and I have to thank again Patrick Uiterwijk who wrote that tool for Fedora initially (and that we adapted to our needs).

September 20, 2016

The CentOS Infrastructure team will be moving the machines hosting cbs.centos.org, ci.centos.org and accounts.centos.org on October 10th, 2016. We expect a downtime of 48hrs. Contact us in #centos-devel on freenode at any time during that period for questions, or watch the centos-devel mailing list for the latest updates.

The servers, switches, PDUs, and even the racks themselves hosting CBS, ci.centos.org, accounts.centos.org and registry.centos.org are all stored in a datacenter in Raleigh, North Carolina, USA and will be moved to a new space in the datacenter on Monday October 10th. This new space provides a little bit of expansion room for the future of these services and consolidates networks that were previously separate (namely the CICO cloud with the rest of the CI infrastructure). During this window, all services related to the listed CentOS properties will be down.

We blocked out 2 days (48hrs) to do the move, but we will do our best to restore services as soon as it is possible to do so.

September 07, 2016

UPDATE 2016-09-08: Due to additional checks, we had to retire v1608.01 from Atlas and release it again as v1608.02. The two versions are identical.

Official Vagrant images for CentOS Linux 6 and CentOS Linux 7 for x86_64 are now available for download, featuring updated packages to 31 August 2016, as well as a new image for VMware Fusion.

Known Issues

The VirtualBox Guest Additions are not preinstalled; if you need them for shared folders, please install the vagrant-vbguest plugin. We recommend using NFS instead of VirtualBox shared folders if possible.

Since the Guest Additions are missing, our images are preconfigured to use rsync for synced folders. Windows users can either use SMB for synced folders, or disable the sync directory by adding the line config.vm.synced_folder ".", "/vagrant", disabled: true to the Vagrantfile.

The VMware Tools installer fails to generate a new initramfs due to a dracut configuration error in both our image and VMware Tools. As a workaround, change the add_drivers line in /etc/dracut.conf.d/vmware-fusion-drivers.conf to

add_drivers+=" mptspi "

(add spaces directly before and after mptspi) before trying to install VMware Tools or open-vm-tools.

Downloads

The official images can be downloaded from Hashicorp’s Atlas. We provide images for libvirt, VirtualBox and VMware.

September 01, 2016

Since yesterday, we have production-ready automated tests for our Vagrant images on ci.centos.org, fully integrated with GitHub. We were only able to build and test scratch images manually until now, which was time consuming and had the disadvantage that, due to hardware limitations on my side, only the images for VirtualBox were actually tested.

A pull request to the CentOS/sig-cloud-instance-build repository on GitHub will trigger the cloudinstance-vagrant-build Jenkins job on ci.centos.org, which builds all Vagrant images in CBS. If the build process completes without errors, the cloudinstance-vagrant-test job will test the Vagrant images for both CentOS Linux 6 and CentOS Linux 7, using the libvirt and virtualbox Vagrant providers. If everything is ok, you can see the test result directly below the pull request on GitHub (please note that a full test currently needs almost two hours to complete, most of the time being spent building the images):

Most of the code for the test is in my cloudinstance-vagrant-cico-util repository on GitHub, with a few additional snippets in the Jenkins configuration for each job. We are using the latest Vagrant provided by the Software Collections SIG, and VirtualBox 5.0.26 from virtualbox.org (at the time of writing this post, Vagrant refuses to start if it detects VirtualBox 5.1). Feedback is of course welcome.

We also will be glad to discuss the new things happening within the project, including a number of operational Special Interest Groups (SIGs) that are producing add on software for CentOS including The Xen Hypervisor, OpenStack (via RDO), Storage (GlusterFS and Ceph), Software Collections, Cloud Images (AWS, Azure, Oracle, Vagrant Boxes, KVM), Containers (Docker and Project Atomic).

So, if you have been using CentOS for the past 12 years, all that is happening just like it always has (long lived standard Linux distro with LTS), as well as all the new hypervisor, container and cloud capabilities.

May 02, 2016

Recently I was discussing with some people about TLS everywhere, and we then started to discuss about the Letsencrypt initiative.
I had to admit that I just tested it some time ago (just for "fun") but I suddenly looked at it from a different angle : while the most used case is when you install/run the letsencrypt client on your node to directly configure it, I have to admit that it's something I didn't want to have to deal with. I still think that proper web server configuration has to happen through cfgmgmt, and not through another process. (and same for the key/cert distribution, something for a different blog post maybe).

If so you're (pushing|pulling) automatically your web servers configuration from $cfgmgmt, but that you want to use/deploy TLS certificates signed by letsencrypt, what can you do ? Well, the good news is that you don't have to be forced to let the letsencrypt client touch your configuration at all : you can use the "certonly" option to just generate the private key locally, send the csr and get the signed cert back (and the whole chain too)
One thing to know about letsencrypt is that the validation/verification process isn't the one that you can see in most of the companies providing CA/signing capabilities : as there is no ID/Paper verification (or something else) , the only validation for the domain/sub-domain that you want to generate a certificate for happens over http request (basically creating a file with a challenge , process a request from their "ACME" server[s] to retrieve that file back, and validate content)

So what are our options then ? The letsencrypt documentation mentions several plugins like manual (involves you to then create the file with the challenge answer to the webserver, then launching the validation process) , or standalone (doesn't work if you already have a httpd/nginx process as there will be a port conflict) , or even webroot (working fine as it will then just write the file itself under /.well-kwown/ under the DocumentRoot)

The webroot seems easy, but as said, we don't want to even install letsencrypt on the web server[s]. Even worse, suppose (and that's the case I had in mind) that you have multiple web nodes configured in a kind of CDN way : you don't want to distribute that file on all the nodes for validation/verification (when using the "manual" plugin) and you'd have to do it on all the nodes (as you don't know in advance which one will be verified by the ACME server)

So what about something centralized (where you'd run the letsencrypt client locally) for all your certs (including some with SANs ) in a transpartent way ? I so thought about something like this :

So now, once in place everywhere, you can generate the cert for that domain on the central letsencrypt node (assuming that httpd is running on that node, and reachable from the "frontend" nodes, and that /var/www/html is indeed the DocumentRoot (default) for httpd on that node):

Transparent, smart, easy to do and even something you can deploy when you need to renew, and then remove to be back with initial config files too (if you don't want to have those ProxyPass directives active all the time)

The only thing you have also to know is that once you have proper TLS in place, it's usually better to redirect transpartently all requests to your http server to the https version. Most of the people will do that (next example for httpd/apache) like this :

It's good, but when you'll renew the certificate, you'll probably just want to be sure that the GET request for /.well-known/* will continue to work over http (from the ACME server) so we can tune a little bit those rules (RewriteCond are cumulatives so it will not be redirect if url starts with .well-known:

Hope that you'll have found that useful, especially if you don't want to deploy letsencrypt everywhere but still use it to generate locally your keys/certs. Once done, you can then distribute/push/pull (depending on your cfgmgmt) those files and don't forget to also implement proper monitoring for cert validity and automation around that too (consider that your homework)

April 28, 2016

Recently, some people started to ask proper IPv6/AAAA record for some of our public mirror infrastructure, like mirror.centos.org, and also msync.centos.org

Reason is that a lot of people are now using IPv6 wherever possible and from a CentOS point of view, we should ensure that everybody can have content over (legacy) ipv4 and ipv6. Funny that I call ipv4 "legacy" as we still have to admit that it's still the default everywhere, even in 2016 with the available pools now exhausted.

While we had already some AAAA records for some of our public nodes (like www.centos.org as an example), I started to "chase" after proper and native ipv6 connectivity for our nodes.
That's where I had to take contact with all our valuable sponsors. First thing to say is that we'd like to thank them all for their support for the CentOS Project over the years : it wouldn't have been possible to deliver multiple terrabytes of data per month without their sponsorship !

WRT ipv6 connectivity that's where the results of my quest where really different : while some DCs support ipv6 natively, and even answer you in 5 minutes when asking for a /64 subnet to be allocated , some other aren't still ipv6 ready : For the worst case the answer was "nothing ready and no plan for that" or for sometimes the received answer was something like "it's on the roadmap for 2018/2019").

The good news is that ~30% of our nodes behind msync.centos.org have now ipv6 connectivity, so the next step is now to test our various configurations (distributed by puppet) and then also our GeoIP redirection (done at the PowerDNS level for such records, for which we'll also then add proper AAAA record)

Hopefully we'll have that tested and then announced soon, and also for other public services that we're providing to you.

If you are an EPEL user (for whatever operating system), a packager, an upstream project member who wants to see your software in EPEL, a hardware enthusiast wanting to see builds for your favorite architecture, etc. … you are welcome to join us. We’ll have plenty of time for questions and issues from the audience.

The trick is that EPEL is useful or crucial for a number of the projects now releasing on top of CentOS via the special interest group process (SIGs provide their community newer software on the slow-and-steady CentOS Linux.) This means EPEL is essential for work happening inside of the CentOS Project, but it remains a third-party repository. Figuring out all of the details of working together across the Fedora and CentOS projects is important for both communities.

Hope to see you there!

December 14, 2015

As CentOS 7 (1511) was released, I thought it would be a good idea to update several of my home machines (including kids' workstations) with that version, and also newer kernel.
Usually that's just a smooth operation, but sometimes some backported features/new features, especially in the kernel, can lead to some strange issues.
That's what happened for my older Thinkpad Edge : That's a cheap/small thinkpad that Lenovo did several years ago ( circa 2011 ), and that I used a lot just when travelling, as it only has a AMD Athlon(tm) II Neo K345 Dual-Core Processor.
So basically not a lot of horse power, but still something convenient just to read your mails, remotely connect through ssh, or browse the web.
When rebooting on the newer kernel, it panics directly.

Two bug reports are open for this, one on the CentOS Bug tracker, linked also to the upstream one. Current status is that there is no kernel update that will fix this, but there is a easy to implement workaround :

November 30, 2015

Last friday, while working on something else (working on "CentOS 7 userland" release for Armv7hl boards), I got notifications coming from our Zabbix monitoring instance complaining about web scenarios failing (errors due to time outs) , and also then also about "Disk I/O is overloaded" triggers (checking the cpu iowait time). Usually you'd verify what happens in the Virtual Machine itself, but even connecting to the VM was difficult and slow. But once connected, nothing strange, and no real activity , not even on the disk (Plenty of tools for this, but iotop is helpful to see which process is reading/writing to the disk in that case), but iowait was almost at 100%).

As said, it was happening suddenly for all Virtual Machines on the same hypervisor (CentOS 6 x86_64 KVM host), and even the hypervisor was suddenly complaining (but less in comparison with the VMs) about iowait too. So obviously, it wasn't really something not being optimized at the hypervisor/VMS, but something else. That rang a bell, as if you have a raid controller, and that battery for example is to be replaced, the controller can decide to stop all read/write cache, so slowing down all IOs going to the disk.

At first sight, there was no HDD issue, and array/logical volume was working fine (no failed HDD in that RAID10 volume), so it was time to dive deeper into analysis.

A quick MegaCli64 -ShowSummary -a0 showed me that indeed the underlying disk were active but I got my attention caught by the fact that there was a "Patrol Read" operation in progress on a disk. I then discovered a useful (bookmarked, as it's a gold mine) page explaining the issue with default settings and the "Patrol Read" operation.
While it seems a good idea to scan the disks in the background to discover disk error in advance (PFA), the default setting is really not optimized : (from that website) : "will take up to 30% of IO resources"

I decided to stop the currently running patrol read process with MegaCli64 -AdpPR -Stop -aALL and I directly saw Virtual Machines (and hypervisor) iowait going back to normal mode.
Here is the Zabbix graph for one of the impacted VM, and it's easy to guess when I stopped the underlying "Patrol read" process :

That "patrol read" operation is scheduled to run by default once a week (168h) so your real option is to either disable it completely (through MegaCli64 -AdpPR -Dsbl -aALL) or at least (adviced) change the IO impact (for example 5% : MegaCli64 -AdpSetProp PatrolReadRate 5 -aALL)

Never understimate the power of Hardware settings (in the BIOS or in that case raid hardware controller).

Hope it can help others too

September 23, 2015

Recently I had (from an Infra side) to start deploying KVM guests for the ppc64 and ppc64le arches, so that AltArch SIGs contributors could start bootstrapping CentOS 7 rebuild for those arches. I'll probably write a tech review about Power8 and the fact you can just use libvirt/virt-install to quickly provision new VMs on PowerKVM , but I'll do that in a separate post.

Parallel to ppc64/ppc64le, armv7hl interested some Community members, and the discussion/activity about that arch is discussed on the dedicated mailing list. It's slowly coming and some users already reported having used that on some boards (but still unsigned and no updates packages -yet- )

Last (but not least) in this AltArch list is i686 : Johnny built all packages and are already publicly available on buildlogs.centos.org , each time in parallel to the x86_64 version. It seems that respinning the ISO for that arch and last tests would be the only things to do.

If you're interested in participating in AltArch (and have special interesting a specific arch/platform), feel free to discuss that on the centos-devel list !

September 16, 2015

So, thanks to the folks from Opennebula, we'll have another CentOS Dojo in Barcelona on Tuesday 20th October 2015. That even will be colocated with the Opennebulaconf happening the days after that Dojo. If you're attending the OpennebulaConf, or if you're just in the area and would like to attend the CentOS Dojo, feel free to register

Regarding the Dojo content, I'll be myself giving a presentation about Selinux : covering a little bit of intro (still needed for some folks afraid of using it , don't know why but we'll change that ...) about selinux itself, how to run it on bare-metal, virtual machines and there will be some slides for the mandatory container hype thing.
But we'll also cover managing selinux booleans/contexts, etc through your config management solution. (We'll cover puppet and ansible as those are the two I'm using on a daily basis) and also how to build and deploy custom selinux policies with your config management solution.

On the other hand, if you're a CentOS user and would like yourself to give a talk during that Dojo, feel free to submit a talk ! More informations about the Dojo on the dedicated wiki page

See you there !

September 09, 2015

In the last days, I encountered a strange issue^Wlimitation with Ext4 that I wouldn't have thought of. I've used ext2/ext3/ext4 for quite some time and so I've been used to resize the filesystem "online" (while "mounted"). In the past you had to use ext2online for that, then it was integrated into resize2fs itself.

The logic is simple and always the same : extend your underlaying block device (or add another one), then modify the LVM Volume Group (if needed), then the Logical Volume and finally the resize2fs operation, so something like

The limitation is that when the initial Ext4 filesystem is created, the number of reserved/calculated GDT blocks for that filesystem will allow to grow it by a factor of 1000.

Ouch, that system (CentOS 6.7) I was working on had been provisioned in the past for a certain role, and that particular fs/mount point was set to 2G (installed like this through the Kickstart setup ). But finally role changed and so the filesystem has been extended/resized some times, until I tried to extend it to more than 2TiB, which then caused resize2fs to complain ...

So two choices :

you do it "offline" through umount, e2fsck, resize2fs, e2fsck, mount (but time consumming)

you still have plenty of space in the VG, and you just want to create another volume with correct size, format it, rsync content, umount old one and mount the new one.

That means that I learned something new (one learns something new every day !), and also the fact that you then need to take that limitation in mind when using a kickstart (that doesn't include the --grow option, but a fixed size for the filesystem).

Hope that it can help

September 02, 2015

As some initiatives (like Let's Encrypt as one example) try to force TLS usage everywhere. We thought about doing the same for the CentOS.org infra. Obviously we already had some x509 certificates, but not for every httpd server that was serving content for CentOS users. So we decided to enforce TLS usage on those servers. But TLS can be used obviously on other things than a web server.

That's why we considered implementing something for our Postfix nodes. The interesting part is that it's really easy (depending of course at the security level one may want to reach/use). There are two parts in the postfix main.cf that can be configured :

Let's start with the client/outgoing part : just adding those lines in your main.cf will automatically configure it to use TLS when possible, but otherwise fall back on clear if remote server doesn't support TLS :

The interesting part is the smtp_tls_security_level option : as you see, we decided to force it to may . That's what Postfix official TLS documentation calls "Opportunistic TLS" : in some words it will try TLS (even with untrusted remote certs !) and will only default to clear if no remote TLS support is available. That's the option we decided to use as it doesn't break anything, and even if the remote server has a self-signed cert, it's still better to use TLS with self-signed than clear text, right ?

Once you have reloaded your postfix configuration, you'll directly see in your maillog that it will start trying TLS and deliver mails to servers configured for it :

Still easy, but here we also add our key/cert to the config but if you decide to use a signed by a trusted CA cert (like we do for centos.org infra), be sure that the cert is the concatenated/bundled version of both your cert and the CAChain cert. That's also documented in the Postfix TLS guide, and if you're already using Nginx, you already know what I'm talking about as you already have to do it too.

If you've correctly configured your cert/keys and reloaded your postfix config, now remote SMTPD servers will also (if configured to do so) deliver mails to your server through TLS. Bonus point if you're using a cert signed by a trusted CA, as from a client side you'll see this :

This might be of interest to the Fedora Project community, so I’m pushing my own reference here to appear on the Fedora Planet. Much of the work happening in the CentOS GSoC effort may be useful as-is or as elements within Fedora work. (In at least one case, the RootFS build factory for Arm, the work is also happening partially in Fedora, so it’s a triple-win.)

May 20, 2015

As more and more people were showing interest in CentOS on the ARM platform, we thought that it would be a good idea to start trying building CentOS 7 for that platform. Jim started with arm64/aarch64 and got an alpha build ready and installable.

On my end, I configured some armv7hl nodes, "donated" to the project by Scaleway. The first goal was to init some Plague builders to distribute the jobs on those nodes, which is now done. Then working on a "self-contained" buildroot , so that all other packages can be rebuilt only against that buildroot. So building first gcc from CentOS 7 (latest release, better arm support), then glibc, etc, etc ... That buildroot is now done and is available here.

Now the fun started (meaning that 4 armv7hl nodes are currently (re)building a bunch of SRPMS) and you can follow the status on the Arm-dev List if you're interested, or even better, if you're willing to join the party and have a look at the build logs for packages that failed to rebuild. The first target would be to have a "minimal" install working, so basically having sshd/yum working. Then try other things like GUI environment.

As plague-server required mod_python (deprecated now) we don't have any Web UI people can have a look at. But I created a "quick-and-dirty" script that gathers information from the mysql DB, and outputs that here :