Post navigation

The AWS Elastic Filesystem (EFS) gives you an NFSv4-mountable file system with almost unlimited storage capacity. The filesystem I just created to write this article reports 9,007,199,254,739,968 bytes free. In human-readable format df -kh reports 8.0E (Exabytes) of available disk space. In the year 2019, that’s a lot of storage space.

In past articles I’ve shown how to create EFS resources manually, but this week I wanted to programmatically create EFS resources with Terraform so that I could easily create, test, and tear-down EFS and VM resources on AWS.

I also wanted to make sure that my EFS resources are secure, that only VMs within my Virtual Private Cloud (VPC) could access the EFS data, so that no one outside of my VPC could mount or otherwise access the data.

The file_system_id is automatically set to the efs-example resource’s ID, which ties the mount target to the EFS file system.

The subnet_id for subnet-efs is a separate /24 subnet I created from my VPC just for EFS. The ingress-efs security group is a separate security group I created for EFS. Let’s cover each one of these separately.

A separate EFS subnet

First off I’ve allocated a /16 subnet for my VPC and I carve out individual /24 subnets from that VPC for each cluster of VMs and/or EFS resources that I add to an AWS availability zone. Here’s how I’ve defined my test environment VPC and EFS subnet:

The EFS security group

Finally, I need a security group that only allows traffic between my test environment VMs and my test environment EFS volume. I already have a security group called ingress-test-env that is used to control security for my VMs. For EFS I create another security group that allows inbound traffic on port 2049 (the NFSv4 port), allows egress traffic on any port.

By setting the ingress-efs-test resource’s security_groups attribute to ingress-test-env this only allows network traffic to and from VMs in the ingress-test-env security group to talk to the EFS volume. If you use security_groups like this, you really lock down the EFS volume and you don’t need to set the cidr_blocks attribute at all.

Ceph is a distributed storage system that provides object, file, and block storage. On each storage node you’ll find a file system where Ceph stores objects and a Ceph OSD (Object storage daemon) process. On a Ceph cluster you’ll also find Ceph MON (monitoring) daemons, which ensure that the Ceph cluster remains highly available.

Rook acts as a Kubernetes orchestration layer for Ceph, deploying the OSD and MON processes as POD replica sets. From the Rook README file:

Rook deploys the PODs in two namespaces, rook-ceph-system and rook-ceph. On my cluster it took about 2 minutes for the PODs to deploy, initialize, and get to a running state. While I was waiting for everything to finish I checked the POD status with:

Final tasks

Now I need to do two more things before I can install Prometheus and Grafana:

I need to make Rook the default storage provider for my cluster.

Since the Prometheus Helm chart requests volumes formatted with the XFS filesystem, I need to install XFS tools on all of my Ubuntu Kubernetes nodes. (XFS is not yet installed by Kubespray by default, although there’s currently a PR up that addresses that issue.)

Make Rook the default storage provider

To make Rook the default storage provider I just run a kubectl command:

That updates the rook-ceph-block storage class and makes it the default for storage on the cluster. Any applications that I install will use Rook+Ceph for their data storage if they don’t specify a specific storage class.

Install XFS tools

Normally I would not recommend running one-off commands on a cluster. If you want to make a change to a cluster, you should encode the change in a playbook so it’s applied every time you update the cluster or add a new node. That’s why I submitted a PR to Kubespray to address this problem.

However, since my Kubespray PR has not yet merged, and I built the cluster using Kubespray, and Kubespray uses Ansible, one of the easiest ways to install XFS tools on all hosts is by using the Ansible “run a single command on all hosts” feature:

I’ve been setting up and tearing down Kubernetes clusters for testing various things for the past year, mostly using Vagrant/Virtualbox but also some VMware vSphere and OpenStack deployments.

I wanted to set something a little more permanent up at my home lab — a cluster where I could add and remove nodes, run nodes on multiple physical machines, and use different types of compute hardware.

Set up the virtual machines

To get started I used a desktop System76 Wild Dog Pro Linux box (4.5 GHz i7-7700K, 64GB DDR4) and my create-vm script to create six Ubuntu 18.04 “Bionic Beaver” VMs for the cluster:

The inventory.py script generates an Ansible hosts inventory file in inventory/mycluster/hosts.ini with all of your VM IP addresses.

I like to add one variable override to the bottom of hosts.ini which copies the kubectl credentials over to my host machine. That way I can run kubectl commands directly from my desktop. The extra lines to add to the bottom of hosts.ini are:

[all:vars]kubectl_localhost=true

Install Kubernetes

To install Kubernetes on the VMs I run the Kubespray cluster.yaml playbook:

Once the playbooks have finished, you should have a fully-operational Kubernetes cluster running on your desktop.

At this point you should be able to query the cluster from your desktop using kubectl. For example:

$ kubectl cluster-infoKubernetes master is running at https://192.168.122.251:6443coredns is running at https://192.168.122.251:6443/api/v1/namespaces/kube-system/services/coredns:dns/proxykubernetes-dashboard is running at https://192.168.122.251:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxyTo further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

I was looking for a way to automate the creation of VMs for testing various distributed system / cluster software packages. I’ve used Vagrant in the past but I wanted something that would:

Allow me to use raw ISO files as the basis for guest VMs.

Guest VMs should be set up with bridged IPs that are routable from the host.

Guest VMs should be able to reach the Internet.

Other hosts on the local network should be able to reach guest VMs. (Setting up additional routes is OK).

VM creation should work with any distro that supports Kickstart files.

Scripts should be able to create and delete VMs in a scripted, fully-automatic manner.

Guest VMs should be set up to allow passwordless ssh access from the “ansible” user.

I’ve previously used virsh’s virt-install tool to create VMs and I like how easy it is to set up things like extra network interfaces and attach existing disk images. The scripts in this repo fully automate the virsh VM creation process.

Sample Kickstart file

The Ansible user: Although I’d prefer to create the “ansible” user as a locked account,with no password just an ssh public key, Kickstart on Ubuntu does not allow this, so I do set up an encrypted password.

To set up your own password, use the encrypt-pw script to create a SHA512-hashed password that you can copy and paste into the Kickstart file. After a VM is created you can use this password if you need to log into the VM via the console.

To use your own ssh key, replace the ssh key in the %post section with your own public key.

The %post section at the bottom of the Kickstart file does a couple of things:

It updates all packages with the latest versions.

To configure a VM with Ansible, you just need ssh access to a VM and Python installed. on the VM. So I use %post to install an ssh-server and Python.

I start the serial console, so that virsh console $vmname works.

I add a public key for Ansible, so I can configure the servers with Ansible without entering a password.

Despite the name, the commands in the %post section are not the last commands executed by Kickstart on an Ubuntu 18.10 server. The “ansible” user is added after the %post commands are executed. This means that the Ansible ssh public key gets added before the ansible user is created.

To make key-based logins work I set the UID:GID of authorized_keys to 1000:1000. The user is later created with UID=1000, GID=1000, which means that the authorized_keys file ends up being owned by the ansible user by the time the VM creation is complete.

Create an Ubuntu 18.10 server

This creates a VM using Ubuntu’s text-based installer. Since the `-d` parameter is used,progress of the install is shown on screen.

create-vm script

# Copyright 2018 Earl C. Ruby III## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.

This script will take an .iso file created by revisor and generate a VM from it.

OPTIONS: -h Show this message -n Host name (required) -i Full path and name of the .iso file to use (required) -k Full path and name of the Kickstart file to use (required) -r RAM in MB (defaults to ${RAM}) -c Number of VCPUs (defaults to ${VCPUS}) -s Amount of storage to allocate in GB (defaults to ${STORAGE}) -b Bridge interface to use (defaults to ${BRIDGE}) -m MAC address to use (default is to use a randomly-generated MAC) -v Verbose -d Debug modeEOF}

My Ubuntu 18.04 box has Thunderbird installed as the default mail client. I was a Thunderbird user for years, but I currently spend most of my time using GMail, and when I click on a email mailto: link on a web page Ubuntu will load Thunderbird.

The documented fix is to go to Settings > Details > Default Applications and pick a different mail client. However, I don’t want a mail client at all, I want mail links to go to my default browser (Firefox, on this machine), load GMail, and open a to email “to” the name in the link.

The next time you click a mailto: link Desktop Webmail will ask you what web mail service you want to use. Desktop Webmail currently supports Gmail, Hotmail, Yahoo and Zoho. Select Gmail and it’ll pop up a new email message using GMail, set the “to” address to the mailto: link, using your preferred browser.

I just upgraded from Ubuntu 17.04 to 17.10 and one of the first things I noticed was all of the disk volumes that are mounted under my home directory appeared on my desktop. In Ubuntu 17.10, all volumes that are mounted under /home or /media appear on your desktop, and none of the switches in the Settings tool will make them go away.

The names of the folders aren’t even useful. They’re names like 10GB Volume and 20GB Volume. If you have two volumes the same size they’ll both have the same useless name. No hint of where the volume is mounted appears.

I have files, documents, databases, and email going back 20 years, much of it archival data that I want to be able to search but which never gets updated, so I keep these archive directories on separate read-only logical volumes. If my home directory’s file system gets corrupted beyond repair, the archives will still be intact. Since the volumes are read-only a misbehaving program or command-line oops won’t destroy the data.

But I don’t want to see them all over my desktop.

Tweak tool to the rescue! Install the tool and run it:

sudo apt install gnome-tweak-tool
gnome-tweak-tool

Then:

Desktop > Mounted Volumes > Off

No more volume icons on the desktop!

gnome-tweak-tool has other useful settings that are absent from the Settings tool, such as giving you the ability to move the window buttons to the upper left side of your windows.

Want to make the icons on your desktop smaller? Open up the File Manager, browse to Desktop, and select the icon size you want by moving the slider bar. The size of the icons on your Desktop and the size in the File Manager’s Desktop folder both use the same setting.

The latest version of IOS for iPhone, iPad, and iWatch devices really, really wants you to set up Apple Wallet / Apple Pay. Your devices want you to embed Apple in the middle of every purchase, and they’ll pester you every time you pick up your device with full-screen dialogs insisting that you need to finish setting up Apple Wallet. Right. Now.

I try to limit the number of things that can access my money, and I have no reason or desire to set up Apple Wallet, so I turned the notifications off. They are embedded in three separate places on the iPhone.

CockroachDB is new distributed database which, like its namesake, is really hard to kill.

CockroachDB implements SQL DML commands for creating schemas, tables, and indexes using the same syntax as PostgreSQL, and it supports the PostgreSQL wire protocol, meaning that any PostgreSQL database driver or client can be used to connect to a CockroachDB database. If you’re currently using PostgresSQL and you want an easier scale-out, highly-available way to deploy a database, you should take a look at CockroachDB. In many cases you can just repoint your application at a CockroachDB server and your application will run the same as it did using PostgreSQL.

The first day I tried using CockroachDB I got a six-node system up and running using CockroachDB’s Docker image on my Apcera cluster using AWS EFS as a backing store in less than an hour. This is what I did to get it working.

Create a namespace and a private network

“roachnet” is a private VxLAN created by the Apcera platform that only containers that I’ve joined to the network can see.

Create the first CockroachDB node

Next I create a container instance called “roach1” from the latest Docker image, open ports 8080 and 26257, tell it to use the EFS provider for storage, and to advertise itself to other CockroachDB nodes so they can find it and join the DB cluster.

I added the “sleep 3” command because when I originally tested this (on CockroachDB 1.1.0) the platform started the containers so fast that the DB got confused and didn’t add all of them to the cluster. All nodes started, but only some joined the cluster. After I added the delay all nodes joined the cluster.

Verify that the containers are all running.:

After that the cluster was up and running. I could connect to the database, create schemas, create tables, add, update, and delete records. I’m pretty happy with the initial results. Next step is automatically generating secure certificates so I’m not operating in insecure mode, then I’m going to run actual applications against the cluster.

Hope you found this useful.

CockroachDB overview screen

CockroachDB Storage Screen

CockroachDB Queues

I spin up a lot of VMs using VMware Fusion. I generally keep “clean” generic copies of a few different distros and versions of Linux servers ready to go with my login, an sshd server, ssh keys, and basic settings that I use already set up. When I need to quickly test something manually — usually some new, multi-VM distributed container orchestration or database system — I just make as many copies of the server’s *.vmwarevm file as I need, fire up the VM copies on my laptop, test whatever I need to test, then shut them down. Eventually I delete the copies and recover the disk space.

Depending on where my laptop is running I’ll get a completely random IP address for the VM from the local DHCP server. I would log into the consoles, get the IPs, then log into the various VMs from a terminal. (Cut and paste just works a whole lot better on a terminal than on the VMware console.)

However, since the console screens are up, and I repeat this pattern several times a week, I figured why not save a step and make the ephemeral VMs just show me their IP address on their consoles without having to login, so I added an “on reboot” file called /etc/cron.d/welcome on the master image which updates the /etc/issue file.

When a new VM boots, it writes the hostname, kernel info, and the ethernet config to the /etc/issue file. /etc/issue is displayed on the screen before the login prompt, so now I can just glance at the console, see the IP address, and ssh to the new VM.

Although you’d never want to do this on a production system, it works great for ephemeral, throw-away test VMs.