Post navigation

Amazon announced the development of the Amazon Elastic File System (AWS EFS) in 2015. EFS was designed to provide multiple EC2 instances with shared, low-latency access to a fully-managed file system. On June 28, 2016 Amazon announced that EFS is now available for production use in the US East (Northern Virginia), US West (Oregon), and Europe (Ireland) Regions.

Apcera‘s NFS Service Gateway can be used to access AWS EFS storage volumes within containers. You can use EFS to provide persistent storage to your containers running on AWS-hosted clouds in regions where EFS is available.

Gathering information

Before you begin you will need to know:

The name of the AWS Region where your Apcera Platform is running

The name/ID of the AWS VPC where your Apcera Platform is running

The name/ID of the AWS security group for your Apcera Platform

Setting up an EFS volume

Log into your AWS console.

Select the name of the AWS Region where your Apcera Platform is running on the upper right side of the screen.

Select Elastic File System.

Click Create File System.

Configure the file system access:

Select the name of the VPC.

The availability zone and subnet should be selected for you automatically.

If your VPC has more than one subnet (unusual) then select the subnet containing the Instance Managers that will be connecting to the EFS volume.

Leave IP address set to Automatic.

The first EFS volume you create will create a new security group. Use that security group for this and all future EFS volumes. Write down the name of the new EFS security group – we’ll configure it in the next few steps.

You should see a “Success!” message and a new EFS volume with “Life Cycle State” = “Creating”.

Write down the IP address of the EFS volume.

Update the EFS security group

Go back to the main console menu and select EC2.

Click Security Groups in the left hand nav menu.

Type the name of the new EFS security group into the search filter list.

On the bottom half of the screen delete the default inbound and outbound rules.

Add one inbound rule to allow all TCP traffic on port 2049 from the source “name/ID of the AWS security group for your Apcera Platform”

Add one outbound rule to allow all TCP traffic on port 2049 to the destination “name/ID of the AWS security group for your Apcera Platform”

This allows all VMs within your Apcera Platform security group to connect to your EFS volume on port 2049 (NFS).

No other traffic from any other source or to any other destination is allowed.|

Create an NFS Provider for the EFS volume

We’re going to create a single provider for the EFS volume. Each time you have a container or set of containers that need a persistent file system, just create a new service from the same provider. Each new service will carve out a new namespace on the EFS volume, keeping the files associated with that service separate from the files in all other services that use the same provider.

According to the EFS FAQ, When you create a file system, you create endpoints in your VPC called “mount targets.” Each mount target provides an IP address and a DNS name, and you use this IP address or DNS name in your mount command. Only resources that can access a mount target can access your file system. Since the Apcera Platform isn’t using Amazon DNS services internally, we’ll use the IP address to connect to the EFS volume.

To create the provider, you need to construct a URL describing the volume. In this case, we’ll use the internal IP address of the EFS volume as the hostname and / as the exported volume name. All EFS volumes use the NFS v4.1 protocol. If the IP address of the EFS volume is 10.0.0.112 we’d construct a provider using:

You can bind this service to any container that needs a shared, persistent file system. Each time you need a new shared, persistent file system for a container or group of containers just create a new service using the same provider and bind the service to your job or jobs.

Persistence for Docker

Now that we have a provider that can carve out EFS storage for containers, let’s try spinning up some Docker images.

On the Apcera Platform, if the specification for a Docker image (Dockerfile) specifies that the app requires persisted volumes, you must do one of the following when creating the job:

Include the –provider flag when you create or run the Docker job. You must include this flag if you include the –volume flag when creating or running the Docker job.

Include the –ignore-volumes flag when you create or run the Docker job.

Here is an example of running NGINX inside a Docker container on the Apcera platform, where the content for the site is stored on an EFS volume:

I’m using the Apcera “apc” command-line tool to build the container, pulling the nginx image directly off hub.docker.com, telling it to use the awsefs EFS volume provider I created earlier for persistence, and to mount the EFS volume at the mount point “/usr/share/nginx/html”.

Now connect to the container:

/proc/mounts contains a list of all of the container’s mount points. I can verify that the container does indeed have an EFS volume by grepping /proc/mounts for the mount point:

Grepping for “/usr/share/nginx/html” shows the IP address 10.0.0.112, which is the IP of the EFS volume, the log directory name after is the unique namespace for the service, the mountpoint is “/usr/share/nginx/html”, and the mount type “nfs4”.

There is no content in the directory, so I add some by echoing some HTML code to an index.html file. My container will proclaim to the world “NGINX in a Docker container on Apcera with content stored on EFS” in an H3 typeface!

Now that I have some content I need to add a route to the content. Right now the NGINX container is running, and listening on ports 80 and 443, but it’s completely isolated from the outside world — no one can connect to those ports unless there’s a route (a URL) set up.

My cluster is running on the domain earlruby.apcera-platform.io, so I add a route like so:

I have successfully added the http route http://nginx.earlruby.apcera-platform.io/ to my NGINX container. This is a real public DNS entry. To verify that it works I point my browser at the route I just added:

Success!

Such an amazing app is bound to go viral, and a single NGINX container may not be able to keep up with the load. I want to ensure that my app can keep up and remain highly-available, and that it keeps running even if one or more VMs in my cluster get killed off, so I add more NGINX containers:

Now I’ve got 20 containers running my NGINX app, all serving up the same content, running on multiple VMs across my cluster, all load-balanced under the single URL http://nginx.earlruby.apcera-platform.io/. If any container gets killed off, the Apcera platform will spin up a new one. If any VM in the cluster dies, any containers running on it will automatically be migrated to new hosts. If I want to scale up the app to 100 or 1000 containers, or back down to 1, it’s a one-line command to make the change.

In terms of resources, I’m using slightly less than 45 MiB to run those 20 containers. That’s not a typo — 45MiB! Containers are much more efficient users of RAM than VMs.

Of all of the social networks, engagement on Twitter is dismally low. Even the people who like the app don’t spend nearly as much time on Twitter as they do on other social media. There are some obvious problems with the app that Twitter could fix, but they don’t.

Which one will you click?

Until a few months ago, Twitter’s iPhone app didn’t support badge notifications. A badge is the small red number that appears on the app’s icon letting you know that you have Notifications. Twitter’s iPhone app didn’t have them. You could look at your phone’s screen and see that Facebook, LinkedIn, MeetUp, and NextDoor had messages waiting for you, but not Twitter. A glance at your screen and the small red numbers taunt you – Check FaceBook! Check your email! Check Messages! Badges are a simple way to get you to start up that app and engage.

1. Fix Notifications

With a recent update, Twitter finally added badge notifications. Only problem is, they don’t actually work. The badge will appear with a “2” on it, I’ll start the Twitter app, and the “Notifications” icon indicates that there’s something new. I click it, and there are no updates. I can check the “Me” link and see that I have 2 more followers, but they’re not listed on the Notifications screen. If I log into the TweetDeck web app I can see who the new followers are, but the iPhone mobile app pretends they don’t exist.

2. Make it easy to engage with friends

Ever have a conversation with a friend on Twitter? It’s next to impossible to follow replies, comments, or have any sort of conversation using the tools they provide on their mobile app. If Twitter wants to increase user engagement, they should get rid of the Messages tab and make it a “Mentions” tab that shows private messages and allows for threaded conversations. Alternately they could add a “swipe left” feature or even a “view replies” icon to view the replies to a message, in threaded order. Make it possible for people to have a conversation about the things that they’re posting, and they’ll stay on engaged for longer.

3. Show me what posts are trending

The Home button shows me the latest messages from everyone I’m following in order by time posted. What if I want to see the posts with the most retweets? Or the most hearts? Or the most replies? You know, the messages from the people I’m following that are the most interesting/funny/relevant? Get rid of the “Moments” section and give me a “Trending” section that shows the items from the people I follow with the most retweets, likes, and replies. I guarantee I’ll spend more time in that section than I do looking at “Moments”.

4. Load more items in the “Home” feed

I use an iPhone with “service” from AT&T. I also ride BART, which means that I spend about half my commute with no data service. (Newsflash to AT&T: People on trains spend most of their time on their phones. If you cared about your customers you’d send a tech to ride a train with a signal strength meter a couple of times a year and fix the dead zones.)

Since mobile data service is spotty, you’d think the Twitter app would start downloading items for my “Home” feed as whenever I have a signal, so I’d never run out of items to read. Unfortunately that’s not the case, and I routinely hit the end of the list of things on “Home” to read just as BART enters another AT&T dead zone. I sit there watching the spinner for a few seconds, then quit Twitter and load another app – one that was smart enough to download content in the background so it’s ready for me to view.

5. Cache some damn profile pictures

I only follow 361 people on Twitter. Each one of them has a small profile picture that rarely gets updated, so why does Twitter download and render a the same, identical profile photos every time I open the app? I’ll be scrolling along, and I can see it download and render the photos one by one. If I’m in an AT&T dead zone, I’ll just see a bunch of empty boxes instead of profile pictures in my feed. How hard could it be to cache a copy of the photos on my phone? The app can always check for new photos and update them if one is available, so why is it downloading them every time I open the app? If I don’t have a connection at the moment it’s OK to show me someone’s 24-hour-old profile picture – it’s better than showing me an empty box.

That’s it. 5 simple things that Twitter engineers could fix this week to increase the amount of time people spend using their mobile app.

Full disclosure: I own shares of Twitter stock. If someone at Twitter fixed these problems I might be making less of of a loss on those shares. In addition, the Twitter stockholders meeting is this week. I won’t be there, but if you are feel free to share this article with the people in attendance.

This is a talk I gave last week at the SF Microservices Meetup titled Policy-based Cloud Storage, Persisting Data in a Multi-Site, Multi-Cloud World. In it I cover Apcera‘s approach to storage for containers and how to use policy to manage very large scale application deployments.

I have an Ubuntu 15.04 “Vivid” workstation already set up with LUKS full disk encryption, and I have a Synology DS414 NAS with 12TB raw storage on my home network. I wanted to add a disk volume on the Synology DS414 that I could mount on the Ubuntu server, but NFS doesn’t support “at rest” encrypted file systems, and using EncFS over NFS seemed like the wrong way to go about it, so I decided to try setting up an iSCSI volume and encrypting it with LUKS. Using this type of setup, all data is encrypted both “on the wire” and “at rest”.

Log into the Synology Admin Panel and select Main Menu > Storage Manager:

I needed to add some sudo access rights for support personnel on about a hundred Centos 6.6 servers. Currently no one one these hosts had sudo rights, so the /etc/sudoers file was the default file. I’m using Ansible to maintain these hosts, but rather than modify the default /etc/sudoers file using Ansible’s lineinfile: command, I decided to create a support.conf file and use Ansible’s copy: command to copy that file into /etc/sudoers.d/. That way if a future version of Centos changes the /etc/sudoers file I’m leaving that file untouched, so my changes should always work.

The support.conf file I created copied over just fine, and the validation step of running “visudo -cf” on the file before moving it into place claimed that the file was error-free and should work just fine as a sudoers file.

I logged in as the support user and it didn’t work:

[support@c1n1 ~]$ sudo /bin/ls /var/log/*
support is not in the sudoers file. This incident will be reported.

Not only did it not work, it was telling me that the support user wasn’t even in the file, which they clearly were.

After Googling around a bit and not finding much I saw this in the Sudoers Manual:

sudo will read each file in /etc/sudoers.d, skipping file names that end in ‘~’ or contain a ‘.’ character to avoid causing problems with package manager or editor temporary/backup files.

sudo was skipping the file because the file name contained a period!

I changed the name of the file from support.conf to support and it worked.

Word to travelers: do not book hotel rooms through TripAdvisor. They will funnel you through sketchy third-party sites (“Amoma” is the one who burned me) who advertise made-up rates, take your money, and then get back in touch two weeks later to tell you oopsie, they can’t make a reservation at that hotel after all.

I guess it’s a nice scam while it lasts, but in this age of networked, instant word-of-mouth reviews, that kind of business model won’t hold up long.

I suggested Shannon try installing the Web of Trust (WOT) plug-in for her browser. I use it in all of mine, and it’s stopped scam sites from being loaded into my browser.

WOT displays a colored traffic light next to website links to show you which sites people trust for safe searching, surfing and shopping online: green for good, red for bad, and yellow as a warning to be cautious. The icons are shown in popular search engine results, social media, online email, shortened URL’s, and lots of other sites.

The cool part is, the rating is based on the aggregate ratings of all of the people who use a plug-in. Get burned by a site? Click the WOT icon and rate the site as untrustworthy. Have an excellent experience? Click the WOT icon and rate the site as trustworthy. The more that people use it, the more accurate and reliable the ratings become.

If a site is really untrustworthy, WOT will stop your browser from loading the site unless you tell it that you really want to go to that site. You can still go anywhere you want, but you’ll be warned about sites that others have had problems with.

I’m using Ansible to set up the network interface cards of multiple racks of storage servers running Centos 6.6. Each server has four network interfaces to configure, a public 1GbE interface, a private 1GbE interface, and two 10GbE interfaces that are set up as a bonded 20GbE interface with two VLANs assigned to the bond.

If Ansible changes an interface on a server it calls a handler to restart the network interfaces so the changes go into effect. However, I don’t want the network interfaces of every single server in a cluster to restart at the same time, so at the beginning of my network.yml playbook I set:

serial: 1

That way Ansible just updates the network config of one server at a time.

Also, if there are any failures I want Ansible to stop immediately, so if I screwed something up I don’t take out the networking to every computer in the cluster. For this reason I also set:

max_fail_percentage: 1

If a change is made to an interface I’ve been using the following handler to restart the interface:

- name: Restart Network
service: name=network state=restarted

That works, but about half the time Ansible detects a failure and drops out with an error, even though the network restarted just fine. Checking the server immediately after Ansible says that there’s an error shows that the server is running and it’s network interfaces were configured correctly.

This behavior is annoying since you have to restart the entire playbook after one server fails. If you’re configuring many racks of servers and the network setup is just updating one server at a time I’d end up having to restart the playbook a half dozen times to get through it, even though nothing was actually wrong.

At first I thought that maybe the ssh connection was dropping (I was restarting the network after all) but you can log in via ssh and restart the network and never lose the connection, so that wasn’t the problem.

The connection does pause as the interface that you’re ssh-ing in over resets, but the connection comes right back.

I wrote a short script to repeatedly restart the network interfaces and check the exit code returned, but the exit code was always 0, “no errors”, so network restart wasn’t reporting an error, but for some reason Ansible thought there was a failure.

There’s obviously some sort of timing issue causing a problem, where Ansible is checking to see if all is well, but since the network is being reset the check times out.

I initially came up with this workaround:

- name: Restart Network
shell: service network restart; sleep 3

That fixes the problem, however, since “sleep 3” will always exit with a 0 exit code (success), Ansible will always think this worked even when the network restart failed. (Ansible takes the last exit code returned as the success/failure of the entire shell operation.) If “service network restart” actually does fail, I want Ansible to stop processing.

In order to preserve the exit code, I wrote a one-line Perl script that restarts the network, sleeps 3 seconds, then exits with the same exit code returned by “service network restart”.

A new company called Peerio is promising secure, easy messaging and file sharing for everyone. They’re building apps that encrypt everything you send or share, making the code for these apps open source, and paying for security audits to peer-review the source code, looking for security weaknesses.

They’ve put together a short video to explain the basics of what they offer. I thought I’d give it a try and see how it works.

I went to Peerio.com using the Chrome browser, so the home page automatically offered to install Peerio on Chrome.

I clicked the install button and Peerio popped up as a new Chrome app.

Clicking the app brought up the new account screen, with the word “beta” displayed in small type just under the company logo, so they’re letting me know up front that this is going to be a little rough.

I clicked Sign Up, added a user name and email address, and was prompted for a pass phrase.

I have a couple of pass phrases I use. I typed one in, but apparently it wasn’t long enough. I tried another and another. Not long enough. The words “ALMOST THERE. JUST A FEW MORE LETTERS…” appeared on screen. One phrase I typed in had 40+ letters in it, but still the words “ALMOST THERE. JUST A FEW MORE LETTERS…” persisted. Tried again, this time putting spaces between the words. Phrase accepted! Maybe the check is trying to verify the number of space-separated words, not the total number of characters? Anyhow, got past that hurdle.

Next it sends you an email with a confirmation code and gives you 10 minutes (with a second by second countdown) to enter the confirmation code. I guess if you don’t enter it within 10 minutes your account is toast?

Once past that step I was prompted to create a shorter PIN code that can be used to login to the site. The long pass phrase is only needed to log in the first time you use a new device, after that your PIN can be used. I tried entering a few short number sequences. All were rejected as “too weak” so I used a strong, unique password with a mix of upper and lowercase letters, numbers, and special characters. The screen hid what I was typing and only asked for the PIN once, so if I thumb-fingered it, my account was going to be rendered useless pretty quickly. Hopefully I typed what I thought I typed.

Of course to use the service to send messages to people you have to load your contacts in. I added a friend’s email and Peerio sent him an invite. Tried adding another email address and the “Add Contact” form cut me off at the “.c” in “.com” — looks like the folks at Peerio only let you have friends with email addresses that are less than 16 characters long. My friends at monkeybots.com, you’re out of luck.

The Contacts tab has sub-tabs for “All Contacts”, “Confirmed Contacts”, and “Pending Contacts”, but the one email address I entered that was less than 16 characters long didn’t show up anywhere (I expected to see it under “Pending Contacts”). With my entries disappearing or truncated, I stopped trying to use the system.

Before the developers pay for another security audit, they really ought to try doing some basic usability testing — set up a new user in front of a laptop, and make two videos — one of the keyboard and screen and one of the user’s face, and then watch them try to log in and set up an account. I think they’d find the experience invaluable.

Anyhow, if you’re interested and feel like trying out their very BETA (feels like ALPHA) release, head over to Peerio.com and sign up. If you want to send me a message, you can reach me on Peerio as “earl”.

Google “How do I mount an ISO image in Linux” and most of the links still say to use “-t iso9660”. For example:

mount -t iso9660 -o loop,ro diskimage.iso /mnt/iso

That worked fine 10 years ago, but these days not all ISOs use ISO9660 file systems. Many use the UDF (Universal Disk Format) file system, and if you specify ISO9660 when mounting a UDF ISO file, subtle problems can occur. For instance, file names that contain upper case letters on a UDF file system will appear in lower case when that ISO is mounted using ISO9660.

On any modern Linux distro mount is smart enough to figure out what type of file system to use when mounting an ISO file, so it’s perfectly fine to let mount infer the type, e.g.:

mount -o loop,ro diskimage.iso /mnt/iso

Here’s an example of what happens when you try to mount a type UDF ISO as type ISO9660. Note that the case of the file names changes to all lower case when mounting as iso9660, which in this case causes subtle errors to occur within the software.

I just recently heard about CCMixter.org on FLOSS Weekly. CCMixter.org is a resource and collaborative space for musicians and remixers. They have thousands of music tracks which can be downloaded, remixed, sampled, or streamed.

I recently did a fresh install of Ubuntu on the computer I was using, and clicking on any of CCMixter’s streaming links caused a window to pop up asking me if I wanted to play the stream using Rhythmbox or “Other”. Selecting Rhythmbox popped up Rhythmbox, but it wouldn’t play the stream. Googling around a bit led me to discussions of Rhythmbox brokenness going back to 2008, so I took a different tack.