Post navigation

This is the first webinar I did for BigML. I’m pretty sure this is the one where one of the people attending left a comment that they expected the webinar to be live and not recorded. Of course, it *was* live! I had just practiced it over and over until it sounded like a perfected script. It’s entertaining for me to watch now and see my early attempts at making nice Keynote animations as well as the somewhat overly dramatic tone.

Still, the content is good, although these days I tend to just go straight into the UI and do an actual demo without all the intro slides.

Since I currently work at a Machine Learning company, it may surprise some to find out that I am currently enrolled in Andrew Ng’s Machine Learning class thru Coursera. I am taking the class because I want to be able to…

The Basics

Flip the switches to turn off the lights – it is harder than you think!

The goal of De-Lite is to turn off all of the light bulbs. Unfortunately, the electrician that setup the light switches had far too much free time, and has made a bit of a mess of the wiring.

Each switch on the left is connected to only some of the light bulbs. The connected bulbs for the switch are indicated by a X in the switch row. When the switch is flipped, each connected bulb will change state: if it was on, it will turn off, if it was off it will turn on. The
right combination of switch states will turn all the bulbs off.

Each game is guaranteed to be solveable.

Each game has a unique string that identifies the setup, so they can be replayed or shared.

Login to facebook and share your wins!

Background texture is customizable.

Game Sharing

If you beat a hard puzzle and you want to let your friends try it, just share your win on your facebook wall. The post will embed a link that other users can click on from their mobile device to start the same puzzle.

UPDATE

Startcom is now at this URL. And although the certs are still free, they no longer work with things like Chrome, or Safari, or Firefox… which doesn’t leave much 😉

I’m still investigating other free SSL services…

Why do you need a SSL certificate?

Short answer: to enable encryption! For example, if you are running a web-site and *any* part of it requires authentication, then you should enable SSL. For the whole site. No, you can’t just enable SSL for the authentication piece and then use cookies, because the session can be hijacked. No session hijacking is not hard, here is a Firefox plugin that will do it: Firesheep

Why you need a valid certificate – or – why you can’t use a self-signed certificate

A SSL certificate serves two important functions. The first, which you already know, is that it enables encrypted communication. But the second, and often overlooked function is that the SSL certificate verifies the identity of the other party you intend to communicate with. This is very important, easily as important as the encryption itself. Without the verification step, shady people could fool your computer into passing data thru a third party in an attack vector referred to as a man-in-the-middle attack.

But how does this verification work? The short answer is digital signatures and trust. When you connect to an encrypted web site, the server sends you a certificate that is signed by a third party. Your browser then verifies this signature against a local database of certificate identities, all of which your browser trusts. (Note: Why does your browser trust them? The author of the browser trusts the certificates and so included them with the install package). If the certificate has a valid signature from a third party that your browser trusts, then it will trust the remote server.

If you imagined the servers having a friendly conversation, it might go like this:

you: “Hey are you Bob.com?”
Bob.com: “I sure am – you can see here that Joe.com trusts me – I have his signature here.”
you: “Let me check – yup, that’s Joe.com’s signature all right. If he trusts you, you must be ok”.
Bob.com: “Great – let’s go off the record!”

That’s how a valid certificate works. But when you want to run your own site, you have to get Joe.com to trust you, which usually involves money. After all, Joe.com wants to get paid for going to the trouble of verifying that you really are Bob.com.

Of course, you don’t have to use a valid certificate; you can self sign the certificate instead. This just means that you create your very own certificate authority and then add your own signature. Now the conversation goes like this:

you: “Hey are you Bob.com?”
Bob.com: “I sure am – you can see here that I trust myself – I have my signature here.”
you: “Uhhh. Let me throw an error up on my user’s screen. Well, he clicked thru the warnings, so… I guess you must be ok.”
Bob.com: “Great – let’s go off the record!”

This does enable encryption, but it’s going to throw a lot of your users off since they with probably not fully understand why they are getting the error. Of course if you are just running a site for yourself or some friends, this is probably not a problem, and in fact in that case you can install the certificate (by trusting it permanently) and then you won’t get errors anymore.

This is what I did for years for my own sites. But it always bothered me. Well, and it causes weird problems that can waste a lot of time to troubleshoot. For example, the Flash uploader for WordPress just does not work with a self signed certificate (See: No-SSL-Flash-Upload). But, the inconvenience was never serious enough to pony up the $15/yr (or more!) for the privilege of having a valid certificate.

Enough already – Where do I get one?

So, about once a year I got annoyed with self signed certificates and asked google to help me out. And this time, there was a hit! Here it is:

And why is it free? I love this: “Because we believe in the right to protect and secure information between two entities without discrimination of race, origin and financial capabilities.”. Yes! I love these guys already!

What you get for free is a basic SSL certificate. In the case of a certificate for a web server, you get protection on the base domain name and one name, for example “alleft.com” and “www.alleft.com”. This is a nice feature because you can run SSL on the domain name and do a redirect to www without breaking encryption. You can also get a S/MIME cert for email, a certificate for XMPP, and an Object Code Signing cert. Did I mention it’s free?

The setup process was a little bit convoluted, but not impossible to follow. Basically, after setting up your account the website will install an Authentication Certificate into your browser so you won’t need to remember a username and password. However, if you want to login from another computer, you’ll need to export this certificate and copy it to the new computer.

Once your account is setup, you then need to prove you own the domain you want a certificate for using the “Validation Wizard” – there are the standard tests available: email address, domain name, personal identity, etc. Then you can go thru the “Certificates Wizard” and get your key and cert setup.

I can confirm that the certificate is real and has been working great – any you can try it for yourself by hitting this site with https if you like. It seems that the intermediate cert is not recognized by my Android phone, which is unfortunate, but will likely be resolved in future versions of the built-in browser.

I have a lot of accounts, and I’m willing to bet that you do as well. Banking websites, social websites, work accounts, wordpress accounts, DNS registrars, the list goes on and on. Each of these accounts typically requires a password, which makes for a lot of passwords. I could use the same password everywhere, or even just a small set of passwords, but the risk of having all my accounts compromised would increase with each account added that uses the same password. In order to reduce this risk, I keep my passwords unique to each account as much as possible.

Surprisingly, I manage to remember quite a few of them. But occassionaly, especially for accounts that I don’t access often, I forget them. Since I’m not about to write my passwords on sticky notes and put them on my monitor or in a drawer, I use a program that stores my library of passwords in an encrypted file. This way I have one “master password” that gives me access to the hundreds (yes hundreds) of passwords I have.

Many years ago I used a program called STRIP (Secure Tool for Recalling Important Passwords) on my Palm device. This program worked great and I used it for about 9 years. However, a year ago I switched to an Android phone and STRIP is not supported on Android. I tried several different apps and eventually started using Callpod’s Keeper. At first, I really liked this program. I was using the free version which at the time would allow you to backup the password file to the SD card which was all I needed. However, after a few updates, the SD card backup feature disappeared from the free version in favor of their paid version which syncs to a cloud service at $9.99/yr. This combined with the fact that the program nags you constantly with a popup window to backup your data and then reminds you that you can’t because you have the free version, convinced me to start looking for a new password program.

After playing with several different free and paid versions, I re-discovered KeePassDroid. I remembered playing with it the first time I research Android password programs and dismissing it because it didn’t have a backup capability. However, I had recently been playing with Dropbox and realized that I could put the KeePassDroid data file into Dropbox. Would this work? The short answer is yes! And it’s a delectable combination. Here’s how it works:

1) Install: KeePassDroid, Dropbox, and OI File Manager
You need KeePassDroid to manage your passwords, Dropbox to synchronize them off your phone, and the OI File Manager to make it easier to locate the files to upload in step 3.

2) Run KeePassDroid and create the default database in /mnt/sdcard/keepass/keepass.kdb
This sets up an empty database which you can now upload into Dropbox.

3) Run Dropbox and Menu/Upload/Any File/OI File Manager -> select the /mnt/sdcard/keepass/keepass.kdb
This will copy the keepass.kdb file into /mnt/sdcard/dropbox/keepass.kdb where Dropbox can manage it.

4) Run KeePassDroid and from the main screen (where you select the database file and enter the password) click on the Folder icon and navigate to home/mnt/sdcard/dropbox and select keepass.kdb

5) If you want to clean up, you can remove the /mnt/sdcard/keepass/keepass.kdb file – it won’t be needed anymore since you will only access the file in the dropbox folder.

Once this is setup, you can launch KeePassDroid by running Dropbox and then clicking on the “keepass.kdb” file. Opening the file this way will ensure that Dropbox uploads any changes you make to the database file.

So, why go to all this extra work? Two big reasons: First, if you ever lose your phone or run it over with a car, you have an up to date version of your password database “in the cloud” that you can reload on a new phone. Second, you can now manage your password from other devices! I’m running Dropbox on my OSX desktop along with KeePassX and I can now access or update my password database from my phone and my desktop. That’s delicious.

Of course, there is no reason you have to stop there. I haven’t tried this yet, but since KeePassDroid can manage several database files, you could create a second one, maybe “family.kdb” and then in a shared dropbox folder you could share the database file with other members of your family. Or perhaps a “work.kdb” and share passwords with co-workers, etc.

In summary: KeePassDroid works great, it’s free, and when combined with Dropbox it has great syncing, backup and possible sharing capability.

And mobile developers please take note: I *hate* nagware, and I suspect other people do as well. I hate it so much that I would never consider buying software that resorts to it – it’s an instant sale killer for me. When implementing the free/paid mobile app model, it’s important to give users a fully functional free version that works and doesn’t nag. This way, people are inclined to install it and to keep using it because it’s free and works great. And then if the paid version has a few really cool, but not functionally important features then I think people will be more likely to upgrade.

I used to think that every server needed to have something like a DRAC (Dell Remote Access Card) or IBM RSA (IBM Remote System Administration) to be really manageable remotely. But the problem is that they are expensive and they are barely functional. In the case of DRAC, you have to use a browser with ActiveX controls. And although the IBM RSA utilizes java, the consoles have an annoying habit of typinggggggggggggg a lot of eeeeeeeeeeextra characters, especially towards the end of a really long command.

Technological annoyances aside, I really prefer serial consoles. They work great and there are even inexpensive ways to setup networks of remote serial consoles (Project Hydra). But what happens when you need remote access to the BIOS, or to use pxeboot to rescue boot because a kernel upgrade went wrong, or just to reinstall your system? Do you really need that $200 remote access card? The answer is no, provided you have IPMI 2.0 in your baseboard controller which is *very* common in server class hardware.

What is IPMI?

Just close that google tab, I’ll save you the effort: IPMI stands for “Intelligent Platform Management Interface” and is a standard for monitoring and controlling a machine remotely and independently from the operating system. This system management is typically handled by a BMC (Baseboard Management Controller) which is like a second computer inside your server with access to things like fan speed information, power control, system event logs, and SOL also known as Serial Over LAN which when combined with a serial console gives you a fully BIOS capable remote console. IPMI is thus a protocol for interfacing with the BMC, and SOL is the remote access gravy we are after.

How to setup IPMI

The first step is to configure your BMC so that it is remotely accessible. If you have a BMC, you will probably see an option during the POST to enter the BIOS configuration utility. On a Dell SC1435 it looks like this:

In this case, hitting Ctrl-E in the allotted time will take you into the BMC menu, which on a Dell SC1435 looks like this:

The options in this menu are:

IPMI over LAN: On

This enables the IPMI protocol over the network.

NIC Selection: Shared

In “Shared” mode, the BMC network interface and the eth0 interface will share the same network port. In this configuration it is correct to think of your eth0 network port as a little hub with two computers connected to it since you will see two separate MAC addresses on this port.

LAN Parameters

This popup menu is where you configure the network address (static or DHCP) for the BMC interface. One caveat – I have never been able to get the VLAN ID settings to work correctly.

LAN User Configuration

You need to setup a user and password in here in order to enable remote access.

How to setup IPMI without console access

If you already have an OS installed and have remote shell access, you can configure the BMC for remote access from the command line. In Debian, this is done using the ipmitool package and loading a few kernel modules:

NOTE: When you are setting remote access in shared mode on a running system, it is possible to wedge your eth0 connection, especially if you are trying to make the BMC work with VLANs. To avoid the hassle, I usually put the eth0 interface and the BMC interface in the same untagged VLAN.

Accessing the IPMI interface remotely

Now that you have enabled the network interface for your BMC, you can use the ipmitool to do all sort of things. Here are some particularly useful commands:

This connects to the remote BMC and issues the specified power control. Very useful for hard resetting crashed machines.

ipmitool -H {YOUR BMC IP} -U {YOUR USER} shell

Start a IPMI shell with interactive help – this allows you to issue multiple commands withoug having to type the password each time.

ipmitool -H {YOUR BMC IP} -U {YOUR USER} sel list

List the contents of the “System Event Log” – this is where you catch things like ECC errors that are causing your server to reboot, etc.

How to setup the serial console redirection

So now you are probably wondering how you access the console thru IPMI, but we are missing a step. The IPMI interface will give us the ability to redirect a serial port over the network, but we need to attach a console to that serial port. This needs to be done in three places:

The BIOS

Reboot your system and go into the BIOS setup. Look for a section like “Serial Communication” or “Console Redirection”. Here is what the option looks like on a Dell SC1435:

Serial Communication: On with Console Redirection via COM2

This sets up a serial console using a virtual COM2 port.

External Serial Connector: COM1

This machine has a physical serial port, which we set to COM1 so it won’t conflict with the console port.

Failsafe Baud Rate: 57600

Setup the baud rate you desire

Remote Terminal Type: VT100/VT220

Funny how the VAX terminal type keeps hanging around…

Redirection after Boot: Enabled

The bootloader and kernel init

The BIOS setup will give us remote access to the BIOS and POST before the bootloader. To get the bootloader and kernel init to use the serial console so we can choose a kernel and see initialization messages, we need to edit the bootloader configuration. For grub on Debian, this can be done with an entry in /boot/grub/menu.lst similar to

# kopt=root=/dev/mapper/sys-root ro console=ttyS1,57600n8

The OS

And finally, we want the booted up system to start a getty on the serial port as well. This can be done with an entry in /etc/inittab similar to T1:23:respawn:/sbin/getty -L ttyS1 57600 vt102.

Connecting to the IPMI SOL console

At this point, we have a serial console setup on com2 (ttyS1) and we have the BMC setup to make the serial port remote accessible via IPMI SOL. Now we can connect using ipmitool to the console remotely. The catch is that you must load the lanplus module in order to have access to SOL:

ipmitool -H {YOUR BMC IP} -U {YOUR USER} -I lanplus sol activate

If all goes well, you should now have full remote console access – you can reboot the machine and watch the entire POST, kernel init, and system login remotely. To get out of the console, just type “~.” and it will either close you out of ipmitool or drop you back into the ipmitool shell depending on where you started.

Summary

Once you get IPMI+SOL working, you will have everything you need to remotely manage the machine. I currently use this setup with PXEBoot to allow remote re-installation and rescue booting, which is really nice. Some of the older servers we have don’t do well with higher baud rates, and some require an occassional Ctrl-L to redraw the screen but for the most part they work great.

If you are new to the world of *nix operating systems, you might still be wrapping your head around the concept of pipes. But if you spend any time at the command line, it won’t be long before you are throwing commands together with more pipes than a high rise plumbing job. This is because the ability to easily piece together the input and output of the literally thousands of command line tools is extremely useful.

And as you work with pipes some more, it won’t be long before you run up against a more complicated problem, specifically “how do I pipe between sets of commands?”. The answer is to group the sets of commands within subshells. I remember the first pipe and subshell solution I used was a quick (and very common) way to tar-copy a directory like so:

(cd /the/source && tar cf - .) | (cd /the/target && tar xf -)

This is a nice reliable cross-platform way of copying an entire directory from one path to another, and it works because the parenthesis spawn two subshells that run the “cd && tar” command separately, and then they are connected by the pipe. But what if you wanted to make a second copy of /the/source in one command? Or maybe you want to copy this directory to three targets all in one shot? The answer is Process Substitution.

Process substitution works by running a command in a subshell and returning a file-handle that you can connect to STDIN and STDOUT of other processes. So in my example of tar-copy to multiple destinations, you could do it this way:

The “tee” command duplicates STDIN to STDOUT and the named file. In this case, the named file is a process substitution, the bit in the ‘>()’, which runs a tar extracting the stream into /the/target1.

(cd /the/target2 && tar xf -)

This is a regular subshell that extracts the tar stream duplicated by “tee” into /the/target2.

Of course you could expand this to as many targets as you wanted by chaining more “tee” commands together. By the way, I have actually used the above command to make two backups of a single drive at one time, so it’s not a completely contrived example! Here are some other examples of when I have found process substitution handy.

Erasing multiple hardrives

I wanted to erase a bunch of identical hard drives using /dev/urandom at the same time without having to start multiple reads of the urandom device. This was the command I used:

OSX has a nifty little program called stickies that is really handy for jotting down little notes. I found myself relying on these so much that I wrote a pair of scripts to sync the sticky database between my two primary OSX machines. However, it is periodically useful to compare two versions of the database to make sure the latest changes are in the right file. Here is how I do it:

diff -urN \

Restoring multiple MySQL slaves

I wanted to restore two MySQL replication slaves at the same time, without copying the dump file to the slaves and without copying the dump file more than once from the backup server:

The mongosniff command allows you to decode a mongodb connection in realtime, but the command does *not* allow you to filter by hostname. This means that when you run mongosniff on the primary of a replica set, you are inundated with replica traffic and can not easily separate the data from a specific client. While mongosniff does allow you to query from a file, this would require two steps: one to capture packets and then a second step to decode them. With process substitution you can make mongosniff work in realtime by reading from a tcpdump subprocess. For example:

mongosniff --source FILE

It always feels satisfying to use process substitution. What problems have you solved with it?

I’ve been working with personal computers for awhile. I’m not quite from the punch card era, but I do fondly remember saving my prime number sieve program, written in basic, to an audio cassette attached to a Commodore Vic-20. Somewhere I still have the 24-pin dot matrix printout of all the prime numbers less than 1,000,000.

Anyways, I’ve also been using cron for quite some time, and so it frustrates me how often I seem to repeat the same mistakes when working with crontab files. So, read on and let me know if you’ve ever caught yourself on any of these:

It’s not a shell script

The first important observation about the crontab file is this: It’s not a shell script. Sure, it looks a bit like a shell script since you can set some variables, but it’s not. It’s best to think of the crontab as an interpreted file that happens to support some variable declarations that look a lot like a shell script. But they are simple name = value pairs NOT shell variables. The “value” must be a simple value. For example, you can not de-reference a crontab “variable” in a value, so this won’t work:

#Does not work because you can't de-reference PATH in the value.
PATH=/my/foo/path:$PATH
#Does not work because sub-shells have no meaning in a crontab file.
PATH=`path_script`:/bin

In fact, you can not set arbitrary “variables” in crontab, so this won’t work:

#Does not work - you can not set arbitrary "variables"
COMMON_CMD=/my/home/bin/doit.sh

In fact, it’s really wrong to think about these as variables at all. Really, they are cron options that derive their default values from your environment variables. More importantly, there are only a limited number of “variables”, er options, that can be defined in a crontab. Here is the list pulled from “man 5 crontab” on Debian Squeeze:

PATH

Works just like the shell PATH, but it does *not* inherit from your environment. Typically set to a very short list of path elements, often just “/usr/bin:/bin”

MAILTO

A comma delimited list of mail address to which to send the output of cron jobs. Applies to all cron entries after the MAILTO declaration. You can define it multiple times

HOME

The path to the crontab owners’ home directory.

SHELL

The shell to use when invoking cron jobs – the default is /bin/sh.

LOGNAME

Set from /etc/passwd

CONTENT_TYPE

The content-type to use for cron output emails.

CONTENT_TRANSFER_ENCODING

The charset to use for cron output emails.

And that’s all.

Percent signs need to be escaped

I forget this one more often than I would like to admit. Usually, I’m trying to do something like:

#Does not do what you might expect because of the '%' sign
0 23 * * * (EPOCH=`date '+%s'`; echo $EPOCH >> /tmp/foo)

As you can see from the Subject line, the command that actually got executed stopped at the ‘%’ sign. This is because crontab treats the ‘%’ sign as a newline – all content after the ‘%’ will be passed into STDIN of the command. In order to use the ‘%’ sign in a command you have to escape it with a backslash:

0 23 * * * (EPOCH=`date '+\%s'`; echo $EPOCH >> /tmp/foo)

No linebreaks

In a shell script, a single line can span multiple lines by using a trailing ‘\’ at the end of each line. It is so tempting to do this in a crontab file, but you can not. Instead, if my crontab command starts getting long enough that I start thinking about backslashes, then I move the command into a shell script and invoke that instead. Much easier to read and maintain.

Comments

You can not put comments at the end of a “variable” declaration, nor in a command. They will be interpreted.

#This comment is ok, but the next one will cause you grief
PATH=/bin:/usr/bin #Why did I put this here?
#This will append to the file "foo#Store"
* * * * * echo `date '+\%s'` >> /tmp/foo#Store useless timestamps!

Summary

So, those are the big crontab pitfalls that I re-discover on occasion. Hopefully after reading this I’ll save you some crontab grief.

Here is an fun proof I saw during a stay at MSRI (Mathematical Sciences Research Institute) – unfortunately, I can not remember the reference (or none was given). If you happen to know the originator of this work, please let me know and I will attribute it appropriately. In any event, it is a “fun” problem, so I typed up the following summary:

Prop: There exists precisely one pair of numerically non-symmetric six-sided dice with no blank sides such that the sum-roll probability distribution is equivalent to normal symmetric six-sided dice. (Where by sum-roll it is meant the probability of two dice rolling a sum of x, etc.)

Proof: Let a die be represented by a Polynomial in the following fashion. Powers represent the number on a side of a die, and the coefficient of a specific power represents the number of sides with that number of dots. For example, a normal symmetric die has the following representation:

And a four sided die with three sevens and one four would be:

Notice that with this representation, is the number of sides on the die. Also notice that this representation encapsulates the probability of a specific roll, i.e., the probability of a die rolling a “n” is:

Finally, the action of rolling two dice is equivalent to forming the product of the polynomial representations. For example, when two normal symmetric six-sided dice are rolled, the probability of rolling a one (in sum) is zero, the probability of rolling a two is 1/36, the probability of rolling a seven is 1/6, etc. Notice that if we form the product

Then the coefficient of is zero, the coefficient of is one, and the coefficient of is six (0/36, 1/36, and 6/36).

Now, to answer the original question, we find two polynomials, and such that:

1)
2)
3)
4)

Condition #1 gives us the same sum-roll distribution as normal symmetric dice.
Condition #2 gives us two six-sided dice.
Condition #3 assures us of the non-symmetry of and .
Condition #4 is equivalent to saying that the polynomials have a zero coefficient for the term, which is the same as saying that the dice have no blank sides.

So, to proceed, we simply need to factor the polynomial in condition #1. Here it is:

Which I’ll write as:

where:

Notice that so multiplying by will not change the number of sides on the die. Also notice that if we put both terms in one of or then the other die would have a blank side (since the product of any of the remaining terms , or would incur a monomial). Thus, both and must have as a factor.

Now, notice that and , whose product is six. Also notice that . Thus, if we want and to represent six-sided dice then they must both have and as factors, since multiplying by will not change the number of sides. So far we have
the following:

The only factor we have left is the . If we put a in both of and then we will have created symmetric dice since will be equal to (in violation of condition #3). The only other option is to put the entire term into either or By symmetry, it is irrelevant which polynomial gets the factor, so we’ll try:

Now let’s check our work! Condition #1 is satisfied since . How about condition #2? Well, and so that looks good. Clearly, so condition #3 is ok. Finally, neither term has a monic, so and neither die has a blank side.

So, now you have all the information you need to find these dice. I didn’t want to spoil it by writing out the actual sides, so I’ll leave you the final step.

“You have a bunch of servers to administer, not enough consoles and a small budget. Roll your own serial console server with USB.”

This is a paper I wrote for Linux Journal after playing around with using USBserial adaptors to build a scaleable serial console solution. It should *NOT* be confused with the Rogue Wave Hydra product. In fact, I used the name first!