24 Sep2013

The Problem

You’re on Mac OS X (somewhere around 10.7.5) and you’re using the sed command to replace characters from the latin1 or Windows-1252 character encoding with their utf8 equivalents. Unfortunately you get an error like the following:

This happened to me while working on HamDecks, a small project that creates Mnemosyne decks to help you study for the Amateur Radio Operator exams using questions from the official ARRL Question pools. The source question pool files (Technician, General, Extra) though have some problems… There’s a lot of characters with strange/exotic encoding in the ARRL pool files that could not be imported into Mnemosyne. That’s how I got myself into this whole mess in the first place.

Options

The stackoverflow link above makes two suggestions:

Use the iconv utility

Use a PERL one-liner

Your Mileage May Vary, but neither of those suggestions worked for me. So what did work then?

Virtual Disk Guide

Interested in virtualization? Do QCOWs rule your filesystem? Are you a libvirt or KVM+QEMU wizard? I wrote a book about virtual disk management. Check out the The Linux Sysadmin's Guide to Virtual Disks online for free at ScribesGuides.com.

Consider supporting the author by purchasing a hard copy of the first edition for just $10.00 on Lulu.com.

bitmath

bitmath is a Python library for dealing with file size units (GiB's, kB's, etc) in a sane way. bitmath supports arithmetic, rich comparison, conversion, automatic best human-readable representation, and many otherutility functions. Read some examples on the docs site or check out the source on GitHub.