Converting Youtube WordPress Links to iframe Embeds

Yesterday’s DIY involved converting posts from dreamwidth to hugo (actually jekyl). Most of the heavy lifting was done by Thomas Frössm’s exitwp. Once everything was exported into jekyll-flavored markdown, one minor gap was youtube embeds. First, a skeleton for wrapping the iframe around a youtube movie code.

I’ve been thinking a bit about moving my blog yet again. This time to hugo for a variety of reasons:

I don’t use more than 10% of the features offered by WordPress.

I have very mixed feelings about blog-hosted discussions these days.

WordPress seems to be an easy target for hacks these days.

Most of my personal writing gets composed in emacs as markdown first anyway, so the first set of functions are intended to ease conversion by generating yaml front matter (title and date) for a post from the filename and the first heading. My filenames tend to start with a date stamp YYYYMMDD for ease of searching.

One of my hobbies is to try out scripts in multiple programming languages, so I found myself dusting off the passphrase generation script. Mostly it’s an exercise in figuring out how languages do things differently. C++ loves iterators. Racket loves lists. Common-lisp makes some things easy and some things really hard. Python loves list comprehension. I’m still figuring out go. So I write the script, and then try to tweak it to find out which parts of the language are best optimized. Some lessons on this round.

The problem

read a literary text file

create a list of unique words

randomly select 7 words from the list

count the number of unique words.

Lesson 1: If you just want it done, language doesn’t matter.

Timing differences among various implementations ranged from 0.35 seconds (python) to 4 seconds (c++ no optimization). Not that big of a deal except as an object lesson in techniques.

Lesson 2: Python is very good.

My python implementation consistently was the fastest of the bunch. Some of that is due to use of highly optimized c-based functions and list comprehensions. Here is the passphrase creation script (Pastebin.com).

Lesson 3: Imperative design is sometimes faster in racket.

Declaring a variable and updating it with an anti-functional for loop is sometimes faster for this sort of thing, but not by a lot. Racket came in the middle of the pack.

Lesson 4: C++ isn’t necessarily faster.

This one surprised me a bit. My C++ implementation read from stdin, and it was much slower than I expected (5 seconds). Using optimization flags -O2 brought it down to python speed on linux. Likely I could get that down faster, but I’m a novice at C++.

Lesson 5: Go concurrency isn’t as easy as it claims to be, and isn’t necessarily faster.

A single-threaded go script performed on par with racket. Trying to multi-thread it took half the night and ended up with a slower program. Again, I’m a novice at go.

Lesson 6: Using common lisp outside of the REPL is harder than it needs to be.

About half of the job was figuring out how to get the script to run outside of SLIME. Problem 1 was ensuring that the cl-ppcre regular expression library was loaded (sbcl used for the following):

Initial reports of the Ashley Madison hack suggested they did one thing right. If I’m reading this report from Ars Technica correctly, they managed to screw even that up by having a second password table with MD5-hashed passwords.

Why is that bad? cryptographic hash functions create unique “signatures” from electronic data. They come in two different varieties. “Fast” algorithms are used to verify the authenticity of gigabytes of data. They’re used to check the integrity of almost everything sent over the internet. They’re designed to be run millions of times a second with minimal memory.

Standard practice for storing passwords is to store a hash “signature” instead of the raw (plaintext) password. You log into a site, it runs the hash function, and compares the signature with the signature stored in its database.

While fast hash algorithms like MD5 are great for checking things like Windows 10 or streaming video. They’re bad for storing passwords. The state of the art in breaking passwords involves making millions of guesses. With MD5 and a graphics card, a password cracker can try over a billion guesses a second.

“Slow” hash functions such as bcrypt or PBKDF2 are designed to take an arbitrary length of time. Instead of a billion guesses per second, a cracker is limited to a few hundred.