Wed, 29 Mar 2006

What to do with a few extra hours in a boring motel with no net access?
How about digging into fixing one of Emacs' more annoying misfeatures?

Whenever I edit an html file using emacs, I find I have to stay away
from double dashes -- I can't add a phrase such as this one.
If I forget and type a phrase with a double dash, then as soon
as I get to the end of that line and emacs decides it's time to wrap
to the next line, it "helpfully" treats the double dashes as a
comment, and indents the next line to the level where the dashes were,
adding another set of dashes. I've googled, I've asked on emacs IRC
help channels, but there doesn't seem to be any way out. (I guess no
one else ever uses double dashes in html files?)

It's frustrating: I like using double dashes now and then. And aside
from the occasional boneheaded misfeature like this one, I like using
emacs. But the dash problem been driving me nuts for a long time
now. So I finally dug into the code to cure it.

First, the file is sgml-mode.el, so don't bother searching anything
with html in the name. On my system it's
/usr/share/emacs/21.4/lisp/textmodes/sgml-model.el.
Edit that file and search for "--" and the first
thing you'll find (well, after the file's preamble comments) is a
comment in the definition of "sgml-specials" saying that if you
include ?- in the list of specials, it will hork the typing of double
dashes, so that's normally left out.

A clue! Perhaps some Debian or Ubuntu site file has changed
sgml-specials for me, and all I need to do is change it back!
So I typed

M-x describe-variable sgml-specials

to see the current setting.

Um ... it's set to "34". That's not very helpful. I haven't a clue how
that translates to the list of characters I see in sgml-mode.el.
Forget about that approach for now.

Searching through the file for the string "comment" got me a few more
hits, and I tried commenting out various comment handling lines until
the evil behavior went away. (I had to remove sgml-mode.elc first,
otherwise emacs wouldn't see any changes I made to sgml-mode.el.
If you haven't done much elisp hacking, the .el is the lisp source,
while the .elc is a byte-compiled version which loads quicker but
isn't intended to be edited by humans. For Java programmers, the .elc
is sort of like a .class file.)

To regenerate the .elc file so sgml-mode will load faster, I ran emacs
as root from the directory sgml-mode.el was in, and typed:

M-x byte-compile-file sgml-mode.el

All better! And now I know where to find documentation for all those
useful-looking, but seemingly undocumented, keyboard shortcuts that
go along with emacs' html mode. Just search in the file for
html-mode-map, and you'll find all sorts of useful stuff.

For instance, that typing Ctrl-C Ctrl-C followed by various letters: u
gets you an unordered list, h gets you an href tag, i an image tag,
and so on, with the cursor positioned where you want to type next.

It doesn't seem to offer any basic inline formatting (like
<i> or <em>), alas; but of course that's easy to add
by editing the file (or maybe even in .emacs). To add an <em>
tag, add this line to html-mode-map:

(define-key map "\C-c\C-ce" 'html-em)

then add this function somewhere near where html-headline-1 and
friends are defined:

(define-skeleton html-em
"HTML emphasis tags."
nil
"" _ "")

Of course, you can define any set of tags you use often, not just
<em>.

HTML mode in emacs should be much more fun and less painful now!

Update: If you don't want to modify the files as root, it also
works fine to copy sgml-mode.el to wherever you keep personal
elisp files. For instance, put them in a directory called
~/.emacs-lisp then add this to your .emacs:
(setq load-path (cons "~/.emacs-lisp/" load-path))