super secret hq

A False Forth Scanner

June 10, 2013

Two languages that I keep returning to, like a moth to the flame, are LISP and Forth. I have the SwiftForth compiler on my system, along with Clozure CL. I like both, but neither of them are embeddable in a C program, so the next set of posts will be on writing a very simple Forth interpreter that I’ll call “False Forth” because it is not a “true” Forth.

Our scanner will also be simple. It will whitespace and return words only.

Whitespace includes blanks, tabs, linefeeds, and most non-printing characters between words.

Quoted text is anything between two quotation marks, using either single quote or double quote marks as the starting and ending delimiters. Inside a quoted text, two consecutive delimiters are considered an escape and are treated as a single occurrence of the delimiter within the text. All other characters are kept as-is. This includes both printing and non-printing characters.

Unquoted text is everything else. There is no escaping within unquoted text.

Here’s an example showing both words and spaces, along with the equivalent using C-style strings.

I wrote that
"Fred yelled, ""Are you ready?""
while the band marched on."
Dave's pet" "trick."