The technique I am using is to have the graphical front-end emit a
text-language representation, then uses the usual techniques to
compile that to executable code.

The graphics-to-text emitter uses prolog (actually Common Lisp with a
prolog-like library) to perform the "parse" and semantic analysis.

The graphics is stored as a set of simple "facts", e.g. line from x1y1
to x2y2; text at xy. The prolog "parser" infers structure - e.g. a
"box" consists of 4 lines that touch and form a closed polygon; a
named box is a box that contains a chunk of text, and so on.

The "semantic" analysis uses prolog to map these uber-facts into a
structured text language (invented for this purpose), e.g. a named box
maps into a "software component" with the given name, a group of
ellipses joined by lines map onto state machines, etc.