Herr Sommer und Frau Sonne (macrons forever)

After initiating what turned out to be an animated debate on the TEI-list, we had still not settled on anything yet concerning the Geminationsstriche on the m and the n, which still appear in the form m[m] and n[n] in the edition as it is.

There were all in all two valid options:

– encoding it as a macron in the form: Som& # 772;er (I had to add a few spaces for html to not transform it into an actual macron…) for Sommer and Son& # 772;e for Sonne;

– encoding it as a glyph in the form: So<g ref=”#mgem”/>er for Sommer and So<g ref=”#ngem”/> for Sonne

The first option is not fully satisfactory as we would have had to encode the diplomatical version and the reading version differently, which means that it would have come up to something like:

So<choice><abbr>m& # 772;</abbr><expan>mm</expan></choice> (or maybe use a <sic>/<corr> combination within the choice element, but it does not make much difference). This would make encoding Geminationsstriche somewhat tedious.

In the second option, it would be possible to define the “mgem”/”ngem” in such a way that we could spare us the use of the choice element while encoding. But this option is not really satisfactory either considering the Geminationsstrich is not a glyph but a macron. And if anyone was to reuse our XML file for whatever use, we would have to add this definition in the header and it would just add a layer of work and complexity, increase the error potential, etc.

So Alexander Meyer came up with the idea that we encode in the “light” form Som& # 772;er for Sommer and Son& # 772;e for Sonne and he writes a script (for internal use only), that makes all the necessary transformation to get the different versions. The XML-file is encoded properly, we have our two versions and the encoding is not massively more complex.

I can imagine that you could object this and that, but honestly, considering how our project works, this seems the best optimization we can get to me in terms of scholarly reliability, work economy and sustainability.

Anne Baillot

I studied German Studies and Philosophy in Paris where I got my PhD in 2002. I then moved to Berlin, where I have been living & doing research ever since. My areas of specialty include German literature, Digital Humanities, textual scholarship and intellectual history. I am currently working at the Centre Marc Bloch in Berlin as an expert in digital technologies for the humanities.

2 Responses

<g> doesn’t have to be an empty element. You can encode a Geminationsstrich like this: So <g ref=”#mgem”>mm</g>er. This way, a reader may choose not to follow the reference and ignore the <g> tags altogether and still get meaningful content.

doesn’t have to be an empty element. You can encode a Geminationsstrich like this: Sommer. This way, a reader may choose not to follow the reference and ignore the tags altogether and still get meaningful content.