Hi. On the English, I would make the following changes:
<verse osisID="Gen.1.9"><w src="1" lemma="strong:G02532">And</w>
<w src="3 4" lemma="strong:G03588 strong:G02316">God</w>
<w src="2" lemma="strong:G02036">said</w>,
<w src="5" lemma="strong:G04863">Let</w>
<w src="6" lemma="strong:G03588">the</w>
<w src="7" lemma="strong:G05204">water</w>
<w src="8" lemma="strong:G03588"></w><w src="9" lemma="strong:G05270">underneath</w>
<w src="10" lemma="strong:G03588">the</w>
<w src="11" lemma="strong:G03772">heaven</w>
<w src="5" lemma="strong:G04863">come together</w>
<w src="12" lemma="strong:G01519">into</w>
<w src="14" lemma="strong:G01520">one</w>
<w src="13" lemma="strong:G04864">gathering</w>,
<w src="15" lemma="strong:G02532">and</w>
<w src="16" lemma="strong:G03708">let</w>
<w src="17" lemma="strong:G03588">the</w>
<w src="18" lemma="strong:G03584">dry</w>
<transChange type="added">land</transChange>
<w src="16" lemma="strong:G03708">appear</w>!
<w src="19" lemma="strong:G02532">And</w>
<w src="20" lemma="strong:G01096">it was</w>
<w src="21" lemma="strong:G03779">so</w>.
<w src="22" lemma="strong:G02532">And</w>
<w src="24" lemma="strong:G03588">the</w>
<w src="25" lemma="strong:G05204">water</w>
<w src="26" lemma="strong:G03588"></w><w src="27" lemma="strong:G05270">underneath</w>
<w src="28" lemma="strong:G03588">the</w>
<w src="29" lemma="strong:G03772">heaven</w>
<w src="23" lemma="strong:G04863">gathered together</w>
<w src="30" lemma="strong:G01519">into</w>
<w src="31" lemma="strong:G03588"></w><w src="32 33" lemma="strong:G04864 strong:G01473">their gatherings</w>,
<w src="34" lemma="strong:G02532">and</w>
<w src="36" lemma="strong:G03588">the</w>
<w src="37" lemma="strong:G03584">dry</w>
<transChange type="added">land</transChange>
<w src="35" lemma="strong:G03708">appeared</w>.
</verse>
I don’t know the exact ruling, but when you have a Greek or Hebrew word that is split, it had been a common practice to either designate the split by adding a G0 or H0 in the lemma element like this:
<w src="5" lemma="strong:G04863 G0">Let</w>
However, that split can break some of the programs that are not expecting the extra designation. Having the following should be sufficient, and leaving it up to the program to show the relationship because the src is the same.
<w src="5" lemma="strong:G04863">Let</w>
<w src="5" lemma="strong:G04863">come together</w>
Perhaps someone else can comment.
Also, it is a practice to keep the punctuation outside of the <w> element, except where the punctuation is nested between two or more words in a single <w> element.
I am not sure which is best, but I personally keep the element G3588 outside the word following, except when it is nested in a preposition. The following is the way I would do it:
<w src="8" lemma="strong:G03588"></w><w src="9" lemma="strong:G05270">underneath</w>
As opposed to:
<w src="8 9" lemma="strong:G03588 strong:G05270">underneath</w>
The reason I do it like the first is because if I should try and compile a dictionary from the OSIS source, then it is easier for the program to determine which strong’s word is translated UNDERNEATH. The other word does not get translated, and therefore it is an empty element. That is just the way I would personally do it. Therefore:
<w src="3 4" lemma="strong:G03588 strong:G02316">God</w>
<w src="2" lemma="strong:G02036">said</w>,
Would be:
<w src="3" lemma="strong:G03588"></w><w src="4" lemma="strong:G02316">God</w>
<w src="2" lemma="strong:G02036">said</w>,
Notice that I don’t have any whitespace between the two elements. That is a personal preference. I can see how one might prefer grouping them. Consider the following how it might appear by a program that reads the empty element.
AndG2532 GodG2316 said,G2046
AndG2532 *G3588 GodG2316 said,G2046
AndG2532 G3588GodG2316 said,G2046
Or
AndG2532 GodG3588 G2316 said,G2046
I don’t know… I guess it just depends on how you want the program to interpret the empty element. I can see it both ways.
Funny… I like the osis XML to keep the punctuation outside the word element, but I like punctuation to be right next to the English word, rather than after the Strong’s number.
The src order is what keeps the link between the original Greek or Hebrew words. The position in how it is placed is sufficient to declare the English word order. Therefore, it would be best to remove the extra numbering found in the interlinear.
Anyway… those are my thoughts for what they are worth.
Wade
From: Daniel Bearden [mailto:dpbearden at gmail.com]
Sent: Friday, August 21, 2009 3:32 PM
To: sword-devel at crosswire.org
Subject: [sword-devel] Apostolic Bible Polyglot Greek-English interlinear updated
Greetings, I've updated the ABP modules to hopefully address any errors including the double markup.
> What are all the numerical digits within the text for? Example: (Gen 1:9)
>> And God said, Let [6come together 1the 2water 3underneath 4the
> 5heaven] into [2gathering 1one], and let [3appear 1the 2dry land ]! And it
> was so. And [6gathered together 1the 2water 3underneath 4the 5heaven] into
> their gatherings, and [3appeared 1the 2dry land ].
The words are arranged according to the order of the Greek text. This is how the text appears in the printed edition, viewable in pdf format here: http://www.apostolicbible.com/
I suppose with some effort I could rearrange the words, which would involve modifying the python script I use to generate the OSIS modules from the original latex files.
> If all these numbers are some form of semantic markup, should they not be
> inside XML tags, rather than outside them?
How could this be done?
> Belatedly, I notice that the Descriptions of the two modules are
> identical, which will cause confusion when modules are listed just by
> Description, not by [module name]. Daniel, would you have any objection
> if we added "(English)" and "(Greek)" respectively to each?
I've included the updated .conf files
Thanks,
Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20090824/9739b95e/attachment-0001.html>