If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Enjoy an ad free experience by logging in. Not a member yet? Register.

The divs may contain other tags, and may not have the spans in that order, quantity or positioning. I do not want to capture any tags, except the spans with the class mentioned.

any idea how I might go about this?

I was thinking of writing a recursive function that gathers the part before the first span and the first span, then feeds the rest into the same function, but I can't figure out how to feed only that section of the DOM to the recursive function.

Users who have thanked rnd me for this post:

Can you explain how that code works? I'm having a little trouble wrapping my head around it...

also, it seems like it does a good job of splitting (although I'm getting an error, initially), but I need to retain an index of ordering between the two types, so I know that ungrouped[1] comes before grouped[1] and that ungrouped[2] is between grouped[1] and grouped[2].

you see, once i have identified the grouped and ungrouped sections, they have to be processed and then reassembled into a similar div.

Can you explain how that code works? I'm having a little trouble wrapping my head around it...

also, it seems like it does a good job of splitting (although I'm getting an error, initially), but I need to retain an index of ordering between the two types, so I know that ungrouped[1] comes before grouped[1] and that ungrouped[2] is between grouped[1] and grouped[2].

ok, that makes sense, and it's do-able. since we need indexed output, i'll move both to a single array and stack objects instead of strings, so we can indicate that the slot is of the grouped or ungrouped kind.

i'll add some comments to help illustrate how it works, let me know if you want me to zoom-in on any particular part.

Code:

<div id="problematicDiv">
This is a test
<span class="wordGroup">word group 1</span>
more text
<span class="wordGroup">word group 2</span>
last bit of text that stretches on and on
</div>
<script>
var div=document.createElement("div"); // creates a new temp div that we can mess-up without affecting the page
div.innerHTML=IN.value; // fills temp div with existing div content
div=div.firstChild; // drill into the temp div wrapper to get a clone of the problematic div.
var output=[], // a stack for the extractions
a; // a current span element var to use within the loop to refer to the "current" span
while(a=div.querySelectorAll("span.wordGroup")[0]){ // continues grabbing all span.wordGroup tags until none are left:
var boundary =document.createElement("br"); // make a boundary to stand-in where the span was
a.parentNode.insertBefore(boundary, a); // inject boundary just before span
a.parentNode.removeChild(a); // remove orig span that was added to stack as a grouped slot
var texts=div.innerHTML.split("<br>") // grabs all text to the left of extracted span
output.push({kind: "ungrouped", value: texts[0].trim() }); // push the other text to the stack
div.innerHTML=texts.slice(1).join("<br>"); // removes the content before the boundary and the boundary itself
output.push({kind: "grouped", value: a.outerHTML}); // push the span code to the stack
}// next sub-element
output.push({kind: "ungrouped", value: div.innerHTML.trim() }); // push the last remaining text to the stack
console.log(output); // view the output structure
</script>