Comments

edited

Issue details

We have an automatic test in which two users are collaborating on a document. User 1 is typing. User 2 waits until the document has a size of at least 34. Then user 2 selects from characters 25 to 30, hits backspace and then undo. User 1 continues typing the entire time. Sometimes this leads to a range error for user 2:

Uncaught RangeError: Position 36 out of range, 55, RangeError: Position 36 out of range

It is very difficult to isolate this issue, as I have been unable to trigger it on my own laptop. Only on travis I can see it happening, and only about 25-30% of the times I am running the tests. I have added some debug statements to a version of prosemirror-history to be output in travis to get more of an idea of what is going on.

So far I have found this:

The error seems to be triggered by two things happening at the same time:

The selection here in most cases has an anchor of 25 and head 30. This is what one would expect as those correspond to the selection initially set for user 2. However, sometimes, the anchor and head are offset by around 5 - so they are 30 and 36 instead of 25 and 30.

Sometimes, the document size much longer than the selection head. If the selection head is 36 and the document size is 37, the range error occurs.

I can see that the selection anchor/head are already 30/36 here, before the mapping in the following line.

The numbers are fairly consistent - while the length of the document varies, the anchor is always either 25 (as it should be) or 30. And the head is either 30 (as it should be) or 36. Seems like something about the mapping of the selection is not working in case of incoming changes from another source sometimes. Notice that the difference of 5 between the correct anchor and actual one happens to be the length of the piece of content that was deleted.

Steps to reproduce

Very difficult. But one can fork the Fidus Writer repository, make some changes and connect it to travis and make the test run a few times. My prosemirror-history with debugging additions can be found here.

ProseMirror version

Latest

Affected platforms

Chrome

Firefox

Internet Explorer

Other

This comment has been minimized.

edited

Case 1 (working correctly):

At the time when undo is executed by user 2, user 2 has a selection from anchor 25 to head 25. There are no sendable steps and the document size is 32. The last messages sent and received from the server are all by user 2:

a selection change from 25 to 30.

a replace step from 25 to 30.

a confirmation from the server of the replace step having been accepted.

At the time when undo is executed by user 2, user 2 has a selection from anchor 25 to head 25. There are no sendable steps and the document size is 33. The last messages sent and received from the server are a mix of changes from user 1 and 2:

A replace step with slice at 31/31 by user 1.

A selection change from 25 to 30 by user 2.

A replace step from 25 to 30 by user 2.

A confirmation to user 2 from the server of the replace step in 3 having been accepted.

At the time when undo is executed by user 2, user 2 has a selection from anchor 25 to head 25. There are no sendable steps and the document size is 32. The last messages sent and received from the server are a mix of changes from user 1 and 2:

A replace step with slice at 30/30 by user 1.

A selection change from 25 to 30 by user 2.

A replace step with slice at 31/31 by user 1.

A replace step from 25 to 30 by user 2.

A confirmation to user 2 from the server of the replace step in 4 having been accepted.

This comment has been minimized.

The processes inside of ProseMirror are entirely deterministic. As such, it should be possible to record a trace of what, exactly, happens in what order when this error occurs. If looking at that doesn't show the issue to be on your side, please use it to create a (deterministically failing) test case on top of core ProseMirror and I'll take a look.

This comment has been minimized.

Are there timeouts in there? That's the only explanation I can find for why it would fail on travis sometimes and never on my own laptop. Previously I thought it was because travis was running the tests remotely via saucelabs, but for the past half year I have been running tests on travis with headless chrome and the error keeps showing up.

This comment has been minimized.

Until yesterday, all our replace steps were accompanied by an addmark step that would cover the inserted content (for tracking changes). Yesterday I changed this to just a single replace step which now includes the mark by replacing the original tr rather than using appendTransaction to add the mark in a second step. This change has done that so far I have not been able to trigger this issue again. I cannot say it's gone for good, but at least for now with the new setup I am not likely able to collect enough data to create test cases, etc. . I will close this issue and reopen if I discover it shows up in production usage.

This comment has been minimized.

Not in prosemirror-model, transform, or collab. Since you're testing your own integration with collab, which I guess involves some kind of communication with a server, I'd look for the problem there first.

This comment has been minimized.

I spent a few days looking at my own code first before contacting you about this last year. I spent another few days last week.

But in the end it comes down to the prosemirror undo function being triggered leading to a range error within prosemirror code. One part of prosemirror history wants to set the selection outside of where it can go and another part is creating a range error (see numbers in the answer here with the examples). Even with other bugs in other parts of my code, it seems like prosemirror's undo function should not under any circumstances return a doc and a selection that is outside of that doc.

If the error is in my code, it must be my code inputting incorrect values into prosemirror in some way -- calling functions with incorrect variables, etc. . It's not impossible, but at this stage, after looking through my own code for a while, the next place I would look first is in the history plugin and what causes it to determine this strange selection position. And I did look in there. It's just that a lot of that reads like assembler to me and I have a hard time getting my head around what the variables mean. I was just thinking that maybe you have more of an idea of what could be causing mapping issues with the selection in the undo function.

I tried recreating the same steps manually, but the same issue doesn't seem to happen then. I will reopen and try to find more details should I run into this again.

This comment has been minimized.

edited

The way you just recently mentioned setting one transaction's meta property to that of another transaction seems like it might be worth investigating (as you noted, that is not safe).

I know it's not safe which is why I am looking for alternatives. But this issue showed up before that, when I was feeding values into ProseMirror according to official recommendations (as far as I know). Since I have switched to the new solution (which involves setting the meta in this unsafe way) I have not been able to trigger this error again.

This comment has been minimized.

Maybe this is a better approach: Where in the history plugin could I add logging statements to see what mappings it's trying to map the selection through? Even if the bug is in my code, I think I'll need to log those in order to find it.

This comment has been minimized.

I'm sorry, the history implementation is pretty complicated, and I can't answer that question in a simple way. But I think the first point of order is to try and isolate and reduce the failure case, because debugging a non-reproduceable issue is going to be very painful. For that, I think you'll have to log a lot of information about the transactions as they are applied in your test, and maybe try to mechanically diff failing and succeeding ones.