*[RFC PATCH 0/7] merge requirement: index matches head@ 2018-06-03 6:58 Elijah Newren
2018-06-03 6:58 ` [RFC PATCH 1/7] t6044: verify that merges expected to abort actually abort Elijah Newren
` (7 more replies)0 siblings, 8 replies; 30+ messages in thread
From: Elijah Newren @ 2018-06-03 6:58 UTC (permalink / raw)
To: git; +Cc: jrnieder, Elijah Newren
Between working on some other things, I happened to be reading
Documentation/git-merge.txt and ran across the part that says:
...[merge will] abort if there are any changes registered in the
index relative to the `HEAD` commit. (One exception is when the
changed index entries are in the state that would result from the
merge already.)
I was pretty sure this statement was wrong, but did some digging to
uncover the details and the history. What I thought would turn into a
simple three-line documentation fix, ballooned into this patch series.
This series might be best read in a different order; I'm not yet sure
the right way to structure it. But:
* Patch 5 demonstrates one of the ways that the parenthetical
sentence is wrong (desirable perhaps, but not what is implemented)
* Patch 7 explains the history, the trade-offs, the three ways the
parenthetical sentence is wrong, and the many pitfalls we've run
into trying to allow for such an exception. Very small
documentation fix with a huge commit message.
* Patch 6 fixes the breakage demonstrated in patch 5, but if I only
submitted patches 5-7, then the testsuite wouldn't pass because
this fix uncovered multiple other bugs. That's where patches 1-4
came in. This fix is also kind of opinionated; it takes the stance
that allowing the exceptions isn't worth it.
Elijah Newren (7):
t6044: verify that merges expected to abort actually abort
t6044: add a testcase for index matching head, when head doesn't match HEAD
merge-recursive: make sure when we say we abort that we actually abort
merge-recursive: fix assumption that head tree being merged is HEAD
t6044: add more testcases with staged changes before a merge is invoked
merge-recursive: enforce rule that index matches head before merging
merge: fix misleading pre-merge check documentation
Documentation/git-merge.txt | 6 +-
builtin/am.c | 6 +-
cache.h | 8 --
merge-recursive.c | 14 +--
merge.c | 10 +-
t/t6044-merge-unrelated-index-changes.sh | 67 +++++++++++--
t/t7504-commit-msg-hook.sh | 4 +-
t/t7611-merge-abort.sh | 118 -----------------------
tree.h | 8 ++
9 files changed, 87 insertions(+), 154 deletions(-)
--
2.18.0.rc0.49.g3c08dc0fef
^permalinkrawreply [flat|nested] 30+ messages in thread

*[RFC PATCH 7/7] merge: fix misleading pre-merge check documentation
2018-06-03 6:58 [RFC PATCH 0/7] merge requirement: index matches head Elijah Newren
` (5 preceding siblings ...)
2018-06-03 6:58 ` [RFC PATCH 6/7] merge-recursive: enforce rule that index matches head before merging Elijah Newren
@ 2018-06-03 6:58 ` Elijah Newren
2018-06-07 5:27 ` Elijah Newren
2018-07-01 1:24 ` [PATCH v2 0/9] Fix merge issues with index not matching HEAD Elijah Newren
7 siblings, 1 reply; 30+ messages in thread
From: Elijah Newren @ 2018-06-03 6:58 UTC (permalink / raw)
To: git; +Cc: jrnieder, Elijah Newren
builtin/merge.c contains this important requirement for merge strategies:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
However, Documentation/git-merge.txt says:
...[merge will] abort if there are any changes registered in the index
relative to the `HEAD` commit. (One exception is when the changed
index entries are in the state that would result from the merge
already.)
Interestingly, prior to commit c0be8aa06b85 ("Documentation/git-merge.txt:
Partial rewrite of How Merge Works", 2008-07-19),
Documentation/git-merge.txt said much more:
...the index file must match the tree of `HEAD` commit...
[NOTE]
This is a bit of a lite. In certain special cases [explained
in detail]...
Otherwise, merge will refuse to do any harm to your repository
(that is...your working tree...and index are left intact).
So, this suggests that the exceptions existed because there were special
cases where it would case no harm, and potentially be slightly more
convenient for the user. While the current text in git-merge.txt does
list a condition under which it would be safe to proceed despite the index
not matching HEAD, it does not match what is actually implemented, in
three different ways:
* The exception is written to describe what unpack-trees allows. Not
all merge strategies allow such an exception, though, making this
description misleading. 'ours' and 'octopus' merges have strictly
enforced index==HEAD for a while, and the commit previous to this
one made 'recursive' do so as well.
* If someone did a three-way content merge on a specific file using
versions from the relevant commits and staged it prior to running
merge, then that path would technically satisfy the exception listed
in git-merge.txt. unpack-trees.c would still error out on the path,
though, because it defers the three-way content merge logic to other
parts of the code (resolve, octopus, or recursive) and has no way of
checking whether the index entry from before the merge will match
the end result of the merge.
* The exception as implemented in unpack-trees actually only checked
that the index matched the MERGE_HEAD version of the file and that
HEAD matched the merge base. Assuming no renames, that would indeed
provide cases where the index matches the end result we'd get from a
merge. But renames means unpack-trees is checking that it instead
matches something other than what the final result will be, risking
either erroring out when we shouldn't need to, or not erroring out
when we should and overwriting the user's staged changes.
In addition to the wording behind this exception being misleading, it is
also somewhat surprising to see how many times the code for the special
cases were wrong or the check to make sure the index matched head was
forgotten altogether:
* Prior to commit ee6566e8d70d ("[PATCH] Rewrite read-tree", 2005-09-05),
there were many cases where an unclean index entry was allowed (look for
merged_entry_allow_dirty()); it appears that in those cases, the merge
would have simply overwritten staged changes with the result of the
merge. Thus, the merge result would have been correct, but the user's
uncommitted changes could be thrown away without warning.
* Prior to commit 160252f81626 ("git-merge-ours: make sure our index
matches HEAD", 2005-11-03), the 'ours' merge strategy did not check
whether the index matched HEAD. If it didn't, the resulting merge
would include all the staged changes, and thus wasn't really an 'ours'
strategy.
* Prior to commit 3ec62ad9ffba ("merge-octopus: abort if index does not
match HEAD", 2016-04-09), 'octopus' merges did not check whether the
index matched HEAD, also resulting in any staged changes from before
the commit silently being folded into the resulting merge. commit
a6ee883b8eb5 ("t6044: new merge testcases for when index doesn't match
HEAD", 2016-04-09) was also added at the same time to try to test to
make sure all strategies did the necessary checking for the requirement
that the index match HEAD. Sadly, it didn't catch all the cases, as
evidenced by the remainder of this list...
* Prior to commit 65170c07d466 ("merge-recursive: avoid incorporating
uncommitted changes in a merge", 2017-12-21), merge-recursive simply
relied on unpack_trees() to do the necessary check, but in one special
case it avoided calling unpack_trees() entirely and accidentally ended
up silently including any staged changes from before the merge in the
resulting merge commit.
* The commit immediately before this one in this series noted that the
exceptions were written in a way that assumed no renames, making it
unsafe for merge-recursive to use. merge-recursive was modified to
use its own check to enforce that index==HEAD.
This history makes it very tempting to go into builtin/merge.c and replace
the comment that strategies must enforce that index matches HEAD with code
that just enforces it. At this point, that would only affect the
'resolve' strategy; all other strategies have each been modified to
manually enforce it. (However, note that index==HEAD is not strictly
enforced for fast-forward merges, as those are not considered a merge
strategy and they trigger in builtin/merge.c before the section in the
code where the relevant comment is found.)
But, even if we don't take the step of just fixing these problems by
enforcing index==HEAD for all strategies, we at least need to update this
misleading documentation in git-merge.txt. For now, just modify the claim
in Documentation/git-merge.txt to fix the error. The precise details
around combination of merges strategies and special cases probably is not
relevant to most users, so simply state that exceptions may exist but are
narrow and vary depending upon which merge strategy is in use.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
Documentation/git-merge.txt | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-merge.txt b/Documentation/git-merge.txt
index d5dfd8430f..141bd72284 100644
--- a/Documentation/git-merge.txt
+++ b/Documentation/git-merge.txt
@@ -122,9+122,9 @@ merge' may need to update.
To avoid recording unrelated changes in the merge commit,
'git pull' and 'git merge' will also abort if there are any changes
-registered in the index relative to the `HEAD` commit. (One
-exception is when the changed index entries are in the state that
-would result from the merge already.)
+registered in the index relative to the `HEAD` commit. (Special
+narrow exceptions to this rule may exist depending on which merge
+strategy is in use, but generally, the index must match HEAD.)
If all named commits are already ancestors of `HEAD`, 'git merge'
will exit early with the message "Already up to date."
--
2.18.0.rc0.49.g3c08dc0fef
^permalinkrawreply [flat|nested] 30+ messages in thread

*Re: [RFC PATCH 4/7] merge-recursive: fix assumption that head tree being merged is HEAD
2018-06-03 6:58 ` [RFC PATCH 4/7] merge-recursive: fix assumption that head tree being merged is HEAD Elijah Newren
@ 2018-06-03 13:52 ` Ramsay Jones
2018-06-03 23:37 ` brian m. carlson
2018-06-04 3:19 ` Junio C Hamano1 sibling, 1 reply; 30+ messages in thread
From: Ramsay Jones @ 2018-06-03 13:52 UTC (permalink / raw)
To: Elijah Newren, git; +Cc: jrnieder
On 03/06/18 07:58, Elijah Newren wrote:
> `git merge-recursive` does a three-way merge between user-specified trees
> base, head, and remote. Since the user is allowed to specify head, we can
> not necesarily assume that head == HEAD.
>
> We modify index_has_changes() to take an extra argument specifying the
> tree to compare the index to. If NULL, it will compare to HEAD. We then
> use this from merge-recursive to make sure we compare to the
> user-specified head.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>
> I'm really unsure where the index_has_changes() declaration should go;
> I stuck it in tree.h, but is there a better spot?
Err, leave it where it is and '#include "tree.h"' ? :-D
ATB,
Ramsay Jones
^permalinkrawreply [flat|nested] 30+ messages in thread

*Re: [RFC PATCH 4/7] merge-recursive: fix assumption that head tree being merged is HEAD
2018-06-03 23:37 ` brian m. carlson@ 2018-06-04 2:26 ` Ramsay Jones0 siblings, 0 replies; 30+ messages in thread
From: Ramsay Jones @ 2018-06-04 2:26 UTC (permalink / raw)
To: brian m. carlson, Elijah Newren, git, jrnieder
On 04/06/18 00:37, brian m. carlson wrote:
> On Sun, Jun 03, 2018 at 02:52:12PM +0100, Ramsay Jones wrote:
>> On 03/06/18 07:58, Elijah Newren wrote:
>>> I'm really unsure where the index_has_changes() declaration should go;
>>> I stuck it in tree.h, but is there a better spot?
>>
>> Err, leave it where it is and '#include "tree.h"' ? :-D
>
> Or leave it where it is and use a forward structure declaration?
Indeed, I had intended to mention that possibility as well.
[Note: the "merge-recursive.h" header file references several
'struct tree *' parameters, but does not itself include a
declaration/definition from any source. So, in all of the six
files that #include it, it relies on a previous #include to
provide such a declaration/definition. I haven't checked, but
I think that it is usually provided by the "commit.h" header (even
on the single occasion that "tree.h" was included!).]
ATB,
Ramsay Jones
^permalinkrawreply [flat|nested] 30+ messages in thread

*Re: [RFC PATCH 4/7] merge-recursive: fix assumption that head tree being merged is HEAD
2018-06-04 3:19 ` Junio C Hamano@ 2018-06-05 7:14 ` Elijah Newren
2018-06-11 16:15 ` Elijah Newren0 siblings, 1 reply; 30+ messages in thread
From: Elijah Newren @ 2018-06-05 7:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List, Jonathan Nieder
On Sun, Jun 3, 2018 at 8:19 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
>> `git merge-recursive` does a three-way merge between user-specified trees
>> base, head, and remote. Since the user is allowed to specify head, we can
>> not necesarily assume that head == HEAD.
>>
>> We modify index_has_changes() to take an extra argument specifying the
>> tree to compare the index to. If NULL, it will compare to HEAD. We then
>> use this from merge-recursive to make sure we compare to the
>> user-specified head.
>>
>> Signed-off-by: Elijah Newren <newren@gmail.com>
>> ---
>>
>> I'm really unsure where the index_has_changes() declaration should go;
>> I stuck it in tree.h, but is there a better spot?
>
> I think I saw you tried to lift an assumption that we're always
> working on the_index in a separate patch recently. Should that
> logic apply also to this part of the codebase? IOW, shouldn't
> index_has_changes() take a pointer to istate (as opposed to a
> function that uses the implicit the_index that should be named as
> "cache_has_changes()" or something?)
>
> I tend to think this function as part of the larger read-cache.c
> family whose definitions are in cache.h and accompanied by macros
> that are protected by NO_THE_INDEX_COMPATIBILITY_MACROS so if we
> were to move it elsewhere, I'd keep the header part as-is and
> implementation to read-cache.c to keep it together with the family,
> but I do not see a huge issue with the current placement, either.
That's good point; the goal to lift assumptions on the_index should
probably also apply here. I'll make the change.
(And it was actually Duy's patch that I was reviewing, but close
enough.) I'll take a look at moving it to read-cache.c as well.
^permalinkrawreply [flat|nested] 30+ messages in thread

*Re: [RFC PATCH 7/7] merge: fix misleading pre-merge check documentation
2018-06-03 6:58 ` [RFC PATCH 7/7] merge: fix misleading pre-merge check documentation Elijah Newren
@ 2018-06-07 5:27 ` Elijah Newren0 siblings, 0 replies; 30+ messages in thread
From: Elijah Newren @ 2018-06-07 5:27 UTC (permalink / raw)
To: Git Mailing List; +Cc: Jonathan Nieder, Elijah Newren
On Sat, Jun 2, 2018 at 11:58 PM, Elijah Newren <newren@gmail.com> wrote:
> builtin/merge.c contains this important requirement for merge strategies:
>
> ...the index must be in sync with the head commit. The strategies are
> responsible to ensure this.
>
> However, Documentation/git-merge.txt says:
>
> ...[merge will] abort if there are any changes registered in the index
> relative to the `HEAD` commit. (One exception is when the changed
> index entries are in the state that would result from the merge
> already.)
>
> Interestingly, prior to commit c0be8aa06b85 ("Documentation/git-merge.txt:
> Partial rewrite of How Merge Works", 2008-07-19),
> Documentation/git-merge.txt said much more:
>
> ...the index file must match the tree of `HEAD` commit...
> [NOTE]
> This is a bit of a lite. In certain special cases [explained
> in detail]...
> Otherwise, merge will refuse to do any harm to your repository
> (that is...your working tree...and index are left intact).
>
> So, this suggests that the exceptions existed because there were special
> cases where it would case no harm, and potentially be slightly more
> convenient for the user. While the current text in git-merge.txt does
> list a condition under which it would be safe to proceed despite the index
> not matching HEAD, it does not match what is actually implemented, in
> three different ways:
>
> * The exception is written to describe what unpack-trees allows. Not
> all merge strategies allow such an exception, though, making this
> description misleading. 'ours' and 'octopus' merges have strictly
> enforced index==HEAD for a while, and the commit previous to this
> one made 'recursive' do so as well.
>
> * If someone did a three-way content merge on a specific file using
> versions from the relevant commits and staged it prior to running
> merge, then that path would technically satisfy the exception listed
> in git-merge.txt. unpack-trees.c would still error out on the path,
> though, because it defers the three-way content merge logic to other
> parts of the code (resolve, octopus, or recursive) and has no way of
> checking whether the index entry from before the merge will match
> the end result of the merge.
>
> * The exception as implemented in unpack-trees actually only checked
> that the index matched the MERGE_HEAD version of the file and that
> HEAD matched the merge base. Assuming no renames, that would indeed
> provide cases where the index matches the end result we'd get from a
> merge. But renames means unpack-trees is checking that it instead
> matches something other than what the final result will be, risking
> either erroring out when we shouldn't need to, or not erroring out
> when we should and overwriting the user's staged changes.
>
> In addition to the wording behind this exception being misleading, it is
> also somewhat surprising to see how many times the code for the special
> cases were wrong or the check to make sure the index matched head was
> forgotten altogether:
>
> * Prior to commit ee6566e8d70d ("[PATCH] Rewrite read-tree", 2005-09-05),
> there were many cases where an unclean index entry was allowed (look for
> merged_entry_allow_dirty()); it appears that in those cases, the merge
> would have simply overwritten staged changes with the result of the
> merge. Thus, the merge result would have been correct, but the user's
> uncommitted changes could be thrown away without warning.
>
> * Prior to commit 160252f81626 ("git-merge-ours: make sure our index
> matches HEAD", 2005-11-03), the 'ours' merge strategy did not check
> whether the index matched HEAD. If it didn't, the resulting merge
> would include all the staged changes, and thus wasn't really an 'ours'
> strategy.
>
> * Prior to commit 3ec62ad9ffba ("merge-octopus: abort if index does not
> match HEAD", 2016-04-09), 'octopus' merges did not check whether the
> index matched HEAD, also resulting in any staged changes from before
> the commit silently being folded into the resulting merge. commit
> a6ee883b8eb5 ("t6044: new merge testcases for when index doesn't match
> HEAD", 2016-04-09) was also added at the same time to try to test to
> make sure all strategies did the necessary checking for the requirement
> that the index match HEAD. Sadly, it didn't catch all the cases, as
> evidenced by the remainder of this list...
>
> * Prior to commit 65170c07d466 ("merge-recursive: avoid incorporating
> uncommitted changes in a merge", 2017-12-21), merge-recursive simply
> relied on unpack_trees() to do the necessary check, but in one special
> case it avoided calling unpack_trees() entirely and accidentally ended
> up silently including any staged changes from before the merge in the
> resulting merge commit.
>
> * The commit immediately before this one in this series noted that the
> exceptions were written in a way that assumed no renames, making it
> unsafe for merge-recursive to use. merge-recursive was modified to
> use its own check to enforce that index==HEAD.
>
> This history makes it very tempting to go into builtin/merge.c and replace
> the comment that strategies must enforce that index matches HEAD with code
> that just enforces it. At this point, that would only affect the
> 'resolve' strategy; all other strategies have each been modified to
> manually enforce it.
I'm curious if anyone has comments on this last paragraph above.
Would anyone object to strictly enforcing index matches HEAD before
all types of merges?
^permalinkrawreply [flat|nested] 30+ messages in thread

*Re: [RFC PATCH 4/7] merge-recursive: fix assumption that head tree being merged is HEAD
2018-06-05 7:14 ` Elijah Newren@ 2018-06-11 16:15 ` Elijah Newren0 siblings, 0 replies; 30+ messages in thread
From: Elijah Newren @ 2018-06-11 16:15 UTC (permalink / raw)
To: Junio C Hamano
Cc: Git Mailing List, Jonathan Nieder, Nguyễn Thái Ngọc
On Tue, Jun 5, 2018 at 12:14 AM, Elijah Newren <newren@gmail.com> wrote:
> On Sun, Jun 3, 2018 at 8:19 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Elijah Newren <newren@gmail.com> writes:
>>
>>> `git merge-recursive` does a three-way merge between user-specified trees
>>> base, head, and remote. Since the user is allowed to specify head, we can
>>> not necesarily assume that head == HEAD.
>>>
>>> We modify index_has_changes() to take an extra argument specifying the
>>> tree to compare the index to. If NULL, it will compare to HEAD. We then
>>> use this from merge-recursive to make sure we compare to the
>>> user-specified head.
>>>
>>> Signed-off-by: Elijah Newren <newren@gmail.com>
>>> ---
>>>
>>> I'm really unsure where the index_has_changes() declaration should go;
>>> I stuck it in tree.h, but is there a better spot?
>>
>> I think I saw you tried to lift an assumption that we're always
>> working on the_index in a separate patch recently. Should that
>> logic apply also to this part of the codebase? IOW, shouldn't
>> index_has_changes() take a pointer to istate (as opposed to a
>> function that uses the implicit the_index that should be named as
>> "cache_has_changes()" or something?)
>>
>> I tend to think this function as part of the larger read-cache.c
>> family whose definitions are in cache.h and accompanied by macros
>> that are protected by NO_THE_INDEX_COMPATIBILITY_MACROS so if we
>> were to move it elsewhere, I'd keep the header part as-is and
>> implementation to read-cache.c to keep it together with the family,
>> but I do not see a huge issue with the current placement, either.
>
> That's good point; the goal to lift assumptions on the_index should
> probably also apply here. I'll make the change.
> (And it was actually Duy's patch that I was reviewing, but close
> enough.) I'll take a look at moving it to read-cache.c as well.
Making it not depend on the_index will require changes to make
diff-lib.c not depend on the_index first, so this is going to have to
wait for Duy's changes mentioned at
https://public-inbox.org/git/CACsJy8Ba74iSPf4_zFxuV=_uNJgL6Z2QunOvAvi3qab-6EWi5g@mail.gmail.com/.
I'll re-roll this series on top of Duy's when it comes out.
^permalinkrawreply [flat|nested] 30+ messages in thread

*[PATCH v2 0/9] Fix merge issues with index not matching HEAD
2018-06-03 6:58 [RFC PATCH 0/7] merge requirement: index matches head Elijah Newren
` (6 preceding siblings ...)
2018-06-03 6:58 ` [RFC PATCH 7/7] merge: fix misleading pre-merge check documentation Elijah Newren
@ 2018-07-01 1:24 ` Elijah Newren
2018-07-01 1:24 ` [PATCH v2 1/9] read-cache.c: move index_has_changes() from merge.c Elijah Newren
` (8 more replies)7 siblings, 9 replies; 30+ messages in thread
From: Elijah Newren @ 2018-07-01 1:24 UTC (permalink / raw)
To: git; +Cc: gitster, pclouds, Elijah Newren
This series exists to fix problems for merges when the index doesn't
match HEAD. We've had an almost comical number of these types of
problems in our history, as thoroughly documented in the commit
message for the final patch.
v1 can be found here:
https://public-inbox.org/git/20180603065810.23841-1-newren@gmail.com/
Changes since v1 (full branch-diff below):
* Minor wording tweaks to a few commit messages (fixing typos,
rewrapping, etc.)
* Move index_has_changes() to read-cache.c, and (partially) lift the
assumption that we're always operating on the_index -- as
suggested by Junio. A full lift of the assumption would conflict
with and duplicate work Duy is doing, so there is a simple BUG()
check in place for now.
* Add two new patches to the _beginning_ of the series, to implement
the last point. Reason: Since this series will probably conflict
slightly with Duy's not-yet-submitted work to remove assumption of
the_index throughout the codebase, I figured this would make it
the clearest and easiest for him to fix up small conflicts.
(Alternatively, if folks would rather that my series wait for his
to go through, it should make it much easier for me to rebase my
series on top of his work by having these placeholders at the
beginning.)
Questions I'm particularly interested in reviewers addressing:
* Do my two patches at the beginning make sense? In particular, is
the BUG() a reasonable way to limit conflicts with Duy for now,
while he works on more thoroughly stamping out assumed the_index
usage?
* Does the series flow well? I was curious if I should reorder the
series when I submitted v1, but no one commented on that question.
* The second to last patch points out that current git incorrectly
implements what would be a safe exception to the index matching
HEAD before a merge rule. Would it be more desireable to
correctly implement the safe exception (even though it would be
somewhat difficult), rather than just disallow the exception for
merge-recursive as I did in that patch?
* Given the large number of problems we've had in this area -- as
documented in the final commit message -- should we be more
defensive and disallow all merge strategies from having even
so-called safe exceptions? We could do this in a single place,
which would have prevented all the current and most if not all
historical problems in this area, by just enforcing that the index
match HEAD in builtin/merge.c. (See the second to last paragraph
of the last commit message for more details.)
Elijah Newren (9):
read-cache.c: move index_has_changes() from merge.c
index_has_changes(): avoid assuming operating on the_index
t6044: verify that merges expected to abort actually abort
t6044: add a testcase for index matching head, when head doesn't match
HEAD
merge-recursive: make sure when we say we abort that we actually abort
merge-recursive: fix assumption that head tree being merged is HEAD
t6044: add more testcases with staged changes before a merge is
invoked
merge-recursive: enforce rule that index matches head before merging
merge: fix misleading pre-merge check documentation
Documentation/git-merge.txt | 6 +-
builtin/am.c | 7 +-
cache.h | 16 +--
merge-recursive.c | 14 +--
merge.c | 31 ------
read-cache.c | 40 ++++++++
t/t6044-merge-unrelated-index-changes.sh | 67 +++++++++++--
t/t7504-commit-msg-hook.sh | 4 +-
t/t7611-merge-abort.sh | 118 -----------------------
9 files changed, 124 insertions(+), 179 deletions(-)
-: --------- > 1: ff2501ac4 read-cache.c: move index_has_changes() from merge.c
-: --------- > 2: 5813ca722 index_has_changes(): avoid assuming operating on the_index
1: 730e5e483 = 3: ca11503bd t6044: verify that merges expected to abort actually abort
2: 36f7cc0a3 = 4: 386390899 t6044: add a testcase for index matching head, when head doesn't match HEAD
3: 70899afa3 ! 5: 8a900d2ee merge-recursive: make sure when we say we abort that we actually abort
@@ -19,7 +19,7 @@
@@
struct strbuf sb = STRBUF_INIT;
- if (!o->call_depth && index_has_changes(&sb)) {
+ if (!o->call_depth && index_has_changes(&the_index, &sb)) {
- err(o, _("Dirty index: cannot merge (dirty: %s)"),
+ err(o, _("Your local changes to the following files would be overwritten by merge:\n %s"),
sb.buf);
4: eab2f36a4 ! 6: 2564b29e9 merge-recursive: fix assumption that head tree being merged is HEAD
@@ -6,10 +6,9 @@
base, head, and remote. Since the user is allowed to specify head, we can
not necesarily assume that head == HEAD.
- We modify index_has_changes() to take an extra argument specifying the
- tree to compare the index to. If NULL, it will compare to HEAD. We then
- use this from merge-recursive to make sure we compare to the
- user-specified head.
+ Modify index_has_changes() to take an extra argument specifying the tree
+ to compare against. If NULL, it will compare to HEAD. We then use this
+ from merge-recursive to make sure we compare to the user-specified head.
Signed-off-by: Elijah Newren <newren@gmail.com>
@@ -20,8 +19,8 @@
refresh_and_write_cache();
-- if (index_has_changes(&sb)) {
-+ if (index_has_changes(&sb, NULL)) {
+- if (index_has_changes(&the_index, &sb)) {
++ if (index_has_changes(&the_index, NULL, &sb)) {
write_state_bool(state, "dirtyindex", 1);
die(_("Dirty index: cannot apply patches (dirty: %s)"), sb.buf);
}
@@ -29,8 +28,9 @@
* Applying the patch to an earlier tree and merging
* the result may have produced the same tree as ours.
*/
-- if (!apply_status && !index_has_changes(NULL)) {
-+ if (!apply_status && !index_has_changes(NULL, NULL)) {
+- if (!apply_status && !index_has_changes(&the_index, NULL)) {
++ if (!apply_status &&
++ !index_has_changes(&the_index, NULL, NULL)) {
say(state, stdout, _("No changes -- Patch already applied."));
goto next;
}
@@ -38,8 +38,8 @@
say(state, stdout, _("Applying: %.*s"), linelen(state->msg), state->msg);
-- if (!index_has_changes(NULL)) {
-+ if (!index_has_changes(NULL, NULL)) {
+- if (!index_has_changes(&the_index, NULL)) {
++ if (!index_has_changes(&the_index, NULL, NULL)) {
printf_ln(_("No changes - did you forget to use 'git add'?\n"
"If there is nothing left to stage, chances are that something else\n"
"already introduced the same changes; you might want to skip this patch."));
@@ -48,20 +48,32 @@
--- a/cache.h
+++ b/cache.h
@@
- extern void move_index_extensions(struct index_state *dst, struct index_state *src);
+ /* Forward structure decls */
+ struct pathspec;
+ struct child_process;
++struct tree;
+
+ /*
+ * Copy the sha1 and stat state of a cache entry from one to
+@@
extern int unmerged_index(const struct index_state *);
--/**
-- * Returns 1 if the index differs from HEAD, 0 otherwise. When on an unborn
-- * branch, returns 1 if there are entries in the index, 0 otherwise. If an
-- * strbuf is provided, the space-separated list of files that differ will be
-- * appended to it.
-- */
--extern int index_has_changes(struct strbuf *sb);
--
+ /**
+- * Returns 1 if istate differs from HEAD, 0 otherwise. When on an unborn
+- * branch, returns 1 if there are entries in istate, 0 otherwise. If an
+- * strbuf is provided, the space-separated list of files that differ will
+- * be appended to it.
++ * Returns 1 if istate differs from tree, 0 otherwise. If tree is NULL,
++ * compares istate to HEAD. If tree is NULL and on an unborn branch,
++ * returns 1 if there are entries in istate, 0 otherwise. If an strbuf is
++ * provided, the space-separated list of files that differ will be appended
++ * to it.
+ */
+ extern int index_has_changes(const struct index_state *istate,
++ struct tree *tree,
+ struct strbuf *sb);
+
extern int verify_path(const char *path, unsigned mode);
- extern int strcmp_offset(const char *s1, const char *s2, size_t *first_change);
- extern int index_dir_exists(struct index_state *istate, const char *name, int namelen);
diff --git a/merge-recursive.c b/merge-recursive.c
--- a/merge-recursive.c
@@ -70,26 +82,31 @@
if (oid_eq(&common->object.oid, &merge->object.oid)) {
struct strbuf sb = STRBUF_INIT;
-- if (!o->call_depth && index_has_changes(&sb)) {
-+ if (!o->call_depth && index_has_changes(&sb, head)) {
+- if (!o->call_depth && index_has_changes(&the_index, &sb)) {
++ if (!o->call_depth && index_has_changes(&the_index, head, &sb)) {
err(o, _("Your local changes to the following files would be overwritten by merge:\n %s"),
sb.buf);
return -1;
-diff --git a/merge.c b/merge.c
---- a/merge.c
-+++ b/merge.c
+diff --git a/read-cache.c b/read-cache.c
+--- a/read-cache.c
++++ b/read-cache.c
@@
- return oid_to_hex(commit ? &commit->object.oid : the_hash_algo->empty_tree);
+ return 0;
}
--int index_has_changes(struct strbuf *sb)
-+int index_has_changes(struct strbuf *sb, struct tree *tree)
+-int index_has_changes(const struct index_state *istate, struct strbuf *sb)
++int index_has_changes(const struct index_state *istate,
++ struct tree *tree,
++ struct strbuf *sb)
{
- struct object_id head;
+ struct object_id cmp;
int i;
+ if (istate != &the_index) {
+ BUG("index_has_changes cannot yet accept istate != &the_index; do_diff_cache needs updating first.");
+ }
- if (!get_oid_tree("HEAD", &head)) {
+ if (tree)
+ cmp = tree->object.oid;
@@ -118,20 +135,3 @@
git reset --hard &&
git checkout C^0 &&
-
-diff --git a/tree.h b/tree.h
---- a/tree.h
-+++ b/tree.h
-@@
- extern int read_tree(struct tree *tree, int stage, struct pathspec *pathspec,
- struct index_state *istate);
-
-+/**
-+ * Returns 1 if the index differs from tree, 0 otherwise. If tree is NULL,
-+ * compares to HEAD. If tree is NULL and on an unborn branch, returns 1 if
-+ * there are entries in the index, 0 otherwise. If an strbuf is provided,
-+ * the space-separated list of files that differ will be appended to it.
-+ */
-+extern int index_has_changes(struct strbuf *sb, struct tree *tree);
-+
- #endif /* TREE_H */
5: 4aa0684c0 ! 7: 88a8e44a2 t6044: add more testcases with staged changes before a merge is invoked
@@ -5,8 +5,9 @@
According to Documentation/git-merge.txt,
...[merge will] abort if there are any changes registered in the index
- relative to the `HEAD` commit. (One exception is when the changed index
- entries are in the state that would result from the merge already.)
+ relative to the `HEAD` commit. (One exception is when the changed
+ index entries are in the state that would result from the merge
+ already.)
Add some tests showing that this exception, while it does accurately state
what would be a safe condition under which we could allow the merge to
6: 905f2683f ! 8: c0049b788 merge-recursive: enforce rule that index matches head before merging
@@ -48,16 +48,16 @@
commit, just like the 'ours' and 'octopus' strategies do.
Some testcase fixups were in order:
- t6044: We no longer expect stray staged changes to sometimes result
- in the merge continuing. Also, fixes a case where a merge
- didn't abort but should have.
- t7504: had a few tests that had stray staged changes that were not
- actually part of the test under consideration
t7611: had many tests designed to show that `git merge --abort` could
not always restore the index and working tree to the state they
were in before the merge started. The tests that were associated
with having changes in the index before the merge started are no
longer applicable, so they have been removed.
+ t7504: had a few tests that had stray staged changes that were not
+ actually part of the test under consideration
+ t6044: We no longer expect stray staged changes to sometimes result
+ in the merge continuing. Also, fix a case where a merge
+ didn't abort but should have.
Signed-off-by: Elijah Newren <newren@gmail.com>
@@ -70,7 +70,7 @@
int code, clean;
+ struct strbuf sb = STRBUF_INIT;
+
-+ if (!o->call_depth && index_has_changes(&sb, head)) {
++ if (!o->call_depth && index_has_changes(&the_index, head, &sb)) {
+ err(o, _("Your local changes to the following files would be overwritten by merge:\n %s"),
+ sb.buf);
+ return -1;
@@ -84,7 +84,7 @@
if (oid_eq(&common->object.oid, &merge->object.oid)) {
- struct strbuf sb = STRBUF_INIT;
-
-- if (!o->call_depth && index_has_changes(&sb, head)) {
+- if (!o->call_depth && index_has_changes(&the_index, head, &sb)) {
- err(o, _("Your local changes to the following files would be overwritten by merge:\n %s"),
- sb.buf);
- return -1;
7: 69f6fe8b1 ! 9: 557c5d94c merge: fix misleading pre-merge check documentation
@@ -20,7 +20,7 @@
...the index file must match the tree of `HEAD` commit...
[NOTE]
- This is a bit of a lite. In certain special cases [explained
+ This is a bit of a lie. In certain special cases [explained
in detail]...
Otherwise, merge will refuse to do any harm to your repository
(that is...your working tree...and index are left intact).
--
2.18.0.137.g2a11d05a6.dirty
^permalinkrawreply [flat|nested] 30+ messages in thread

*[PATCH v2 9/9] merge: fix misleading pre-merge check documentation
2018-07-01 1:24 ` [PATCH v2 0/9] Fix merge issues with index not matching HEAD Elijah Newren
` (7 preceding siblings ...)
2018-07-01 1:25 ` [PATCH v2 8/9] merge-recursive: enforce rule that index matches head before merging Elijah Newren
@ 2018-07-01 1:25 ` Elijah Newren8 siblings, 0 replies; 30+ messages in thread
From: Elijah Newren @ 2018-07-01 1:25 UTC (permalink / raw)
To: git; +Cc: gitster, pclouds, Elijah Newren
builtin/merge.c contains this important requirement for merge strategies:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
However, Documentation/git-merge.txt says:
...[merge will] abort if there are any changes registered in the index
relative to the `HEAD` commit. (One exception is when the changed
index entries are in the state that would result from the merge
already.)
Interestingly, prior to commit c0be8aa06b85 ("Documentation/git-merge.txt:
Partial rewrite of How Merge Works", 2008-07-19),
Documentation/git-merge.txt said much more:
...the index file must match the tree of `HEAD` commit...
[NOTE]
This is a bit of a lie. In certain special cases [explained
in detail]...
Otherwise, merge will refuse to do any harm to your repository
(that is...your working tree...and index are left intact).
So, this suggests that the exceptions existed because there were special
cases where it would case no harm, and potentially be slightly more
convenient for the user. While the current text in git-merge.txt does
list a condition under which it would be safe to proceed despite the index
not matching HEAD, it does not match what is actually implemented, in
three different ways:
* The exception is written to describe what unpack-trees allows. Not
all merge strategies allow such an exception, though, making this
description misleading. 'ours' and 'octopus' merges have strictly
enforced index==HEAD for a while, and the commit previous to this
one made 'recursive' do so as well.
* If someone did a three-way content merge on a specific file using
versions from the relevant commits and staged it prior to running
merge, then that path would technically satisfy the exception listed
in git-merge.txt. unpack-trees.c would still error out on the path,
though, because it defers the three-way content merge logic to other
parts of the code (resolve, octopus, or recursive) and has no way of
checking whether the index entry from before the merge will match
the end result of the merge.
* The exception as implemented in unpack-trees actually only checked
that the index matched the MERGE_HEAD version of the file and that
HEAD matched the merge base. Assuming no renames, that would indeed
provide cases where the index matches the end result we'd get from a
merge. But renames means unpack-trees is checking that it instead
matches something other than what the final result will be, risking
either erroring out when we shouldn't need to, or not erroring out
when we should and overwriting the user's staged changes.
In addition to the wording behind this exception being misleading, it is
also somewhat surprising to see how many times the code for the special
cases were wrong or the check to make sure the index matched head was
forgotten altogether:
* Prior to commit ee6566e8d70d ("[PATCH] Rewrite read-tree", 2005-09-05),
there were many cases where an unclean index entry was allowed (look for
merged_entry_allow_dirty()); it appears that in those cases, the merge
would have simply overwritten staged changes with the result of the
merge. Thus, the merge result would have been correct, but the user's
uncommitted changes could be thrown away without warning.
* Prior to commit 160252f81626 ("git-merge-ours: make sure our index
matches HEAD", 2005-11-03), the 'ours' merge strategy did not check
whether the index matched HEAD. If it didn't, the resulting merge
would include all the staged changes, and thus wasn't really an 'ours'
strategy.
* Prior to commit 3ec62ad9ffba ("merge-octopus: abort if index does not
match HEAD", 2016-04-09), 'octopus' merges did not check whether the
index matched HEAD, also resulting in any staged changes from before
the commit silently being folded into the resulting merge. commit
a6ee883b8eb5 ("t6044: new merge testcases for when index doesn't match
HEAD", 2016-04-09) was also added at the same time to try to test to
make sure all strategies did the necessary checking for the requirement
that the index match HEAD. Sadly, it didn't catch all the cases, as
evidenced by the remainder of this list...
* Prior to commit 65170c07d466 ("merge-recursive: avoid incorporating
uncommitted changes in a merge", 2017-12-21), merge-recursive simply
relied on unpack_trees() to do the necessary check, but in one special
case it avoided calling unpack_trees() entirely and accidentally ended
up silently including any staged changes from before the merge in the
resulting merge commit.
* The commit immediately before this one in this series noted that the
exceptions were written in a way that assumed no renames, making it
unsafe for merge-recursive to use. merge-recursive was modified to
use its own check to enforce that index==HEAD.
This history makes it very tempting to go into builtin/merge.c and replace
the comment that strategies must enforce that index matches HEAD with code
that just enforces it. At this point, that would only affect the
'resolve' strategy; all other strategies have each been modified to
manually enforce it. (However, note that index==HEAD is not strictly
enforced for fast-forward merges, as those are not considered a merge
strategy and they trigger in builtin/merge.c before the section in the
code where the relevant comment is found.)
But, even if we don't take the step of just fixing these problems by
enforcing index==HEAD for all strategies, we at least need to update this
misleading documentation in git-merge.txt. For now, just modify the claim
in Documentation/git-merge.txt to fix the error. The precise details
around combination of merges strategies and special cases probably is not
relevant to most users, so simply state that exceptions may exist but are
narrow and vary depending upon which merge strategy is in use.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
Documentation/git-merge.txt | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-merge.txt b/Documentation/git-merge.txt
index 6a5c00e2c..b050c8c2e 100644
--- a/Documentation/git-merge.txt
+++ b/Documentation/git-merge.txt
@@ -122,9+122,9 @@ merge' may need to update.
To avoid recording unrelated changes in the merge commit,
'git pull' and 'git merge' will also abort if there are any changes
-registered in the index relative to the `HEAD` commit. (One
-exception is when the changed index entries are in the state that
-would result from the merge already.)
+registered in the index relative to the `HEAD` commit. (Special
+narrow exceptions to this rule may exist depending on which merge
+strategy is in use, but generally, the index must match HEAD.)
If all named commits are already ancestors of `HEAD`, 'git merge'
will exit early with the message "Already up to date."
--
2.18.0.137.g2a11d05a6.dirty
^permalinkrawreply [flat|nested] 30+ messages in thread

*Re: [PATCH v2 2/9] index_has_changes(): avoid assuming operating on the_index
2018-07-01 1:24 ` [PATCH v2 2/9] index_has_changes(): avoid assuming operating on the_index Elijah Newren
@ 2018-07-03 19:44 ` Junio C Hamano0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2018-07-03 19:44 UTC (permalink / raw)
To: Elijah Newren; +Cc: git, pclouds
Elijah Newren <newren@gmail.com> writes:
> Modify index_has_changes() to take a struct istate* instead of just
> operating on the_index. This is only a partial conversion, though,
> because we call do_diff_cache() which implicitly assumes work is to be
> done on the_index. Ongoing work is being done elsewhere to do the
> remainder of the conversion, and thus is not duplicated here. Instead,
> a simple check is put in place until that work is complete.
Yeah, that is an unfortunate but necessary compromise until we
create do_diff_index() that can take an istate, and optionally turn
do_diff_cache() into
#define do_diff_cache(...) do_diff_index(&the_index, ...)
if there are many callers that want to work on the default in-core
istate.
^permalinkrawreply [flat|nested] 30+ messages in thread