Commit Message

The 'write' mode of the 'commit-graph' supports input from a number of
different sources: pack indexes over stdin, commits over stdin, commits
reachable from all references, and so on. Each of these options are
specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
Similar to our replacement of 'git config [--<type>]' with 'git config
[--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
`--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
deprecate '[--<input>]' in favor of '[--input=<source>]'.
This makes it more clear to implement new options that are combinations
of other options (such as, for example, "none", a combination of the old
"--append" and a new sentinel to specify to _not_ look in other packs,
which we will implement in a future patch).
Unfortunately, the new enumerated type is a bitfield, even though it
makes much more sense as '0, 1, 2, ...'. Even though *almost* all
options are pairwise exclusive, '--stdin-{packs,commits}' *is*
compatible with '--append'. For this reason, use a bitfield.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-commit-graph.txt | 26 +++++-----
builtin/commit-graph.c | 77 ++++++++++++++++++++++--------
t/t5318-commit-graph.sh | 46 +++++++++---------
t/t5324-split-commit-graph.sh | 44 ++++++++---------
4 files changed, 114 insertions(+), 79 deletions(-)

On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> The 'write' mode of the 'commit-graph' supports input from a number of> different sources: pack indexes over stdin, commits over stdin, commits> reachable from all references, and so on. Each of these options are> specified with a unique option: '--stdin-packs', '--stdin-commits', etc.>> Similar to our replacement of 'git config [--<type>]' with 'git config> [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support> `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly> deprecate '[--<input>]' in favor of '[--input=<source>]'.>> This makes it more clear to implement new options that are combinations> of other options (such as, for example, "none", a combination of the old> "--append" and a new sentinel to specify to _not_ look in other packs,> which we will implement in a future patch).
Makes sense.
> Unfortunately, the new enumerated type is a bitfield, even though it> makes much more sense as '0, 1, 2, ...'. Even though *almost* all> options are pairwise exclusive, '--stdin-{packs,commits}' *is*> compatible with '--append'. For this reason, use a bitfield.
> -With the `--append` option, include all commits that are present in the> -existing commit-graph file.> +With the `--input=append` option, include all commits that are present> +in the existing commit-graph file.
Would it be too crazy to call this `--input=existing` instead, and have
it be the same as `--append`? I find that `--append` makes a lot of
sense (it's a mode we can turn on or off), whereas "input = append"
seems more odd.
From the next commit message, we learn that a long `--input=append`
triggers `fill_oids_from_all_packs()`, which wouldn't match my expecting
from "--input=existing". So...
Does this hint that we could leave `--append` alone? We'd have lots of
different inputs to choose from using `--input`, and an `--append` mode
on top of that. That would make your inputs truly mutually exclusive and
you don't need the bitfield anymore, as you mention above. Hmm?
Would that mean that the falling back to `fill_oids_from_all_packs()`
would follow from "is there an --input?", as opposed to from "is there
an --input except --input=append?"?
(I don't know whether these inputs really *have* to be exclusive, or if
that's more of an implementation detail. That is, even without an
"append" input, might we one day be able to handle more inputs at once?
Maybe this is not the time to worry about that.)
Martin

On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:
> On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:> > The 'write' mode of the 'commit-graph' supports input from a number of> > different sources: pack indexes over stdin, commits over stdin, commits> > reachable from all references, and so on. Each of these options are> > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.> >> > Similar to our replacement of 'git config [--<type>]' with 'git config> > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support> > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly> > deprecate '[--<input>]' in favor of '[--input=<source>]'.> >> > This makes it more clear to implement new options that are combinations> > of other options (such as, for example, "none", a combination of the old> > "--append" and a new sentinel to specify to _not_ look in other packs,> > which we will implement in a future patch).>> Makes sense.>> > Unfortunately, the new enumerated type is a bitfield, even though it> > makes much more sense as '0, 1, 2, ...'. Even though *almost* all> > options are pairwise exclusive, '--stdin-{packs,commits}' *is*> > compatible with '--append'. For this reason, use a bitfield.>> > -With the `--append` option, include all commits that are present in the> > -existing commit-graph file.> > +With the `--input=append` option, include all commits that are present> > +in the existing commit-graph file.>> Would it be too crazy to call this `--input=existing` instead, and have> it be the same as `--append`? I find that `--append` makes a lot of> sense (it's a mode we can turn on or off), whereas "input = append"> seems more odd.
Hmm. When I wrote this, I was thinking of introducing equivalent options
that are identical in name and functionality as '--input=<mode>' instead
of '--<mode>'. So, I guess that is to say that I didn't spend an awful
amount of time thinking about whether or not '--input=append' made sense
given anything else.
So, I don't think that '--input=existing' is a bad idea at all, but I do
worry about advertising this deprecation as "'--<mode>' becomes
'--input=<mode>', except when your mode is 'append', in which case it
becomes '--input=existing'".
I suppose that, on the other hand, if we *were* to introduce such a
change, now would be the time to do it, before '--input=<mode>' is on
master and tagged in a release, but I'm not sure that '--input=append'
is so much worse.
I'm inclined to leave it as is, unless there are others that feel
strongly, in which case we can/should come back to it before this moves
towards being queued.
> >From the next commit message, we learn that a long `--input=append`> triggers `fill_oids_from_all_packs()`, which wouldn't match my expecting> from "--input=existing". So...>> Does this hint that we could leave `--append` alone? We'd have lots of> different inputs to choose from using `--input`, and an `--append` mode> on top of that. That would make your inputs truly mutually exclusive and> you don't need the bitfield anymore, as you mention above. Hmm?>> Would that mean that the falling back to `fill_oids_from_all_packs()`> would follow from "is there an --input?", as opposed to from "is there> an --input except --input=append?"?>> (I don't know whether these inputs really *have* to be exclusive, or if> that's more of an implementation detail. That is, even without an> "append" input, might we one day be able to handle more inputs at once?> Maybe this is not the time to worry about that.)>> Martin
Thanks,
Taylor

On Mon, Feb 03, 2020 at 08:51:24PM -0800, Taylor Blau wrote:
> On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:> > On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:> > > The 'write' mode of the 'commit-graph' supports input from a number of
s/mode/subcommand/
> > > different sources:
I note that you use the word "sources" here, in the subject line as
well (as '--input=<source>'), and in the code as well (e.g. the in
the error message "unrecognized --input source, %s"). I like this
word, I think the words "input" and "source" go really well together.
> > > pack indexes over stdin, commits over stdin, commits> > > reachable from all references, and so on.
It's interesting to see that you stopped listing and went for "and so
on" right when it got interesting/controversial with '--append'... :)
> > > Each of these options are> > > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
It also supports the very inefficient scanning through all objects in
all pack files to find commit objects, which, sadly, ended up being
the default, and thus doesn't have its own --option. Should there be
a corresponding '--input=<source>' as well? (Note that I don't mean
this as a suggestion to add one; on the contrary, the less exposure it
gets the better.)
> > > Similar to our replacement of 'git config [--<type>]' with 'git config> > > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support> > > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly> > > deprecate '[--<input>]' in favor of '[--input=<source>]'.> > >> > > This makes it more clear to implement new options that are combinations> > > of other options (such as, for example, "none", a combination of the old> > > "--append" and a new sentinel to specify to _not_ look in other packs,> > > which we will implement in a future patch).> >> > Makes sense.> >> > > Unfortunately, the new enumerated type is a bitfield, even though it> > > makes much more sense as '0, 1, 2, ...'. Even though *almost* all> > > options are pairwise exclusive, '--stdin-{packs,commits}' *is*> > > compatible with '--append'. For this reason, use a bitfield.> >> > > -With the `--append` option, include all commits that are present in the> > > -existing commit-graph file.> > > +With the `--input=append` option, include all commits that are present> > > +in the existing commit-graph file.> >> > Would it be too crazy to call this `--input=existing` instead, and have> > it be the same as `--append`? I find that `--append` makes a lot of> > sense (it's a mode we can turn on or off), whereas "input = append"> > seems more odd.> > Hmm. When I wrote this, I was thinking of introducing equivalent options> that are identical in name and functionality as '--input=<mode>' instead> of '--<mode>'. So, I guess that is to say that I didn't spend an awful> amount of time thinking about whether or not '--input=append' made sense> given anything else.> > So, I don't think that '--input=existing' is a bad idea at all, but I do> worry about advertising this deprecation as "'--<mode>' becomes> '--input=<mode>', except when your mode is 'append', in which case it> becomes '--input=existing'".
But here you suddenly start using the word "mode" both in
'--input=<mode>' and in '--<mode>'.
On one hand, I don't think that the word "mode" goes as well with
"input" as "source" does.
On the other, is '--append' really a source/mode, like '--reachable'
and '--stdin-commits' are? Source, no: from wordsmithing perspective
it doesn't fit with "source", and being orthogonal to the "real"
source options while they are mutually exclusive seems to be a clear
indication that it isn't. Mode, yes: it's a mode of operation where
no longer reachable/present commits are not discarded from the
commit-graph.
So I don't think that adding '--input=append' is a good idea, even if
we were call it differently, e.g. '--input=existing' as suggested
above.
However, I do think that '--input=existing' would better express what
'--input=none' in the next patch wants to achieve.

On Thu, Feb 13, 2020 at 12:33:13PM +0100, SZEDER Gábor wrote:
> On Mon, Feb 03, 2020 at 08:51:24PM -0800, Taylor Blau wrote:> > On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:> > > On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:> > > > The 'write' mode of the 'commit-graph' supports input from a number of> > s/mode/subcommand/> > > > > different sources:> > I note that you use the word "sources" here, in the subject line as> well (as '--input=<source>'), and in the code as well (e.g. the in> the error message "unrecognized --input source, %s"). I like this> word, I think the words "input" and "source" go really well together.> > > > > pack indexes over stdin, commits over stdin, commits> > > > reachable from all references, and so on.> > It's interesting to see that you stopped listing and went for "and so> on" right when it got interesting/controversial with '--append'... :)> > > > > Each of these options are> > > > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.> > It also supports the very inefficient scanning through all objects in> all pack files to find commit objects, which, sadly, ended up being> the default, and thus doesn't have its own --option. Should there be> a corresponding '--input=<source>' as well? (Note that I don't mean> this as a suggestion to add one; on the contrary, the less exposure it> gets the better.)> > > > > Similar to our replacement of 'git config [--<type>]' with 'git config> > > > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support> > > > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly> > > > deprecate '[--<input>]' in favor of '[--input=<source>]'.> > > >> > > > This makes it more clear to implement new options that are combinations> > > > of other options (such as, for example, "none", a combination of the old> > > > "--append" and a new sentinel to specify to _not_ look in other packs,> > > > which we will implement in a future patch).> > >> > > Makes sense.> > >> > > > Unfortunately, the new enumerated type is a bitfield, even though it> > > > makes much more sense as '0, 1, 2, ...'. Even though *almost* all> > > > options are pairwise exclusive, '--stdin-{packs,commits}' *is*> > > > compatible with '--append'. For this reason, use a bitfield.> > >> > > > -With the `--append` option, include all commits that are present in the> > > > -existing commit-graph file.> > > > +With the `--input=append` option, include all commits that are present> > > > +in the existing commit-graph file.> > >> > > Would it be too crazy to call this `--input=existing` instead, and have> > > it be the same as `--append`? I find that `--append` makes a lot of> > > sense (it's a mode we can turn on or off), whereas "input = append"> > > seems more odd.> > > > Hmm. When I wrote this, I was thinking of introducing equivalent options> > that are identical in name and functionality as '--input=<mode>' instead> > of '--<mode>'. So, I guess that is to say that I didn't spend an awful> > amount of time thinking about whether or not '--input=append' made sense> > given anything else.> > > > So, I don't think that '--input=existing' is a bad idea at all, but I do> > worry about advertising this deprecation as "'--<mode>' becomes> > '--input=<mode>', except when your mode is 'append', in which case it> > becomes '--input=existing'".> > But here you suddenly start using the word "mode" both in> '--input=<mode>' and in '--<mode>'.> > On one hand, I don't think that the word "mode" goes as well with> "input" as "source" does.> > On the other, is '--append' really a source/mode, like '--reachable'> and '--stdin-commits' are?
Well, re-reading this question got me confused right after sending it,
so let me try to rephrase.
Is '--append' really a "source", like '--reachable' and
'--stdin-commits' are? No:
> Source, no: from wordsmithing perspective> it doesn't fit with "source", and being orthogonal to the "real"> source options while they are mutually exclusive seems to be a clear> indication that it isn't.
Or is it a "mode" modifying how other options are handled? Yes:
> Mode, yes: it's a mode of operation where> no longer reachable/present commits are not discarded from the> commit-graph.> > So I don't think that adding '--input=append' is a good idea, even if> we were call it differently, e.g. '--input=existing' as suggested> above.> > However, I do think that '--input=existing' would better express what> '--input=none' in the next patch wants to achieve.>

On Thu, Feb 13, 2020 at 12:48:00PM +0100, SZEDER Gábor wrote:
> On Thu, Feb 13, 2020 at 12:33:13PM +0100, SZEDER Gábor wrote:> > On Mon, Feb 03, 2020 at 08:51:24PM -0800, Taylor Blau wrote:> > > On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:> > > > On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:> > > > > The 'write' mode of the 'commit-graph' supports input from a number of> >> > s/mode/subcommand/
Sure.
> > > > > different sources:> >> > I note that you use the word "sources" here, in the subject line as> > well (as '--input=<source>'), and in the code as well (e.g. the in> > the error message "unrecognized --input source, %s"). I like this> > word, I think the words "input" and "source" go really well together.> >> > > > > pack indexes over stdin, commits over stdin, commits> > > > > reachable from all references, and so on.> >> > It's interesting to see that you stopped listing and went for "and so> > on" right when it got interesting/controversial with '--append'... :)
Only because I figured that I had illustrated the point as "here are the
input sources that we currently understand", like "--stdin-packs",
"--stdin-commits", and so on, not because I find this option to be
controversial.
> > > > > Each of these options are> > > > > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.> >> > It also supports the very inefficient scanning through all objects in> > all pack files to find commit objects, which, sadly, ended up being> > the default, and thus doesn't have its own --option. Should there be> > a corresponding '--input=<source>' as well? (Note that I don't mean> > this as a suggestion to add one; on the contrary, the less exposure it> > gets the better.)
Maybe... although I (like you, I think) have a hard time imagining that
this would ever get used, since it *is* the default source of input, so
you could just as easily *not* write '--input' anything and get the same
effect.
> > > > > Similar to our replacement of 'git config [--<type>]' with 'git config> > > > > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support> > > > > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly> > > > > deprecate '[--<input>]' in favor of '[--input=<source>]'.> > > > >> > > > > This makes it more clear to implement new options that are combinations> > > > > of other options (such as, for example, "none", a combination of the old> > > > > "--append" and a new sentinel to specify to _not_ look in other packs,> > > > > which we will implement in a future patch).> > > >> > > > Makes sense.> > > >> > > > > Unfortunately, the new enumerated type is a bitfield, even though it> > > > > makes much more sense as '0, 1, 2, ...'. Even though *almost* all> > > > > options are pairwise exclusive, '--stdin-{packs,commits}' *is*> > > > > compatible with '--append'. For this reason, use a bitfield.> > > >> > > > > -With the `--append` option, include all commits that are present in the> > > > > -existing commit-graph file.> > > > > +With the `--input=append` option, include all commits that are present> > > > > +in the existing commit-graph file.> > > >> > > > Would it be too crazy to call this `--input=existing` instead, and have> > > > it be the same as `--append`? I find that `--append` makes a lot of> > > > sense (it's a mode we can turn on or off), whereas "input = append"> > > > seems more odd.
I don't think that I have a strong preference here, since I don't find
'--input=append' to be out of the ordinary, so I'd be happy with either.
If you'd strongly prefer that we call this '--input=existing', then
that's fine with me.
I called it '--input=append' to translate 1-to-1 from '--<source>' to
'--input=<source>'. I would worry a little about saying, "yeah, if you
want to use the new '--input'-style options, just write what you used to
write *unless* that thing is '--append' in which case write 'existing'
instead".
Maybe '--append' was poorly-named to begin with, so I think the
question is really: if we believe that it is poorly named, is now the
right time to correct that naming?
> > > Hmm. When I wrote this, I was thinking of introducing equivalent options> > > that are identical in name and functionality as '--input=<mode>' instead> > > of '--<mode>'. So, I guess that is to say that I didn't spend an awful> > > amount of time thinking about whether or not '--input=append' made sense> > > given anything else.> > >> > > So, I don't think that '--input=existing' is a bad idea at all, but I do> > > worry about advertising this deprecation as "'--<mode>' becomes> > > '--input=<mode>', except when your mode is 'append', in which case it> > > becomes '--input=existing'".> >> > But here you suddenly start using the word "mode" both in> > '--input=<mode>' and in '--<mode>'.> >> > On one hand, I don't think that the word "mode" goes as well with> > "input" as "source" does.> >> > On the other, is '--append' really a source/mode, like '--reachable'> > and '--stdin-commits' are?>> Well, re-reading this question got me confused right after sending it,> so let me try to rephrase.>> Is '--append' really a "source", like '--reachable' and> '--stdin-commits' are? No:>> > Source, no: from wordsmithing perspective> > it doesn't fit with "source", and being orthogonal to the "real"> > source options while they are mutually exclusive seems to be a clear> > indication that it isn't.>> Or is it a "mode" modifying how other options are handled? Yes:>> > Mode, yes: it's a mode of operation where> > no longer reachable/present commits are not discarded from the> > commit-graph.> >> > So I don't think that adding '--input=append' is a good idea, even if> > we were call it differently, e.g. '--input=existing' as suggested> > above.> >> > However, I do think that '--input=existing' would better express what> > '--input=none' in the next patch wants to achieve.
So, I guess I'm left wondering what we can do to move forward here. For
my part, I see a couple of options:
(a) we could replace every instance of '--input=<source>' with
'--input=<mode>', which seems it would make "append" a valid
pattern-match for "mode", and move on.
(b) we could stick with '--input=<source>' and 's/append/existing',
and move on
(c) or do something else that I didn't think of here and go forward
with that instead.
Let me know which you'd prefer that I do, and I'd be happy to send an
updated version of the series for you to look at.
Thanks,
Taylor