*Re: [PATCH 0/3] Wildcard matching for credentials
2020-02-14 23:58 ` [PATCH 0/3] Wildcard matching for credentials Taylor Blau
@ 2020-02-15 0:13 ` brian m. carlson0 siblings, 0 replies; 8+ messages in thread
From: brian m. carlson @ 2020-02-15 0:13 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Jeff King
[-- Attachment #1: Type: text/plain, Size: 1736 bytes --]
On 2020-02-14 at 23:58:22, Taylor Blau wrote:
> Hi brian,
>
> On Fri, Feb 14, 2020 at 10:59:26PM +0000, brian m. carlson wrote:
> > This series introduces wildcard matching (that is, urlmatch support) for
> > credential config options, just like for the http options. This is
> > helpful in corporate environments where custom credentials should be
> > used across a wide variety of subdomains.
> >
> > In addition, there's an additional test for urlmatch behavior with
> > multiple subdomains and a mailmap update for the email address used in
> > this series.
>
> I can imagine that this is perhaps for Git LFS, which I could see
> benefiting from this change. My review has nothing to do with my
> affiliation (or lack thereof) to LFS.
It was originally prompted by a discussion that someone started on the
Git LFS issue tracker, yes. I pointed out that it's actually Git that
controls credential helper matching and so I sent a patch.
I've also worked somewhere with multiple internal Git servers, so I can
imagine this would also be valuable in such an environment.
> I gave your patches a review, and they all look quite good to me. Thanks
> especially for 2/3, which I would have suggested were it not already
> there ;-).
I think there was some discussion on the list about whether allowing
multiple wildcards and wildcards in any part of the domain name was
intentional or not, and it was decided that it was. I promised to
follow up with a patch to the testsuite, and here I am, some time later.
> This looks good to me, so please have my:
>
> Reviewed-by: Taylor Blau <me@ttaylorr.com>
Thanks.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]^permalinkrawreply [flat|nested] 8+ messages in thread

*Re: [PATCH 3/3] credential: allow wildcard patterns when matching config
2020-02-14 22:59 ` [PATCH 3/3] credential: allow wildcard patterns when matching config brian m. carlson
@ 2020-02-16 6:13 ` Jeff King
2020-02-16 20:53 ` brian m. carlson0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2020-02-16 6:13 UTC (permalink / raw)
To: brian m. carlson; +Cc: git
On Fri, Feb 14, 2020 at 10:59:29PM +0000, brian m. carlson wrote:
> From: brian m. carlson <bk2204@github.com>
>
> In some cases, a user will want to use a specific credential helper for
> a wildcard pattern, such as https://*.corp.example.com. Use the urlmatch
> code to match a pattern for a URL to allow this behavior.
>
> Since we are handling URLs using urlmatch, remove the custom code to
> match URLs, since it is no longer needed.
I think it's a good idea to unify the matching for these two subsystems
if we can. Playing devil's advocate for a moment:
- are there cases that would have matched before that won't now?
Basic stuff like:
echo url=https://example.com/foo |
GIT_TERMINAL_PROMPT=0 git \
-c credential.https://example.com/foo.helper='!echo >&2 full helper' \
-c credential.https://example.com/fo.helper='!echo >&2 partial helper' \
credential fill
seems to behave the same (we respect only the full match both before
and after your patch). I wondered if we might run into problems
matching usernames. But this:
echo url=https://user@example.com/ |
GIT_TERMINAL_PROMPT=0 git \
-c credential.https://example.com.helper='!echo >&2 bare helper' \
-c credential.https://user@example.com.helper='!echo >&2 user helper' \
credential fill
seems to be behave the same both before and after (it runs both
helpers). And this:
echo url=https://example.com/ |
GIT_TERMINAL_PROMPT=0 git \
-c credential.https://example.com.helper='!echo >&2 bare helper' \
-c credential.https://user@example.com.helper='!echo >&2 user helper' \
credential fill
likewise behaves the same before and after, running only the bare
helper (I think we _could_ match the user-helper by doing a second
pass over the config after getting the username from config, from
another helper, or from the user, but I doubt anybody is clamoring
for that feature).
So I think we're OK here, unless you can come up with any more
obscure case.
- are there cases that match now that didn't before? Those are a bit
dangerous become they may mean unexpectedly sharing credentials with
hosts the user didn't mean to.
Obviously anything with a "*" in it will behave differently, but I
think that's OK (i.e,. I doubt anybody wrote "*" in a hostname and
didn't mean it for it to glob).
One interesting one is that the credential code currently requires a
full path match instead of a prefix match. So:
echo url=https://example.com/foo/bar |
GIT_TERMINAL_PROMPT=0 git \
-c credential.https://example.com/foo.helper='!echo >&2 run helper' \
credential fill
doesn't currently trigger the helper, but does with your patch. I
think this is _probably_ OK. I think the existing behavior was done
to just always err on the side of strictness in matching, with the
thought that nobody cared that much about path matching anyway (by
default we ignore the paths completely).
- The rules for ordering are a bit different, in that the credential
code takes any matches in the order in which they appear in the
config file. Whereas urlmatch won't pass on any less-specific
matches. So doing:
echo url=https://example.com/foo |
GIT_TERMINAL_PROMPT=0 git \
-c credential.https://example.com/foo.helper='!echo >&2 path helper' \
-c credential.https://example.com.helper='!echo >&2 bare-domain helper' \
credential fill
right now will trigger both helpers (path then bare-domain), but
after your patch will just trigger the path helper. Curiously, if you
flip the order of the two config keys, it will still run both. That's
because urlmatch assumes that the receiver is handling the "last one
wins" semantics, but the credential code is actually building a list
of helpers.
Weirder still, the current strategy for single-valued items like
credential.*.username is "first one wins". So:
echo url=https://example.com/foo |
git \
-c credential.https://example.com.username=domain-user \
-c credential.https://example.com/foo.username=path-user \
credential fill
will prompt for the password of domain-user, but flipping it will ask
for path-user. Which seems...weird. That actually doesn't change with
your patch, because we'll either send both to the credential callback
(in the order above) or we'd just send the first one (if the order is
flipped).
I'd be in favor of changing that to the usual "last one wins",
because it makes no sense and is unlike the rest of Git's config.
However, I think the reason it's written that way is so that:
echo url=https://foo@example.com |
git \
-c credential.https://example.com.username=bar \
credential fill
still takes the username=foo from the command-line. We'd want to take
care not to break that case, which implies remembering whether
c->username came from a previous config value, or was part of the
initial credential lookup.
That doesn't _technically_ have to be part of your patch, because
your patch doesn't make it any worse than it is now. But it would be
nice to cleanup (since urlmatch gives us nice "most specific wins"
semantics).
But I do think we need to deal with the helper-list behavior as part
of your patch, because it _does_ regress. And while the example above
is probably fairly contrived, I think urlmatch is responsible for
matching credential.helper, too (as the least specific possible
match), and it's not outrageous to have something like:
[credential]
helper = cache
[credential "https://example.com"]
helper = !some-host-specific-helper
Some options are:
- teach urlmatch to pass matching config keys to our callback even
if they're "worse" than a previously-seen key, so that we can
then record all helpers in the order they appear in the config
(retaining the existing behavior)
- use urlmatch's cmp_matches() to order the list of helper. This
would be a change in behavior, but I wonder if it might be what
people prefer. I suspect it would make some happy (if the
host-specific helper can answer the query above, you'd just as
soon not run the cache helper at all) and others not (if the
host-specific one is expensive or requires user interaction, you
may want to try the cache first). So I'm not sure if it would be
a good idea or not.
A few comments on the patch itself:
> --- a/Documentation/gitcredentials.txt
> +++ b/Documentation/gitcredentials.txt
> @@ -131,7 +131,9 @@ context would not match:
> because the hostnames differ. Nor would it match `foo.example.com`; Git
> compares hostnames exactly, without considering whether two hosts are part of
> the same domain. Likewise, a config entry for `http://example.com` would not
> -match: Git compares the protocols exactly.
> +match: Git compares the protocols exactly. However, you may use wildcards in
> +the domain name and other pattern matching techniques as with the `http.<url>.*`
> +options.
You'd probably want to review the documentation to accommodate any of
the behavior changes discussed above that we end up with.
> static void credential_apply_config(struct credential *c)
> {
> + char *normalized_url;
> + struct urlmatch_config config = { STRING_LIST_INIT_DUP };
Here we initialize a string_list, but I don't think we ever free it. It
looks like the urlmatch_config_entry() callback will stash items here
for its "more specific match" checks.
I think we need:
string_list_clear(&config.vars, 1);
which builtin/config.c:get_urlmatch() does. It also seems to free
config.url.url, which is touched by the normalization process. I _think_
this is always the same as the return value normalized_url, so we'd be
OK there.
> + config.section = "credential";
> + config.key = NULL;
> + config.collect_fn = credential_config_callback;
> + config.cascade_fn = git_default_config;
> + config.cb = c;
I don't think the old code would ever call git_default_config (we _just_
want to load values for this specific URL). So I think you'd want to
leave the cascade_fn NULL here?
> + credential_describe(c, &url);
> + normalized_url = url_normalize(url.buf, &config.url);
The purpose of credential_describe() so far has been to show the URL to
the user. It won't do any %-encoding that would be required for more
exotic URLs. But I assume we'd want that in whatever we feed to
url_normalize. So for example:
echo url=https://example.com/%2566 |
GIT_TERMINAL_PROMPT=0 git \
-c credential.https://example.com/%2566.helper='!echo >&2 run helper'
credential fill
matches in the current code, but not after your patch (we decode %25
into just "%", and then feed "%66" to the url normalizer, which decodes
it to "f".
-Peff
^permalinkrawreply [flat|nested] 8+ messages in thread

*Re: [PATCH 3/3] credential: allow wildcard patterns when matching config
2020-02-16 6:13 ` Jeff King@ 2020-02-16 20:53 ` brian m. carlson0 siblings, 0 replies; 8+ messages in thread
From: brian m. carlson @ 2020-02-16 20:53 UTC (permalink / raw)
To: Jeff King; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 3526 bytes --]
On 2020-02-16 at 06:13:23, Jeff King wrote:
> So I think we're OK here, unless you can come up with any more
> obscure case.
Yeah, I made this implicit in my patch, but I couldn't think of a
situation where we'd hit this case. I'll update the commit message to
reflect that we don't intend for this behavior to change.
> Some options are:
>
> - teach urlmatch to pass matching config keys to our callback even
> if they're "worse" than a previously-seen key, so that we can
> then record all helpers in the order they appear in the config
> (retaining the existing behavior)
>
> - use urlmatch's cmp_matches() to order the list of helper. This
> would be a change in behavior, but I wonder if it might be what
> people prefer. I suspect it would make some happy (if the
> host-specific helper can answer the query above, you'd just as
> soon not run the cache helper at all) and others not (if the
> host-specific one is expensive or requires user interaction, you
> may want to try the cache first). So I'm not sure if it would be
> a good idea or not.
I think it should be reasonably simple to adjust the logic to do the
former. I'd like to avoid making non-backwards compatible changes in
this series. I'll add some tests for this case as well, since I think
it's going to be important to get right. Thanks for the sanity check.
> A few comments on the patch itself:
>
> > --- a/Documentation/gitcredentials.txt
> > +++ b/Documentation/gitcredentials.txt
> > @@ -131,7 +131,9 @@ context would not match:
> > because the hostnames differ. Nor would it match `foo.example.com`; Git
> > compares hostnames exactly, without considering whether two hosts are part of
> > the same domain. Likewise, a config entry for `http://example.com` would not
> > -match: Git compares the protocols exactly.
> > +match: Git compares the protocols exactly. However, you may use wildcards in
> > +the domain name and other pattern matching techniques as with the `http.<url>.*`
> > +options.
>
> You'd probably want to review the documentation to accommodate any of
> the behavior changes discussed above that we end up with.
As mentioned, my hope is to not need to do this.
> > + config.section = "credential";
> > + config.key = NULL;
> > + config.collect_fn = credential_config_callback;
> > + config.cascade_fn = git_default_config;
> > + config.cb = c;
>
> I don't think the old code would ever call git_default_config (we _just_
> want to load values for this specific URL). So I think you'd want to
> leave the cascade_fn NULL here?
Okay, can do.
> > + credential_describe(c, &url);
> > + normalized_url = url_normalize(url.buf, &config.url);
>
> The purpose of credential_describe() so far has been to show the URL to
> the user. It won't do any %-encoding that would be required for more
> exotic URLs. But I assume we'd want that in whatever we feed to
> url_normalize. So for example:
>
> echo url=https://example.com/%2566 |
> GIT_TERMINAL_PROMPT=0 git \
> -c credential.https://example.com/%2566.helper='!echo >&2 run helper'
> credential fill
>
> matches in the current code, but not after your patch (we decode %25
> into just "%", and then feed "%66" to the url normalizer, which decodes
> it to "f".
Good point. I'll fix this and add a test.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]^permalinkrawreply [flat|nested] 8+ messages in thread