Discussions

Reviews

Jacob Keller sent a patch adding a test that fails. He wrote in the
commit message that the “git rebase” interactive mode causes exec
commands to be run with GIT_DIR set, and that afterwards running a git
command in a subdirectory fails because GIT_DIR=".git". He suspected
the regression was introduced in some recent rebase--helper changes to
speed up the interactive rebase and convert some shell scripts to C
code.

Johannes Schindelin, alias Dscho, replied to Jacob and suggested a fix
as well as a number of improvements in Jacob’s patch. He also asked if
Jacob could take care of creating a proper patch for the fix. Jacob
agreed with Dscho’s comments and to create a proper patch.

Phillip Wood then chimed in stating that Dscho’s suggested fix might not
be right:

Just clearing GIT_DIR does not match the behavior of the shell version
(tested by passing -p to avoid rebase--helper) as that passes GIT_DIR to
exec commands if it has been explicitly set. I think that users that set
GIT_DIR on the command line would expect it to be propagated to exec
commands.

At that point Junio Hamano, the Git maintainer, Jacob and Phillip
started discussing the possible impact of the bug and if it was worth
delaying the release to get a chance to properly test a fix for some
time.

Then Dscho gave an explanation about where the bug could come
from:

When you look at git_dir_init in git-sh-setup, you will see that
Unix shell scripts explicitly get their GIT_DIR turned into an
absolute path.

He then suggested a fix in the rebase--helper code in C that has
replaced the shell code in git-sh-setup. The fix is about turning the
content of the GIT_DIR environment variable into an absolute path
before running the exec command.

Jacob agreed again to create a proper patch from Dscho’s fix and then
sent a patch with Dscho’s fix.
The patch has subsequently been merged into the master branch.

Support

Lars Schneider realized after migrating a large repository to Git that
“all text files in the index of the repo have CRLF line endings”. He
then asked:

In general this seems not to be a problem as the project is developed exclusively on Windows.

However, I wonder if there are any “hidden consequences” of this setup?

Jonathan Nieder answered:

There are no hidden consequences that I’m aware of. If you later
decide that you want to become a cross-platform project, then you may
want to switch to LF endings, in which case I suggest the “single
fixup commit” strategy.

He suggested though to declare explicitely all the files as non text
files in .gitattributes using the -text flag, so that Git will not be
tempted to change line endings.

Torsten Bögershausen agreed with Jonathan saying:

If you don’t specify .gitattributes, then all people who have
core.autocrlf=true will suffer from a runtime penalty.

because:

At each checkout Git needs to figure out that the file has CRLF in
the repo, so that there is no conversion done.

and also:

Those who have core.autocrlf=false would produce commits with CRLF
for new files, and those developers who have core.autocrlf=true would
produce files with LF in the index and CRLF in the worktree. This may
(most probably will) cause confusion later, when things are pushed and
pulled.

Lars thanked Jonathan for the idea of using the -text flag but
wondered about its implications saying:

For whatever reason I always thought this is the way to tell
Git that a particular file is binary with the implication that
Git should not attempt to diff it.

To this Jonathan replied:

No other implications. You’re thinking of -diff. There is also a
shortcut “binary” which simply means -text -diff.

Jonathan in his first email also asked his own related question:

I’d be interested to hear what happens when diff-ing across a line
ending fixup commit. Is this an area where Git needs some
improvement? “git merge” knows an -Xrenormalize option to deal with a
related problem — it’s possible that “git diff” needs to learn a
similar trick.

To that, Torsten replied:

That is a tricky thing.
Sometimes you want to see the CLRF - LF as a diff, (represented as “^M”),
and sometimes not.

Junio Hamano then also gave his “knee-jerk reaction” on this, saying
that “the end user definitely wants to see preimage and postimage
lines are different in such a commit by default, one side has and the
other side lacks ^M at the end” and also that when one does not want
to see those changes “one of the ‘whitespace ignoring’ options […]
may suffice, but if not, it should be easy to invent a new one”.

Junio then posted a sample patch to implement --ignore-cr-at-eol.

Stefan Beller reviewed this patch, which was further improved by Junio
and then discussed a few times, so that this new flag is likely to
appear is the next Git release.

A sub thread of the discussion started about making big changes to the
xdiff code that was originally “borrowed” from a separate open source
project. There was no clear result from this discussion though.

Johannes Sixt also replied directly to Lars’ first email:

I’ve been working on a project with CRLF in every source file for a
decade now. It’s C++ source, and it isn’t even Windows-only: when
checked out on Linux, there are CRs in the files, with no bad
consequences so far. GCC is happy with them.

To that Johannes Schindelin, alias Dscho, replied:

I envy you for the blessing of such a clean C++ source that you do
not have any, say, Unix shell script in it.

In a separate reply to Torsten’s first email, Dscho also confirmed
that completely switching off line ending conversions can give “around
5-15% speed improvement”.

A discussion then started about the merits of having an entry like
“*.sh text eol=lf” in the .gitattributes for shell scripts, compared
to having Git change strictly no file. In the end it looks like such an
entry could help, though there could be shell scripts that don’t use the
.sh extension.

Developer Spotlight: Torsten Bögershausen

Who are you and what do you do?

Originally a hardware developer, these days are filled with software
development for embedded systems.

What would you name your most important contribution to Git?

The precomposeunicode feature for Mac Os was an important thing to go
cross-platform, but the Git users may have a different point of view.

What are you doing on the Git project these days, and why?

The last years it was CRLF handling, also known as EOL or line ending.
Mainly because I am using it myself.

If you could get a team of expert developers to work full time on
something in Git for a full year, what would it be?

The Git code base is in a pretty good shape.
Improve the on-disk or even over-the-wire protocol to include
information if a file is binary or text with CRLF (2 bits).
Please let me know, when you have the team.

If you could remove something from Git without worrying about
backwards compatibility, what would it be?

“git checkout -b” is certainly good for experienced people,
hard to understand for beginners.
“git add -A” or -all is certainly my favorite thing to be removed…
Don’t accept commit messages which are not unicode any more.
Remove the core.autocrlf from the code base, demand that people
set up a .gitattributes file on Windows.

What is your favorite Git-related tool/library, outside of Git itself?

Probably Gerrit, even if I like the pull-request workflow which allows
people to collaborate.