> Past objections and rebuttals could be summarized as:> > - Violates POSIX.> - POSIX didn't consider this situation and it's not useful to follow> a broken specification at the cost of security.

Correct.

> - Might break unknown applications that use this feature.> - Applications that break because of the change are easy to spot and> fix. Applications that are vulnerable to symlink ToCToU by not having> the change aren't. Additionally, no applications have yet been found> that rely on this behavior.

Are there *known* applications that break?

> - Applications should just use mkstemp() or O_CREATE|O_EXCL.> - True, but applications are not perfect, and new software is written> all the time that makes these mistakes; blocking this flaw at the> kernel is a single solution to the entire class of vulnerability.

Ugh - and people continue to get exploited from a preventable, fixable and already fixed VFS design flaw.

> This patch is based on the patch in Openwall and grsecurity, > along with suggestions from Al Viro. I have added a sysctl to > enable the protected behavior, documentation, and a > ratelimited warning.

> +config PROTECTED_STICKY_SYMLINKS> + bool "Protect symlink following in sticky world-writable directories"> + help> + A long-standing class of security issues is the symlink-based> + time-of-check-time-of-use race, most commonly seen in> + world-writable directories like /tmp. The common method of> + exploitation of this flaw is to cross privilege boundaries> + when following a given symlink (i.e. a root process follows> + a malicious symlink belonging to another user).> +> + Enabling this solves the problem by permitting symlinks to only> + be followed when outside a sticky world-writable directory,> + or when the uid of the symlink and follower match, or when> + the directory and symlink owners match.

Unless there are known apps that regress, shouldnt this be default y? Even if we were forced to not actually release such a final kernel, doing it in -rc1 would flush weird apps out of the woodwork.

Taking the global /tmp and /lib, /usr/lib dentry spinlocks here every time we follow a symlink is a bit sad: there are a lot of high-profile symlinks in a Linux installation in those places, followed by non-owners.

Nor is it really a realistic race that happens often: neither /tmp nor the library directories are being renamed or their permissions changed all that often for this parent lock taking to matter in practice.

I don't think the get_task_comm() extra layer is really necessary here - this is called in the *current* task's context - and it's not like that tasks are changing their own names all that often while they are busy executing VFS code, right? The race with some other task writing to /proc/PID/comm is uninteresting.

The more important piece of forensic information to log would be the file name that got attempted and perhaps the directory name as well. If there's an unknown hole in an unknown privileged app then this warning alone does not give the admin any clues where to look - the name of the expoiting task is probably obfuscated anyway, it's the identity of the attack vector that matters - and that name can generally not be controlled and obfuscated by the attacker.