On Tue, Feb 17, 2004 at 07:29:18PM +0000, Jamie Lokier wrote:> What happens is that one program or library checks an incoming path> for ".." components - that code knows nothing about UTF-8 of course.> > Then it passes the string to another program which assumes the path> has been subject to appropriate security checks, munges it in UTF-8,> and eventually does a file operation with it. The munging generates> ".." components from non-minimal UTF-8 forms - if it's not obeying the> Unicode rejection requirement (which wasn't in earlier versions), that is.

Why the hell would it _ever_ do such normalization?

> A realistic example is where the second program reads files whose> paths are mentioned in a text file which is parsed as UTF-8, after the> first program has done a security check by grepping for ".."> components.> > Unicode says the second program shouldn't accept malformed UTF-8,> precisely because in real scenarios (like this one) there's a mix of> programs and libraries, some aware of UTF-8, some not, and the latter> are involved in security decisions.> > Here on linux-kernel we're saying that if the second program accepts> any old byte sequence in a filename, it should preserve the byte> sequence exactly. But any program whose parser-tokeniser is scanning> UTF-8 is very unlikely to do that - it's just too complicated to say> some bits of a text stream must be remembered as literal bytes, and> others must be scanned as multibyte characters.

So what you are saying is that conversion of invalid multibyte sequencesinto non-error wide chars followed by conversion back into UTF-8 canlead to trouble? *DUH*

> The holes only arise because software which is interpreting UTF-8 is> mixed with software which isn't. That's one of the most useful> features of UTF-8, after all - that's why we use it for filenames.

The holes only arise because software which is interpreting UTF-8 doesn'tcare to do it properly. Software that doesn't interpret it (including thekernel) doesn't enter the picture at all.-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to majordomo@vger.kernel.orgMore majordomo info at http://vger.kernel.org/majordomo-info.htmlPlease read the FAQ at http://www.tux.org/lkml/