Change History (12)

Most file system related functions in Python (like os.stat) accept unicode strings which are then encoded using the default encoding of the file system (see ​http://docs.python.org/library/sys.html#sys.getfilesystemencoding). This is actually the only sane thing to do - if you pass a manually encoded string you'll have no guarantee it will match what was actually written on the FS. On Unix platforms, this depends on the user's *locale*. Thus if the user you're running the server on, doesn't have a properly set LC_ALL, LANG, etc. in his enviroment, the FS encoding will be assumed ASCII and os.stat will crash.

I'm marking this as accepted, because I think it's worth putting a note about this in FileStorage docs.

What I'm saying is that my system is configured tu use UTF-8 : sys.getfilesystemencoding() gives me "UTF-8", but safe_join won't give me a UTF-8 encoded string. A .encode("utf-8") should be applied to the string before it is passed to os.stat, but the File.save() doesn't handle that. I figured it might be a bug of safe_join not to give a string in the same encoding it was passed.

In any case, I don't manage to get safe_join giving back a string that my os.stat can handle. Is it as safe as it pretends to be?

Anyhow, I agree with you that the FileStorage docs could be more precise on some points, including this one.

Most file system related functions in Python (like os.stat) accept unicode strings which are then encoded using the default encoding of the file system (see ​http://docs.python.org/library/sys.html#sys.getfilesystemencoding). This is actually the only sane thing to do - if you pass a manually encoded string you'll have no guarantee it will match what was actually written on the FS. On Unix platforms, this depends on the user's *locale*. Thus if the user you're running the server on, doesn't have a properly set LC_ALL, LANG, etc. in his enviroment, the FS encoding will be assumed ASCII and os.stat will crash.

I'm marking this as accepted, because I think it's worth putting a note about this in FileStorage docs.

You're actually right. For some reason, django wouldn't get the locale settings of my user. In fact the fcgi script wouldn't, even if my envvars script was properly configured.
I had to set DefaultInitEnv LANG "en_US.UTF-8" in my sites-available/default.
Now my view gets the right filesystemencoding and that solves it.
Thanks for your help!

Most file system related functions in Python (like os.stat) accept unicode strings which are then encoded using the default encoding of the file system (see ​http://docs.python.org/library/sys.html#sys.getfilesystemencoding). This is actually the only sane thing to do - if you pass a manually encoded string you'll have no guarantee it will match what was actually written on the FS. On Unix platforms, this depends on the user's *locale*. Thus if the user you're running the server on, doesn't have a properly set LC_ALL, LANG, etc. in his enviroment, the FS encoding will be assumed ASCII and os.stat will crash.

I'm marking this as accepted, because I think it's worth putting a note about this in FileStorage docs.