-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
M.-A. Lemburg wrote:
> Shouldn't this encoding guessing be a separate function that you call
> on either a file or a seekable stream ?
>> After all, detecting encodings is just as useful to have for non-file
> streams.
Other stream sources typically have out-of-band ways to signal the
encoding: only when reading from the filesystem do we pretty much
*have* to guess, and in that case the BOM / signature is the best
heuristic we have. Also, some non-file streams are not seekable, and so
can't be guessed via a pre-pass.
> You'd then avoid having to stuff everything into
> a single function call and also open up the door for more complex
> application specific guess work or defaults.
>> The whole process would then have two steps:
>> 1. guess encoding
>> import codecs
> encoding = codecs.guess_file_encoding(filename)
Filename is not enough information: or do you mean that API to actually
open the stream?
> 2. open the file with the found encoding
>> f = open(filename, encoding=encoding)
>> For seekable streams f, you'd have:
>> 1. guess encoding
>> import codecs
> encoding = codecs.guess_stream_encoding(f)
>> 2. wrap the stream with a reader for the found encoding
>> reader_class = codecs.getreader(encoding)
> g = reader_class(f)
>
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAktHoU4ACgkQ+gerLs4ltQ5o3QCeLOJ7J91E+5f66vhgu1BUhYh4
9UgAnR2IeCd0BCsPez8ZilGNHJfhRn3Y
=SoPb
-----END PGP SIGNATURE-----