Which is correct, and what I expect. When I execute the following (same) lines in a script:
file = open("./web.txt","r")
line = file.readline()

The output is:
&#9632;-&#9824;0&#9824;A&#9824;

If I replace the readline with
file = codecs.open("./web.txt", "r", "utf-16le")
file.readline()
the output is:
File "c:\Python23\lib\encodings\utf_16_le.py", line 26, in
readline
raise NotImplementedError, '.readline() is not implemented for UTF-16-LE'

UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-2: character maps to <undefined>
Note the----------------^^^
(0-2: character maps to <undefined>, as opposed to 0-3).

Furious George
Monday, July 26, 2004

Works for me - given a file containing the string you specify, and the Python code you specify, I get a "truncated character" as expected on the \n; appending another null byte to the file, the string reads in correctly.

I'm not familiar enough with Python to be able to guess what's happening differently when you run it.

Iago
Monday, July 26, 2004

>>> file = open("./web.txt", "r")

This is not correct. You should never manually open unicode file in text mode. Always use binary mode ("rb").

>>> When I execute the following (same) lines in a script:
>>> The output is: &#9632;-&#9824;0&#9824;A&#9824;

You're probably using unicode-aware console or writing it to a file and using unicode-aware editor.

It works. Try print repr(line). I bet you're trying to print it on a limited text console. Try using graphical unicode-aware console like IDLE (comes with Python distribution). Then you don't need to use repr(line).