-- _645126031
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
At 12:25 08/09/22, Michael Selig wrote:
>On Mon, 22 Sep 2008 12:35:49 +1000, Martin Duerst <duerst / it.aoyama.ac.jp>
>wrote:
>
>>
>> Therefore, I think we should seriously consider this proposal,
>> and hopefully implement it before Sept. 25th. In terms of
>> implementation, I don't think it should be that difficult,
>> but it may be quite a bit of work to check
>> Encoding::default_internal in all the affected methods.
>
>Wow, that is rather ambitious - 3 days?
Well, that's the deadline for feature changes for 1.9.1.
It would be a real pity to wait for 2.0 for this.
The feature freeze wiki at
http://redmine.ruby-lang.org/wiki/ruby/DevelopersMeeting20080922
says that default_internal is currently pending, but that
this should be discussed/settled this week.
Anyhow, I had a look at the code, and it doesn't seem to be that
difficult. The function io_extract_encoding_option in io.c
seems to be central. I'm attaching a patch, which I hope is
a good start. I'm also writing to ruby-dev (in Japanese)
because that's where the real experts are.
The patch isn't as strict as your proposal with respect
to re-setting, but I'm fine either way.
I have tested this patch with code like the following
(called with -Eutf-8, -Eshift_jis, -Eeuc-jp, and without -E
option, in all combinations)
>>>>
Encoding.default_internal utf-8'
# tested with 'utf-8', 'shift_jis', and 'euc-jp'
s \u3042\u3044\u3046\u3048\u304A"
File.open('testout1.txt', 'w:shift_jis') do |f| f.write s end
File.open('testout2.txt', 'w:euc-jp') do |f| f.write s end
File.open('testout3.txt', 'w:utf-8') do |f| f.write s end
File.open('testout1.txt', 'r:shift_jis') do |f| s .read; p s.encoding end
File.open('testout2.txt', 'r:euc-jp') do |f| s .read; p s.encoding end
File.open('testout3.txt', 'r:utf-8') do |f| s .read; p s.encoding end
File.open('testout3.txt', 'r:ASCII-8BIT') do |f| s .read; p s.encoding end
# for next line, change file number to pick up default_internal
File.open('testout3.txt', 'r') do |f| s .read; p s.encoding end
>>>>
>The bulk of the implementation will be in the libraries, and I think many
>of them need updating to cope with non-acsii encodings anyhow.
Yes. I'm not sure how libraries are affected by the feature
freeze, but they have to be fixed anyhow, completely independently
of default_internal. And I agree that this cannot be done in 3 days.
Regards, Martin.
>> - We should think through various scenarios for output.
>> I can't think of any problems just now, I just noticed
>> the absence of considerations for output below.
>
>I did think about output to a certain extent, and one good thing is that
>IO already seems to automatically transcode to the "external" encoding at
>the moment. As for other classes, again I think most need updating to
>support multiple encodings anyhow. They will at a minimum need a way of
>having the user pass the "external" encoding (defaulting to
>"default_external"), and do the transcode as necessary, based on the
>encoding of the data to be output. However, as with IO, this behaviour
>probably should happen no matter whether "default_internal" is implemented
>or not.
>
>Cheers
>Mike
>
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jpmailto:duerst / it.aoyama.ac.jp
-- _645126031
Content-Type: application/octet-stream; name="patch_default_internal.txt";
x-mac-type4455854"; x-mac-creator4747874"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="patch_default_internal.txt"
SW5kZXg6IGVuY29kaW5nLmMNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCi0tLSBlbmNvZGluZy5jCShyZXZpc2lvbiAx
OTUxMCkNCisrKyBlbmNvZGluZy5jCSh3b3JraW5nIGNvcHkpDQpAQCAtMTA2Miw2ICsxMDYyLDY3
IEBADQogI2VuZGlmDQogfQ0KIA0KK3N0YXRpYyBpbnQgZGVmYXVsdF9pbnRlcm5hbF9pbmRleCA9
IC0xOw0KK3N0YXRpYyByYl9lbmNvZGluZyAqZGVmYXVsdF9pbnRlcm5hbCA9IDA7DQorDQorDQor
cmJfZW5jb2RpbmcgKg0KK3JiX2RlZmF1bHRfaW50ZXJuYWxfZW5jb2Rpbmcodm9pZCkNCit7DQor
ICAgIHJldHVybiBkZWZhdWx0X2ludGVybmFsOw0KK30NCisNCitWQUxVRQ0KK3JiX2VuY19kZWZh
dWx0X2ludGVybmFsKHZvaWQpDQorew0KKyAgICByZXR1cm4gZGVmYXVsdF9pbnRlcm5hbD09MCA/
IFFuaWwgOiANCisJcmJfZW5jX2Zyb21fZW5jb2RpbmcoZGVmYXVsdF9pbnRlcm5hbCk7DQorfQ0K
Kw0KKy8qDQorICogY2FsbC1zZXE6DQorICogICBFbmNvZGluZy5kZWZhdWx0X2ludGVybmFsID0+
IGVuYw0KKyAqDQorICogUmV0dXJucyBkZWZhdWx0IGludGVybmFsIGVuY29kaW5nIChuaWwgaWYg
dW51c2VkKS4NCisgKg0KKyAqLw0KK3N0YXRpYyBWQUxVRQ0KK2dldF9kZWZhdWx0X2ludGVybmFs
KFZBTFVFIGtsYXNzKQ0KK3sNCisgICAgcmV0dXJuIHJiX2VuY19kZWZhdWx0X2ludGVybmFsKCk7
DQorfQ0KKw0KK3ZvaWQNCityYl9lbmNfc2V0X2RlZmF1bHRfaW50ZXJuYWwoVkFMVUUgZW5jb2Rp
bmcpDQorew0KKyAgICBpZiAoZGVmYXVsdF9pbnRlcm5hbCkNCisJcmJfd2FybigiUmVzZXR0aW5n
IEVuY29kaW5nLmRlZmF1bHRfaW50ZXJuYWwiKTsNCisgICAgaWYgKGVuY29kaW5nID09IFFuaWwp
IHsNCisgICAgICAgIGRlZmF1bHRfaW50ZXJuYWwgPSAwOw0KKyAgICAgICAgZGVmYXVsdF9pbnRl
cm5hbF9pbmRleCA9IC0xOw0KKyAgICB9DQorICAgIGVsc2Ugew0KKwlkZWZhdWx0X2ludGVybmFs
ID0gcmJfdG9fZW5jb2RpbmcoZW5jb2RpbmcpOw0KKwlkZWZhdWx0X2ludGVybmFsX2luZGV4ID0g
cmJfZW5jX3RvX2luZGV4KGRlZmF1bHRfaW50ZXJuYWwpOw0KKyAgICB9DQorfQ0KKw0KKy8qDQor
ICogY2FsbC1zZXE6DQorICogICBFbmNvZGluZy5kZWZhdWx0X2ludGVybmFsPSBlbmMgPT4gZW5j
DQorICoNCisgKiBTZXRzIGRlZmF1bHQgaW50ZXJuYWwgZW5jb2RpbmcgKGRlZmF1bHQgaXMgbmls
LCBpLmUuIHVudXNlZCkuDQorICogRm9yIHVzZSBpbiBtYWluIGFwcGxpY2F0aW9uOyBuZXZlciB1
c2UgaW4gYSBsaWJyYXJ5IQ0KKyAqIFJldHVybnMgbmlsLiBQcm9kdWNlcyBhIHdhcm5pbmcgaWYg
cmVzZXQuDQorICoNCisgKi8NCitzdGF0aWMgVkFMVUUNCitzZXRfZGVmYXVsdF9pbnRlcm5hbChW
QUxVRSBrbGFzcywgVkFMVUUgZW5jb2RpbmcpDQorew0KKyAgICByYl9lbmNfc2V0X2RlZmF1bHRf
aW50ZXJuYWwoZW5jb2RpbmcpOw0KKyAgICByZXR1cm4gUW5pbDsNCit9DQorDQogc3RhdGljIHZv
aWQNCiBzZXRfZW5jb2RpbmdfY29uc3QoY29uc3QgY2hhciAqbmFtZSwgcmJfZW5jb2RpbmcgKmVu
YykNCiB7DQpAQCAtMTIxNCw2ICsxMjc1LDkgQEANCiAgICAgcmJfZGVmaW5lX3NpbmdsZXRvbl9t
ZXRob2QocmJfY0VuY29kaW5nLCAiZGVmYXVsdF9leHRlcm5hbCIsIGdldF9kZWZhdWx0X2V4dGVy
bmFsLCAwKTsNCiAgICAgcmJfZGVmaW5lX3NpbmdsZXRvbl9tZXRob2QocmJfY0VuY29kaW5nLCAi
bG9jYWxlX2NoYXJtYXAiLCByYl9sb2NhbGVfY2hhcm1hcCwgMCk7DQogDQorICAgIHJiX2RlZmlu
ZV9zaW5nbGV0b25fbWV0aG9kKHJiX2NFbmNvZGluZywgImRlZmF1bHRfaW50ZXJuYWwiLCAgIGdl
dF9kZWZhdWx0X2ludGVybmFsLCAwKTsNCisgICAgcmJfZGVmaW5lX3NpbmdsZXRvbl9tZXRob2Qo
cmJfY0VuY29kaW5nLCAiZGVmYXVsdF9pbnRlcm5hbD0iLCAgc2V0X2RlZmF1bHRfaW50ZXJuYWws
IDEpOw0KKw0KICAgICBsaXN0ID0gcmJfYXJ5X25ldzIoZW5jX3RhYmxlLmNvdW50KTsNCiAgICAg
UkJBU0lDKGxpc3QpLT5rbGFzcyA9IDA7DQogICAgIHJiX2VuY29kaW5nX2xpc3QgPSBsaXN0Ow0K
SW5kZXg6IGlvLmMNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT0NCi0tLSBpby5jCShyZXZpc2lvbiAxOTUxMCkNCisrKyBp
by5jCSh3b3JraW5nIGNvcHkpDQpAQCAtMzg4NSw2ICszODg1LDcgQEANCiAgICAgVkFMVUUgZWNv
cHRzOw0KICAgICBpbnQgaGFzX2VuYyA9IDAsIGhhc192bW9kZSA9IDA7DQogICAgIFZBTFVFIGlu
dG1vZGU7DQorICAgIHJiX2VuY29kaW5nICpkZWZfaW50ZXJuYWw7DQogDQogICAgIHZtb2RlID0g
KnZtb2RlX3A7DQogDQpAQCAtMzk3Miw2ICszOTczLDIwIEBADQogDQogICAgICpvZmxhZ3NfcCA9
IG9mbGFnczsNCiAgICAgKmZtb2RlX3AgPSBmbW9kZTsNCisgICAgaWYgKGZtb2RlJkZNT0RFX1JF
QURBQkxFICYmICFlbmMyICYmIChkZWZfaW50ZXJuYWw9cmJfZGVmYXVsdF9pbnRlcm5hbF9lbmNv
ZGluZygpKSkgew0KKwlyYl9lbmNvZGluZyAqZGVmX2V4dGVybmFsID0gcmJfZGVmYXVsdF9leHRl
cm5hbF9lbmNvZGluZygpOw0KKwlyYl9lbmNvZGluZyAqYXNjaWlfOGJpdCA9IHJiX2VuY19maW5k
KCJBU0NJSS04QklUIik7DQorCWlmICghZW5jKSB7DQorCSAgICBpZiAoZGVmX2V4dGVybmFsIT1k
ZWZfaW50ZXJuYWwgJiYgZGVmX2V4dGVybmFsIT1hc2NpaV84Yml0KSB7DQorCSAgICAgICAgZW5j
ICA9IGRlZl9pbnRlcm5hbDsNCisJICAgICAgICBlbmMyID0gZGVmX2V4dGVybmFsOw0KKwkgICAg
fQ0KKwl9DQorCWVsc2UgaWYgKGVuYyE9ZGVmX2ludGVybmFsICYmIGVuYyE9YXNjaWlfOGJpdCkg
ew0KKwkgICAgZW5jMiA9IGVuYzsNCisJICAgIGVuYyA9IGRlZl9pbnRlcm5hbDsNCisJfQ0KKyAg
ICB9DQogICAgIGNvbnZjb25maWdfcC0+ZW5jID0gZW5jOw0KICAgICBjb252Y29uZmlnX3AtPmVu
YzIgPSBlbmMyOw0KICAgICBjb252Y29uZmlnX3AtPmVjZmxhZ3MgPSBlY2ZsYWdzOw0K
-- _645126031 --