About the program

Welcome!
You may find this site usefull, if you have recieved some texts that you believe are written in the Cyrillic alphabet, but instead are displayed in some strange combination of bizarre characters. This program will try to guess the encoding, and if it does not, it will show samples, examples of all encoding-combinations, so as you will be able to select the good one.

How to

Paste the text to decypher in the big text area. The first few words will be analysed so they should be (scrambled) in supposed Cyrillic.

In the select-listbox "The text looks like this" choose the option that is most similar to your text. The shown variants are the most frequent problematic pre-encodings.

If you don't know which variant to choose, press the button next to "I don't know, test all combinations".

The program will try to decypher the text and will print the result below.

If the translation is successfull, you will see the text in Cyrillic characters and will be able to copy it and save it if it's important.

If the translation isn't successfull (still the text is not in Cyrillic but in the same or other unintelligible characters), you can choose from the newly created select-listbox the variant that is in Cyrillic (if there are more than one, select the longest). By pressing the button OK you will have the correct text converted.

If the text is not totally converted, try all other variants in Cyrillic from the select-listbox.

Limits

If your text contains question marks "???? ?? ??????", the problem is with the sender and no recovery will be possible. Ask them to resend the text, eventually as an ordinary text file.

There is no claim that every text is decypherable, even if you are certain that the text is in Cyrillic.

The analyzed and converted text is limited to 20 KiB.

A 100% precision is not always achieved - in a conversion from a codepage to another code page, some characters may be lost, like the Bulgarian quotes or rarely some single letters. Some of this depends on your Windows Clipboard character handling.

The program will try a maximum of 4725 variants in two or three levels: if there had been a multiple encoding like koi8(utf(cp1251(utf))), it will not be detected or tested. Usually the possible and displayed correct variants are between 32 and 255.

If a part of the text is encoded with one code page, and another part - with another code page, the program could recognize only one of the parts at a time.

Terms of use

Please notice that this freeware program is created with the hope that it would be usefull, but has no warranty, not even an implied warranty for fitness for any particular use. Please use it at your own risk.

If you have very long texts to translate, please make sure you have a backup copy.

11.10.06 : The main site is on a new hardware server, should run faster.

11.09.06 : The program now uses PHP5 and should run times faster.

19.08.06 : Because of a broken DNS entry, this site was inaccessible from 06:00 on 15 august up to 15:00 on 18 august. That was the reason for me to set two "mirror" sites (5ko.free.fr/decode and 2cyr.accent-bg.com/decode) with the same program. If the original has a problem, you can find the copies in Google and recover your texts.

20.10.05 : Small improvement to the frequency-analysis function: for texts, written in all-capital letters.

14.10.05 : Two more gmail-cyrillic encodings were added. Theoretically the tested combinations are 2112.

15.06.05 : Russian language interface was added. Big thanks to chAlx!

16.02.05 : One more postfilter decoding is added, for strings like this: "%u043A%u0438%u0440%u0438%u043B%u0438%u0446%u0430".

05.02.05 : More encodings tests added, the number of tested encodings is doubled, but thus the program may work slightly slower.

03.02.05 : The frequency analysis function that detects the original encoding works much better now. Currently the program recognises most of the encodings if the first few words are not too weird. It although still needs some improvement.