Thursday, April 24, 2008

We've been running Google Summer of Code™ for four years now, and each year gets better and better. Competition was fierce for 2008, as we received nearly 7,100 applications, almost 1,000 more than last year. We welcomed over 800 more student applicants from over 1,300 colleges and universities. We're also pleased to see that the program is growing geographically; we had participants from 90 countries in 2007, and this year we'll be hacking out with folks from 98 countries around the globe.

Today we're pleased to let you know that we're funding 1125 student developers, almost 25% more students than last year, who will work to improve over 175 Free and Open Source Projects over the next few months. Check out the program website for more details on each participating student and mentoring organization.

For those of you who aren't participating in the program, now is a great time to continue working on your project ideas and learning about Free and Open Source. Each participating project is well placed to provide you with assistance in getting up to speed as a new contributor; take advantage of this opportunity to fix some bugs, hone your skills and, if you'd like, prepare for future instances of the program.

Congratulations to all students whose proposals were accepted, and many thanks to all of our applicants. The community bonding period starts today, and we'd love to hear from all of you how you plan to spend this time getting ready to start coding in six weeks. Feel free to post a comment and share your thoughts.

------------------

I heard there are many students from Sri Lanka too. Congratulations to you all and Good luck! Have a good experience J

Google is doing a great Job isn’t it? Handling more than 110 students all over the world is not an easy task…

The very recent and Important thing I learnt in CS is about BOM – Byte Order Mark

We have been doing Moodle localization for last 8 months or so. We do not use any special tool for this, mostly use Dreamweaver to do this! As you all expect, we saved our works in UTF8. Until today, we had small problems when we test our language pack with Moodle. But we didn’t much worry about it and suddenly yesterday we got a serious one. When we try to test our pack, the pages start to give ‘header already sent’ error messages. Then only we realized the seriousness and start to dig the problem.

Today we found that till today we have saved our works in UTF8+BOM. But Moodle doesn’t support for BOM. Then we removed that and now everything works fine !

Here is a small FAQ from Unicode site :

Q: What is a BOM?

A: A byte order mark (BOM) consists of the character code U+FEFF at the beginning of a data stream, where it can be used as a signature defining the byte order and encoding form, primarily of unmarked plaintext files. Under some higher level protocols, use of a BOM may be mandatory (or prohibited) in the Unicode data stream defined in that protocol.

Q: Where is a BOM useful?

A: A BOM is useful at the beginning of files that are typed as text, but for which it is not known whether they are in big or little endian format—it can also serve as a hint indicating that the file is in Unicode, as opposed to in a legacy encoding and furthermore, it act as a signature for the specific encoding form used .

Q: When a BOM is used, is it only in 16-bit Unicode text?

A: No, a BOM can be used as a signature no matter how the Unicode text is transformed: UTF-16, UTF-8, UTF-7, etc. The exact bytes comprising the BOM will be whatever the Unicode character FEFF is converted into by that transformation format. In that form, the BOM serves to indicate both that it is a Unicode file, and which of the formats it is in. Examples:

Bytes

Encoding Form

00 00 FE FF

UTF-32, big-endian

FF FE 00 00

UTF-32, little-endian

FE FF

UTF-16, big-endian

FF FE

UTF-16, little-endian

EF BB BF

UTF-8

Q: Can a UTF-8 data stream contain the BOM character (in UTF-8 form)? If yes, then can I still assume the remaining UTF-8 bytes are in big-endian order?

A: Yes, UTF-8 can contain a BOM. However, it makes no difference as to the endianness of the byte stream. UTF-8 always has the same byte order. An initial BOM is only used as a signature — an indication that an otherwise unmarked text file is in UTF-8. Note that some recipients of UTF-8 encoded data do not expect a BOM. Where UTF-8 is used transparently in 8-bit environments, the use of a BOM will interfere with any protocol or file format that expects specific ASCII characters at the beginning, such as the use of "#!" of at the beginning of Unix shell scripts.