Algorithms and Cryptography

Intelligent algorithms can be the difference between a terrible and a successful product. Oka and Cohn described new algorithms for selecting which ads will be most profitable to display, for making use of survey-propagation techniques, and for building cryptographic systems "that can withstand the disclosure of partial information about their secret keys." If this is your cup of tea, check out this page.

Helping Writers Find the Right Words

This is a project that I will personally follow because it is something that I feel I could really make use of on a daily basis. The goal is to go beyond conventional thesauruses and terminology lists, because they are static and usually quite unhelpful with regard to specific usage. The tool provides writers with an inline contextual thesaurus which uses a paraphrase model and a large language model to provide suggested rephrasing that might be appropriate in the writer's intended context. Take that and add optional online search functionality for further examples and you've got a potential winner. You can read more about the technology on a Microsoft Research page entitled Next Generation Writing Assistance.

Color-Structured Image Search

Wang and Hua believe that image search can be improved by exploiting color spatial-relation information, via something they call color-layout-sensitive image search. The technology allows users to specify a color layout and then reorders image search results by promoting the images by the requested color layout. Color-layout features are extracted as metadata. The researchers claim that their experimental results show that this method is both effective and efficient. They also note that the technique could be extended to other kinds of semantic structures. I've personally never wanted to search for a specific image by color, but there you go.

Tool Kit for Visualizing Large-Scale Data

This is a tool kit consisting of Silverlight/Ajax controls for visualizing large-scale structured data from various sources. The structure of data, trends, and relationships of data properties can be all represented graphically. Furthermore, the four researchers are also working on a platform that enables rapid development of a large-scale data-explorer, analysis, and reporting tools. I can see this maybe being put into an Office application for corporate use.

Concurrency Analysis Platform and Tools

Concurrency bugs are difficult to find and reproduce. The solution, according to this group, is something they call the Concurrency Analysis Platform (CAP). CAP provides predictable control over thread interleavings and when it finds a concurrency bug, it provides an instantaneous repro. Unfortunately, the project website for CAP isn't particularly extensive, but it does include four concurrency-analysis tools: CHESS, a systematic unit-testing tool for concurrency; Cuzz, a concurrency fuzzer for obtaining more coverage from existing stress tests; FeatherLite, a lightweight data-race detector; and Sober, a tool for finding memory-model errors.

Content Services for Minority Languages

Here are two separate approaches for helping minority languages along: searching scanned books without the need for OCR and a tool that translates a document into English, and then summarizes it into point-form. The first, OCRLess, is a language-independent technology that enables search in printed documents for languages that lack OCR or have poor OCR quality. It combines image-based matching with text-based indexing. First, image documents are segmented into shapes—words, parts of words, characters, or parts of characters. Then, similar shapes are clustered and indexed by unique IDs. The second tool is called Trans-Bulletization and is meant as an accompaniment to machine translation solutions. It translates text but only outputs the result in short points. The idea is that when translated text is in bullet-form, the user's expectation of fluency and style are smaller, and thus quality of the output can be improved.