Saturday, July 21, 2007

EU Google Competitor Project Gets Aid Worth $166 Million

EU Google Competitor Project Gets Aid Worth $166 Million: [...] Dow Jones reports: "The aim is to develop new search technologies for the next generation Internet, including 'semantic technologies which try to recognize the meaning of content and place it in its proper context.' The semantic Web has been considered the next evolution of the Internet at least since Tim Berners-Lee, widely considered a creator of the current version of the Internet, published an article describing it in 2001. In theory, a semantic Web could receive a user request for information about fishing, for example, and automatically narrow the results according to the user's individual needs rather than blanket the user with pages related to numerous aspects of fishing. The Commission's funding approval Thursday immediately sparked talk of building a potential European challenger to Web search leader Google Inc." (Via Slashdot.)

I fear that the EU is suffering from magical thinking of the kind identified by Drew McDermott in his classic Artificial Intelligence and Natural Stupidity (not available online), which should be required reading for everyone studying or investing in this area. Calling some technology "semantic" doesn't make it so. All search engines try to "recognize the meaning of content and place it in its proper context." It's just that doing so accurately and efficiently in general is extremely hard. Significant progress depends on unpredictable research advances, not on predictable development efforts. Putting around 1000 person years on a focused project like this creates false expectations and actually hurts basic research in the field.

Competition is search is good. The major search engines have substantial research efforts, as could for instance be seen from their publications at the recent natural-language processing conferences in Prague, and there are several startups exploring new approaches in the field. More research in this area is good. But the EU should have learned from the limited success of big initiatives from EUROTRA to the framework programs that major advances cannot be willed by bureaucratic fiat.

The seeds of current search technology were not in major coordinated development efforts, but in academic research at schools like Berkeley, CMU, Cornell, and Stanford, and in unpredictable benefits from industrial research at Bell Labs, IBM, and PARC in areas like machine learning and information retrieval. None of this work came from a big grand plan, but rather from the initiative of researchers and research managers in exploiting the resources available to them (and it could be argued that the current funding climate in the US, which puts greater emphasis on top-down initiatives and applicability than before, may well reduce the creativity of the research system here). The most important effect of these efforts was not in technologies, but in creating opportunities for creative people (students, faculty, researchers) to play with new ideas and recognize their potential. Without institutional reform in Europe to open up comparable opportunities through increased flexibility in education, research, and funding, much of these $166 million will end up as institutional welfare payments to hidebound universities and corporations, as has been the case for much of the previous EU investments in research and development.

2 comments:

I think your comments make sense, but I believe that Theseus is a German national (ministry of technology) project, rather than an EU project. Thus the relevant comparisons would be projects such as Verbmobil and Smartweb.

About Me

I am VP and Engineering Fellow at Google, where I lead work on natural-language understanding and machine learning. My previous positions include chair of the Computer and Information Science department of the University of Pennsylvania, head of the Machine Learning and Information Retrieval department at AT&T Labs, and research and management positions at SRI International. I received a Ph.D. in Artificial Intelligence from the University of Edinburgh in 1982, and I have over 120 research publications on computational linguistics, machine learning, bioinformatics, speech recognition, and logic programming, as well as several patents. I was elected AAAI Fellow in 1991 for contributions to computational linguistics and logic programming, ACM Fellow in 2010 for contributions to machine-learning models of natural language and biological sequences, and ACL Fellow in 2017 for contributions to sequence modeling, finite-state methods, and dependency and deductive parsing. I was president of the Association for Computational Linguistics in 1993.