We are conducting a short survey on why and how often Gateway to Research is used.

We are very interested in your opinions and experience and would like to invite you to take a moment to complete this short survey. Your input and comments are important, please be assured that all information provided is anonymous and confidential.

If you are not ready to give feedback now, but would like to contribute later, a link to the form can be found in the feedback section of the Contact Us page.

The survey should take about 3 minutes to complete. Would you like to participate?

Thank you for agreeing to give us your feedback on GtR.

We are very interested in your opinions and experience and would like to invite you to take a moment to complete this short survey. Your input and comments are important, please be assured that all information provided is anonymous and confidential.

The survey should take about 3 minutes to complete. Would you still like to participate?

What type of organisation do you come from?

Organisation Type

Public

Private

University/Research Organisation

Charity

Other (inc. general public user)

Please describe your organisation

Does your organisation have fewer than 250 staff?

Organisation Size

Yes
No

What was the main reason you visited Gateway to Research?

Reason for visit

To identify research/researchers/publications/outputs in an area of interest

Geographical analysis

To find information about a specific research project/publication/outcome

Identify research being conducted at universities and research organisations

Download data

How often do you use Gateway to Research?

Frequency of use

At least weekly

Once or twice a month

Several times a year

Once a year or less frequently

How did you hear about Gateway to Research?

How did you hear about GtR

At work/my employer

Friend/relative

Internet search

Magazine

News feature/story/article

Social media

Other (please provide response)

Please tell us where you heard about GtR

How useful was Gateway to Research in meeting your needs?

How useful was GtR

Extremely

Very

Moderately

Slightly

Not at all

User Research

Questions 6 to 9 are additional questions targeted at existing functionality on the Gateway to Research system namely the CSV download. This is the mechanism used to extract data from the system using a CSV template. We would value your opinion on how best to improve this service for our users

When you visit Gateway to Research, how often do you use the ‘CSV’ download functionality?

Do you download CSV files

Every time

At least weekly

Once or twice a month

Several times a year

Once a year or less frequently

Never used it

What data do you download using the CSV functionality?

What do you download

Projects

Publications

People

Organisations

Bespoke search (using refine by filtering)

Other (please provide response)

Please tell us where you heard about GtR

What would you like to be able to download in the future?

Download Suggestions

Classifications

Multiple People listed on a project

Multiple Organisations listed on a project

All Outcomes

Project Partner Participant values

Entire Gateway to Research dataset

Other (please provide response)

Please tell us what would you like to be download in the future ?

Please tell us what would you like to be download in the future ?

Do you have any other suggestions on how we can improve the download functionality?

Maximum 150 characters.

Characters remaining :150

Thank you for visiting Gateway to Research and taking the time to complete our short survey. If you would like to provide more information about your views and experience on Gateway to Research, you can complete a further survey http://www.smartsurvey.co.uk/s/IGZKO/

Abstract

Words are the building blocks of sentences, yet meaning of a sentence goes well beyond meanings of the words therein. Indeed, while we do have dictionaries for words, we don't seem to need them to infer the meaning of a sentence from meanings of its constituents. Discovering the process of meaning assignment in natural languages is one of the most foundational issues in linguistics and computer science, whose findings will increase our understanding of cognition and intelligence and may assist in applications to automating language-related tasks, such as document search as done by Google.

To date, the compositional logical and the distributional probabilistic models have provided two complementary partial solutions to the problem of meaning assigning in natural languages. The logical approach is based on classic ideas from mathematical logic, mainly Frege's principle that meaning of a sentence can be derived from the relations of the words in it. The distributional model is more recent, it can be related to Wittgenstein's philosophy of `meaning as use', whereby meanings of words can be determined from their context. The logical models have been the champions on the theory side, whereas in practice their probabilistic rivals have provided the best predictions. This two-sortedness of defining properties of meaning: `logical form' versus `contextual use', has left the question of `what is the foundational structure of meaning?' even more open a question than before. This project has ambitious and far-reaching goals; it aims to bring together these two complementary concepts to tackle the question. And it aims to do so by bridging the fields of linguistics, computer science, logic, probability theory, category theory, and even physics. Its scope is foundational, multi and inter disciplinary, with an eye towards applications.

Meaning assignment is a dynamic interactive process involving grammar and logic as well as meanings of words. Both of the two existing approaches to language miss a crucial aspect of this process: the logical model ignores meanings of words, the distributional model ignores the grammar and logic. We aim to model the entire dynamic process alongside the following three strands of integration, foundations, and applications.

(I) In integration we develop a process of meaning assignment that acts with the compositional forms of the logical model on the contextual word-meaning entities of the distributional model.

(II) In foundations, we go beyond classical logical principles of compositionality and context-based models of meaning to develop more fundamental processes of meaning assignments based on novel information-flow techniques, mainly from physics, but also from other linguistic approaches and other models of word meaning, such as ontological domains and conceptual spaces.

(III) In applications, we evaluate our theories against naturally occurring data and apply the results to practical issues based on meaning inference and similarity, e.g. in search. To be able to work with logical connectives in Google, one needs to re-enter them by hand in the `advanced search' tab, by manually decomposing the logical structure of the sentence and moreover providing the extra context for their different meanings. This is fundamentally non-compositional and goes against the spirit of automated search. It is exactly here that the lack of compositional methods in meaning assignment causes practical problems and where our compositional methods become of use. Hence, we aim to put forward our results to tackle such problems, e.g. to be able to use our sentence similarity models for paraphrasing, question-answering, and retrieving documents that have the same meaning and/or are about the same subject. Our proposed partnership with Google, ensures access to real life data and helps implementation and applicability of our methods in small and large scales.

Planned Impact

On the knowledge side, the proposed research will cause significant scientific advances across different disciplines of logic, linguistics, mathematics, physics, and computer science. This is by modeling the process of cognition and natural language generation and developing new mathematics, logic, and high level diagrammatic tools.

The project has 3 partners, from Computer Science in Cambridge, Cognitive Sciences and Artificial Intelligence in Utrecht, and industry in Google. These extend the geographical boundaries of the impacts of the project from UK to Europe and the US, but also from academia to industry. I also have on-going collaborations with experts from these various disciplines in venues including UK, Italy, France, USA, and Canada.

On the economy and social side, internet with its huge pool of services and data has become an inseparable part of our daily lives. The theoretical results of this project will be put forward to improve the quality of services on the internet. At the moment documents are identifies as bags of their words. If the relationships between the words is also taken into account, language processing tasks will hugely benefit, for instance tasks such as information retrieval from text and document search. As a result, new techniques for applications such as question answering and textual entailment will be developed, better answers to online questions will be provided, and more comprehensible summaries of news and articles will be constructed automatically. The partnership with Google and Cambridge is exactly towards following and realizing this pathway.

From the other side of the spectrum, the results will help us understand the nature of intelligence and language understanding. This has conceptual importance of its own, will improve the quality of human life in due time, by facilitating mutual understanding in society and across societies of different languages.

Finally, the proposed theoretical setting is based on using simple diagrammatic techniques to depict mathematical structure and logic. The simplicity and accessibility of these methods provides the public with a chance to understand advanced academic developments, a chance which will have an impact on educating the society. We have had open sessions to introduce Computer Science research to high school students and especially to girls in Oxford. The diagrammatic methods and their application areas caused much discussion with and within the students.

On the academic side, apart from publishing articles and attending already-established workshops and conferences, I have asked for funding to organize two workshops. This is to fill the interdisciplinary gap and bring together researchers from the different disciplines of the project, so that we can discuss and disseminate ideas and results and help start a multi-subject community across these different fields. I will also organize the interdisciplinary seminar series of the logic group of computing lab at oxford. These are open to all academic fields and also the public. Other impacts are through training and teaching. I have asked funding for a doctorate student and plan to continue lecturing my field of expertise based on the project. I have already been invited to give advanced lecture series about the subject in Utrecht and in a Masters course in Cognitive Science in University of Latvia.

We have had three recent findings:1- The tensor-based models of compositional distributional semantics, inspired by mathematical models of quantum mechanics, can reason about entailment as well as similarity. 2- The similarity measures used by compositional distributional semantics are mathematically the same as the relvance measures used in Information Retrieval.3- Adding bi-algebtas to the setting of compact closed categories enables the underlying compositional distributional semantics to reason about quantifiers.

Exploitation Route

The findings of the first category above can be applied to entailment tasks other than the ones we have considered.

The findings of the second category should be applied to retrieval datasets and have great potential to improve performance of search engines.

For the third category, the theoretical predictions of the model need to be experimentally evaluated.

Sectors

Digital/Communication/Information Technologies (including Software)

Description

In summer 2014, comanies ClotheNetwork and TelRock decided to implement the academic developments of the project (compositional vector models of natural language) as part of the prototype they were making for their Artificial Intelligence product. They hired a student of me as one of their summer interns and also paid me consultancy fees.

Two datasets of intransitive and transitive pairs of sentences that fully entail each other.

Type Of Material

Database/Collection of data

Provided To Others?

No

Impact

This is again midways between a dataest of pairs of words that entail each other and pairs of long sentences that do. It is designed to measure the effectivity of tensor-based composition operators in entailment tasks.

Title

Fuzzy Entailment

Description

Two datasets: one of intransitive sentenes and phrases and another of transitive sentences that entail each other. Gold standards are collected for degrees of entailment between the pairs of phrases and sentences from Amazon Mechanical Turk.

Type Of Material

Database/Collection of data

Provided To Others?

No

Impact

This is a hand made entailment dataset, the first one that goes beyond words but does not consist of full blown large sentences of over 15 words. It is designed to compute the effectivity of tensor-composition methods for entailment tasks.

Title

LACL and COLING datasets

Description

three datasets of subject-verb, verb-object, and subject-verb-object upward entailment

Type Of Material

Database/Collection of data

Year Produced

2016

Provided To Others?

Yes

Impact

evaluating entailment on phrase and sentence level, using compositional tensor based methods.

This is a dataset of about 600 pairs of subject-verb-object sentences that are related to each other. These 600 are manually chosen from a set of 1000 for which gold standard human judgements were collected from Amazon MechanicalTurk. The judgements measure the degree of relevance of the two sentences, treating one as a query and the other as a document.

Type Of Material

Database/Collection of data

Provided To Others?

No

Impact

This is the first dataset that is designed to compare methods from Information Retrieval and from Natural Language Processing. It is under final stages and will soon be made publicly available and a paper about it will be published as well.

Title

Sentence similarity dataset

Description

This is a set of pairs of transitive sentences whose meanings range from similar to dissimilar.

Type Of Material

Database/Collection of data

Year Produced

2013

Provided To Others?

Yes

Impact

The dataset has been used by other researchers in the community to validate their sentence representation models.

This was a set of pairs of sentences which has ambiguous verbs in them and the sentences where used to dismabiguate the verb. This was the first time transitive sentences were used in this task and our dataset was the first one of its kind.

Type Of Material

Database/Collection of data

Year Produced

2011

Provided To Others?

Yes

Impact

The paper containing the dataset has about 100 citations now and various researchers in the field of natural language processing use this dataset to validate their sentence representation models and techniques.

co-organizing the Workshop on Statistical and Logical Models of Meaning in the 7th North American Summer School in Logic Language Information in Rutgers University NJ, US.
co-supervising a PhD student together

Collaborator Contribution

Knowledge about type-algebras and their applications to linguistics, being a leader and a senior figure of the field and thus helping in mentoring

Impact

we are now in the process of editing the proceedings of the workshop into a volume of the Journal of Language Modelling: http://jlm.ipipan.waw.pl/index.php/JLM

Start Year

2016

Description

Cambridge

Organisation

University of Cambridge

Department

Computer Laboratory

Country

United Kingdom of Great Britain & Northern Ireland (UK)

Sector

Academic/University

PI Contribution

Working on the implementation and experimental part of the theoretical work proposed in my EPSRC project

Collaborator Contribution

Developing a dataset of relative clauses from real large scale corpora and experimenting with it using the theory developed together.

Impact

we are writing a paper together and have plans to submit it to a journal.

Start Year

2013

Description

Chieti-Pescara

Organisation

University of Chieti-Pescara

Country

Italy, Italian Republic

Sector

Academic/University

PI Contribution

We work on pregroup models of natural language together, a work that has resulted in the analysis of many languages such as Persian, Sanskrit, Hungarian, and also French and Italian.

Collaborator Contribution

We work on pregroup models of natural language together, a work that has resulted in the analysis of many languages such as Persian, Sanskrit, Hungarian, and also French and Italian.

Impact

Conference and Festschrift papers, listed in the publication section.

Start Year

2010

Description

Elisabetta

Organisation

University of Padova

Country

Italy, Italian Republic

Sector

Academic/University

PI Contribution

Collaborating on two articles

Collaborator Contribution

expertise on the linguistic side of the research

Impact

two working papers

Start Year

2016

Description

LORIA-Nancy

Organisation

National Center for Scientific Research (Centre National de la Recherche Scientifique CNRS)

Department

Lorraine Research Laboratory in Computer Science and its Applications (LORIA)

Country

France, French Republic

Sector

Public

PI Contribution

I was invited to visit Hans van Ditmarsch's group in LORIA for a week. My visit, including meals, travel, and hotel were fully funded by van Ditmarsch's ERC project CELLO. During this week, I gave a seminar and collaborated with two members of the group.

Collaborator Contribution

The discussions with van Ditmarsh led to plans for organizing a workshop on Application of Modal Logic to Computer Science. The funding application for half of the expenses of the workshop is under review (submitted to CIMPA). Van Ditmarsch's project is to pay for the other half of the expenses.

The collaboration is with `University of Oxford' not `University College Oxford', as indicated in the previous field, the drop-down list did not seem to have `University of Oxford' or `Oxford University' on its own.
We work on the same categorical models of meaning.

Collaborator Contribution

We work on the same categorical models of meaning. Our collaboration has resulted in coverage by the New Scientist Magazine, under cover heading `Quantum Linguistics, a leap forward for artificial intelligence'.

Impact

conference and journal papers, listed in the publication section.

Start Year

2007

Description

Prague

Organisation

Czech Technical University in Prague

Department

Faculty of Electrical Engineering

Country

Czech Republic

Sector

Academic/University

PI Contribution

giving seminars in Prague and London, working on a technical report that has led to a recent submission

I visited Professor Reinhard Muskens and we collaborated on a paper.My expertise was the vector space models of meaning.

Collaborator Contribution

Professor Muskens' expertise were lambda calculus models of natural language. It is the first time that these models are endowed with vector semantics.

Impact

a paper in final stages.

Start Year

2015

Description

Utrecht

Organisation

University of Utrecht

Department

Department of Languages, Literature and Communication

Country

Netherlands, Kingdom of the

Sector

Academic/University

PI Contribution

co-organizing the Workshop on Statistical and Logical Models of Meaning in the 7th North American Summer School in Logic Language Information in Rutgers University NJ, US.
co-supervising a PhD student together

Collaborator Contribution

Knowledge about type-algebras and their applications to linguistics, being a leader and a senior figure of the field and thus helping in mentoring

Impact

a workshop.
we are now in the process of editing the proceedings of the workshop into a volume of the Journal of Language Modelling: http://jlm.ipipan.waw.pl/index.php/JLM

Start Year

2016

Title

Compositional Distributional Vector Builder

Description

This software is developed by D. Milajevs, who is the PhD student supported by my grant. It inputs different copora of text and evaluation datasets, turns them into vector spaces, then outputs the results of the evaluation in tabular and graphical forms.

Type Of Technology

Software

Year Produced

2016

Impact

Softwares similar to this do indeed exist: the ERC project COMPOSES of Marco Baroni from the Center for Mind/Brain Sciences of the University of Trento is one such example. But it is the first time that one can input different copora and datasets, as well as normalization schemes, and compare the performances of models across all the parameters, and further, display the results graphically.

A press release, press conference or response to a media enquiry/interview

Part Of Official Scheme?

No

Geographic Reach

International

Primary Audience

Media (as a channel to the public)

Results and Impact

I was asked to write an article about the wikipedia's robot trained to distinguish malicious articles by the Conversation news website. I was provided access to professional article-writing facilities and given an editor. The article was received and I got good feedback from colleagues who read it.

30 pupils from schools across London attended my master class which was on Natural Language Processing. The EPSRC funded PhD student of my grant (D. Milajevs) and myself gave a master class of 3 hours with light theoretical content and hands on web-based and programming tasks.

30 pupils from the Scurr Highschool in TowerHamlets attended this masterclass, given by me. I repeated the material of my Royal Institute master class on Natural Language Processing, with light theoretical content and hands on activities.

The fifth public Tungsten lecture was given by myself in the topic of Quantum Linguistics. The audience included academics but also business managers from Tungsten. The talk led to a long discussion afterwards and my team has since been in contact with the Tungsten research team.

Gateway to Research (GtR) now includes the outcomes of research projects. Please help us identify additional improvements that would make GtR meet your needs even further by completing this short survey.