(Cat? OR feline) AND NOT dog?
Cat? W/5 behavior
(Cat? OR feline) AND traits
Cat AND charact*

This guide provides a more detailed description of the syntax that is supported along with examples.

This search box also supports the look-up of an IP.com Digital Signature (also referred to as Fingerprint); enter the 72-, 48-, or 32-character code to retrieve details of the associated file or submission.

Concept Search - What can I type?

For a concept search, you can enter phrases, sentences, or full paragraphs in English. For example, copy and paste the abstract of a patent application or paragraphs from an article.

Concept search eliminates the need for complex Boolean syntax to inform retrieval. Our Semantic Gist engine uses advanced cognitive semantic analysis to extract the meaning of data. This reduces the chances of missing valuable information, that may result from traditional keyword searching.

Automatic, In-Domain, Question/Answer-Set Generation

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a system for automatically generating a set of domain-specific question-answer (QA) pairs from a domain-specific corpus and an existing set of domain-general QA pairs. The output of the system is a high-quality QA set with good coverage suitable for training a QA system to the new domain.

Country

Undisclosed

Language

English (United States)

This text was extracted from a PDF file.

This is the abbreviated version, containing approximately
31% of the total text.

Page 01 of 4

Automatic,

Disclosed is a system for automatically generating a set of domain-specific question-answer (QA) pairs from a domain-specific corpus and an existing set of domain-general QA pairs. The output of the system is a high-quality QA set with good coverage suitable for training a QA system to the new domain. This invention aims to reduce the human element to drastically decrease the time, cost, and expertise needed in question creation. Statistical QA systems require large quantities of training data in the form of QA pairs. System accuracy is directly correlated to the quantity and quality of the questions provided during the training phase. Currently, creating quality questions is a time-consuming and expensive manual process. Generating QA pairs in order to adapt an existing QA system to handle a new topic domain typically requires upwards of hundreds of person-hours, often from subject matter experts. This expense compounds as clients request handling of multiple new domains in a year. A system for generating question-answer sets from a set of documents representing a particular domain and a known distribution of question types, the system comprising:1. Initializing a target distributional specification of question types2. Pending input of an existing QA corpus, analyzing the existing QA corpus for distributional information3. Populating the target distributional specification in 1 with any distributional information from 24. Pending user input of distributional information, modifying the target specification in step 35. Receiving as input a document corpus6. Initializing a set of generated QA pairs7. Initialize a distributional specification for the generated set in step 68. Selecting a question type for generation by sampling from the distributional specification9. Selecting a source document for question generation by sampling from the corpus10. Selecting a source section for question generation by sampling from the document11. Selecting a source paragraph for question generation by conditional sampling from the section12. Identify candidate sentences that support generation of the appropriate question type13. Select a source sentence for question generation by sampling from the candidate sentences14. Generate a question and answer of the selected type from the selected sentence15. Add the generated question/answer pair from step 14 to the set of QA pairs in step 616. Update the generated distributional specification in step 7 with information about the QA pair in step 1517. Repeat steps 8-1618. At periodic checkpoints in the generation process, compare the target distributional specification in step 4 with the generated distributional specification in step 7 to identify mismatch19. Pending a mismatch, generate an intermediate target distributional specification that downweights over-represented categories in the generated specification20. Modify step 8 to sample from the intermediate target distribution until the next...