(Cat? OR feline) AND NOT dog?
Cat? W/5 behavior
(Cat? OR feline) AND traits
Cat AND charact*

This guide provides a more detailed description of the syntax that is supported along with examples.

This search box also supports the look-up of an IP.com Digital Signature (also referred to as Fingerprint); enter the 72-, 48-, or 32-character code to retrieve details of the associated file or submission.

Concept Search - What can I type?

For a concept search, you can enter phrases, sentences, or full paragraphs in English. For example, copy and paste the abstract of a patent application or paragraphs from an article.

Concept search eliminates the need for complex Boolean syntax to inform retrieval. Our Semantic Gist engine uses advanced cognitive semantic analysis to extract the meaning of data. This reduces the chances of missing valuable information, that may result from traditional keyword searching.

Minimum Redundancy Parts-Of-Speech Data Storage Technique

Publishing Venue

IBM

Related People

Carlgren, RG: AUTHOR

Abstract

This technique minimizes the storage requirement to represent the basic parts of speech of a dictionary word list. Eight primary parts of speech are used to characterize words in European languages. These are "noun", "verb", "adjective", "adverb", "preposition", "conjunction", "pronoun", and "interjection". Many words can have multiple parts of speech. The number of different combinations of different parts of speech for the various words in the English language prevent the use of a number of less than eight bits to represent the parts of speech of a word. This technique provides for the representation of parts of speech in an average of much less than eight bits per stored word.

Country

United States

Language

English (United States)

This text was extracted from a PDF file.

This is the abbreviated version, containing approximately
89% of the total text.

Page 1 of 1

Minimum Redundancy Parts-Of-Speech Data Storage Technique

This technique minimizes the storage requirement to represent the basic parts
of speech of a dictionary word list. Eight primary parts of speech are used to
characterize words in European languages. These are "noun", "verb",
"adjective", "adverb", "preposition", "conjunction", "pronoun", and "interjection".
Many words can have multiple parts of speech. The number of different
combinations of different parts of speech for the various words in the English
language prevent the use of a number of less than eight bits to represent the
parts of speech of a word. This technique provides for the representation of
parts of speech in an average of much less than eight bits per stored word. This
technique for storing parts-of-speech data is to exploit the frequency distribution
of the valid combinations of parts of speech which occur in European languages.
In English, most words, statistically, can have one or more of the following parts
of speech: "noun", "verb", "adjective", and "adverb". The various combinations of
these parts of speech can be represented in four bits. Since having all four or
none of these parts of speech is highly unlikely, then a mask of all bits on or off
can be used as a flag to indicate that the actual parts of speech are encoded in
the following eight bits. Hence, a bit mask representation of all valid parts of
speech for a word must be either four bits long or twelve bits long. It is the s...