(Cat? OR feline) AND NOT dog?
Cat? W/5 behavior
(Cat? OR feline) AND traits
Cat AND charact*

This guide provides a more detailed description of the syntax that is supported along with examples.

This search box also supports the look-up of an IP.com Digital Signature (also referred to as Fingerprint); enter the 72-, 48-, or 32-character code to retrieve details of the associated file or submission.

Concept Search - What can I type?

For a concept search, you can enter phrases, sentences, or full paragraphs in English. For example, copy and paste the abstract of a patent application or paragraphs from an article.

Concept search eliminates the need for complex Boolean syntax to inform retrieval. Our Semantic Gist engine uses advanced cognitive semantic analysis to extract the meaning of data. This reduces the chances of missing valuable information, that may result from traditional keyword searching.

Method of Estimating the Size of a Set

Publishing Venue

IBM

Related People

Wegman, MN: AUTHOR

Abstract

The problem solved is: Given any collection of names, estimate the number of unique names in the collection, making only one pass over the data and using a very small amount of memory. The problem is given a fixed amount of memory and a small amount of computing to answer any sequence of requests of either the form "Union an element x into a set", or of the form "How many elements are there in the set". Unioning the same element twice does not change the number of elements. We propose a technique which will give a close estimate to the number of elements in the set. The naive technique is to keep a list of all emements in the set and, when a union request comes along, check to see if the element is already in the set. If not, add it to the list. The length of the list is the number of elements in it.

Country

United States

Language

English (United States)

This text was extracted from a PDF file.

This is the abbreviated version, containing approximately
51% of the total text.

Page 1 of 2

Method of Estimating the Size of a Set

The problem solved is: Given any collection of names, estimate the number of
unique names in the collection, making only one pass over the data and using a
very small amount of memory. The problem is given a fixed amount of memory
and a small amount of computing to answer any sequence of requests of either
the form "Union an element x into a set", or of the form "How many elements are
there in the set". Unioning the same element twice does not change the number
of elements. We propose a technique which will give a close estimate to the
number of elements in the set. The naive technique is to keep a list of all
emements in the set and, when a union request comes along, check to see if the
element is already in the set. If not, add it to the list. The length of the list is the
number of elements in it. Elements are chosen from some universe of elements.
Suppose we divided the universe in half. To keep track of elements in a set, we
would keep only a list of those elements in the set which were in the first half of
the universe. The list would be approximately half as long as the previous
technique, and by doubling the number of elements in the list, we would have a
good estimate of the number of elements in the set. If the list again became too
long, we could restrict our attention to only one quarter of the universe, cut the
size of the list in half (approximately) and now multiply by four to get an estimate
of the actual size of the set. This process may be continued. Every time the list
gets too long, halve the size of the universe, and double the multiplier. The way
in which the universe is partitioned is important. For any such division, there are
sets which will not be estimated correctly. A set of elements completely
contained in the segment of the universe being recorded will be overestimated,
and the set of elements not in that segment will be under-estimated. Rather than
using one fixed partition of the universe, we may look at a class of partitions and
determine how good an estimate is provided by members of the class. A class of
partitions defines a set of possibly overlapping segments, namely, those
segments that the universe is broken into b...