Vanessa Murdock, Barcelona ES

Vanessa Murdock, Barcelona ES

Patent application number

Description

Published

20090024554

Method For Matching Electronic Advertisements To Surrounding Context Based On Their Advertisement Content - A system for selecting electronic advertisements from an advertisement pool to match the surrounding content is disclosed. To select advertisements, the system takes an approach to content match that focuses on capturing subtler linguistic associations between the surrounding content and the content of the advertisement. The system of the present invention implements this goal by means of simple and efficient semantic association measures dealing with lexical collocations such as conventional multi-word expressions like “big brother” or “strong tea”. The semantic association measures are used as features for training a machine learning model. In one embodiment, a ranking SVM (Support Vector Machines) trained to identify advertisements relevant to a particular context. The trained machine learning model can then be used to rank advertisements for a particular context by supplying the machine learning model with the semantic association measures for the advertisements and the surrounding context.

01-22-2009

20090089244

METHOD OF DETECTING SPAM HOSTS BASED ON CLUSTERING THE HOST GRAPH - Systems and methods for identifying spam hosts are disclosed in which hosts are known to the system and initially classified as spam or non-spam. Then the hosts are partitioned into clusters based on how each host is linked to other hosts. Each cluster is then analyzed and, depending on the number of spam and non-spam hosts it contains, the cluster may be classified as a spam cluster or a non-spam cluster. The hosts within the cluster may then be reclassified based on the cluster's classification. The results may then be used in many different ways including to filter search results based on host classifications so that spam hosts are not displayed or displayed last in a results set.

04-02-2009

20090089285

METHOD OF DETECTING SPAM HOSTS BASED ON PROPAGATING PREDICTION LABELS - Systems and methods for identifying spam hosts are disclosed in which hosts are known to the system and initially classified as spam or non-spam by a baseline classifier. The accuracy of the initial host classifications are then improved by propagating them using a random walk algorithm. The random walk used may be modified in order to obtain a weighted or skewed characterization of the host. The hosts may then be reclassified based on the characterization obtained from the random walk to obtain a final spam/non-spam classification. The final classification may then be used in many different ways including to filter search results based on host classifications so that spam hosts are not displayed or displayed last in a results set.

04-02-2009

20090089373

SYSTEM AND METHOD FOR IDENTIFYING SPAM HOSTS USING STACKED GRAPHICAL LEARNING - Systems and methods for identifying spam hosts are disclosed in which hosts known to the system and initially classified as spam or non-spam by a baseline classifier. Then for each node u in the host graph a new feature is computed. This feature is an aggregate function of the initial classifications produced by the baseline classifier for the neighbors of the node u. The set of neighbors can be defined in many different ways: in-link neighbors, out-link neighbors, bi-directional neighbors, k-hops neighbors, etc. The new feature computed above then is added to the existing set of features, and the baseline classifier is trained again, producing new predictions for each node. The results may then be used in many different ways including to filter search results based on host classifications so that spam hosts are not displayed or displayed last in a results set.

04-02-2009

20090112840

Method For Selecting Electronic Advertisements Using Machine Translation Techniques - A system for selecting electronic advertisements from an advertisement pool to match the surrounding content is disclosed. To select advertisements, the system takes an approach to content match that takes advantage of machine translation technologies. The system of the present invention implements this goal by means of simple and efficient machine translation features that are extracted from the surrounding context to match with the pool of potential advertisements. Machine translation features used as features for training a machine learning model. In one embodiment, a ranking SVM (Support Vector Machines) trained to identify advertisements relevant to a particular context. The trained machine learning model can then be used to rank advertisements for a particular context by supplying the machine learning model with the machine translation features measures for the advertisements and the surrounding context.

04-30-2009

20090248662

Ranking Advertisements with Pseudo-Relevance Feedback and Translation Models - Methods, computer products, and systems for selecting advertisements in response to an internet query are provided. The method provides for receiving an internet query that includes query terms, retrieving and then ranking a first set of advertisements in response to the internet query using a query likelihood model. The method then selects sampling words using pseudo-relevance feedback and translation models, the internet query, and the first set of ad materials obtained using the query likelihood model. The sampling words are chosen from a distribution of words from the words in the first set of ad materials, and the pseudo-relevance feedback model is used to select a word (w) in the distribution of words based on a probability that word w generates query term q(p(q|w)). The translation model is used to calculate the probability p(q|w) based on a translation probability that w translates into q(t(q|w)). The method also includes retrieving and ranking a second set of ad materials using an expanded query formed by adding the selected sampling words to the original internet query. The second set of ad materials is then presented to the user. The use of translation models enhances the topicality of the results because the distribution words selected are related to the terms in the original query as indicated by their translation probabilities.

10-01-2009

20090265230

RANKING USING WORD OVERLAP AND CORRELATION FEATURES - A system for and method for ranking results. The system includes a server configured to receive a query and an advertisement engine configured to receive the query from the server. The advertisement engine ranks advertisements based on various features, including at least one word overlap feature and a correlation feature.

10-22-2009

20090265290

OPTIMIZING RANKING FUNCTIONS USING CLICK DATA - A system for optimizing machine-learned ranking functions based on click data. The system determines the weighting for each feature of a plurality of features according to a learning model based on the click data. The system selects an element from a plurality of elements for display on a web page based on the weighting of each feature of the plurality of features. The system may rank the items to form a list on the web page based on the weighted features in order of inferred relevance according to the online learning model.

10-22-2009

20090271388

ANNOTATIONS OF THIRD PARTY CONTENT - The subject matter disclosed herein relates to creating a search query based on content and subject of a web page, for example. In one particular example, such a search query may be established by a selection of one or more keywords in a web page. Consequently, the search query may be affected by a determination of content and/or a subject of the web page.

10-29-2009

20090298594

MEDIA/TAG-BASED WORD GAMES - A method of creating a word game comprising receiving a seed value from a browser, obtaining from a media database a plurality of words associated with the seed value, creating a word game from at least a subset of the obtained plurality of words, integrating the word game into a browser interpretable document, and, returning the browser interpretable document to the browser. Some embodiments further comprise incorporating into the browser interpretable document an advertisement associated with the seed value and/or at least one of the obtained plurality of words. Also disclosed is a system comprising a gaming server which receives a game request; a media server and media tag database; the gaming server requesting from the media server a set of media tags associated with a game seed value, building a word game using at least a subset of the media tags, and transmitting the word game.

12-03-2009

20100010895

PREDICTION OF A DEGREE OF RELEVANCE BETWEEN QUERY REWRITES AND A SEARCH QUERY - A predictor for determining a degree of relevance between a query rewrite and a search query is provided. The predictor may receive a search query from a user via a terminal and identify a set of candidate query rewrites associated with the search query. The predictor may then extract a set of features from advertisements associated with the query rewrites and the search query and determine a degree of relevance between the advertisements and the search query based on a prediction model. The predictor may then determine the degree of relevance between the rewrites and the search query based on the determined degree of relevance between the advertisements and the search query.

01-14-2010

20100131493

LIGHTNING SEARCH BOOKMARK - Disclosed are methods and apparatus for automatically storing and generating bookmarks. In one embodiment, a search query is received. Information identifying a bookmark representing the search query is automatically stored in association with a set of bookmarks. Search results corresponding to the search query are automatically obtained and provided, where the search results identify one or more documents. When one of the documents is selected, a link to the selected one of the documents is automatically stored in association with the bookmark.

05-27-2010

20100131495

LIGHTNING SEARCH AGGREGATE - Disclosed are methods and apparatus for executing a search query. In accordance with one embodiment, a search query is obtained. The search query is classified into one or more of a plurality of categories. The search query is executed for each of the one or more of the plurality of categories. Search results corresponding to the search query are obtained for each of the one or more of the plurality of categories. The search results are then provided for each of the one or more of the plurality of categories.

05-27-2010

20100235346

MULTI-TIERED SYSTEM FOR SEARCHING LARGE COLLECTIONS IN PARALLEL - The system includes a pre-retrieval predictor which determines which collection to submit the query to with a certain degree of confidence. The query is then submitted to either one collection, or multiple collections in parallel. When the results are returned, they are assessed and if they are deemed adequate they are shown to the user. If they are inadequate, the results from the smaller and larger collections are merged and shown to the user. Only if the predictor failed to send the query to more than one collection and the result is not adequate, the query is sent to other collections and executed in a sequential fashion. Overall, large scale searching can be accomplished much more efficiently with no degradation in the quality of the retrieved results and a small increase in processing cost.

RANKING ENTITY RELATIONS USING EXTERNAL CORPUS - Exemplary methods and apparatuses are disclosed that may be used to provide or otherwise support ranking entity relations utilizing the vocabulary of at least one external corpus for use in search engine information management systems.

03-24-2011

20110087680

Method for Selecting Electronic Advertisements Using Machine Translation Techniques - A system for selecting electronic advertisements from an advertisement pool to match the surrounding content is disclosed. To select advertisements, the system takes an approach to content match that takes advantage of machine translation technologies. The system of the present invention implements this goal by means of simple and efficient machine translation features that are extracted from the surrounding context to match with the pool of potential advertisements. Machine translation features used as features for training a machine learning model. In one embodiment, a ranking SVM (Support Vector Machines) trained to identify advertisements relevant to a particular context. The trained machine learning model can then be used to rank advertisements for a particular context by supplying the machine learning model with the machine translation features measures for the advertisements and the surrounding context.

04-14-2011

20110173150

METHODS AND SYSTEM FOR ASSOCIATING LOCATIONS WITH ANNOTATIONS - Methods, systems and computer program products for associating geographical locations with annotations corresponding to content. In one method, a language model is developed. The language model is developed from the location information and the one or more annotations associated with content uploaded by users. The language model is based on the probabilistic distribution of locations over one or more annotations. Further, when a user provides one or more annotations, the system and the method may use the language model to identify one or more locations associated with the one or more annotations provided by the user. The language model predicts one or more geographical locations based on the probabilistic distribution of locations over the annotations.

METHOD AND INTERFACE FOR DISPLAYING LOCATIONS ASSOCIATED WITH ANNOTATIONS - Methods, systems and computer program products for displaying geographical locations with the one or more annotations. In a particular embodiment, a language model is used to obtain the probability distribution of the locations over one or more annotations. Further, the system and the method utilizes the probability data obtained from the language model to determine a probability score for each location over the one or more annotations. Subsequently, one or more geographical locations are displayed on a world map, based on the probability score of the geographical locations over the one or more annotations. In one embodiment, geographical locations may be highlighted using a color code on a heat map overlaid on the world map. The color code may represent the ranking of the geographical locations based on the calculated probability score for each identified geographical location. Further, when the user provides one or more additional annotations, the world map may be dynamically updated to display the relevant geographical locations associated with the updated annotations.

07-14-2011

20120011129

FACETED EXPLORATION OF MEDIA COLLECTIONS - Exemplary methods and apparatuses are disclosed that may be used to provide or otherwise support extraction of objects and facets from one or more extraction corpora and ranking of said facets using multiple ranking corpora.

Method For Matching Electronic Advertisements To Surrounding Context Based On Their Advertisement Content - A system for selecting electronic advertisements from an advertisement pool to match the surrounding content is disclosed. To select advertisements, the system takes an approach to content match that focuses on capturing subtler linguistic associations between the surrounding content and the content of the advertisement. The system of the present invention implements this goal by means of simple and efficient semantic association measures dealing with lexical collocations such as conventional multi-word expressions like “big brother” or “strong tea”. The semantic association measures are used as features for training a machine learning model. In one embodiment, a ranking SVM (Support Vector Machines) trained to identify advertisements relevant to a particular context. The trained machine learning model can then be used to rank advertisements for a particular context by supplying the machine learning model with the semantic association measures for the advertisements and the surrounding context.

METHOD AND SYSTEM TO IDENTIFY GEOGRAPHICAL LOCATIONS ASSOCIATED WITH QUERIES RECEIVED AT A SEARCH ENGINE - Techniques are provided for prediction locations of users that submit search queries. A query is received at a search engine. An inverted index is searched to identify one or more geographical locations associated with one or more terms of the received query. The inverted index lists a plurality of query terms and one or more geographical locations associated with each query term. Each geographic location that is associated with a listed query term in the inverted index is a determined location for at least one user previously having submitted the listed term in a search query. A geographical location is predicted for a user that submitted the received query based on the identified one or more geographical locations. In this manner, a location is predicted for the user based on similar queries previously submitted by users.

06-28-2012

20130013628

LIGHTNING SEARCH BOOKMARK - Disclosed are methods and apparatus for automatically storing and generating bookmarks. In one embodiment, a search query is received. Information identifying a bookmark representing the search query is automatically stored in association with a set of bookmarks. Search results corresponding to the search query are automatically obtained and provided, where the search results identify one or more documents. When one of the documents is selected, a link to the selected one of the documents is automatically stored in association with the bookmark.