Rules and filtering

Premium operators

Below are the operators available in real-time and historical PowerTrack.

A subset of these are available with the premium and enterprise search APIs. See this table for a product-by-product list of available operators.

Operator

Description

keyword

Matches a keyword within the body of a Tweet. This is a tokenized match, meaning that your keyword string will be matched against the tokenized text of the Tweet body – tokenization is based on punctuation, symbol, and separator Unicode basic plane characters. For example, a Tweet with the text “I like coca-cola” would be split into the following tokens: I, like, coca, cola. These tokens would then be compared to the keyword string used in your rule. To match strings containing punctuation (e.g. coca-cola), symbol, or separator characters, you must use a quoted exact match as described below.

emoji

Matches an emoji within the body of a Tweet. Emojis are a tokenized match, meaning that your emoji will be matched against the tokenized text of the Tweet body – tokenization is based on punctuation, symbol/emoji, and separator Unicode basic plane characters. For example, a Tweet with the text “I like 🍕” would be split into the following tokens: I, like, 🍕. These tokens would then be compared to the emoji used in your rule. Note that if an emoji has a variant, you must use “quotations” to add to a rule.

"exact phrase match"

Matches an exact phrase within the body of a Tweet.

Note: In 30 Day Search and Full Archive Search, punctuation is not tokenized and is instead treated as whitespace.

e.g. quoted “#hashtag” will match “hashtag” but not #hashtag (use the hashtag # operator without quotes to match on actual hashtags

e.g. quoted “$cashtag” will match “cashtag” but not $cashtag (use the cashtag $ operator without quotes to match on actual cashtags

#

Matches any Tweet with the given hashtag.

This operator performs an exact match, NOT a tokenized match, meaning the rule “2016” will match posts with the exact hashtag “2016”, but not those with the hashtag “2016election”

Note: that the hashtag operator relies on Twitter’s entity extraction to match hashtags, rather than extracting the hashtag from the body itself. See HERE for more information on Twitter Entities JSON attributes.

@

Matches any Tweet that mentions the given username.

The to: operator returns a subset match of the @mention operator.

The value can be either the username (excluding the @ character) or the user’s numeric Account ID or. See HERE or HERE for methods for looking up numeric Twitter Account IDs.

"keyword1 keyword2"~N

Commonly referred to as a proximity operator, this matches a Tweet where the keywords are no more than N tokens from each other.

If the keywords are in the opposite order, they can not be more than N-2 tokens from each other.

Can have any number of keywords in quotes.

N cannot be greater than 6.

contains:

Substring match for Tweets that have the given substring in the body, regardless of tokenization. In other words, this does a pure substring match and does not consider word boundaries.

Use double quotes to match substrings that contain whitespace or punctuation.

from:

Matches any Tweet from a specific user.

The value must be the user’s Twitter numeric Account ID or username (excluding the @ character). See HERE or HERE for methods for looking up numeric Twitter Account IDs.

to:

Matches any Tweet that is in reply to a particular user.

The value must be the user’s numeric Account ID or username (excluding the @ character). See HERE or HERE for methods for looking up numeric Twitter Account IDs.

url:

Performs a tokenized (keyword/phrase) match on the expanded URLs of a Tweet (similar to url_contains). Tokens and phrases containing punctuation or special characters should be double-quoted. E.g. url:"/developer". While generally not recommended, if you want to match on a specific protocol, enclose in double-quotes: url:"https://developer.twitter.com".

url_title:

Performs a keyword/phrase match on the (new) expanded URL HTML title metadata. See HERE for more information on expanded URL enrichment.

url_description:

Performs a keyword/phrase match on the (new) expanded page description metadata. See HERE for more information on expanded URL enrichment.

url_contains:

Matches Tweets with URLs that literally contain the given phrase or keyword. To search for patterns with punctuation in them (i.e. google.com) enclose the search term in quotes.

NOTE: If you’re using the Expanded URL output format, we will match against the expanded URL as well.

bio:

Matches a keyword or phrase within the user bio of a Tweet. This is a tokenized match within the contents of the 'description' field within the User object.

bio_name:

Matches a keyword within the user bio name of a Tweet. This is a tokenized match within the contents of a user’s “name” field within the User object.

bio_location:

Matches tweets where the User object's location contains the specified keyword or phrase. This operator performs a tokenized match, similar to the normal keyword rules on the message body.

This location is part of the User object, and is the account's 'home' location, is a non-normalized, user-generated, free-form string, and is different from a Tweet's location (when available).

statuses_count:

Matches Tweets when the author has posted a number of statuses that falls within the given range.

If a single number is specified, any number equal to or higher will match.

Additionally, a range can be specified to match any number in the given range (e.g., statuses_count:1000..10000).
.

followers_count:

Matches Tweets when the author has a followers count within the given range.

If a single number is specified, any number equal to or higher will match.

Additionally, a range can be specified to match any number in the given range (e.g., followers_count:1000..10000).

friends_count:

Matches Tweets when the author has a friends count (the number of users they follow) that falls within the given range.

If a single number is specified, any number equal to or higher will match.

Additionally, a range can be specified to match any number in the given range (e.g., friends_count:1000..10000).

listed_count:

Matches Tweets when the author has been listed on Twitter a number of times falls within the given range.

If a single number is specified, any number equal to or higher will match.

Additionally, a range can be specified to match any number in the given range (e.g., listed_count:10..100).

$

Matches any Tweet that contains the specified ‘cashtag’ (where the leading character of the token is the ‘$’ character).

Note that the cashtag operator relies on Twitter’s ‘symbols’ entity extraction to match cashtags, rather than trying to extract the cashtag from the body itself. See HERE for more information on Twitter Entities JSON attributes.

retweets_of:

Matches Tweets that are Retweets of a specified user. Accepts both usernames and numeric Twitter Account IDs (NOT tweet status IDs).
See HERE or HERE for methods for looking up numeric Twitter Account IDs.

retweets_of_status_id:

Deliver only explicit Retweets of the specified Tweet. Note that the status ID used should be the ID of an original Tweet and not a Retweet.

in_reply_to_status_id:

Deliver only explicit replies to the specified Tweet.

sample:

Returns a random sample of Tweets that match a rule rather than the entire set of Tweets. Sample percent must be represented by an integer value between 1 and 100. This operator applies to the entire rule and requires any “OR’d” terms be grouped.

Important Note: The sample operator first reduces the scope of the firehose to X%, then the rule/filter is applied to that sampled subset. If you are using, for example, sample:10, each Tweet has a 10% chance of being in the sample.

Also, the sampling is deterministic, and you will get the same data sample in realtime as you would if you pulled the data historically.

source:

Matches any Tweet generated by the given source application. The value must be either the name of the application or the application’s URL. Cannot be used alone.

lang:

Matches Tweets that have been classified by Twitter as being of a particular language (if, and only if, the tweet has been classified). It is important to note that each Tweet is currently only classified as being of one language, so AND’ing together multiple languages will yield no results.

Note: if no language classification can be made the provided result is ‘und’ (for undefined).

The list below represents the currently supported languages and their corresponding BCP 47 language identifier:

Amharic - am Hungarian – hu Portuguese - pt

Arabic - ar Icelandic - is Romanian - ro

Armenian - hy Indonesian - in Russian - ru

Bengali - bn Italian - it Serbian - sr

Bulgarian - bg Japanese - ja Sindhi - sd

Burmese – my Kannada - kn Sinhala - si

Chinese - zh Khmer - km Slovak - sk

Czech - cs Korean - ko Slovenian - sl

Danish - da Lao - lo Sorani Kurdish - ckb

Dutch - nl Latvian - lv Spanish - es

English - en Lithuanian - lt Swedish - sv

Estonian - et Malayalam - ml Tagalog - tl

Finnish - fi Maldivian - dv Tamil - ta

French - fr Marathi - mr Telugu - te

Georgian - ka Nepali - ne Thai - th

German - de Norwegian - no Tibetan - bo

Greek - el Oriya - or Turkish - tr

Gujarati - gu Panjabi - pa Ukrainian - uk

Haitian - ht Pashto - ps Urdu - ur

Hebrew - iw Persian - fa Uyghur - ug

Hindi - hi Polish - pl Vietnamese - vi

Welsh - cy

place:

Matches Tweets tagged with the specified location or Twitter place ID (see examples). Multi-word place names (“New York City”, “Palo Alto”) should be enclosed in quotes.

Note: See the GET geo/search public API endpoint for how to obtain Twitter place IDs.

Note: Operators matching on place (Tweet geo) will only include matches from original tweets. Retweets do not contain any place data.

place_country:

Matches Tweets where the country code associated with a tagged place/location matches the given ISO alpha-2 character code.

Note: Operators matching on place (Tweet geo) will only include matches from original tweets. Retweets do not contain any place data.

point_radius:[lon lat radius]

Matches against the Exact Location (x,y) of the Tweet when present, and in Twitter, against a “Place” geo polygon, where the Place is fully contained within the defined region.

Units of radius supported are miles (mi) and kilometers (km).

Radius must be less than 25mi.

Longitude is in the range of ±180

Latitude is in the range of ±90

All coordinates are in decimal degrees.

Rule arguments are contained within brackets, space delimited.

Note: Operators matching on place (Tweet geo) will only include matches from original tweets. Retweets do not contain any place data.

bounding_box:[west_long south_lat east_long north_lat]

Matches against the Exact Location (long, lat) of the Tweet when present, and in Twitter, against a “Place” geo polygon, where the Place is fully contained within the defined region.

west_long south_lat represent the southwest corner of the bounding box where west-long is the longitude of that point, and south_lat is the latitude.

east_long and north_lat represent the northeast corner of the bounding box, where east_long is the longitude of that point, and north_lat is the latitude.

Width and height of the bounding box must be less than 25mi

Longitude is in the range of ±180

Latitude is in the range of ±90

All coordinates are in decimal degrees.

Rule arguments are contained within brackets, space delimited.

Note: Operators matching on place (Tweet geo) will only include matches from original tweets. Retweets do not contain any place data.

profile_country:

Exact match on the “countryCode” field from the “address” object in the Profile Geo enrichment.

Uses a normalized set of two-letter country codes, based on ISO-3166-1-alpha-2 specification. This operator is provided in lieu of an operator for “country” field from the “address” object to be concise.

profile_region:

Matches on the “region” field from the “address” object in the Profile Geo enrichment.

This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.

profile_locality:

Matches on the “locality” field from the “address” object in the Profile Geo enrichment.

This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.

profile_subregion:

Matches on the “subRegion” field from the “address” object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region.

This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.

NOTE: All ‘is:’ and ‘has:’ operators cannot be used as standalone operators and must be combined with another clause (e.g. @TwitterDev has:links)

has:geo

Matches Tweets that have Tweet-specific geolocation data provided from Twitter. This can be either “geo” lat-long coordinate, or a “location” in the form of a Twitter “Place”, with the corresponding display name, geo polygon, and other fields.

Note: Operators matching on place (Tweet geo) will only include matches from original tweets. Retweets do not contain any place data.

has:profile_geo

Matches Tweets that have any Profile Geo metadata, regardless of the actual value.

has:links

This operator matches Tweets which contain links in the message body.

is:retweet

Deliver only explicit retweets that match a rule. Can also be negated to exclude retweets that match a rule from delivery and only original content is delivered.

This operator looks only for true Retweets, which use Twitter’s Retweet functionality. Quoted Tweets and Modified Tweets which do not use Twitter’s Retweet functionality will not be matched by this operator.

Can also be negated to match only on original Tweets.

is:quote

A Boolean search operator that returns all Quoted Tweets. Delivers only explicit Quote Tweets that match a rule. Can also be negated to exclude Quote Tweets that match a rule from delivery.

is:verified

Deliver only Tweets where the author is “verified” by Twitter. Can also be negated to exclude Tweets where the author is verified.

is:reply

Deliver only replies that match a rule. It can also be negated to exclude delivery of replies that match the specified rule. This operator matches on replies in original Tweets, as well as replies in quoted Tweets and Retweets. You can use is:reply in conjunction with is:retweet and is:quote to only deliver replies to original Tweets.

-is:nullcast

Negation only. Negates Tweets that are nullcasted (e.g., contains the "scopes": {"followers": false}" object). For more info on Nullcasted Tweets, see here.

Note: Must be used at highest level of rule when used with the Search API.
Example: (gold AND silver) -is:nullcast