Returns the index value in the input string where a specified matching string is located. Search is conducted left-to-right.

Input can be specified as a column reference or a string literal, although string literal usage is rare.

A column reference can refer to a column of String, Object, or Array type, which makes the FIND function useful for filtering data before it has been completely un-nested into tabular data.

Returned value is from the beginning of the string, regardless of the string index value.

If no match is found, the function returns a null value.

If you need to determine if a value is in an array or not, you can use the MATCHES function, which returns a true/false response. See MATCHES Function.

You can also search a string from the right. For more information, see RIGHTFIND Function.

Basic Usage

Column reference example:

derive type:single value: FIND(MyName,'find this',true,0)

Output: Searches the MyName column value for the string find this from the beginning of the value, ignoring case. If a match is found, the index value where the string is located is written to the new column.

String literal example:

derive type:single value: FIND('Hello, World','lo',false,2)

Output: Searches the string Hello, World for the string lo, in a case-sensitive search, beginning at the third character in the string. Since the match is found at the fourth character, the value 3 is written to the new column.

Output: Searches the SearchPool column value for the string FindIt from the beginning of the value (default). Default behavior is to not ignore case. If the string is found, the new column Finds contains the value found it. Otherwise, the column is empty.

ignore_case

If true, the FIND function ignores case when trying to match the string literal or pattern value.

Default value is false, which means that case-sensitive matching is performed by default.

Usage Notes:

Required?

Data Type

Example Value

No

Boolean

true

start_index

The index of the character in the column or string literal value at which to begin the search. For example, a value of 2 instructs the FIND function to begin searching from the third character in the column or string value.

NOTE: Index values begin at 0. If not specified, the default value is 0, which searches the entire string.

Value must be a non-negative integer value.

If this value is greater than the length of the string, then no match is possible.

Usage Notes:

Required?

Data Type

Example Value

No

Integer (non-negative)

2

Examples

Example - Locate product purchases in transaction stream

Source:

You have the simplified transaction stream listed below in which master information about a transaction (TransactionId and CustomerId) is paired with order detail information that is brought into the application as an array in the OrderDetail column. The array column contains information about product ID, quantity, and the type of transaction.

TransactionId

CustomerId

OrderDetail

12312312

100023

[{"ProdId":"54321","Qty":"5","TransType":"PURCHASE"}]

12312313

100045

[{"ProdId":"94105","Qty":"12","TransType":"PURCHASE"}]

12312314

100066

[{"ProdId":"54321","Qty":"1","TransType":"TEST"}]

12312315

100068

[{"ProdId":"85858","Qty":"9","TransType":"PURCHASE"}]

The transaction stream includes test transactions, which are identified by the value TEST for TransType in the detail column. You want to remove these transactions early in the process, which should simplify your dataset and speed up its processing.

Transform:

First, you must identify the records that contain the test transaction value. The following transform generates a new column containing true/false values for whether the value "TEST" appears in the OrderDetail transform.

Tip: You should include the double-quotes around the value, in case the other fields in the array could contain some version of the value TEST. Note that the double quotes need to be escaped, as in the value below.

derive type:single value:FIND(OrderDetail, '\"TEST\"', false, 0)

When the step is added to the recipe, the find_OrderDetail column is generated, containing the index value returned by the FIND function. In this case, there is only one row that contains a value: 42 for the third transaction.

You can then add the following step to keep the rows where the FIND function returned a null value in the find_OrderDetail column: