Using the Python Operator

Introduction

The Python operator allows the application designer to execute arbitrary Python code within StreamBase applications. The purpose
of these operators is to enable Python-centric teams to reuse their code without requiring major rewrites to execute event
processing. This includes execution of models produced with SciPy and TensorFlow.

The Python stateful sessions are attached as child processes to StreamBase applications. Python operators interact with these
sessions by setting input variables, executing the script, and reading output variables. The operator guarantees that all
three operations are executed sequentially even if there are multiple operator instances touching the same session or the
operator is running in asynchronous mode.

Python operators support any Python runtime compliant with Python 2.7 or 3.x. The script code must be compliant with the used
runtime. That means it must use libraries and language structures available in the given runtime. The operator treats the
script code as opaque and does not attempt to parse or compile it before sending it to the runtime. At the same time, all
power of the selected runtime (libraries, Java classes in Jython, .NET access in IronPython) is accessible from the script.

Python Compatibility

The operator integration layer uses a minimal set of features from Python 2.7 and Python 3.x. It requires a pickle library
and TCP/IP networking. The constructs used are compatible with Python 2.7 and 3.x.

Runtime

Version

Notes

Python 2

2.7.x

Python 3

3.x.x

Tested 3.4.x on CentOS 7 and 3.6.x on Windows 10.

PyPy

5.x

Tested 5.0.1 on CentOS 7 and 5.9.0 on Windows 10.

Jython

2.7.0

IronPython

2.7.7

Requires setting the useTempFile property in the configuration file to true.+

Data Conversion

The datatype passed from the inputVars field is inferred from the field type. When you define the datatype for the outputVars tuple fields, the operator runtime tries the best effort to cast the Python objects to StreamBase types. This table summarizes
the conversion.

Global Python Instance

Define Python instances in the adapter-configurations.xml configuration file or as local module instances. The latter approach allows you to define Python instances that are private
to concurrent regions (for parallelism), but still shared by multiple operators (for example, to separate initialization from
execution calls).

For configuration-defined Python instances, use the adapter-configuration element.

If a value is not present, the default is used. Those values listed without a default are required.

Property

Type

Default

Description

instance

string

This is the name that links the operators together and is displayed in the drop-down list on each operator's property configuration
when using the global instance type.

executable

string

python

Path to the Python executable. When absent, the instance is launched with the command, python.

workingDir

string

.

Working directory for the launched process. When absent, the process is started in the same directory as parent StreamBase
process.

useTempFile

boolean

false

The flag indicating that the integration layer should create temporary file with Python code wrapping the interactions with
StreamBase instead of pushing it through stdin. The latter (default) method works for most Python runtimes. Use this flag
when launching IronPython.

captureOutput

boolean

false

Modifies the stdout and stderr behavior. By default, both are chained to the parent's process stdout and stderr. For tests
including output, it is recommended to capture this.

envVariables

section

Environment variable to be passed/overridden launching the Python interpreter. Use the name attribute to provide name for variable and val value.

arguments

section

Argument to the Python interpreter (not script). Can be defined multiple times. The common argument used is -u, which forces Python to use unbuffered stdin/stdout/stderr streams. Use the val attribute to provide a value.

For Python instances defined in EventFlow, use the Python Instance operator. It uses the same parameters as the configuration
file. The Python operators within the same EventFlow can refer to this instance by setting the Instance Type operator property to Local and supplying the instance name in the Local Instance Id property, where the name is the Python Instance name within the EventFlow.

Python Operator Properties

This section describes the properties you can set for the Python operator, using the various tabs of the Properties view in
StreamBase Studio.

General Tab

Name: Use this field to specify or change the component's name, which must be unique in the application. The name must contain
only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must
be alphabetic or an underscore.

Operator: A read-only field that shows the formal name of the operator. If this operator is a global Java operator or your own custom
operator, then this field also shows the fully qualified class name that implements the functionality of this operator. If
you need to reference this class name elsewhere in your application, you can right-click this field and select Copy from the
context menu to place the full class name in the system clipboard.

Start with application: If this field is set to Yes (default) or to a module parameter that evaluates to true, this instance of this operator starts as part of the JVM engine that runs this EventFlow fragment. If this field is set
to No or to a module parameter that evaluates to false, the operator instance is loaded with the engine, but does not start until you send an epadmin container resume command (or its sbadmin equivalent), or until you start the component with StreamBase Manager.

Enable Error Output Port: Select this check box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output
port, always the last port for the component. See Using Error Ports to learn about Error Ports.

Description: Optionally enter text to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the
description by pressing Ctrl while the component's tooltip is displayed.

Operator Properties Tab

Property

Type

Description

Instance Type

radio button

When Local is selected the operator will used the instance defined in the event flow using PythonInstance operator. When Global is selected the configuration defined in the adapter-configurations.xml file is used.

Local Instance ID

text

When Instance Type has Local selected this provides the name of the local Python Instance operator to use.

Global Instance ID

text

When Instance Type has Global selected this provides the name of the globally configured Python instance configured in the adapter-configurations.xml file.

Asynchronous

check box

When checked, the operator executes the script using a non-blocking call. This way, long operations can be executed without
suspending the processing in the module. Make sure that module invariants are preserved around the call. Note that, contrary
to the concurrent parallel execution in StreamBase, this operator does not allocate additional threads and uses lightweight
job scheduling.

Log Level

Drop-down list

Controls the level of verbosity the adapter uses to issue informational traces to the console. This setting is independent
of the containing application's overall log level. Available values, in increasing order of verbosity, are: OFF, ERROR, WARN,
INFO, DEBUG, TRACE.

Script Tab

Property

Type

Description

Script

multiline text

Python code to be executed for each incoming tuple.

Output Tab

Property

Type

Description

Output variables

schema definition

Definition for the expected output variables. Each field defined for the schema corresponds to the Python session variable
expected to be stored by this operator's script, or any previous call. The output variables must be of type castable to StreamBase
field type. Check the type conversion matrix for hints about available types.

Input and Output Port

The input port accepts any incoming tuple transparently. The reserved fields are inputVars and outputVars.

inputVars — optional tuple containing variables to be set in the Python session.

outputVars — tuple of the structure defined in the Output Variables containing variables read from the Python session.

* arbitrary pass through parameters.

Unrecognized fields are transparently passed. The inputVars field is not propagated; the outputVars field is not allowed in the input port.