The datatable module exports a special symbol f, which can be used
to refer to the columns of a frame currently being operated on. If this sounds
cryptic, consider that the most common way to operate on a frame is via the
square-bracket call DT[i,j,by,...]. And it is often the case that within
this expression you would want to refer to individual columns of the frame:
either to create a filter, or a transform, or specify a grouping variable, etc.
In all such cases the f symbol is used, and it is considered to be
evaluated within the context of the frame DT.

For example, consider the expression:

f.price

By itself, it just means a column named “price”, in an unspecified frame.
This expression becomes concrete, however, when used with a particular frame.
For example,:

train_dt[f.price>0,:]

selects all rows in train_dt where the price is positive. Thus, within the
call to train_dt[...], the symbol f refers to the frame train_dt.

The standalone f-expression may occasionally be useful too: it can be saved in
a variable and then re-applied to several different frames. Each time f
will refer to the frame to which it is being applied:

The simple expression f.price can be saved in a variable too. In fact,
there is a Frame helper method .export_names() which does exactly that:
returns a tuple of variables for each column name in the frame, allowing you to
omit the f. prefix:

As you have seen, the expression f.NAME refers to a column called “NAME”.
This notation is handy, but not universal. What do you do if the column’s name
contains spaces or unicode characters? Or if a column’s name is not known, only
its index? Or if the name is in a variable? For these purposes f supports
the square-bracket selectors:

Generally, f[i] means either the column at index i if i is an
integer, or the column with name i if i is a string.

Using an integer index follows the standard Python rule for list subscripts:
negative indices are interpreted as counting from the end of the frame, and
requesting a column with an index outside of [-ncols;ncols) will raise
an error.

This square-bracket form is also useful when you want to access a column
dynamically, i.e. if its name is not known in advance. For example, suppose
there is a frame with columns “2017_01”, “2017_02”, …, “2019_12”. Then
all these columns can be addressed as:

In all these cases a columnset is returned. This columnset may contain a
variable number of columns or even no columns at all, depending on the frame
to which this f-expression is applied.

Applying a slice to symbol f follows the same semantics as if f was a
list of columns. Thus f[:10] means the first 10 columns of a frame, or all
columns if the frame has less than 10. Similarly, f[9:10] selects the 10th
column of a frame if it exists, or nothing if the frame has less than 10
columns. Compare this to selector f[9], which also selects the 10th column
of a frame if it exists, but throws an exception if it doesn’t.

Besides the usual numeric ranges, you can also use name ranges. These ranges
include the first named column, the last named column, and all columns in
between. It is not possible to mix positional and named columns in a range,
and it is not possible to specify a step. If the range is x:y, yet column
x comes after y in the frame, then the columns will be selected in the
reverse order: first x, then the column preceding x, and so on, until
column y is selected last:

Finally, you can select all columns of a particular type by using that type
as an f-selector. You can pass either common python types bool, int,
float, str; or you can pass an stype such as dt.int32, or an ltype such as
dt.ltype.obj. You can also pass None to not select any columns. By itself
this may not be very useful, but occasionally you may need this as a fallback
in conditional expressions:

Columnsets support operations that either add or remove elements from the set.
This is done using methods .extend() and .remove().

The .extend() method takes a columnset as an argument (also a list, or dict,
or sequence of columns) and produces a new columnset containing both the
original and the new columns. The columns need not be unique: the same column
may appear multiple times in a columnset. This method allows to add transformed
columns into the columnset as well:

f[int].extend(f[float])# integer and floating-point columnsf[:3].extend(f[-3:])# the first and the last 3 columnsf.A.extend(f.B)# columns "A" and "B"f[str].extend(dt.str32(f[int]))# string columns, and also all integer# columns converted to strings# All columns, and then one additional column named 'cost', which contains# column `price` multiplied by `quantity`:f[:].extend({"cost":f.price*f.quantity})

When a columnset is extended, the order of the elements is preserved. Thus, a
columnset is closer in functionality to a python list than to a set. In
addition, some of the elements in a columnset can have names, if the columnset
is created from a dictionary. The names may be non-unique too.

The .remove() method is the opposite of .extend(): it takes an existing
columnset and then removes all columns that are passed as the argument:

f[:].remove(f[str])# all columns except columns of type stringf[:10].remove(f.A)# the first 10 columns without column "A"f[:].remove(f[3:-3])# same as `f[:3].extend(f[-3:])`, at least in the# context of a frame with 6+ columns

Removing a column that is not in the columnset is not considered an error,
similar to how set-difference operates. Thus, f[:].remove(f.A) may be
safely applied to a frame that doesn’t have column “A”: the columns that cannot
be removed are simply ignored.

If a columnset includes some column several times, and then you request to
remove that column, then only the first occurrence in the sequence will be
removed. Generally, the multiplicity of some column “A” in columnset
cs1.remove(cs2) will be equal the multiplicity of “A” in cs1 minus the
multiplicity of “A” in cs2, or 0 if such difference would be negative.
Thus,:

f[:].extend(f[int]).remove(f[int])

will have the effect of moving all integer columns to the end of the columnset
(since .remove() removes the first occurrence of a column it finds).

It is not possible to remove a transformed column from a columnset. An error
will be thrown if the argument of .remove() contains any transformed
columns.