Query expressions describe a value or a computation that can be used as part of
an update, create, filter, order by, annotation, or aggregate. There are a
number of built-in expressions (documented below) that can be used to help you
write queries. Expressions can be combined, or in some cases nested, to form
more complex computations.

Django supports negation, addition, subtraction, multiplication, division,
modulo arithmetic, and the power operator on query expressions, using Python
constants, variables, and even other expressions.

fromdjango.db.modelsimportCount,F,Valuefromdjango.db.models.functionsimportLength,Upper# Find companies that have more employees than chairs.Company.objects.filter(num_employees__gt=F('num_chairs'))# Find companies that have at least twice as many employees# as chairs. Both the querysets below are equivalent.Company.objects.filter(num_employees__gt=F('num_chairs')*2)Company.objects.filter(num_employees__gt=F('num_chairs')+F('num_chairs'))# How many chairs are needed for each company to seat all employees?>>>company=Company.objects.filter(...num_employees__gt=F('num_chairs')).annotate(...chairs_needed=F('num_employees')-F('num_chairs')).first()>>>company.num_employees120>>>company.num_chairs50>>>company.chairs_needed70# Create a new company using expressions.>>>company=Company.objects.create(name='Google',ticker=Upper(Value('goog')))# Be sure to refresh it if you need to access the field.>>>company.refresh_from_db()>>>company.ticker'GOOG'# Annotate models with an aggregated value. Both forms# below are equivalent.Company.objects.annotate(num_products=Count('products'))Company.objects.annotate(num_products=Count(F('products')))# Aggregates can contain complex computations alsoCompany.objects.annotate(num_offerings=Count(F('products')+F('services')))# Expressions can also be used in order_by(), either directlyCompany.objects.order_by(Length('name').asc())Company.objects.order_by(Length('name').desc())# or using the double underscore lookup syntax.fromdjango.db.modelsimportCharFieldfromdjango.db.models.functionsimportLengthCharField.register_lookup(Length)Company.objects.order_by('name__length')

An F() object represents the value of a model field or annotated column. It
makes it possible to refer to model field values and perform database
operations using them without actually having to pull them out of the database
into Python memory.

Instead, Django uses the F() object to generate an SQL expression that
describes the required operation at the database level.

This is easiest to understand through an example. Normally, one might do
something like this:

Here, we have pulled the value of reporter.stories_filed from the database
into memory and manipulated it using familiar Python operators, and then saved
the object back to the database. But instead we could also have done:

Although reporter.stories_filed=F('stories_filed')+1 looks like a
normal Python assignment of value to an instance attribute, in fact it’s an SQL
construct describing an operation on the database.

When Django encounters an instance of F(), it overrides the standard Python
operators to create an encapsulated SQL expression; in this case, one which
instructs the database to increment the database field represented by
reporter.stories_filed.

Whatever value is or was on reporter.stories_filed, Python never gets to
know about it - it is dealt with entirely by the database. All Python does,
through Django’s F() class, is create the SQL syntax to refer to the field
and describe the operation.

As well as being used in operations on single instances as above, F() can
be used on QuerySets of object instances, with update(). This reduces
the two queries we were using above - the get() and the
save() - to just one:

We can also use update() to increment
the field value on multiple objects - which could be very much faster than
pulling them all into Python from the database, looping over them, incrementing
the field value of each one, and saving each one back to the database:

Another useful benefit of F() is that having the database - rather than
Python - update a field’s value avoids a race condition.

If two Python threads execute the code in the first example above, one thread
could retrieve, increment, and save a field’s value after the other has
retrieved it from the database. The value that the second thread saves will be
based on the original value; the work of the first thread will simply be lost.

If the database is responsible for updating the field, the process is more
robust: it will only ever update the field based on the value of the field in
the database when the save() or update() is executed, rather
than based on its value when the instance was retrieved.

stories_filed will be updated twice in this case. If it’s initially 1,
the final value will be 3. This persistence can be avoided by reloading the
model object after saving it, for example, by using
refresh_from_db().

If the fields that you’re combining are of different types you’ll need
to tell Django what kind of field will be returned. Since F() does not
directly support output_field you will need to wrap the expression with
ExpressionWrapper:

A class attribute, as a format string, that describes the SQL that is
generated for this function. Defaults to
'%(function)s(%(expressions)s)'.

If you’re constructing SQL like strftime('%W','date') and need a
literal % character in the query, quadruple it (%%%%) in the
template attribute because the string is interpolated twice: once
during the template interpolation in as_sql() and once in the SQL
interpolation with the query parameters in the database cursor.

A class attribute that denotes the number of arguments the function
accepts. If this attribute is set and the function is called with a
different number of expressions, TypeError will be raised. Defaults
to None.

To avoid a SQL injection vulnerability, extra_contextmust
not contain untrusted user input
as these values are interpolated into the SQL string rather than passed
as query parameters, where the database driver would escape them.

The *expressions argument is a list of positional expressions that the
function will be applied to. The expressions will be converted to strings,
joined together with arg_joiner, and then interpolated into the template
as the expressions placeholder.

Positional arguments can be expressions or Python values. Strings are
assumed to be column references and will be wrapped in F() expressions
while other values will be wrapped in Value() expressions.

The **extra kwargs are key=value pairs that can be interpolated
into the template attribute. To avoid a SQL injection vulnerability,
extramust not contain untrusted user input as these values are interpolated
into the SQL string rather than passed as query parameters, where the database
driver would escape them.

The function, template, and arg_joiner keywords can be used to
replace the attributes of the same name without having to define your own
class. output_field can be used to define the expected return type.

An aggregate expression is a special case of a Func() expression that informs the query that a GROUPBY clause
is required. All of the aggregate functions,
like Sum() and Count(), inherit from Aggregate().

Since Aggregates are expressions and wrap expressions, you can represent
some complex computations:

A class attribute determining whether or not this aggregate function
allows passing a distinct keyword argument. If set to False
(default), TypeError is raised if distinct=True is passed.

The expressions positional arguments can include expressions or the names
of model fields. They will be converted to a string and used as the
expressions placeholder within the template.

The output_field argument requires a model field instance, like
IntegerField() or BooleanField(), into which Django will load the value
after it’s retrieved from the database. Usually no arguments are needed when
instantiating the model field as any arguments relating to data validation
(max_length, max_digits, etc.) will not be enforced on the expression’s
output value.

Note that output_field is only required when Django is unable to determine
what field type the result should be. Complex expressions that mix field types
should define the desired output_field. For example, adding an
IntegerField() and a FloatField() together should probably have
output_field=FloatField() defined.

The distinct argument determines whether or not the aggregate function
should be invoked for each distinct value of expressions (or set of
values, for multiple expressions). The argument is only supported on
aggregates that have allow_distinct set to True.

A Value() object represents the smallest possible component of an
expression: a simple value. When you need to represent the value of an integer,
boolean, or string within an expression, you can wrap that value within a
Value().

You will rarely need to use Value() directly. When you write the expression
F('field')+1, Django implicitly wraps the 1 in a Value(),
allowing simple values to be used in more complex expressions. You will need to
use Value() when you want to pass a string to an expression. Most
expressions interpret a string argument as the name of a field, like
Lower('name').

The value argument describes the value to be included in the expression,
such as 1, True, or None. Django knows how to convert these Python
values into their corresponding database type.

The output_field argument should be a model field instance, like
IntegerField() or BooleanField(), into which Django will load the value
after it’s retrieved from the database. Usually no arguments are needed when
instantiating the model field as any arguments relating to data validation
(max_length, max_digits, etc.) will not be enforced on the expression’s
output value.

ExpressionWrapper simply surrounds another expression and provides access
to properties, such as output_field, that may not be available on other
expressions. ExpressionWrapper is necessary when using arithmetic on
F() expressions with different types as described in
Using F() with annotations.

The examples in this section are designed to show how to force
Django to execute a subquery. In some cases it may be possible to
write an equivalent queryset that performs the same task more
clearly or efficiently.

Use OuterRef when a queryset in a Subquery needs to refer to a field
from the outer query. It acts like an F expression except that the
check to see if it refers to a valid field isn’t made until the outer queryset
is resolved.

Instances of OuterRef may be used in conjunction with nested instances
of Subquery to refer to a containing queryset that isn’t the immediate
parent. For example, this queryset would need to be within a nested pair of
Subquery instances to resolve correctly:

Exists is a Subquery subclass that uses an SQL EXISTS statement. In
many cases it will perform better than a subquery since the database is able to
stop evaluation of the subquery when a first matching row is found.

For example, to annotate each post with whether or not it has a comment from
within the last day:

It’s unnecessary to force Exists to refer to a single column, since the
columns are discarded and a boolean result is returned. Similarly, since
ordering is unimportant within an SQL EXISTS subquery and would only
degrade performance, it’s automatically removed.

The initial filter(...) limits the subquery to the relevant parameters.
order_by() removes the default ordering
(if any) on the Comment model. values('post') aggregates comments by
Post. Finally, annotate(...) performs the aggregation. The order in
which these queryset methods are applied is important. In this case, since the
subquery must be limited to a single column, values('total') is required.

This is the only way to perform an aggregation within a Subquery, as
using aggregate() attempts to evaluate the queryset (and if
there is an OuterRef, this will not be possible to resolve).

These extra lookups may not be portable to different database engines (because
you’re explicitly writing SQL code) and violate the DRY principle, so you
should avoid them if possible.

Warning

To protect against SQL injection attacks, you must escape any
parameters that the user can control by using params. params is a
required argument to force you to acknowledge that you’re not interpolating
your SQL with user-provided data.

You also must not quote placeholders in the SQL string. This example is
vulnerable to SQL injection because of the quotes around %s:

Window functions provide a way to apply functions on partitions. Unlike a
normal aggregation function which computes a final result for each set defined
by the group by, window functions operate on frames and
partitions, and compute the result for each row.

You can specify multiple windows in the same query which in Django ORM would be
equivalent to including multiple expressions in a QuerySet.annotate() call. The ORM doesn’t make use of named windows,
instead they are part of the selected columns.

The partition_by argument is a list of expressions (column names should be
wrapped in an F-object) that control the partitioning of the rows.
Partitioning narrows which rows are used to compute the result set.

The output_field is specified either as an argument or by the expression.

The order_by argument accepts a sequence of expressions on which you can
call asc() and
desc(). The ordering controls the order in
which the expression is applied. For example, if you sum over the rows in a
partition, the first result is just the value of the first row, the second is
the sum of first and second row.

The frame parameter specifies which other rows that should be used in the
computation. See Frames for details.

For example, to annotate each movie with the average rating for the movies by
the same studio in the same genre and release year:

This makes it easy to check if a movie is rated better or worse than its peers.

You may want to apply multiple expressions over the same window, i.e., the
same partition and frame. For example, you could modify the previous example
to also include the best and worst rating in each movie’s group (same studio,
genre, and release year) by using three window functions in the same query. The
partition and ordering from the previous example is extracted into a dictionary
to reduce repetition:

Among Django’s built-in database backends, MySQL 8.0.2+, PostgreSQL, and Oracle
support window expressions. Support for different window expression features
varies among the different databases. For example, the options in
asc() and
desc() may not be supported. Consult the
documentation for your database as needed.

Frames narrow the rows that are used for computing the result. They shift from
some start point to some specified end point. Frames can be used with and
without partitions, but it’s often a good idea to specify an ordering of the
window to ensure a deterministic result. In a frame, a peer in a frame is a row
with an equivalent value, or all rows if an ordering clause isn’t present.

The default starting point for a frame is UNBOUNDEDPRECEDING which is the
first row of the partition. The end point is always explicitly included in the
SQL generated by the ORM and is by default UNBOUNDEDFOLLOWING. The default
frame includes all rows from the partition to the last row in the set.

The accepted values for the start and end arguments are None, an
integer, or zero. A negative integer for start results in Npreceding,
while None yields UNBOUNDEDPRECEDING. For both start and end,
zero will return CURRENTROW. Positive integers are accepted for end.

There’s a difference in what CURRENTROW includes. When specified in
ROWS mode, the frame starts or ends with the current row. When specified in
RANGE mode, the frame starts or ends at the first or last peer according to
the ordering clause. Thus, RANGECURRENTROW evaluates the expression for
rows which have the same value specified by the ordering. Because the template
includes both the start and end points, this may be expressed with:

ValueRange(start=0,end=0)

If a movie’s “peers” are described as movies released by the same studio in the
same genre in the same year, this RowRange example annotates each movie
with the average rating of a movie’s two prior and two following peers:

If the database supports it, you can specify the start and end points based on
values of an expression in the partition. If the released field of the
Movie model stores the release month of each movies, this ValueRange
example annotates each movie with the average rating of a movie’s peers
released between twelve months before and twelve months after the each movie.

Below you’ll find technical implementation details that may be useful to
library authors. The technical API and examples below will help with
creating generic query expressions that can extend the built-in functionality
that Django provides.

Query expressions implement the query expression API,
but also expose a number of extra methods and attributes listed below. All
query expressions must inherit from Expression() or a relevant
subclass.

When a query expression wraps another expression, it is responsible for
calling the appropriate methods on the wrapped expression.

Provides the chance to do any pre-processing or validation of
the expression before it’s added to the query. resolve_expression()
must also be called on any nested expressions. A copy() of self
should be returned with any necessary transformations.

query is the backend query implementation.

allow_joins is a boolean that allows or denies the use of
joins in the query.

reuse is a set of reusable joins for multi-join scenarios.

summarize is a boolean that, when True, signals that the
query being computed is a terminal aggregate query.

for_save is a boolean that, when True, signals that the query
being executed is performing a create or update.

Returns a clone (copy) of self, with any column aliases relabeled.
Column aliases are renamed when subqueries are created.
relabeled_clone() should also be called on any nested expressions
and assigned to the clone.

Returns self with any modifications required to reverse the sort
order within an order_by call. As an example, an expression
implementing NULLSLAST would change its value to be
NULLSFIRST. Modifications are only required for expressions that
implement sort order like OrderBy. This method is called when
reverse() is called on a
queryset.

You can write your own query expression classes that use, and can integrate
with, other query expressions. Let’s step through an example by writing an
implementation of the COALESCE SQL function, without using the built-in
Func() expressions.

The COALESCE SQL function is defined as taking a list of columns or
values. It will return the first column or value that isn’t NULL.

We’ll start by defining the template to be used for SQL generation and
an __init__() method to set some attributes:

importcopyfromdjango.db.modelsimportExpressionclassCoalesce(Expression):template='COALESCE( %(expressions)s )'def__init__(self,expressions,output_field):super().__init__(output_field=output_field)iflen(expressions)<2:raiseValueError('expressions must have at least 2 elements')forexpressioninexpressions:ifnothasattr(expression,'resolve_expression'):raiseTypeError('%r is not an Expression'%expression)self.expressions=expressions

We do some basic validation on the parameters, including requiring at least
2 columns or values, and ensuring they are expressions. We are requiring
output_field here so that Django knows what kind of model field to assign
the eventual result to.

Now we implement the pre-processing and validation. Since we do not have
any of our own validation at this point, we just delegate to the nested
expressions:

defas_sql(self,compiler,connection,template=None):sql_expressions,sql_params=[],[]forexpressioninself.expressions:sql,params=compiler.compile(expression)sql_expressions.append(sql)sql_params.extend(params)template=templateorself.templatedata={'expressions':','.join(sql_expressions)}returntemplate%data,paramsdefas_oracle(self,compiler,connection):""" Example of vendor specific handling (Oracle in this case). Let's make the function name lowercase. """returnself.as_sql(compiler,connection,template='coalesce( %(expressions)s )')

as_sql() methods can support custom keyword arguments, allowing
as_vendorname() methods to override data used to generate the SQL string.
Using as_sql() keyword arguments for customization is preferable to
mutating self within as_vendorname() methods as the latter can lead to
errors when running on different database backends. If your class relies on
class attributes to define data, consider allowing overrides in your
as_sql() method.

We generate the SQL for each of the expressions by using the
compiler.compile() method, and join the result together with commas.
Then the template is filled out with our data and the SQL and parameters
are returned.

We’ve also defined a custom implementation that is specific to the Oracle
backend. The as_oracle() function will be called instead of as_sql()
if the Oracle backend is in use.

Finally, we implement the rest of the methods that allow our query expression
to play nice with other query expressions:

Since a Func’s keyword arguments for __init__() (**extra) and
as_sql() (**extra_context) are interpolated into the SQL string rather
than passed as query parameters (where the database driver would escape them),
they must not contain untrusted user input.

For example, if substring is user-provided, this function is vulnerable to
SQL injection:

fromdjango.db.modelsimportFuncclassPosition(Func):function='POSITION'template="%(function)s('%(substring)s' in %(expressions)s)"def__init__(self,expression,substring):# substring=substring is a SQL injection vulnerability!super().__init__(expression,substring=substring)

This function generates a SQL string without any parameters. Since substring
is passed to super().__init__() as a keyword argument, it’s interpolated
into the SQL string before the query is sent to the database.

Here’s a corrected rewrite:

classPosition(Func):function='POSITION'arg_joiner=' IN 'def__init__(self,expression,substring):super().__init__(substring,expression)

With substring instead passed as a positional argument, it’ll be passed as
a parameter in the database query.

If you’re using a database backend that uses a different SQL syntax for a
certain function, you can add support for it by monkey patching a new method
onto the function’s class.

Let’s say we’re writing a backend for Microsoft’s SQL Server which uses the SQL
LEN instead of LENGTH for the Length function.
We’ll monkey patch a new method called as_sqlserver() onto the Length
class:

You can also customize the SQL using the template parameter of as_sql().

We use as_sqlserver() because django.db.connection.vendor returns
sqlserver for the backend.

Third-party backends can register their functions in the top level
__init__.py file of the backend package or in a top level expressions.py
file (or package) that is imported from the top level __init__.py.

For user projects wishing to patch the backend that they’re using, this code
should live in an AppConfig.ready() method.