You can use Model.create() to create a new model instance. This method
accepts keyword arguments, where the keys correspond to the names of the
model’s fields. A new instance is returned and a row is added to the table.

If you are not wrapping the loop in a transaction then each call to
create() happens in its own transaction. That is going to be
really slow!

There is a decent amount of Python logic getting in your way, and each
InsertQuery must be generated and parsed into SQL.

That’s a lot of data (in terms of raw bytes of SQL) you are sending to your
database to parse.

We are retrieving the last insert id, which causes an additional query to
be executed in some cases.

You can get a very significant speedup by simply wrapping this in a
atomic().

# This is much faster.withdb.atomic():fordata_dictindata_source:MyModel.create(**data_dict)

The above code still suffers from points 2, 3 and 4. We can get another big
boost by calling insert_many(). This method accepts a list of
tuples or dictionaries to insert.

# Fastest.MyModel.insert_many(data_source).execute()# Fastest using tuples and specifying the fields being inserted.fields=[MyModel.field1,MyModel.field2]data=[('val1-1','val1-2'),('val2-1','val2-2'),('val3-1','val3-2')]MyModel.insert_many(data,fields=fields)# You can, of course, wrap this in a transaction as well:withdb.atomic():MyModel.insert_many(data,fields=fields)

Depending on the number of rows in your data source, you may need to break it
up into chunks:

SQLite users should be aware of some caveats when using bulk inserts.
Specifically, your SQLite3 version must be 3.7.11.0 or newer to take
advantage of the bulk insert API. Additionally, by default SQLite limits
the number of bound variables in a SQL query to 999. This value can be
modified by setting the SQLITE_MAX_VARIABLE_NUMBER flag.

If the data you would like to bulk load is stored in another table, you can
also create INSERT queries whose source is a SELECT query. Use the
Model.insert_from() method:

If you want to update multiple records, issue an UPDATE query. The following
example will update all Tweet objects, marking them as published, if they
were created before today. Model.update() accepts keyword arguments
where the keys correspond to the model’s field names:

>>> today=datetime.today()>>> query=Tweet.update(is_published=True).where(Tweet.creation_date<today)>>> query.execute()# Returns the number of rows that were updated.4

We can even use a subquery to update the value of a column. Suppose we had a
denormalized column on the User model that stored the number of tweets a
user had made, and we updated this value periodically. Here is how you might
write such a query:

To delete a single model instance, you can use the
Model.delete_instance() shortcut. delete_instance()
will delete the given model instance and can optionally delete any dependent
objects recursively (by specifying recursive=True).

You can use the Model.get() method to retrieve a single instance
matching the given query. For primary-key lookups, you can also use the
shortcut method Model.get_by_id().

This method is a shortcut that calls Model.select() with the given
query, but limits the result set to a single row. Additionally, if no model
matches the given query, a DoesNotExist exception will be raised.

>>> User.get(User.id==1)<__main__.User object at 0x25294d0>>>> User.get_by_id(1)# Same as above.<__main__.User object at 0x252df10>>>> User[1]# Also same as above.<__main__.User object at 0x252dd10>>>> User.get(User.id==1).usernameu'Charlie'>>> User.get(User.username=='Charlie')<__main__.User object at 0x2529410>>>> User.get(User.username=='nobody')UserDoesNotExist: instance matching query does not exist:SQL: SELECT t1."id", t1."username" FROM "user" AS t1 WHERE t1."username" = ?PARAMS: ['nobody']

For more advanced operations, you can use SelectQuery.get(). The
following query retrieves the latest tweet from the user named charlie:

Peewee has one helper method for performing “get/create” type operations:
Model.get_or_create(), which first attempts to retrieve the matching
row. Failing that, a new row will be created.

For “create or get” type logic, typically one would rely on a unique
constraint or primary key to prevent the creation of duplicate objects. As an
example, let’s say we wish to implement registering a new user account using
the example User model. The User model has a unique
constraint on the username field, so we will rely on the database’s integrity
guarantees to ensure we don’t end up with duplicate usernames:

try:withdb.atomic():returnUser.create(username=username)exceptpeewee.IntegrityError:# `username` is a unique column, so this username already exists,# making it safe to call .get().returnUser.get(User.username==username)

You can easily encapsulate this type of logic as a classmethod on your own
Model classes.

The above example first attempts at creation, then falls back to retrieval,
relying on the database to enforce a unique constraint. If you prefer to
attempt to retrieve the record first, you can use
get_or_create(). This method is implemented along the same
lines as the Django function of the same name. You can use the Django-style
keyword argument filters to specify your WHERE conditions. The function
returns a 2-tuple containing the instance and a boolean value indicating if the
object was created.

Here is how you might implement user account creation using
get_or_create():

user,created=User.get_or_create(username=username)

Suppose we have a different model Person and would like to get or create a
person object. The only conditions we care about when retrieving the Person
are their first and last names, but if we end up needing to create a new
record, we will also specify their date-of-birth and favorite color:

We can use Model.select() to retrieve rows from the table. When you
construct a SELECT query, the database will return any rows that correspond
to your query. Peewee allows you to iterate over these rows, as well as use
indexing and slicing operations.

In the following example, we will simply call select() and
iterate over the return value, which is an instance of Select.
This will return all the rows in the User table:

>>> foruserinUser.select():... printuser.username...CharlieHueyPeewee

Note

Subsequent iterations of the same query will not hit the database as the
results are cached. To disable this behavior (to reduce memory usage), call
Select.iterator() when iterating.

When iterating over a model that contains a foreign key, be careful with the
way you access values on related models. Accidentally resolving a foreign key
or iterating over a back-reference can cause N+1 query behavior.

When you create a foreign key, such as Tweet.user, you can use the
backref to create a back-reference (User.tweets). Back-references
are exposed as Select instances:

Note that Peewee uses bitwise operators (& and |) rather than
logical operators (and and or). The reason for this is that Python
coerces the return value of logical operations to a boolean value. This is
also the reason why “IN” queries must be expressed using .in_() rather
than the in operator.

# The following queries are equivalent:Tweet.select().order_by(Tweet.created_date.desc())Tweet.select().order_by(-Tweet.created_date)# Note the "-" prefix.# Similarly you can use "+" to indicate ascending order, though ascending# is the default when no ordering is otherwise specified.User.select().order_by(+User.username)

You can also order across joins. Assuming you want to order tweets by the
username of the author, then by created_date:

When sorting on a calculated value, you can either include the necessary SQL
expressions, or reference the alias assigned to the value. Here are two
examples illustrating these methods:

# Let's start with our base query. We want to get all usernames and the number of# tweets they've made. We wish to sort this list from users with most tweets to# users with fewest tweets.query=(User.select(User.username,fn.COUNT(Tweet.id).alias('num_tweets')).join(Tweet,JOIN.LEFT_OUTER).group_by(User.username))

You can order using the same COUNT expression used in the select clause. In
the example below we are ordering by the COUNT() of tweet ids descending:

Alternatively, you can reference the alias assigned to the calculated value in
the select clause. This method has the benefit of being a bit easier to
read. Note that we are not referring to the named alias directly, but are
wrapping it using the SQL helper:

The resulting query will return User objects with all their normal attributes
plus an additional attribute count which will contain the count of tweets for
each user. We use a left outer join to include users who have no tweets.

Suppose you need to want to get a list of all users whose username begins with
a. There are a couple ways to do this, but one method might be to use some
SQL functions like LOWER and SUBSTR. To use arbitrary SQL functions, use
the special fn() object to construct queries:

# Select the user's id, username and the first letter of their username, lower-casedfirst_letter=fn.LOWER(fn.SUBSTR(User.username,1,1))query=User.select(User,first_letter.alias('first_letter'))# Alternatively we could select only users whose username begins with 'a'a_users=User.select().where(first_letter=='a')>>>foruserina_users:...print(user.username)

There are times when you may want to simply pass in some arbitrary sql. You can
do this using the special SQL class. One use-case is when
referencing an alias:

# We'll query the user table and annotate it with a count of tweets for# the given userquery=(User.select(User,fn.Count(Tweet.id).alias('ct')).join(Tweet).group_by(User))# Now we will order by the count, which was aliased to "ct"query=query.order_by(SQL('ct'))# You could, of course, also write this as:query=query.order_by(fn.COUNT(Tweet.id))

There are two ways to execute hand-crafted SQL statements with peewee:

By default peewee will parameterize queries, so any parameters passed in by the
user will be escaped. The only exception to this rule is if you are writing a
raw SQL query or are passing in a SQL object which may contain untrusted
data. To mitigate this, ensure that any user-defined data is passed in as a
query parameter and not part of the actual SQL query:

# Bad! DO NOT DO THIS!query=MyModel.raw('SELECT * FROM my_table WHERE data = %s'%(user_data,))# Good. `user_data` will be treated as a parameter to the query.query=MyModel.raw('SELECT * FROM my_table WHERE data = %s',user_data)# Bad! DO NOT DO THIS!query=MyModel.select().where(SQL('Some SQL expression %s'%user_data))# Good. `user_data` will be treated as a parameter.query=MyModel.select().where(SQL('Some SQL expression %s',user_data))

Note

MySQL and Postgresql use '%s' to denote parameters. SQLite, on the
other hand, uses '?'. Be sure to use the character appropriate to your
database. You can also find this parameter by checking
Database.param.

peewee comes with basic support for SQL window functions, which can be created
by calling Function.over() and passing in your partitioning or
ordering parameters.

# Get the list of employees and the average salary for their dept.query=(Employee.select(Employee.name,Employee.department,Employee.salary,fn.Avg(Employee.salary).over(partition_by=[Employee.department])).order_by(Employee.name))# Rank employees by salary.query=(Employee.select(Employee.name,Employee.salary,fn.rank().over(order_by=[Employee.salary])))

For general information on window functions, check out the postgresql docs.

PostgresqlDatabase supports a RETURNING clause on UPDATE,
INSERT and DELETE queries. Specifying a RETURNING clause allows you
to iterate over the rows accessed by the query.

For example, let’s say you have an Update that deactivates all
user accounts whose registration has expired. After deactivating them, you want
to send each user an email letting them know their account was deactivated.
Rather than writing two queries, a SELECT and an UPDATE, you can do
this in a single UPDATE query with a RETURNING clause:

query=(User.update(is_active=False).where(User.registration_expired==True).returning(User))# Send an email to every user that was deactivated.fordeactivate_userinquery.execute():send_deactivation_email(deactivated_user)

The RETURNING clause is also available on Insert and
Delete. When used with INSERT, the newly-created rows will be
returned. When used with DELETE, the deleted rows will be returned.

The only limitation of the RETURNING clause is that it can only consist of
columns from tables listed in the query’s FROM clause. To select all
columns from a particular table, you can simply pass in the Model
class.

Here is how you might combine expressions. Comparisons can be arbitrarily
complex.

Note

Note that the actual comparisons are wrapped in parentheses. Python’s operator
precedence necessitates that comparisons be wrapped in parentheses.

# Find any users who are active administrations.User.select().where((User.is_admin==True)&(User.is_active==True))# Find any users who are either administrators or super-users.User.select().where((User.is_admin==True)|(User.is_superuser==True))# Find any Tweets by users who are not admins (NOT IN).admins=User.select().where(User.is_admin==True)non_admin_tweets=Tweet.select().where(~(Tweet.user<<admins))# Find any users who are not my friends (strangers).friends=User.select().where(User.username.in_(['charlie','huey','mickey']))strangers=User.select().where(User.id.not_in(friends))

Warning

Although you may be tempted to use python’s in, and, or and
not operators in your query expressions, these will not work. The
return value of an in expression is always coerced to a boolean value.
Similarly, and, or and not all treat their arguments as boolean
values and cannot be overloaded.

So just remember:

Use .in_() and .not_in() instead of in and notin

Use & instead of and

Use | instead of or

Use ~ instead of not

Use .is_null() instead of isNone or ==None.

Don’t forget to wrap your comparisons in parentheses when using logical operators.

Because SQLite’s LIKE operation is case-insensitive by default,
peewee will use the SQLite GLOB operation for case-sensitive searches.
The glob operation uses asterisks for wildcards as opposed to the usual
percent-sign. If you are using SQLite and want case-sensitive partial
string matching, remember to use asterisks for the wildcard.

Because of the way SQL handles NULL, there are some special operations
available for expressing:

ISNULL

ISNOTNULL

IN

NOTIN

While it would be possible to use the ISNULL and IN operators with the
negation operator (~), sometimes to get the correct semantics you will need
to explicitly use ISNOTNULL and NOTIN.

The simplest way to use ISNULL and IN is to use the operator
overloads:

# Get all User objects whose last login is NULL.User.select().where(User.last_login>>None)# Get users whose username is in the given list.usernames=['charlie','huey','mickey']User.select().where(User.username<<usernames)

If you don’t like operator overloads, you can call the Field methods instead:

# Get all User objects whose last login is NULL.User.select().where(User.last_login.is_null(True))# Get users whose username is in the given list.usernames=['charlie','huey','mickey']User.select().where(User.username.in_(usernames))

To negate the above queries, you can use unary negation, but for the correct
semantics you may need to use the special ISNOT and NOTIN operators:

# Get all User objects whose last login is *NOT* NULL.User.select().where(User.last_login.is_null(False))# Using unary negation instead.User.select().where(~(User.last_login>>None))# Get users whose username is *NOT* in the given list.usernames=['charlie','huey','mickey']User.select().where(User.username.not_in(usernames))# Using unary negation instead.usernames=['charlie','huey','mickey']User.select().where(~(User.username<<usernames))

Because I ran out of python operators to overload, there are some missing
operators in peewee, for instance modulo. If you find that you need to
support an operator that is not in the table above, it is very easy to add your
own.

Here is how you might add support for modulo in SQLite:

frompeeweeimport*frompeeweeimportExpression# the building block for expressionsdefmod(lhs,rhs):returnExpression(lhs,SQL('%'),rhs)

Now you can use these custom operators to build richer queries:

# Users with even ids.User.select().where(mod(User.id,2)==0)

For more examples check out the source to the playhouse.postgresql_ext
module, as it contains numerous operators specific to postgresql’s hstore.

Foreign keys are created using a special field class
ForeignKeyField. Each foreign key also creates a back-reference on
the related model using the specified backref.

Note

In SQLite, foreign keys are not enabled by default. Most things, including
the Peewee foreign-key API, will work fine, but ON DELETE behaviour will be
ignored, even if you explicitly specify on_delete to your ForeignKeyField.
In conjunction with the default PrimaryKeyField behaviour (where deleted
record IDs can be reused), this can lead to surprising (and almost
certainly unwanted) behaviour where if you delete a record in table A
referenced by a foreign key in table B, and then create a new, unrelated,
record in table A, the new record will end up mis-attached to the undeleted
record in table B. To avoid the mis-attachment, you can use
PrimaryKeyAutoIncrementField, but it may be better overall to
ensure that foreign keys are enabled with pragmas=((‘foreign_keys’,
‘on’),) when you instantiate SqliteDatabase.

Referring back to the User and Tweet models, note that
there is a ForeignKeyField from Tweet to User. The foreign key
can be traversed, allowing you access to the associated user instance:

>>> tweet.user.username'charlie'

Note

Unless the User model was explicitly selected when retrieving the
Tweet, an additional query will be required to load the User data. To
learn how to avoid the extra query, see the N+1 query documentation.

The reverse is also true, and we can iterate over the tweets associated with a
given User instance:

SQL makes it easy to select columns from multiple tables and return it all at
once. Peewee makes this possible, too, but since Peewee models form a graph
(via foreign-keys), the selected data is returned as a graph of model
instances. To see what I mean, consider this query:

The question is: where is the “username” attribute to be found? The answer is
that Peewee, because there is a foreign-key relationship between Tweet and
User, will return each row as a Tweet model with the associated User model,
which has it’s username attribute set:

When doing complicated joins, joins where no foreign-key exists (for example
joining on a subquery), etc., it is necessary to tell Peewee where to place the
joined attributes. This is done by putting an alias on the join predicate
expression.

For example, let’s say that in the above query we want to put the joined user
data in the Tweet.foo attribute:

query=(Tweet.select(Tweet.content,Tweet.timestamp,User.username).join(User,on=(Tweet.user==User.id).alias('foo')).order_by(Tweet.timestamp.desc()))fortweetinquery:# Joined user data is stored in "tweet.foo":print(tweet.content,tweet.timestamp,tweet.foo.username)

For queries with complex joins and selections from several models, constructing
this graph can be expensive. If you wish, instead, to have all columns as
attributes on a single model, you can use objects()
method:

fortweetinquery.objects():# Now "username" is on the Tweet model itself:print(tweet.content,tweet.timestamp,tweet.username)

For additional performance gains, consider using dicts(),
tuples() or namedtuples() when
iterating large and/or complex result-sets.

If a foreign key does not exist between two tables you can still perform a
join, but you must manually specify the join predicate.

In the following example, there is no explicit foreign-key between User and
ActivityLog, but there is an implied relationship between the
ActivityLog.object_id field and User.id. Rather than joining on a specific
Field, we will join using an Expression.

Peewee provides a field for representing many-to-many relationships, much like
Django does. This feature was added due to many requests from users, but I
strongly advocate against using it, since it conflates the idea of a field with
a junction table and hidden joins. It’s just a nasty hack to provide convenient
accessors.

To implement many-to-many correctly with peewee, you will therefore create
the intermediary table yourself and query through it:

The ManyToManyField provides a field-like API over many-to-many
fields. For all but the simplest many-to-many situations, you’re better off
using the standard peewee APIs. But, if your models are very simple and your
querying needs are not very complex, you can get a big boost by using
ManyToManyField. Check out the Fields extension module
for details.

frompeeweeimport*fromplayhouse.fieldsimportManyToManyFielddb=SqliteDatabase('school.db')classBaseModel(Model):classMeta:database=dbclassStudent(BaseModel):name=CharField()classCourse(BaseModel):name=CharField()students=ManyToManyField(Student,backref='courses')StudentCourse=Course.students.get_through_model()db.create_tables([Student,Course,StudentCourse])# Get all classes that "huey" is enrolled in:huey=Student.get(Student.name=='Huey')forcourseinhuey.courses.order_by(Course.name):print(course.name)# Get all students in "English 101":engl_101=Course.get(Course.name=='English 101')forstudentinengl_101.students:print(student.name)# When adding objects to a many-to-many relationship, we can pass# in either a single model instance, a list of models, or even a# query of models:huey.courses.add(Course.select().where(Course.name.contains('English')))engl_101.students.add(Student.get(Student.name=='Mickey'))engl_101.students.add([Student.get(Student.name=='Charlie'),Student.get(Student.name=='Zaizee')])# The same rules apply for removing items from a many-to-many:huey.courses.remove(Course.select().where(Course.name.startswith('CS')))engl_101.students.remove(huey)# Calling .clear() will remove all associated objects:cs_150.students.clear()

Another less common approach involves the use of subqueries. Here is another
way we might construct a query to get all the categories whose parent category
is Electronics using a subquery:

Parent=Category.alias()join_query=Parent.select().where(Parent.name=='Electronics')# Subqueries used as JOINs need to have an alias.join_query=join_query.alias('jq')query=(Category.select().join(join_query,on=(Category.parent==join_query.c.id)))

The term N+1 queries refers to a situation where an application performs a
query, then for each row of the result set, the application performs at least
one other query (another way to conceptualize this is as a nested loop). In
many cases, these n queries can be avoided through the use of a SQL join or
subquery. The database itself may do a nested loop, but it will usually be more
performant than doing n queries in your application code, which involves
latency communicating with the database and may not take advantage of indices
or other optimizations employed by the database when joining or executing a
subquery.

Peewee provides several APIs for mitigating N+1 query behavior. Recollecting
the models used throughout this document, User and Tweet, this section will
try to outline some common N+1 scenarios, and how peewee can help you avoid
them.

Note

In some cases, N+1 queries will not result in a significant or measurable
performance hit. It all depends on the data you are querying, the database
you are using, and the latency involved in executing queries and retrieving
results. As always when making optimizations, profile before and after to
ensure the changes do what you expect them to.

The twitter timeline displays a list of tweets from multiple users. In addition
to the tweet’s content, the username of the tweet’s author is also displayed.
The N+1 scenario here would be:

Fetch the 10 most recent tweets.

For each tweet, select the author (10 queries).

By selecting both tables and using a join, peewee makes it possible to
accomplish this in a single query:

query=(Tweet.select(Tweet,User)# Note that we are selecting both models..join(User)# Use an INNER join because every tweet has an author..order_by(Tweet.id.desc())# Get the most recent tweets..limit(10))fortweetinquery:print(tweet.user.username,'-',tweet.message)

Without the join, accessing tweet.user.username would trigger a query to
resolve the foreign key tweet.user and retrieve the associated user. But
since we have selected and joined on User, peewee will automatically
resolve the foreign-key for us.

Let’s say you want to build a page that shows several users and all of their
tweets. The N+1 scenario would be:

Fetch some users.

For each user, fetch their tweets.

This situation is similar to the previous example, but there is one important
difference: when we selected tweets, they only have a single associated user,
so we could directly assign the foreign key. The reverse is not true, however,
as one user may have any number of tweets (or none at all).

Peewee provides an approach to avoiding O(n) queries in this situation. Fetch
users first, then fetch all the tweets associated with those users. Once
peewee has the big list of tweets, it will assign them out, matching them with
the appropriate user. This method is usually faster but will involve a query
for each table being selected.

peewee supports pre-fetching related data using sub-queries. This method
requires the use of a special API, prefetch(). Pre-fetch, as its name
indicates, will eagerly load the appropriate tweets for the given users using
subqueries. This means instead of O(n) queries for n rows, we will do
O(k) queries for k tables.

Here is an example of how we might fetch several users and any tweets they
created within the past week.

week_ago=datetime.date.today()-datetime.timedelta(days=7)users=User.select()tweets=(Tweet.select().where((Tweet.is_published==True)&(Tweet.created_date>=week_ago)))# This will perform two queries.users_with_tweets=prefetch(users,tweets)foruserinusers_with_tweets:print(user.username)fortweetinuser.tweets:print(' ',tweet.message)

Note

Note that neither the User query, nor the Tweet query contained a
JOIN clause. When using prefetch() you do not need to specify the
join.

prefetch() can be used to query an arbitrary number of tables. Check
the API documentation for more examples.

By default peewee will cache the rows returned when iterating of a
Select. This is an optimization to allow multiple iterations as
well as indexing and slicing without causing additional queries. This caching
can be problematic, however, when you plan to iterate over a large number of
rows.

To reduce the amount of memory used by peewee when iterating over a query, use
the iterator() method. This method allows you to iterate
without caching each model returned, using much less memory when iterating over
large result sets.

# Let's assume we've got 10 million stat objects to dump to a csv file.stats=Stat.select()# Our imaginary serializer classserializer=CSVSerializer()# Loop over all the stats and serialize.forstatinstats.iterator():serializer.serialize_object(stat)

When iterating over a large number of rows that contain columns from multiple
tables, peewee will reconstruct the model graph for each row returned. This
operation can be slow for complex graphs.

Ordinarily, when a query contains joins, peewee will reconstruct the graph of
joined data returned by cursor. Using the above helpers returns a simpler
data-structure which can be much more efficient when iterating over large or
very-complex queries.

Note

If no constructor is passed to objects(), then peewee
will return model instances. However, instead of attempting to reconstruct
a graph of any joined data, all columns will be returned as attributes of
the model.

For example:

query=(Tweet.select(Tweet,User).join(User))# Note that the user columns are stored in a separate User instance# accessible at row.user:fortweetinquery:print(tweet.user.username,tweet.content)# Using ".objects()" will put all attributes on the model we are# querying.fortweetinquery.objects():print(tweet.username,tweet.content)