5.1 KiB
Table of Contents
SQLAlchemy Anti-Patterns
This is a list of what I consider SQLAlchemy anti-patterns.
Abusing lazily loaded relationships
Bad:
class Customer(Base):
@property
def has_valid_toast(self):
"""Return True if customer has at least one valid toast."""
return any(toast.kind == 'brioche' for toast in self.toaster.toasts)
This suffers from severe performance inefficiencies:
- The toaster will be loaded, as well as its toast. This involves creating and issuing the SQL query, waiting for the database to return, and instantiating all those objects.
has_valid_toast
does not actually care about those objects. It just returns a boolean.
A better way would be to issue a SQL EXISTS
query so
that the database handles this check and only returns a boolean.
Good:
class Customer(Base):
@property
def has_valid_toast(self):
"""Return True if customer has at least one valid toast."""
= (session.query(Toaster)
query
.join(Toast)self)
.with_parent(filter(Toast.kind == 'brioche'))
.return session.query(query.exists()).scalar()
This query might not always be the fastest if those relationships are small, and eagerly loaded.
Explicit session passing
TODO
Bad:
def toaster_exists(toaster_id, session):
...
Implicit transaction handling
TODO
Loading the full object when checking for object existence
Bad:
def toaster_exists(toaster_id):
return bool(session.query(Toaster).filter_by(id=toaster_id).first())
This is inefficient because it:
- Queries all the columns from the database (including any eagerly loaded joins)
- Instantiates and maps all data on the Toaster model
The database query would look something like this. You can see that all columns are selected to be loaded by the ORM.
SELECT toasters.id AS toasters_id, toasters.name AS toasters_name,
AS toasters_color
toasters.color FROM toasters
WHERE toasters.id = 1
LIMIT 1 OFFSET 0
And then it just checks if the result is truthy.
Here’s a better way to do it:
def toaster_exists(toaster_id):
= session.query(Toaster).filter_by(id=toaster_id)
query return session.query(query.exists()).scalar()
In this case, we just ask the database about whether a record exists with this id. This is obviously much more efficient.
SELECT EXISTS (SELECT 1
FROM toasters
WHERE toasters.id = 1) AS anon_1
Using identity as comparator
Bad:
= session.query(Toaster).filter(Toaster.deleted_at is None).all() toasters
Unfortunately this won’t work at all. This query will return all toasters, including the one that were deleted.
The way sqlalchemy works is that it overrides the magic comparison
methods (__eq__
, __lt__
, etc.). All comparison
methods can be overrode except the identity operator (is
)
which checks for objects identity.
What this means is that expression
Toaster.deleted_at is None
will be immediately evaluated by
the Python interpreter, and since (presumably)
Toaster.deleted_at
is a
sqlalchemy.orm.attributes.InstrumentedAttribute
, it’s not
None
and thus it’s equivalent to doing:
= session.query(Toaster).filter(True).all() toasters
Which obviously renders the filter inoperable, and will return all records.
There’s two ways to fix it:
= session.query(Toaster).filter(Toaster.deleted_at == None).all() toasters
Here we use the equality operator, which Python allows overriding.
Behind the scene, Python calls
Toaster.deleted_at.__eq__(None)
, which gives SQLAlchemy the
opportunity to return a comparator that when coerced to a string, will
evaluate to deleted_at is NULL
.
Most linter will issue a warning for equality comparison against
None
, so you can also do (this is my preferred
solution):
= session.query(Toaster).filter(Toaster.deleted_at.is_(None)).all() toasters
See docs for is_.
Returning
None
instead of raising a NoResultFound
exception
See Returning nothing instead of raising NotFound exception.