SQLAlchemy is one of the commonly used ORMs in Python. Compared to other ORMs like Django ORM or Peewee, it might be a bit more challenging for beginners. However, it provides a rich set of advanced and flexible features. This article focuses on the topic of model inheritance.
The code in this article is based on
Flask-SQLAlchemy
for demonstration. The differences with native SQLAlchemy are minor, primarily:
db.Model
is roughly equivalent tosqlalchemy.ext.declarative.declarative_base
Model.query
is similar todb.session.query(Model)
{: .note}
Single Table Inheritance
The simplest form of table association is single table inheritance. In this inheritance relationship, regardless of how many subclasses are derived, there is always only one table at the database level, distinguished by a polymorphic_identity
column.
Evidently, this is the most efficient query method among inheritance types. However, it doesn’t allow for the flexibility of adding unique fields to each subclass. It’s suitable for scenarios where the data structure of subclasses is the same, but their business logic differs.
For example, consider a user table where ordinary users require permission checks, but administrator users have all permissions. Of course, one could use if
statements for condition checks, but this can significantly reduce code readability and maintainability in complex conditions. A more object-oriented approach can address this problem, for instance:
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
user_type = db.Column(db.String(10))
__mapper_args__ = {
'polymorphic_identity': 'normal',
'polymorphic_on': user_type
}
def has_permission(self, permission):
return some_permission_check_logic()
class AdminUser(User):
__mapper_args__ = {
'polymorphic_identity': 'admin',
}
def has_permission(self, permission):
return True
In this manner, when we perform database queries, the ORM will automatically use different class mapping objects based on the polymorphic_identity
, for example:
>>> User.query.all()
[<User: xxx@xxx.com>, <User: xxx2@xxx.com>, <AdminUser: xxx3@xxx.com>]
This approach also brings an additional benefit: when we directly create subclass objects, the corresponding polymorphic_identity
is automatically filled in for us:
>>> admin_user = AdminUser()
>>> print(admin_user.user_type)
admin
Automatic Initialization of Subclass Objects
Since creating a subclass automatically fills in the polymorphic_identity
, one might wonder if passing the corresponding value to the parent class could automatically create a subclass object.
Unfortunately, the answer is no:
>>> user = User(user_type='admin', email='xxx@xxx.com')
>>> print(user)
<User: xxx@xxx.com>
>>> # 但是当我们重新从数据库取出对象的时候,可以正确的使用子类
>>> db.session.add(user)
>>> db.session.commit()
>>> user_id = user.id
>>> del user # 这里需要删除已经生成的对象, 否则sqlalchemy会自动复用已经生成的对象
>>> user = User.query.get(user_id)
>>> print(user)
<AdminUser: xxx@xxx.com>
To achieve this effect, one could use metaclasses (see reference #2). However, I personally advise against using this kind of “black magic” for a couple of reasons. First, modifying the deeply customized metaclasses in SQLAlchemy might introduce some unpredictable issues. Second, initializing one class and ending up with an instance of another class is somewhat unconventional. Therefore, I prefer to implement this using a more straightforward approach with class methods:
class User(db.Model):
# ...
@classmethod
def init_for(cls, user_type, **kwargs):
"""根据user_type初始化不同子类的实例"""
children_classes = {
x.polymorphic_identity: x.class_
for x in cls.__mapper__.self_and_descendants
}
return children_classes[user_type](**kwargs)
Note: Single table inheritance should not define __tablename__
for subclasses
Joined Table Inheritance
Many times, we need to deal with more flexible situations where subclasses inherit some attributes from their parent class but also need to define some of their own attributes. Joined table inheritance can help us handle these situations, but it comes with some performance overhead.
As the name implies, joined table inheritance performs a JOIN
query when hitting the database. The data structure definition is actually the subclass’s primary key acts as a foreign key, one-to-one associating to the parent class table’s primary key, and the subclass’s unique data is stored in a separate table.
See the code:
class User(db.Model):
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
user_type = db.Column(db.String(10))
__mapper_args__ = {
'polymorphic_identity': 'normal',
'polymorphic_on': user_type
}
class WechatUser(User):
__tablename__ = 'user_wechat'
# Define a foreign key to parent
id = db.Column(db.Integer, db.ForeignKey('user.id', primary_key=True))
open_id = db.Column(db.String(40))
__mapper_args__ = {
'polymorphic_identity': 'wechat',
}
Query Behaviour
Let’s take a look at underlying SQL
>>> print(User.query)
SELECT "user".id AS user_id, "user".user_type AS user_user_type
FROM "user"
>>> print(WechatUser.query)
SELECT user_wechat.id AS user_wechat_id, "user".id AS user_id, "user".user_type AS user_user_type, user_wechat.open_id AS user_wechat_open_id
FROM "user" JOIN user_wechat ON "user".id = user_wechat.id
As you can see, when only querying the parent class, the subclass’s table will not be queried; conversely, it will compose a JOIN
query, and by default, it is an INNER JOIN
, meaning when only querying the subclass, the parent class’s related data will not be listed.
>>> db.session.add(WechatUser())
>>> db.session.add(User())
>>> db.session.commit()
>>>
>>> print(User.query.all())
>>> print(WechatUser.query.all())
[<__main__.WechatUser object at 0x1016a3e10>, <__main__.User object at 0x1016a3ed0>]
[<__main__.WechatUser object at 0x10170b510>]
Using `with_polymorphic`` to control queries
By default, if a subclass object is queried from the parent class, an additional query will be initiated when accessing subclass attributes, similar to the behavior when lazy=True
is set in a relationship
we define normally. For example:
Code: (The first object in the User table in the test data is a WechatUser)
user = User.query.first()
print(user.open_id)
query:
SELECT user.id AS user_id, user.user_type AS user_user_type
FROM user
LIMIT 1 OFFSET 0;
SELECT user_wechat.open_id AS user_wechat_open_id;
When the number of times we access subclass properties is low, such queries are completely acceptable. However, when we anticipate frequent access to subclass properties, doing so will result in a large number of queries, and joining all tables at that time would be a better choice.
Fortunately, SQLAlchemy also provides this kind of support:
>>> mixed = db.with_polymorphic(User, [User, WechatUser])
>>> # 也可以写作 mixed = db.with_polymorphic(User, '*') join所有的子表
>>> query = db.session.query(mixed)
>>> print(query)
SELECT "user".id AS user_id, "user".user_type AS user_user_type, user_wechat.id AS user_wechat_id, user_wechat.open_id AS user_wechat_open_id
FROM "user" LEFT OUTER JOIN user_wechat ON "user".id = user_wechat.id
>>> query.all()
[<__main__.WechatUser object at 0x1016a3e10>, <__main__.User object at 0x1016a3ed0>]
with_polymorphic
also supports query,
>>> print(db.session.query(mixed).filter(mixed.WechatUser.open_id.is_(None)))
SELECT "user".id AS user_id, "user".user_type AS user_user_type, user_wechat.id AS user_wechat_id, user_wechat.open_id AS user_wechat_open_id
FROM "user" LEFT OUTER JOIN user_wechat ON "user".id = user_wechat.id
WHERE user_wechat.open_id IS NULL
The documentation also provides some more refined means of controlling queries, such as of_type
, custom Join
, and the use of .options()
with the joinedload
strategy. This article will not elaborate on these here; please refer to the documentation if needed.
Concrete Inheritance
SQLAlchemy also offers another method of inheritance, called concrete
inheritance, which we might translate as concrete inheritance.
In this inheritance relationship, the parent class and each subclass have their own complete table, and there can be no relationship between the tables at the database level. Moreover, in Python, each subclass must completely define every field, which will not be inherited from the parent class.
Because each subclass has its own independent table, unlike the previous two types of inheritance relationships, there is no need for a polymorphic_identity
to distinguish each subclass. This also avoids the need for JOINs and the issue of needing an additional request when encountering query fields.
However, the cost is that since each class has its own independent table, querying subclasses from the parent class requires UNION
ing all the tables, so subclasses should not be too numerous, otherwise, the performance cost can be quite significant.
(So honestly, I don’t quite understand the purpose of this type of inheritance relationship.)
Class definitions are as follows:
from sqlalchemy.ext.declarative import ConcreteBase
class User(ConcreteBase, db.Model): # 注意: 此处应继承ConcreteBase
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(50))
class WechatUser(User):
__tablename__ = 'user_wechat'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(50)) # 注意: 子类必须重新定义字段
open_id = db.Column(db.String(40))
__mapper_args__ = {
'concrete': True
}
query:
print(User.query)
SELECT pjoin.id AS pjoin_id, pjoin.name AS pjoin_name, pjoin.type AS pjoin_type, pjoin.open_id AS pjoin_open_id
FROM (SELECT "user".id AS id, "user".name AS name, CAST(NULL AS VARCHAR(40)) AS open_id, 'normal' AS type
FROM "user" UNION ALL SELECT user_wechat.id AS id, user_wechat.name AS name, user_wechat.open_id AS open_id, 'wechat' AS type
FROM user_wechat) AS pjoin
print(WechatUser.query)
SELECT pjoin.id AS pjoin_id, pjoin.name AS pjoin_name, pjoin.open_id AS pjoin_open_id, pjoin.type AS pjoin_type
FROM (SELECT user_wechat.id AS id, user_wechat.name AS name, user_wechat.open_id AS open_id, 'wechat' AS type
FROM user_wechat) AS pjoin
As you can see, this query involves UNION
and CAST
, forming quite a complex query. Moreover, as the number of subclasses increases, the complexity of the query will rapidly increase. I have not used this type of inheritance in a production environment (too risky, and I feel it’s unnecessary).
If anyone has a suitable scenario for this type of inheritance relationship, I hope they can point it out so we can learn together.
Code
The code related to this article can be referred to in this gist.