原文: http://code.google.com/appengine/articles/modeling.html
Modeling Entity Relationships
实体关系建模
Rafe Kaplan, Software Engineer
June 2008
Introduction
导言
Sure, the Getting Started Guide tells you what you need to know in order
to fill properties for a simple AppEngine model, but it's going to take more
than that if you want to be able to represent real world concepts in the
datastore. Whether you are new to web application development, or are used to
working with SQL databases, this article is for those people who are ready to
take a step into the next dimension of AppEngine data representation.
“快速入门”告诉你怎么使用一个简单的AppEnigne model, 怎么使用属性. 但是如果你想在datastore中建立起真实世界的映射比这复杂的多. 不管你是一个web应用开发新手还是已经习惯使用SQL数据库, 本文会深入介绍在AppEngine中数据建模
Why would I need entity relationships?
为什么我需要实体关系?
Imagine you are building a snazzy new web application that includes an
address book where users can store their contacts. For each contact the user
stores, you want to capture the contacts name, birthday (which they mustn't
forget!) their address, telephone number and company they work for.
设想你在创建一个很酷的web应用, 其中包括了一个地址簿, 让用户存储他们的联系人. 对每个联系人你想保持姓名, 生日(这可不能忘记!), 地址, 电话号码和工作的公司.
When the user wants to add an address, they enter the information in to a
form and the form saves the information in a model that looks something like
this:
当用户想增加一个地址时, 他们在一个表单中输入, 然后表单在一个如下的模块中保存地址信息:
class Contact(db.Model):
# Basic info.
name = db.StringProperty()
birth_day = db.DateProperty()
# Address info.
address = db.PostalAddressProperty()
# Phone info.
phone_number = db.PhoneNumberProperty()
# Company info.
company_title = db.StringProperty()
company_name = db.StringProperty()
company_description = db.StringProperty()
company_address = db.PostalAddressProperty()
That's great, your users immediately begin to use their address book and
soon the datastore starts to fill up. Not long after the deployment of your new
application you hear from someone that they are not happy that there is only
one phone number. What if they want to storesomeone's work telephone number in
addition to their home number? No problem you think, you can just add a work
phone number to your structure. You change your data structure to look more
like this:
挺不错, 用户马上开始使用他们的地址簿, 开始输入联系人. 没有多久, 你听说用户对一个联系人只能有一个电话号码不满意. 如果想保持某个人的家里电话和工作电话怎么办? 没有问题, 你想, 只要在Contact中加入一个工作电话号码就成. 你把数据结构改成如下:
# Phone
info.
phone_number = db.PhoneNumberProperty()
work_phone_number = db.PhoneNumberProperty()
Update the form with the new field and you are back in business. Soon
after redeploying your application, you get a number of new complaints. When
they see the new phone number field, people start asking for even more fields.
Some people want a fax number field, others want a mobile field. Some people
even want more than one mobile field (boy modern life sure is hectic)! You
could add another field for fax, and another for mobile, maybe two. What about
if people have three mobile phones? What if they have ten? What if someone invents
a phone for a place you've never thought of?
修改了表单, 加入新的字段后, 马上投入了使用. 上线后没有多久, 你收到不少新的抱怨. 当用户们看到新的电话号码字段后, 开始要求再增加几个. 有些想加传真号码, 有些想加手机, 甚至有些用户要求加入不止一个手机号码(现在的男孩子们够忙的)! 你可能会为传真加一个字段, 然后是手机, 也许是两个手机号码, J 但如果用户要三个手机号码呢? 如果要十个呢? 如果有一天一种你从来没有想到过的号码被发明出来了呢?
Your model needs to use relationships.
你的数据模型需要使用关系.
One to Many一对多
The answer is to allow users to assign as many phone numbers to each of
their contacts as they like. To do this, you need to model the phone numbers in
their own class and have a way of associating many phone numbers to a single
Contact. You can easily model the one to many relationship using
ReferenceProperty. Here is a candidate for this new class:
答案是允许用户为联系人添加任意多个电话号码. 为此, 你需要把电话号码单独放在一个类里面, 并且想办法把多个电话号码与一个联系人关联起来. 使用ReferenceProperty可以很容易地为一对多关系建模, 类似这样:
class Contact(db.Model):
# Basic info.
name = db.StringProperty()
birth_day = db.DateProperty()
# Address info.
address = db.PostalAddressProperty()
# The original phone_number
property has been replaced by
# an implicitly created property called 'phone_numbers'.
# Company info.
company_title = db.StringProperty()
company_name = db.StringProperty()
company_description = db.StringProperty()
company_address = db.PostalAddressProperty()
class PhoneNumber(db.Model):
contact = db.ReferenceProperty(Contact,
collection_name='phone_numbers')
phone_type = db.StringProperty(
choices=('home', 'work', 'fax', 'mobile', 'other'))
number = db.PhoneNumberProperty()
The key to making all this work is the contact property. By defining it as
a ReferenceProperty, you have created a property that can only be assigned
values of type Contact. Every time you define a reference property, it creates
an implicit collection property on the referenced class. By default, this
collection is called <name-of-class>_set. In this case, it would
make a property Contact.phonenumber_set. However, it is probably more intuitive
to call that attribute phone_numbers. You over-rode this default name using the
collection_name keyword parameter to ReferenceProperty.
关键就在contact属性. 因为contact属性是ReferenceProperty, 你只能把Contact类型赋值给这个属性. 每当你定于了一个ReferenceProperty属性, 会在被引用的类上自动创建一个collection属性. 这个collection缺省名字是<类名>_set. 比如说在这个例子中, 缺省名是Contact.phonenumber_set. 但是如果我们叫它phone_numbers可能会更直观些. 使用关键字collection_name, 你可以指定一个新名字, 覆盖掉缺省名.
Creating the relationship between a contact and one of its phone numbers
is easy to do. Let's say you have a contact named "Scott" who has a
home phone and a mobile phone. You populate his contact info like this:
在联系人和电话号码之间创建联系很容易. 比如说你有一个联系人叫Scott, 他有一个家庭号码和一个手机. 你会这样设定联系人信息:
scott = Contact(name='Scott')
scott.put()
PhoneNumber(contact=scott,
phone_type='home',
number='(650) 555 - 2200').put()
PhoneNumber(contact=scott,
phone_type='mobile',
number='(650) 555 - 2201').put()
Because ReferenceProperty creates this special property on Contact, it
makes it very easy to retrieve all the phone numbers for a given person. If you
wanted to print all the phone numbers for a given person, you can do it like
this:
有了ReferenceProperty自动在Contact上创建的collection属性, 要得到一个联系人的所有电话号码就很容易了. 如果你要打印出某个人的所有电话号码, 你可以这样做:
print 'Content-Type: text/html'
print
for phone in scott.phone_numbers:
print '%s: %s' % (phone.phone_type, phone.number)
This will produce results that look like:
home: (650) 555 - 2200
mobile: (650) 555 - 2201
Note: The order of the output might be
different as by default there is no ordering in this kind of relationship.
注意: 输出顺序可能会不一样, 因为缺省情况下没有排序.
The phone_numbers virtual attribute is a Query instance, meaning that you
can use it to further narrow down and sort the collection associated with the
Contact. For example, if you only want to get the home phone numbers, you can
do this:
虚拟属性phone_numbers其实是一个Query, 意味著你可以用它来进一步缩小结果集, 排序. 比如说, 如果你只想要返回家庭号码, 你可以这样做
scott.phone_numbers.filter('phone_type =', 'home')
When Scott loses his phone, it's easy enough to delete that record. Just
delete the PhoneNumber instance and it can no longer be queried for:
当Scott弄丢了手机, 非常容易删除那条记录. 只要把对应的实例删除就可以了:
jack.phone_numbers.filter('phone_type =',
'home').get().delete()
Many to Many
多对多
One thing you would like to do is provide the ability for people to
organize their contacts in to groups. They might make groups like
"Friends", "Co-workers" and "Family". This would
allow users to use these groups to perform actions en masse, such as maybe
sending an invitation to all their friends for a hack-a-thon. Let's define a
simple Group model like this:
你想要让用户能够对联系人分组. 可能会有类似”朋友”, “同事”和”家庭”这样的分组. 这样用户可以进行批量操作, 如给所有朋友发送” hack-a-thon”活动(Google的邀请. 我们这样定义一个分组:
class Group(db.Model):
name = db.StringProperty()
description = db.TextProperty()
You could make a new ReferenceProperty on Contact called group. However,
this would allow contacts to be part of only one group at a time. For example,
someone might include some of their co-workers as friends. You need a way to
represent many-to-many relationships.
你可以在Contact中定义一个叫group的新ReferenceProperty. 但是这样就只允许一个联系人属于一个分组. 举例来说, 某个人把他的一些同事也放在朋友组. 你需要一种方法来表示多对多的关系.
List of Keys
One very simple way is to create a list of keys on one side of the
relationship:
一种很简单的办法是在关系的其中一方创建一个keys的列表:
class Contact(db.Model):
# User that owns this entry.
owner = db.UserProperty()
# Basic info.
name = db.StringProperty()
birth_day = db.DateProperty()
# Address info.
address = db.PostalAddressProperty()
# Company info.
company_title = db.StringProperty()
company_name = db.StringProperty()
company_description = db.StringProperty()
company_address = db.PostalAddressProperty()
# Group
affiliation
groups = db.ListProperty(db.Key)
Adding and removing a user to and from a group means working with a list
of keys:
将某个用户从group中增加和删除相当于对keys列表操作:
friends = Group.gql("WHERE name =
'friends'").get()
mary = Contact.gql("WHERE name = 'Mary'").get()
if friends.key() not in mary.groups:
mary.groups.append(friends.key())
mary.put()
To get all the members of a group, you can execute a simple query. It
might help to add a helper function to the Group entity:
要得到一个组的所有成员, 你可以执行一个简单的查询. 在Group类中加入一个helper函数:
class Group(db.Model):
name = db.StringProperty()
description = db.TextProperty()
@property
def members(self):
return Contact.gql("WHERE groups = :1", self.key())
There are a few limitations to implementing many-to-many relationships
this way. First, you must explicitly retrieve the values on the side of the
collection where the list is stored since all you have available are Key
objects. Another more important one is that you want to avoid storing overly
large lists of keys in a ListProperty. This means you should place the list on
side of the relationship which you expect to have fewer values. In the example
above, the Contact side was chosen because a single person is not likely to belong
to too many groups, whereas in a large contacts database, a group might contain
hundreds of members.
这种方式实现多对多关系有一些限制. 首先在使用列表一方你只保存了keys, 当你需要返回数据时, 必需你自己手工根据keys读取数据. 另外一个更重要的是你要避免在list中保存太多的keys. 这意味着你应该在list项比较少一方放你的list. 在这个例子中, 一个group会有很多个成员, 但是一个人不大可能属于很多个group, 所以我们在Contact中保存groups list, 而不是相反.
Relationship Model
关系模型
One of your users is a big time saleswoman and knows teams of people in
just one company. She is finding it very tedious to have to enter the same
information about the same company again and again. Couldn't there be a way to
specify a company once and then associate them with each person? If it were
that simple, it would merely be necessary to have a one-to-many relationship
between Contact and Company, but it's more complicated than that. Some of her
contacts are contractors that work at more than one company and have different
titles in each. What now?
你的一个用户是个顶级销售员. 在一个公司她(对, 她)认识很多人. 她觉得要不断地输入同一个公司的信息很繁琐. 难道不能输入一次公司信息然后把它和人关联起来吗? 如果就这么简单的话, 就只需要在联系人和公司之间建立一对多的关系, 但是实际情况有点复杂. 有些联系人是Contractor(合同工?), 不只在一个公司工作, 也不只一个头衔. 现在怎么办?
You need a many-to-many relationship that can describe some additional
information about that relationship. To accomplish this, you can use another
Model to describe the relationship:
你需要建立多对多关系, 而且还需要关于关系的信息. 为此你可以用一个专门的数据模型来描述这种关系:
class Contact(db.Model):
# User that owns this entry.
owner = db.UserProperty()
# Basic info.
name = db.StringProperty()
birth_day = db.DateProperty()
# Address info.
address = db.PostalAddressProperty()
# The original organization
properties have been replaced by
# an implicitly created property called 'companies'.
# Group affiliation
groups = db.ListProperty(db.Key)
class Company(db.Model):
name = db.StringProperty()
description = db.StringProperty()
company_address = db.PostalAddressProperty()
class ContactCompany(db.Model):
contact = db.ReferenceProperty(Contact,
required=True,
collection_name='companies')
company = db.ReferenceProperty(Company,
required=True,
collection_name='contacts')
title = db.StringProperty()
Adding someone to a company is done by creating a ContactCompany instance:
把某人加到一个公司就是创建一个ContactCompany实例:
mary = Contact.gql("name = 'Mary'").get()
google = Company.gql("name = 'Google'").get()
ContactCompany(contact=mary,
company=google,
title='Engineer').put()
In addition to being able to being able to store information about a
relationship, using this method has the advantage over the list-of-keys method
in that you can have large collections on either side of the relationship.
However, you need to be very careful because traversing the connections of a
collection will require more calls to thedatastore . Use this kind of
many-to-many relationship only when you really need to, and do so with care to
the performance of your application.
除了可以保存关于关系的信息, 比起列表这种实现方式还有一个优点, 可以维护很大的关系集合. 然而, 你要非常小心, 逐个遍历集合中的数据会多次访问datastore. 只有当你确实需要的时候才使用这种多对多实现, 而且要注意应用程序的性能.
Conclusion
结论
App Engine allows the creation of easy to use relationships between
datastore entities which can represent real-world things and ideas. Use
ReferenceProperty when you need to associate an arbitrary number of repeated
types of information with a single entity. Use key-lists when you need to allow
lots of different objects to share other instances between each other. You will
find that these two approaches will provide you with most of what you need to
create the model behind great applications.
AppEngine中能很容易地创建实体关系, 来描述真实世界的事物和想法. 使用ReferenceProperty当你需要关联某种类型的多个实例到一个实体. 使用keys列表当你允许不同的对象互相之间关联(?这句很难翻). 在大多数情况下, 你会发现这两种方式就能满足你的需要.