Building Scalable, Complex Apps On App Engine
Building Scalable, Complex Apps On App Engine
List properties
What they are, how they work
Example: Microblogging
Maximizing performance
Merge-join
What it is, how it works; list property magic
Example: Modeling the social graph
Moderator
http://tinyurl.com/complextalk
List properties
What is a list property?
class Favorites(db.Model):
colors = db.StringListProperty()
username = db.StringProperty()
class FavoriteColors(db.Model):
color = db.StringProperty()
username = db.StringProperty()
db.GqlQuery(
"SELECT * FROM FavoriteColors "
"WHERE username = :1", ...)
Why use list properties? (2)
results = db.GqlQuery(
"SELECT * FROM FavoriteColors "
"WHERE color = 'yellow'")
Gotchas
Uses more CPU for serializing/deserializing the
entity when it's accessed
Works with sort orders only if querying a single list
property; otherwise indexes "explode"
Concrete example: Microblogging
UsersMessages table
User ID Message ID
1 56
1 82
Concrete example: Microblogging, with RDBMS (2)
class Message(db.Model):
sender = db.StringProperty()
receivers = db.StringListProperty()
body = db.TextProperty()
results = db.GqlQuery(
"SELECT * FROM Message "
"WHERE receivers = :1", me)
That's it!
This is how Jaiku works
Concrete example: Microblogging, with JDO
@PersistenceCapable(
identityType=IdentityType.APPLICATION)
public class Message {
@PrimaryKey
@Persistent(valueStrategy=
IdGeneratorStrategy.IDENTITY)
Long id;
pm = PMF.get().getPersistenceManager();
Query query = pm.newQuery(Message.class);
query.setFilter("receivers == 'foo'");
List<Message> results =
(List<Message>) query.execute();
Concrete example: Microblogging Demo
List property performance
class Message(db.Model):
sender = db.StringProperty()
body = db.TextProperty()
class MessageIndex(db.Model):
receivers = db.StringListProperty()
Solution-- Relation index entities (2)
Scalable indexes
Merge-join
What is merge-join?
Example
Why use merge-join?
class Animal(db.Model):
has = db.StringListProperty()
color = db.StringProperty()
legs = db.IntegerProperty()
results = db.GqlQuery(
"""SELECT * FROM Animal WHERE
color = 'spots' AND
has = 'horns' AND
legs = 4""")
How does merge-join work?
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Zig!
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Zig!
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Zag!
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Example merge-join
Row key
has=hair,key=cat
(Tables represent
has=horns,key=cow property indexes)
has=jaws,key=lion
has=jaws,key=shark
Concrete example: Social graph
Person table
User ID Location ...
1 San Francisco ...
2 New York ...
Friends table
UserA ID UserB ID
56 5
57 1
Concrete example: Social graph, with RDBMS (2)
class Person(db.Model):
location = db.StringProperty()
friends = db.StringListProperty()
db.GqlQuery(
"""SELECT * FROM Person WHERE
friends = :1 AND
friends = :2 AND
location = 'San Francisco'""",
me, otherguy)
That's it!
Add as many equality filters as you need
Concrete example: Social graph Demo
Merge-join performance
Gotchas
Watch out for pathological datasets!
Too many overlapping values = lots of zig-
zagging
Doesn't work with composite indexes because of
"exploding" index combinations
That means you can't apply sort orders!
Must sort in memory
Wrap-up
Wrap-up