Friday, December 7, 2012

Join Query on Google App Engine Datastore

App Engine Datastore is a no sql database. That means you cannot  do standard sql queries. But they do have the basic queries using GQL. It is a no SQL database which is very reliable, does not slow down even with terabytes of data, and has a nice indexing mechanism to fetch data.

You can use Google Cloud SQL if you need a relational database, it's a manage mySQL database by google. By the time I'm writing this, they now support up to 100GB of mySQL database, but still limited to the limitations of a standard mySQL and it's coolness.

So if you really can't avoid a de-normalize table like you want to show the name of a user in a list, I'll show you samples. This will all be using python 2.7 and ndb.

from google.appengine.ext import ndb
# just a sample model for how to efficiently join them
class User(ndb.Model):
    name = ndb.StringProperty()
    photo = ndb.BlobKeyProperty()
    always create a text version of your blobkey and store
    the result of images.get_serving_url(blob) to it one time if
    your app allows it
    photo_path = ndb.TextProperty()

class Page(ndb.Model):
    data = ndb.TextProperty()

class Comment(ndb.Model):
    user = ndb.KeyProperty()
    page = ndb.KeyProperty()
    created = ndb.DateTimeProperty(auto_now_add=True)
    message = ndb.TextProperty()

    def get_comment_async(comment):
        result = comment.to_dict()
        user = yield comment.user.get_async()
        result['user'] = { 'name' :, 'photo' : user.photo_path }
        raise ndb.Return(result)

# on your handler
class MainHandler(webapp.RequestHandler):
    def get(self):
        page = Page.get_by_id(self.request.get('id'))
        # now we query comments of the page
        comments = Comment.query( == page.key)
        futures = []
        # another good use of tasklet is to load fields info asynchronously
        for comment in comments:
            futures.append(yield Comment.get_comment_async(comment))
        # ndb will try to batch what it can so it only does few network hops
        # pass your results to your view
        view_data['comments'] = [future.get_result() for future in futures]