【18-c-4】google app engine - 無限の彼方へ

Takashi MatsuoGoogle, Inc.Feb 18, 2011

App Engineto infinity and beyond

Takashi MatsuoGoogle, Inc.Feb 18, 2011

自己紹介

松尾貴史 @tmatsuo

App Engine Developer AdvocateKay's Daddyhttp://code.google.com/p/kay-framework/

http://code.google.com/p/kay-framework/

> 10万 Developers / Month

> 15万 Apps / Week

> 10億 Page Views / Day

Apr 2008 Python launchMay 2008 Memcache API, Images APIJul 2008 Logs exportAug 2008 Batch write/deleteOct 2008 HTTPS supportDec 2008 Status dashboard, quota detailsFeb 2009 Billing, Remote API, Larger HTTP request/response size limits (10MB)Apr 2009 Java launch, Bulkloader (DB import), Cron jobs, SDCMay 2009 Key-only queries, Quota APIJun 2009 Task queue API, Django 1.0 supportSep 2009 XMPP API, Remote API shell, Django 1.1 supportOct 2009 Incoming emailDec 2009 Blobstore APIFeb 2010 Datastore cursors, Async URLfetch, App statsMar 2010 Denial-of-Service filtering, eventual consistency supportMay 2010 OpenID, OAuth, App Engine for Business, new bulkloaderAug 2010 Namespaces, increased quotas, high perf image servingOct 2010 Instances console, datastore admin & bulk entity deletesDec 2010 Channel API, 10-minute tasks & cron jobs, AlwaysOn & WarmupJan 2011 High Replication datastore, entity copy b/w apps, 10-minute URLfetchFeb 2011 Improved XMPP and Task Queue, Django 1.2 support

App Engine のこれまで 3 年 - 進化を続けるプラットフォーム

ロードマップ

SSL access on non-appspot.com domainsFull-text Search over DatastoreSupport for Python 2.7Background servers capable of running for longer than 30sSupport for running MapReduce jobs across App Engine datasetsBulk Datastore Import and Export toolImproved monitoring and alerting of application servingLogging system improvements to remove limits on size and storageRaise HTTP request and response size limitsIntegration with Google Storage for DevelopersProgrammatic Blob creation in BlobstoreQuota and presence improvements for Channel API

http://code.google.com/apis/storage/

日本での事例

たくさんのミクシィアプリthe Actress - 60万ユーザーMixi Xmas 2010 - 200万ユーザー

朝日新聞 - メディア配信サービスソニー - Chan-Toru業務用アプリケーションも多数

App Engine for Business とは?

App Engine プラットフォームに加え

99.9% SLAクラウド SQL有償サポートドメインコンソールHosted SSL

本あります

プログラミング Google App Enginehttp://www.oreilly.co.jp/books/9784873114750/

オープンソース徹底活用 Slim3 on Google App Engine for JavaISBN-10: 4798026999

http://www.oreilly.co.jp/books/9784873114750/

アジェンダ

App Engine で開発する際の注意点

Datastore の設計最小限の仕事をするDatastore Contention を避ける

シャーディングカウンターFork-join queue

Memcache を効果的に使用するさらなる高みへ

Datastore の設計

非正規化を嫌がらない - No join必要な index のみ作成最小限の Entity Group

トランザクションが必要な箇所のみ可能なら速い方を使う

query より keys_only query が速いquery より get が速い複数の get より batch get が速い

kind の分割が有効なケース一部のプロパティだけ取得

少しの失敗を受け入れるリトライ、deadline の指定

Datastore の設計 - 非正規化

from google.appengine.ext import db

class User(db.Model): name = db.StringProperty() groups = db.ListProperty(db.Key)

class Group(db.Model): name = db.StringProperty()

user = User.get(user_key)group_names = [group.name for group in db.get(user.groups)]

Before

Datastore の設計 - 非正規化


class User(db.Model): name = db.StringProperty() groups = db.ListProperty(db.Key) group_names = db.StringListProperty()

class Group(db.Model): name = db.StringProperty()

user = User.get(user_key)# group_names = user.group_names

After

Datastore の設計






Datastore の設計 - 必要な index のみ作成


class MyModel(db.Model): name = db.StringProperty() total = db.IntegerProperty(indexed=False)

index が増えると書き込み速度は遅くなります

Datastore の設計






Datastore の設計 - Entity Group

class MyModel(db.Model): # ...

# 単純にエンティティを作成すると、それ自体新しい Entity Group になる

my_entity = MyModel()my_entity.put()

# 親を指定すると、親と同じ Entity Group に属する

my_second_entity = MyModel(parent=my_entity)my_second_entity.put()

# 同じ kind である必要はない

my_third_entity = MyOtherModel(parent=my_second_entity)my_third_entity.put()

Entity Group 作成の方法

Datastore の設計 - Entity Group

親子関係全てに Entity Group を使用するべきではない例えば BlogEntry と Comment には不適切

基本的にはトランザクションが必要な箇所に使う検索用インデックスを自前で作る時などにも有効

Datastore の設計 - Entity Group の使用例

検索用インデックス - Before

class Sentence(db.Model): body = db.TextProperty() indexes = db.StringListProperty()

query = Sentence.all().filter( "indexes =", search_word)search_result = query.fetch(20)

fetch 時に不必要な indexes のデシリアライズが発生

Datastore の設計 - Entity Group の使用例

検索用インデックス - After

class Sentence(db.Model): body = db.TextProperty()

class SearchIndex(db.Model): indexes = db.StringListProperty()

query = SearchIndex.all(keys_only=True).\ filter("indexes =", search_word)search_result = db.get( [key.parent() for key in query.fetch(20)])

保存時に Entity Group を形成

Datastore の設計






Datastore の設計 - Entity 分割

例: ファイルシェアリングサービスファイル自身メタデータ

ファイルの一覧ページで必要なのはメタデータだけファイル自身とメタデータを別エンティティに分割一覧ページでの処理が高速で無駄がなくなる

Datastore の設計






Datastore の設計 - リトライ、deadline

from google.appengine.ext import dbfrom google.appengine.runtime.apiproxy_errors import DeadlineExceededErrorfrom google.appengine.runtime import DeadlineExceededError as D

def somefunction(somevalue): handmaid_key = db.Key.from_path("MyModel", 1) new_id = db.allocate_ids(handmaid_key, 1) new_key = db.Key.from_path("MyModel", new_id) my_entity = MyModel(key=new_key, name=somevalue) try: while True: sleep_ms = 100 try: db.put(my_entity, config=db.create_config(deadline=5)) break except (DeadlineExceededError, db.Timeout): time.sleep(sleep_ms) sleep_ms *= 2 except D: # Ran out of retry. Return an error to users.

自動リトライのレシピ

http://appengine-cookbook.appspot.com/recipe/autoretry-datastore-timeouts

最小限の仕事をする

大事な用語ユーザー向けリクエスト(User Facing Request)

インターネット側から来るもの全てバックグラウンドリクエスト(Background Request)

TaskQueue, Cron 等

大事なポイントユーザー向けリクエストは 1000 ms 以内で返すこと

最小限の仕事をする - より速い方を使用

query より keys_only queryquery より getdatastore より memcachememcache より global 変数(static 変数)

最小限の仕事をする - Task Queue を使用

時間のかかる処理は Task Queue に投げる。Channel API を使えば結果を簡単にプッシュ

Datastore Contention をさける

Entity または Entity Group に対しての書き込み

目安: 1 秒に 1 度程度対策:

Entity Group はなるべく小さくするシャーディングカウンターなどのテクニックを使用する

シャーディングカウンター - シンプルな実装例

from google.appengine.ext import dbimport random

class SimpleCounterShard(db.Model): """Shards for the counter""" count = db.IntegerProperty(required=True, default=0)

NUM_SHARDS = 20

def get_count(): """Retrieve the value for a given sharded counter.""" total = 0 for counter in SimpleCounterShard.all(): total += counter.count return total

def increment(): """Increment the value for a given sharded counter.""" def txn(): index = random.randint(0, NUM_SHARDS - 1) shard_name = "shard" + str(index) counter = SimpleCounterShard.get_by_key_name(shard_name) if counter is None: counter = SimpleCounterShard(key_name=shard_name) counter.count += 1 counter.put() db.run_in_transaction(txn)

http://goo.gl/8dGO

http://goo.gl/8dGO

Fork-join queue

Building high-throughput data pipelines with Google App Engine

http://goo.gl/ntlH

シンプルなカウンターの例http://paste.shehas.net/show/137/

http://goo.gl/ntlH

http://paste.shehas.net/show/137/

memcache を効果的に使用する

使用例人気のあるページを html 丸ごとキャッシュmemcache.incr を使用したカウンター

高速だが 100% 正確とは保証できない頻繁に使用される entity や query 結果をキャッシュ

memcache を効果的に使用する

基本パターン

results = memcache.get('results')if results is None: results = some_complex_work() memcache.set('results', results)

# use results

古いキャッシュを破棄する

AppStats - パフォーマンス測定用ツール

http://goo.gl/eYdhD

Before

After

http://goo.gl/eYdhD

さらなる高みへ

Request 数の Fixed Quota はデフォルト 500 QPS 程度それ以上必要な場合は、Quota Increase のフォームから申請するか私宛に相談してください

Tablet Server の Split を抑制頻繁に(数百/秒) Entity の作成などを行う場合

自動採番の id では Split を引き起こしやすいuuid などを key_name として使用する

key_name を使用する場合でも、頭文字が類似だと Split が起きやすい

Hash 値を前置する

まとめ

Datastore に適した設計最小限の仕事をするDatastore Contention を避けるMemcache を効果的に使用するAppStats を使用してパフォーマンス測定するFixed Quota を増やすTablet Server の Split を抑制

無限の彼方へ

Takashi MatsuoGoogle, Inc.e-mail: [email protected]: @tmatsuo

【18-c-4】google app engine - 無限の彼方へ

Documents