google cloud for data crunchers - strata conf 2011
DESCRIPTION
http://strataconf.com/strata2011/public/schedule/detail/16242Talk at Strata 2011 with Ryan Boyd and Kirrily RobertsGoogle is a Data business: over the past few years, many of the tools Google created to store, query, analyze, visualize its data, have been exposed to developers as services.This talk will give you an overview of Google services for Data Crunchers:Google Storage for developersBigQuery, fast interactive queries on Terabytes of dataMachine Learning API: Machine Learning made easyGoogle App Engine, exposing Data APIs is a very common use case for App EngineVisualization API: many cool visualization componentsTRANSCRIPT
![Page 1: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/1.jpg)
Google Cloud for Data CrunchersPatrick Chanezon, Developer Advocate, Cloud@chanezon, [email protected]
Ryan Boyd, Developer Advocate, Apps@ryguyrg, [email protected]
Kirrily Robert, Data Engineer, Freebase.com@skud, [email protected]
![Page 2: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/2.jpg)
Developer DayGoogle 2010
Agenda
• Google App Engine• Google Storage for Developers• Prediction API• BigQuery• Google Fusion Tables• Google Refine
![Page 3: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/3.jpg)
Developer DayGoogle 2010
Google App Engine
![Page 4: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/4.jpg)
3
What iscloud
computing?
![Page 5: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/5.jpg)
Developer DayGoogle 2010
IaaS
PaaS
SaaS
Source: Gartner AADI Summit Dec 2009
Cloud Computing Defined
![Page 6: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/6.jpg)
Developer DayGoogle 2010
Google Storage Prediction API
BigQuery
Your Apps
1. Google Apps2. Third party Apps: Google Apps Marketplace3. ________
Google App Engine
IaaS
PaaS
SaaS
Google's Cloud Offerings
![Page 7: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/7.jpg)
Google App Engine
-Easy to build-Easy to maintain-Easy to scale
7
![Page 8: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/8.jpg)
Cloud development in a box
8
• SDK & “The Cloud”• Hardware• Networking• Operating system• Application runtime
o Java, Python• Static file serving• Services• Fault tolerance• Load balancing
![Page 9: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/9.jpg)
App Engine Services
BlobstoreImages
Mail XMPP Task Queue
Memcache Datastore URL Fetch
User Service
9
![Page 10: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/10.jpg)
Always free to get started
~5M pageviews/month• 6.5 CPU hrs/day• 1 GB storage• 650K URL Fetch calls/day• 2,000 recipients emailed• 1 GB/day bandwidth• 100,000 tasks enqueued• 650K XMPP messages/day
10
![Page 11: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/11.jpg)
Purchase additional resources *
* free monthly quota of ~5 million page views still in full effect11
![Page 12: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/12.jpg)
Developer DayGoogle 2010
Google App Engine for BusinessSame scalable cloud hosting platform. Designed for the enterprise.
• Enterprise application management– Centralized domain console
• Enterprise reliability and support– 99.9% Service Level Agreement– Premium Developer Support
• Hosted SQL– Managed relational SQL database in the cloud
• SSL on your domain– Including "naked" domain support
• Secure by default– Integrated Single Sign On (SSO)
• Pricing that makes sense– Pay only for what you use
Google App Enginefor Business
* Hosted SQL and SSL on your domain available later this year
![Page 13: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/13.jpg)
Developer DayGoogle 2010
App Engine for Data Crunchers
• High Performance Image Serving• OpenId/Oauth integration• Increased quotas
• > 1k entities per query• 10’’ task queues
• Async UrlFetch• Mapper API (Reduce coming soon)• Channel API• Matcher API
![Page 14: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/14.jpg)
Developer DayGoogle 2010
Mapper API
• First component of App Engine’s MapReduce toolkit• Large scale data manipulation• Examples include:
• Report generation
• Computing statistics and metrics …
• Python Example:• http://blog.notdot.net/2010/05/Exploring-the-new-mapper-API
• Java Example:• http://ikaisays.com/2010/07/09/using-the-java-mapper-framework-for-app-engine/
![Page 15: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/15.jpg)
Developer DayGoogle 2010
Channel API
• Allows for Server Push (Comet) to browser • Blog post announcement:
• http://googleappengine.blogspot.com/2010/05/app-engine-at-google-io-2010.html
• External coverage:• Sneak Peak from an early trusted tester
• http://bitshaq.com/2010/09/01/sneak-peak-gae-channel-api/
• Demo code for Dance Dance Robot available here:• http://code.google.com/p/dance-dance-robot/
• Also see: https://groups.google.com/group/google-appengine-java/browse_thread/thread/6fa09953ffae2cd3/c1db7de5fdb82b65?pli=1#
![Page 16: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/16.jpg)
Developer DayGoogle 2010
Matcher API
• Allows an app to register a set of queries to match against a stream of documents
• Trustes Testers, Python only • Group post announcement:
• http://groups.google.com/group/google-appengine/msg/40021537e2e58962
• Docs:• http://code.google.com/p/google-app-engine-samples/wiki/AppEngineMatcherService
• Demo code:• http://code.google.com/p/google-app-engine-samples/source/browse/#svn/trunk/matcher-sample
![Page 17: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/17.jpg)
Developer DayGoogle 2010
Google Storage for DevelopersStore your data in Google's cloud
![Page 18: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/18.jpg)
Developer DayGoogle 2010
What Is Google Storage?
• Store your data in Google's cloudoany format, any amount, any time
• You control access to your dataoprivate, shared, or public
• Access via Google APIs or 3rd party tools/libraries
![Page 19: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/19.jpg)
Developer DayGoogle 2010
Google Storage Technical Details
RESTful API • Verbs: GET, PUT, POST, HEAD, DELETE • Resources: identified by URI, like: http://commondatastorage.googleapis.com/bucket/object
• Compatible with S3
Buckets• Flat containers (no bucket hierarchy)
![Page 20: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/20.jpg)
Developer DayGoogle 2010
Performance and ScalabilityObject types and size• Objects of any type and 100GB+ / Object• Unlimited numbers of objects, 1000s of buckets• Range-get support for data retrieval
Replication • All data replicated to multiple US data centers• Leveraging Google's worldwide network for data delivery Consistency• “Read-your-writes” data consistency
![Page 21: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/21.jpg)
Developer DayGoogle 2010
Security and Privacy FeaturesAuthenticated downloads from a web browser• Sharing with individuals• Group sharing via Google Groups • Sharing with Google Apps domains Permissions set on Buckets or Objects• READ (an object, or list a bucket’s contents)• WRITE (applicable to buckets, allows upload/delete/etc)• FULL_CONTROL (read/write ACLs on objects or buckets)
![Page 22: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/22.jpg)
Developer DayGoogle 2010
ToolsGoogle Storage Manager
gsutil
![Page 23: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/23.jpg)
Developer DayGoogle 2010
Google Storage Benefits
High Performance and Scalability Backed by Google infrastructure
Strong Security and Privacy Control access to your data
Easy to UseGet started fast with Google & 3rd party tools
![Page 24: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/24.jpg)
Developer DayGoogle 2010
Some Early Google Storage Adopters
![Page 25: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/25.jpg)
Developer DayGoogle 2010
Google Storage usage within Google
Haiti Relief Imagery USPTO data
Partner Reporting
Google BigQuery
Google Prediction API
Partner Reporting
![Page 26: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/26.jpg)
Developer DayGoogle 2010
Google Storage - AvailabilityLimited preview in US* currently • 100GB free storage and network per account• Sign up for wait list at
• http://code.google.com/apis/storage/
* Non-US preview available on case-by-case basis
![Page 27: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/27.jpg)
Developer DayGoogle 2010
Google Prediction APIGoogle's prediction engine in the cloud
![Page 28: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/28.jpg)
Developer DayGoogle 2010
Introducing the Google Prediction API
• Google's sophisticated machine learning technology• Available as an on-demand RESTful HTTP web service
![Page 29: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/29.jpg)
Developer DayGoogle 2010
CustomerSentiment
TransactionRisk
SpeciesIdentification
MessageRouting
Legal DocketClassification
SuspiciousActivity
Work RosterAssignment
RecommendProducts
PoliticalBias
UpliftMarketing
Diagnostics
InappropriateContent
CareerCounseling
ChurnPrediction
... and many more ...
A virtually endless number of applications...
EmailFiltering
![Page 30: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/30.jpg)
Developer DayGoogle 2010
"english" The quick brown fox jumped over the lazy dog.
"english" To err is human, but to really foul things up you need a computer.
"spanish" No hay mal que por bien no venga.
"spanish" La tercera es la vencida.
? To be or not to be, that is the question.
? La fe mueve montañas.
2. PREDICTThe Prediction APIlater searches forthose featuresduring prediction.
How does it work?1. TRAINThe Prediction APIfinds relevantfeatures in the sample data during training.
![Page 31: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/31.jpg)
Developer DayGoogle 2010
Introducing the Google Prediction API
![Page 32: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/32.jpg)
Developer DayGoogle 2010
Automatically determine application recommendations
• Goal: Increase relevancy on the Apps Marketplace via recommendations
• Customers: Businesses of various sizes and industries using Google Apps around the world
• Data: Sampling of previous installs of applications• Outcome: Predict applications which would be
appropriate for a new customer visiting the site
A Prediction API Example
![Page 33: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/33.jpg)
Developer DayGoogle 2010
Using the Prediction API
1. Upload
2. Train
Upload your training data toGoogle Storage
Build a model from your data
Make new predictions3. Predict
A simple three step process...
![Page 34: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/34.jpg)
Developer DayGoogle 2010
Upload your training data to Google Storage
• Training data: outputs and input features • Data format: comma separated value format (CSV), result in first column
"SlideRocket","EDUCATION","us","en","10","5""MailChimp","BUSINESS","us","en","7","0""MailChimp","STANDARD","se","sv","1","0""Smartsheet","BUSINESS","us","en","13","4"
Upload to Google Storage
gsutil cp installs gs://appdata/
Step 1: Upload
![Page 35: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/35.jpg)
Developer DayGoogle 2010
Create a new model by training on data
To train a model:
POST prediction/v1.1/training?data=appdata%2Finstalls
Training runs asynchronously. To see if it has finished:GET prediction/v1.1/training/appdata%2Finstalls
{"data":{ "data":"appdata/installs", "modelinfo":"estimated accuracy: 0.xx"}}}
Step 2: Train
![Page 36: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/36.jpg)
Developer DayGoogle 2010
Apply the trained model to make predictions on new data
POST prediction/v1.1/query/appdata%2Finstalls/predict
{ "data":{ "input": { "mixture" : [ "EDUCATION","us","en","10","0" ]}}}
{ data : { "kind" : "prediction#output", "outputLabel":"Manymoon", "outputMulti" :[ {"label":"OffiSync", "score": x.xx} {"label":"Zoho CRM", "score": x.xx} {"label":"MailChimp", "score": x.xx}]}}
Step 3: Predict
![Page 37: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/37.jpg)
Developer DayGoogle 2010
Demo!
![Page 38: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/38.jpg)
Developer DayGoogle 2010
Demo Screenshots
Predicting apps for a 501-1,000 seat educational institution
![Page 39: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/39.jpg)
Developer DayGoogle 2010
Demo Screenshots
Predicting apps for a 501-1,000 seat educational institution
![Page 40: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/40.jpg)
Developer DayGoogle 2010
Demo Screenshots
Predicting apps for a small business
![Page 41: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/41.jpg)
Developer DayGoogle 2010
Demo Screenshots
Predicting apps for a small business
![Page 42: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/42.jpg)
Developer DayGoogle 2010
Data• Input Features: numeric or unstructured text• Output: up to hundreds of discrete categories, or
continuous values
Training• Many machine learning techniques• Automatically selected • Performed asynchronously
Access from many platforms:• Web app from Google App Engine• Apps Script (e.g. from Google Spreadsheet)• Desktop app
Prediction API Capabilities
![Page 43: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/43.jpg)
Developer DayGoogle 2010
Prediction API - PricingFree Quota in trial/development• 100 predictions/day, 5MB trained/day• Available for 6 months
Paid Usage• $10/month per project includes 10,000 predictions• Additional predictions are $0.50 per 1,000• Absolute limit of 60,000 predictions per day• $0.002 per MB trained (max size per dataset is 100MB)
![Page 44: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/44.jpg)
Developer DayGoogle 2010
Google Storage - AvailabilityLimited preview in US* currently • Sign up for wait list at
• http://code.google.com/apis/predict/
* Non-US preview available on case-by-case basis
![Page 45: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/45.jpg)
Developer DayGoogle 2010
Google BigQueryInteractive analysis of large datasets in Google's cloud
![Page 46: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/46.jpg)
Developer DayGoogle 2010
Introducing Google BigQuery– Google's large data adhoc analysis technology
• Analyze massive amounts of data in seconds
– Simple SQL-like query language – Flexible access
• REST APIs, JSON-RPC, Google Apps Script
![Page 47: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/47.jpg)
Developer DayGoogle 2010
Working with large data is a challenge
Why BigQuery?
![Page 48: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/48.jpg)
Developer DayGoogle 2010
Spam TrendsDetection
Web Dashboards
Network Optimization
Interactive Tools
Many Use Cases ...
![Page 49: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/49.jpg)
Developer DayGoogle 2010
• Scalable: Billions of rows
• Fast: Response in seconds
• Simple: Queries in SQL
• Web ServiceoRESToJSON-RPCoGoogle App Scripts
Key Capabilities of BigQuery
![Page 50: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/50.jpg)
Developer DayGoogle 2010
1. Upload
2. Import
Upload your raw data toGoogle Storage
Import raw data into BigQuery table
Perform SQL queries on table
3. Query
Another simple three step process...
Using BigQuery
![Page 51: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/51.jpg)
Developer DayGoogle 2010
Compact subset of SQLo SELECT ... FROM ...WHERE ... GROUP BY ... ORDER BY ...LIMIT ...;
Common functionso Math, String, Time, ...
Additional statistical approximationso TOPo COUNT DISTINCT
Writing Queries
![Page 52: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/52.jpg)
Developer DayGoogle 2010
GET /bigquery/v1/tables/{table name}
GET /bigquery/v1/query?q={query}
Sample JSON Reply:{ "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] }}
Also supports JSON-RPC
BigQuery via REST
![Page 53: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/53.jpg)
Developer DayGoogle 2010
Standard Google Authentication• Client Login• OAuth• AuthSub
HTTPS support• protects your credentials• protects your data
Relies on Google Storage to manage access
Security and Privacy
![Page 54: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/54.jpg)
Developer DayGoogle 2010
Wikimedia Revision history data from:http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z
Wikimedia Revision History
Large Data Analysis Example
![Page 55: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/55.jpg)
Developer DayGoogle 2010
Python DB API 2.0 + B. Clapper's sqlcmdhttp://www.clapper.org/software/python/sqlcmd/
Using BigQuery Shell
![Page 56: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/56.jpg)
Developer DayGoogle 2010
BigQuery from a Spreadsheet
![Page 57: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/57.jpg)
Developer DayGoogle 2010
Google Fusion Tables
![Page 58: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/58.jpg)
Developer DayGoogle 2010
Google Fusion Tables
• Manage large collections of tabular data in the cloud• 100 Mb tables• Filters, Aggregation, Merge• ACL, Collaboration, Discuss Data• Visualizations
• REST API• Geo queries
• Maps Integration• FusionTablesLayer
![Page 59: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/59.jpg)
Developer DayGoogle 2010
Google Fusion Tables
![Page 60: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/60.jpg)
Developer DayGoogle 2010
Google Visualization API
![Page 61: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/61.jpg)
Developer DayGoogle 2010
Google Visualization API
• Collection of JavaScript Visualization components• Some from Google (Chart Tools)• Some from other developers• Share the same wire protocol for Data Sources
![Page 62: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/62.jpg)
Developer DayGoogle 2010
Example: Weather data
• US National Climatic Data Center• weather data at stations around the globe since 1929• Stored in Google Storage• Created a Table for Bigquery• Upload Weather Station coordinates in Fusion Tables• App Engine App
• Maps API to display weather station Maps• Bigquery to query average temperature in January• A bit of Python to create a JSON Data Source• Visualization API
• Just an example: rince, repeat, enhance!
![Page 63: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/63.jpg)
Developer DayGoogle 2010
Example: Weather data
![Page 64: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/64.jpg)
Developer DayGoogle 2010
Google Refine
![Page 65: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/65.jpg)
Developer DayGoogle 2010
Google Refine
• Power tool for working with messy data• Cleanup• Transform• Augment• (Link with FreeBase)
• Desktop software for now• http://code.google.com/p/google-refine/
![Page 66: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/66.jpg)
Developer DayGoogle 2010
Google Refine
![Page 67: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/67.jpg)
Developer DayGoogle 2010
• Google App EngineoEasy to build, deploy and manage web apps
• Google StorageoHigh speed data storage on Google Cloud
• Prediction APIoGoogle's machine learning technology
• BigQueryo Interactive analysis of very large data sets
• Google Fusion TablesoManage collections of tabular data in the cloud
• Google RefineoPower tool for working with messy data
• Google VisualizationoCollection of JavaScript Visualization
Recap
![Page 68: Google Cloud for Data Crunchers - Strata Conf 2011](https://reader033.vdocuments.site/reader033/viewer/2022052504/54c6cb384a79591a568b45b8/html5/thumbnails/68.jpg)
Developer DayGoogle 2010
http://code.google.com/apis/http://code.google.com/more/table/
More information