making sense out of things on the web
DESCRIPTION
TRANSCRIPT
![Page 1: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/1.jpg)
![Page 2: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/2.jpg)
MAKING SENSE OUT OF THINGS ON THE WEB@pradeepbv
![Page 3: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/3.jpg)
3
We have been accumulating a lot of information
![Page 4: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/4.jpg)
4http://en.wikipedia.org/wiki/File:Jingangjing.jpg
![Page 5: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/5.jpg)
5
http://en.wikipedia.org/wiki/File:Printer_in_1568-ce.png
http://en.wikipedia.org/wiki/File:BuxheimStChristopher.jpg
![Page 6: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/6.jpg)
6http://en.wikipedia.org/wiki/Odhecaton
![Page 7: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/7.jpg)
7
http://upload.wikimedia.org/wikipedia/commons/f/f1/The_First_Telegraph.jpg
What hath God wrought
![Page 8: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/8.jpg)
8
http://en.wikipedia.org/wiki/File:1891_Telegraph_Lines.jpg
1891 Telegraph Lines
![Page 9: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/9.jpg)
9
Mr Watson—Come hereI want to see you
http://www.boerner.net/jboerner/?p=9396
![Page 10: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/10.jpg)
10
radioRadio
![Page 11: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/11.jpg)
11http://www.elon.edu/e-web/predictions/150/1930.xhtml
![Page 12: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/12.jpg)
12
![Page 13: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/13.jpg)
13
![Page 14: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/14.jpg)
14
![Page 15: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/15.jpg)
15
www
![Page 16: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/16.jpg)
16http://en.wikipedia.org/wiki/File:NCSA_Mosaic.PNG
![Page 17: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/17.jpg)
17
the Internet had an estimated 16 million users by 1995
![Page 18: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/18.jpg)
18http://en.wikipedia.org/wiki/Venture_capital
![Page 19: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/19.jpg)
19
People from all over the world started sharing their interests,
hopes and dreams online
![Page 20: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/20.jpg)
20
![Page 21: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/21.jpg)
21http://electrokami.com/wp-content/uploads/2010/09/the-internet-in-real-life.jpg
![Page 22: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/22.jpg)
22
The number of devices connected to IP networks will be nearly three times as high as the global population in 2016
![Page 23: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/23.jpg)
23
The Zettabyte Era
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/VNI_Hyperconnectivity_WP.html
kilomegateragigapitaexazettayotta9,444,732,965,739,290,427,392 bits (1024 exbibytes)
![Page 24: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/24.jpg)
24
“Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don't know we don't know.”
Donald Rumsfeld, US Defense Secretaryat a press conference at NATO Headquarters, Brussels, Belgium, June 6, 2002Image: planetization.org
![Page 25: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/25.jpg)
25
Nicholas Carr worries that the flood of digital information is changing not only our habits, but even our mental capacities: Forced to scan and skim to keep up, we are losing our abilities to pay sustained attention, reflect deeply, or remember what we’ve learned.
![Page 26: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/26.jpg)
26http://blogs.tusc.k12.al.us/bhslibrary/files/2012/01/Information_overload.jpg
Information overload?
![Page 27: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/27.jpg)
27http://www.teachersdiary.com/.a/6a0115703931fc970c0128765537ba970c-800wi
DO YOU KNOW WHAT ARE YOU LOOKING FOR?
![Page 28: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/28.jpg)
28http://www.flickr.com/photos/special/1597251/
DO YOU KNOW WHERE TO FIND WHAT YOU WANT?
![Page 29: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/29.jpg)
29http://www.flickr.com/photos/sumrow/1267682594/sizes/l/
REGULAR SEARCH #FAIL?
![Page 30: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/30.jpg)
30http://www.flickr.com/photos/sumrow/1267682594/sizes/l/
IS THERE A SUPERHEROWHO CAN HELP?
![Page 31: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/31.jpg)
BUILD YOUR OWN SEARCH SERVICE
Yes, you are the superhero
![Page 32: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/32.jpg)
BOSS IS BUILD YOUR OWN SEARCH SERVICE
http://developer.yahoo.com/search/boss/
![Page 33: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/33.jpg)
BOSS PROVIDES APIS
TO OUR SEARCHDATA STORES
![Page 34: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/34.jpg)
TO BUILD YOUR OWNPOWERFUL
SEARCH APPLICATIONS
![Page 35: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/35.jpg)
BOSS allows you to search over
Web, images, news & Blogs
![Page 36: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/36.jpg)
You can even monetize yourapplications using Search Ads from BOSS and get support.
![Page 37: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/37.jpg)
What can be done on top of BOSS?• Blend and re-rank search results
• Your own look and feel
• Mix it with other APIs
![Page 38: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/38.jpg)
BOSS Pricing
![Page 39: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/39.jpg)
Free for building your hacks!!
![Page 40: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/40.jpg)
Where do I start?
![Page 41: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/41.jpg)
Restful XML and JSON API
Web
Image
Spelling
News
Search Ads
What’s in it?
http//www.flickr.com/photos/joeshlabotnik/419914250/sizes/o/in/photostream/.jpg
![Page 42: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/42.jpg)
Oauth based Autentication
http//www.flickr.com/photos/friarsbalsam/5736126308/sizes/o/in/photostream/.jpg
![Page 43: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/43.jpg)
What else do I get?
Web and Limited Web results
Image attributes
like height, width, etc
Time span filtering
for News Search
Document type filtering
Extended abstracts
http//www.flickr.com/photos/acidpix/6021203584/sizes/o/in/photostream/.jpg
![Page 44: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/44.jpg)
BOSS + YQL
• Table Name: boss.search
• e.g. select * from boss.search where ck=… and secret=… and q=‘openhackindia’
Parameters Example
Consumer Key ck -
Consumer Secret secret -
Query Term q ‘iitd’
![Page 45: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/45.jpg)
Searching “The Dark Knight”
![Page 46: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/46.jpg)
![Page 47: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/47.jpg)
Finding images of “The Dark Knight Rises”
select * from boss.search where q="The Dark Knight Rises" and service="images" and
ck="..." and secret="..."
![Page 48: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/48.jpg)
Finding “The Dark Knight Rises” in IMDB, movies.yahoo.com
select * from boss.search where q="The Dark Knight Rises" and
sites="imdb.com,movies.yahoo.com" and ck="..." and secret="..."
![Page 49: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/49.jpg)
Spell Check and Correction
select * from boss.search where q="The Dark Knight Rises" and service="spelling" and
ck="..." and secret="..."
![Page 50: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/50.jpg)
Finding news on “The Dark Knight Rises”
select * from boss.search where q="The Dark Knight Rises" and service="news" and ck="..."
and secret="..."
![Page 51: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/51.jpg)
And through the BOSS API
Getting multiple data sets /ysearch/web,images,news?q=anna
/ysearch/web,images,news?web.q=anna&images.q=anna&news.q=lokpal
Searching through sites A Simple Movie Search
/ysearch/web?q=“Dark Knight”&
sites=movies.yahoo.com,netflix.com,imdb.com
AND/OR operators /ysearch/web?q="steve jobs"AND((ipad)OR(iphone))&sites=bestbuy.com,newegg.com
Important: Use Braces or quotes
![Page 52: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/52.jpg)
Unary Operators Search for Batman but not “Dark Knight”
q=(batman -“Dark Knight")
Find pages with “Heath Ledger” but not “Dark Knight”
q=+”heath ledger”–”Dark
Knight”&sites=movies.yahoo.com
Force auto-spelling off
q=+”drk knight”
AND OR
![Page 53: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/53.jpg)
Searching in body and in title
Searching for Dark Knight in the Title on Yahoo moviesq=reviews intitle:"dark knight"&sites=movies.yahoo.com
Searching for Dark Knight in the Title in Yahoo movies containing Christian Baleq=reviews intitle:"dark knight" inbody:"christian
bale"&sites=movies.yahoo.com
![Page 54: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/54.jpg)
Market and document specific Filters
Search for “Dark Knight” in India specific sites q=“Dark Knight”&market=en-in
Search for “PDF’s containing “Dark Knight” q=“Dark Knight”&type=pdf
Search for MS Office type (except PPT’s) containing “Dark Knight” q=“Dark Knight”&type=msoffice,-ppt
![Page 55: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/55.jpg)
Output
![Page 56: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/56.jpg)
Image search parameters
Search for images that are not offensive
/ysearch/images?q=“san francisco”&filter=yes
Search for images that are wallpaper size
/ysearch/images?q=“san francisco”&dimensions=wallpaper
Search for a image at a certain refer URL
/ysearch/images?q=yahoo&refererurl=http://www.flickr.com
• Interesting Output Fields
format, file size, height, width, title, total result count
![Page 57: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/57.jpg)
News search parameters
Search news that is less than 7 days old/ysearch/news?q=lokpal&age=7d
Search news that is between 20hrs and 2 days old
/ysearch/news?q=lokpal&age=20h2d
Re-rank news results by date
/ysearch/news?q=lokpal&ranking=true
Interesting Output Fields
Source, Date, Source URL
![Page 58: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/58.jpg)
EXAMPLE HACKS
![Page 59: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/59.jpg)
Duckduckgo.com
![Page 60: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/60.jpg)
Interceder
![Page 61: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/61.jpg)
Ask-boss (v1)
Hack: http://ask-boss.appspot.com Code: https://github.com/saurabhsahni/Hacks/tree/master/askBOSS
![Page 62: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/62.jpg)
webmeme.in
![Page 63: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/63.jpg)
http://hackyourworld.org/~iitb_pacman/search/
![Page 64: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/64.jpg)
I did BOSS and got data, now how to extract information of out it?
![Page 65: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/65.jpg)
make sense out of it?
![Page 66: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/66.jpg)
![Page 67: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/67.jpg)
Content Analysis
select * from contentanalysis.analyze where text="Yahoo! kicks off hackday”
![Page 68: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/68.jpg)
Content Analysis from a URL
select * from contentanalysis.analyze where url="http://www.cnn.com/"
![Page 69: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/69.jpg)
Term Exraction
select * from search.termextract where context in (select description from rss where url=‘’)
![Page 70: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/70.jpg)
More resources Yahoo! BOSS: http://developer.yahoo.com/boss
BOSS Technical Documentation:
http://developer.yahoo.com/search/boss/boss_api_guide/
YQL: http://developer.yahoo.com/yql
Amazon Web Services: http://aws.amazon.com
oAuth: http://oauth.net/
Open Data: http://theinfo.org
Alt Search Engines: http://www.altsearchengines.com/
![Page 71: Making sense out of things on the web](https://reader036.vdocuments.site/reader036/viewer/2022062617/54bd62ee4a7959a9278b45f0/html5/thumbnails/71.jpg)
Happy hacking!