mashed up playlist

38
THE MASHED UP PLAYLIST part II David Peterson @davidseth #w3c http://www.flickr.com/photos/ soyignatius/

Upload: david-peterson

Post on 30-Oct-2014

18 views

Category:

Technology


1 download

DESCRIPTION

Presented at Web Directions South 09The ABC launched three new socially networked digital radio websites: ABC Dig Music, ABC Jazz and ABC Country in July 2009. They are the first of several ABC projects involving content aggregation. As well as having slick, highly usable designs the music platform integrates with various sources including MusicBrainz, YouTube, Last.fm and Wikipedia. This aggregation functionality graphically illustrates the possibilities of Semantic Web technology for an editorial organisation such as the ABC.

TRANSCRIPT

Page 1: Mashed Up Playlist

THE MASHED UP PLAYLIST part II

David Peterson @davidseth #w3c http://www.flickr.com/photos/soyignatius/

Page 2: Mashed Up Playlist

David Peterson@davidseth

Page 3: Mashed Up Playlist

Challenge

Create a snapshot of an artist

Page 4: Mashed Up Playlist

Problem

<xml><track>

<title>Purple Rain</title><artistName>Prince</artistName>

</track></xml>

Page 5: Mashed Up Playlist

Into

Page 6: Mashed Up Playlist

It’s all about story telling

Page 7: Mashed Up Playlist

Shared Understanding

• Can’t tell a story if the other person doesn’t get what we mean

• Or even speak the same language• Imagine – explain what a kiwi was– or what a sheep was

Page 8: Mashed Up Playlist

• The story matters• ... but ...• You never really have all the information you

need, whether big or small

Page 9: Mashed Up Playlist

You Just don’t Always Know

• Someone else knows more than you• How to find it?

Page 10: Mashed Up Playlist

One Exception

Page 11: Mashed Up Playlist

Semantic Web

• Core idea – you never really know the entire picture

• This is a good thing• Freedom

Page 12: Mashed Up Playlist

Closed World

Open World

http://www.flickr.com/photos/almasryalyoum_e/

Page 13: Mashed Up Playlist

Finding a Solution

• Which APIs to use• Which APIs can we use• How can we combine data from multiple

sources• How can we automate it

Page 14: Mashed Up Playlist

The Curse of too much

• There are over 50 APIs listed on programmableweb.com

• Too many to look into• Each has its own API methods and return data

formats– JSON, XML, RSS, RDF !!!

Page 15: Mashed Up Playlist

Take your Pick

• APIs everywhere– BBC Music– Discogs– Last.fm– MusicBrainz– Yahoo Music– Flickr– Youtube– The Hype Machine

Page 16: Mashed Up Playlist

Finding the key

• One common feature was the usage of a MusicBrainz ID– Last.fm– Discogs– Freebase– Wikipedia/Dbpedia– BBC

Page 17: Mashed Up Playlist

Eureka!

• Great, now all I had to do was use the MusicBrainz API to look up the ID and I was done. Easy...

• :( • The search API sucked. It returned too many

fuzzy results• crap

Page 18: Mashed Up Playlist

Back to the future

• This is where the Semantic Web enters the picture– All that stuff about story telling– Shared understanding– URIs (web links)

Page 19: Mashed Up Playlist

SPARQL

Think of it as Google with a WHERE clause

Page 20: Mashed Up Playlist

SELECT ?artist WHERE { ?artist foaf:name "Prince"@en . ?artist a <http://dbpedia.org/ontology/MusicalArtist>.}

Page 21: Mashed Up Playlist

SELECT ?artist ?bio ?url ?album WHERE { ?artist foaf:name "Prince"@en . ?artist a <http://dbpedia.org/ontology/MusicalArtist> . ?artist dbpedia2:abstract ?bio . ?artist foaf:page ?url .

OPTIONAL { ?album <http://dbpedia.org/ontology/artist> ?artist . ?album rdfs:label "Purple Rain"@en . }}LIMIT 1

Page 22: Mashed Up Playlist

Pinpoint results

• This returns ONE result• “exactly” what we are looking for (or nothing!)

Page 23: Mashed Up Playlist

{170d193a-845c-479f-980e-bef15710653e}

http://www.flickr.com/photos/riseofphoenix/

Page 24: Mashed Up Playlist

{070d193a-845c-479f-980e-bef15710653e}

http://www.flickr.com/photos/angeldew/

Page 25: Mashed Up Playlist

Raw Data

• Not too pretty to look at• But computers LOVE this stuff

Page 26: Mashed Up Playlist

So, what do we get

• Disambiguation• MusicBrainz ID• Discography• Related Artists• Official homepage• Bio• Credit card details (in Semantic Web 2.0)

Page 27: Mashed Up Playlist

The Rosetta Stone

• MusicBrainz ID is our key to the wild web of APIs

• Wikipedia URL is the key to Semantic Web• One happy family

http://www.flickr.com/photos/vportals/

Page 28: Mashed Up Playlist

• [insert LOD graph]

Page 29: Mashed Up Playlist

Take a look

[browser]

Page 30: Mashed Up Playlist

Hindsight is 20/20

... or lessons learned

Page 31: Mashed Up Playlist

Drupal Sucks

• Drupal performance, what performance?• Out of the box it’s been beaten with an ugly

stick

Page 32: Mashed Up Playlist

Don’t use Drupal

• To get the best performance out of Drupal, don’t use Drupal

Page 33: Mashed Up Playlist

Pressflow

• Key patches and enhancements• Releases mirror official Drupal releases• Big players are using it– Drupal.org– ABC– Music labels– Newspapers

Page 34: Mashed Up Playlist

Start your Engines

MySQL base install is ... lacking• MyISAM == slow• Use Percona XtraDB• ... or ... InnoDB

Page 35: Mashed Up Playlist

Reduce your footprint

• APC– PHP app is compiled & cached in memory

Page 36: Mashed Up Playlist

Search

• Drupal’s built in search can be a dawg• Solr – Much faster search– Offers faceting– Can become a platform in its own right.

Page 37: Mashed Up Playlist

A Fresh Coat of Paint

• Varnish– Last but certainly not least– Up to 10 million hits per hour

Page 38: Mashed Up Playlist

What’s Next?

• Project Mercury• Drupal 7– RDFa– Views 3– FOAF+SSL• open social networking• everything under your control