jamiefurness

Upload: sicsa2010

Post on 10-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 JamieFurness

    1/21

    The Effects of Churn on Complex Search Techniques

    Jamie Furness, Mario Kolberg

    Dept. of Computing Science andMathematics, University of Stirling

    1 of 18

  • 8/8/2019 JamieFurness

    2/21

    Introduction

    Structured P2P networks dont tend to support all types of complexqueries.

    Unstructured networks do, and hence are more popular. However,they are inefficient.Using efficient broadcasting it is possible to support all types of complex queries over structured P2P.

    We investigate the effects of churn on broadcast search over Chordand Pastry.

    2 of 18

  • 8/8/2019 JamieFurness

    3/21

    What are Complex Queries?

    Exact-match nine inch nails - the slip (2008) - letting you [v0].mp3Keyword nine inch nails, nin, the slip

    Range bit-rate: 256-320Wild-card nine inch nails *Semantic 9 inch nails

    Regex ^nine inch nails .*\.(mp3|ac|alac)$

    3 of 18

  • 8/8/2019 JamieFurness

    4/21

    Unstructured P2P Networks

    No structure, links established arbitrarily.Flooding or random-walks used to retrieve data.Easy to implement.Inefficient, low success rate.

    4 of 18

  • 8/8/2019 JamieFurness

    5/21

    Structured P2P Networks

    Network of computers, all with equal importance.Nodes are assigned a key, often based on their IP address.Data is assigned a key, often based on its le-name.Distributed Hash Table (DHT) interface can store data or retrievedata given its corresponding key.Examples: Chord, Pastry...

    5 of 18

  • 8/8/2019 JamieFurness

    6/21

    Chord

    Specic type of structured P2P network.Nodes placed in a ring, ordered based on their key.Each node maintains links to O (log N ) other nodes, placedlogarithmically around the ring.

    Requests can be routed in O (log N ) messages.

    6 of 18

  • 8/8/2019 JamieFurness

    7/21

    Pastry

    Another type of structured P2P network.

    Each node maintains a routing table with O (logB N ) rows, eachcontaining O (B ) entries.Requests can be routed in O (log2B N ) messages.B is a tunable parameter, default is 4.

    7 of 18

  • 8/8/2019 JamieFurness

    8/21

    Why are Complex Queries Hard?

    Structured networks make use of consistent hashing.Both types of keys are generated using the same hash function,usually SHA-1.

    Reduces arbitrary length keys to a xed identier space.Balances load, relieving hot-spots.

    Exampletrack 42aef171c1c0accaeee38c605d98ab5db51a13f5

    track1 ea6b175de80bd33899cdf4a0530059aabffb8f66track2 08979fbae1fe1e5b06b3646138be36b27d583f34Not locality aware, patterns in keys are lost after hashing.

    8 of 18

  • 8/8/2019 JamieFurness

    9/21

    Broadcasting Complex Queries

    Broadcasting supports all types of complex queries.Performed by forwarding the query to a few nodes, assigning eachof them an area to cover.Queries are processed at each node.Many more messages than regular operations in structurednetworks... but many less than ooding in unstructured networks.

    9 of 18

  • 8/8/2019 JamieFurness

    10/21

    Broadcasting Complex Queries

    Broadcasting supports all types of complex queries.Performed by forwarding the query to a few nodes, assigning eachof them an area to cover.Queries are processed at each node.Many more messages than regular operations in structurednetworks... but many less than ooding in unstructured networks.

    9 of 18

  • 8/8/2019 JamieFurness

    11/21

    Broadcasting Complex Queries

    Broadcasting supports all types of complex queries.Performed by forwarding the query to a few nodes, assigning eachof them an area to cover.Queries are processed at each node.Many more messages than regular operations in structurednetworks... but many less than ooding in unstructured networks.

    9 of 18

  • 8/8/2019 JamieFurness

    12/21

    Broadcasting Complex Queries

    Broadcasting supports all types of complex queries.Performed by forwarding the query to a few nodes, assigning eachof them an area to cover.Queries are processed at each node.Many more messages than regular operations in structurednetworks... but many less than ooding in unstructured networks.

    9 of 18

  • 8/8/2019 JamieFurness

    13/21

    Issues with Broadcasting & Simulation Aims

    Our aim was to compare the performance of broadcasting a searchquery over both Chord and Pastry while the network is under churn,focusing on some specic areas:

    Success rateBandwidth requirementsData replication

    10 of 18

  • 8/8/2019 JamieFurness

    14/21

    Simulation Network set-up

    Simulations developed using OverSim.Network sizes of 1,000 and 10,000 nodes.Average node lifetime from 100 seconds to 10,000 seconds.Replication rate from 1 to 32.

    11 of 18

  • 8/8/2019 JamieFurness

    15/21

    Churn vs. Success Rate

    The broadcast success rate drops as the churn rate increases.A Chord network with maintenance delay of 10 seconds actually hasa higher success rate than Pastry.

    12 of 18

  • 8/8/2019 JamieFurness

    16/21

    Churn vs. Bandwidth

    Chord uses a constant amount of bandwidth.Pastrys use of reactive maintenance causes the bandwidth usage toincrease as the churn rate increases.

    13 of 18

  • 8/8/2019 JamieFurness

    17/21

    Neighbour Replication

    Replicates data at neighbouring nodes.Maintenance is cheap.Used by both Chord and Pastry.Bad for broadcasting.

    14 of 18

  • 8/8/2019 JamieFurness

    18/21

    Multi-publication Replication

    Replicates data evenly around the network.Maintenance is more expensive.Good for broadcasting.

    15 of 18

  • 8/8/2019 JamieFurness

    19/21

    Comparing Neighbour and Multi-publication Replication

    Multi-publication replication results in a higher success rate thanneighbour replication.When using neighbour replication, what node fails is important, not just how many.

    16 of 18

  • 8/8/2019 JamieFurness

    20/21

    How Much Replication?

    With no replica, the success rate dropped as low as 30%.With 16 replica, the success rate was 100%, even under high churn.

    17 of 18

  • 8/8/2019 JamieFurness

    21/21

    Summary

    The term complex query can cover many different types of queries.Complex querying is possible using broadcasting.

    Much more efficient than ooding over unstructured networks.Using a Chord network turned appropriately outperforms Pastry,and uses far less bandwidth.Churn is very bad!

    By choosing an appropriate data replication strategy we canachieve a reasonable success rate even under high churn.

    18 of 18