author: rodrigo fonseca, george porter, randy h. katz, scott shenker, ion stoica presenter :yinzhi...

37
Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Post on 21-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica

Presenter :Yinzhi Cao

Page 2: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Outline Background Origin X-Trace

VectorFlowing VectorGodOverHead

Usage Scenarios Potential Problems

Page 3: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Background(1)

Network Diagnosis Scenarios One (Accessing Website)

Page 4: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Background(2)

Scenario Two (Distributed File System)

Page 5: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Background(3) Existing Method

White Box

X-Trace

Black Box

Wap5

Sherlock Comparison of

White Box and Black Box

WhiteBox

BlackBox

Overhead Large Small

Modification to Program Yes No

Notification of Program No Yes

Accuracy High Low

Page 6: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Origin of X-Trace How to Diagnosis a

Person?1. Radioactive MaterialImplies: We need a vector

flowing in our body.2. X-Ray DetectorImplies: We need a collector

to monitor activities. 3. OverheadImplies: There is no free

lunch.

Page 7: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

X-Trace(Vector)

Vector: X-Trace Metadata

Page 8: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

X-Trace(Flowing Vector)

Flowing Vector

Only Vectors are of no use. We make it flow and we get the info. The following is an entity we want to diagnosis.

Page 9: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

X-Trace(Flowing Vector) Continued Let Vectors Flow.

Two Ways: pushNext() and pushDown()

Page 10: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

X-Trace(Collector)

Like diagnosing a person, we need a god to collect all the data and reconstruct offline trees.

The question is how to?

Page 11: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

X-Trace(Overhead)

Modification of Existing Program

Page 12: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

X-Trace(Overhead) Continued Influence on Current Network Flow

1. Metadata is very small which brings little additional flow to the network.

2. Reports are sent in different channels which doesn’t occupy current network flow

Page 13: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Usage Scenarios of X-Trace(1) Web Request and Recursive DNS

queries

Page 14: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Usage Scenarios of X-Trace(2) A Web Hosting Site

Page 15: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Usage Scenarios of X-Trace(3) An Overlay Network

Page 16: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Potential Problems Mentioned by Author Report Loss Managing Report Traffic Non-Tree Request Structures Partial Deployment Security Consideration

Page 17: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

We have examined White Box. So let’s come to some other approach, which may not be that accurate but may cost less overhead.First, we need some models.

Page 18: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Author: Victor Bahl, Ranveer Chandra, Albert Greenberg, Srikanth Kandula, David A. Maltz, Ming

Zhang

Presenter: Yinzhi Cao

Page 19: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Outline

ModelsNode ModelNetwork ModelRelationship Model

How to use Our Model Algorithm Efficiency Evaluation

Page 20: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Models

The main idea of this paper is to establish a model of network and use this model to diagnose.

We have three levels of Model: Node, Network and Relationship.

Page 21: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Node Model

Node has three status: down, up and troubled.

Page 22: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Network Model

Graph What’s

more? Inference Graph.

Page 23: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Relationship Model(1)

Noisy-Max

Page 24: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Backup Slides 1 First, we use the model below. The

circle means with x probability the output is the input, and with 1-x probability the output is up.

Let’s use unordered pair {x,y} to represent node status.{1,1} = {1} up{0,1} troubled{0,0} = {0} down

Page 25: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Backup Slides 2

So the status of Child can be represented as follows.

Status(Child) = |Status(Parent)•Status(Parent)|

• means outer product.

And we define |(x,y)| = <x,y> = xy.

Page 26: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Relationship Model(2)

Selector

Page 27: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Relationship Model(3)

Failover

Page 28: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Backup Slides 3

We use definition before. Status(Parent1)={x1,x2},

Status(Parent2)={y1,y2}. Status(Child)={(x1+x2)x1+not(x1+x2)y1,

(x1+x2)x2+not(x1+x2)y2}

+ means and, * means or which is skipped.

Page 29: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

How to Use Model?

Fault Localization on the Inference Graph

Page 30: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Algorithm Efficiency(1)

Calculations inside Inference Graph ( noisy max relationship )

Reduce time complexity from O(3n) to O(n)

Page 31: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Algorithm Efficiency(2)

Comparison of Multiple Input and Observation

Two Methods to Use

1. Examine Data Sets with High Probability and Ignore Small Ones

2. Dynamic Programming (Reduce Redundancy)

Page 32: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Algorithm Efficiency(3) Author conclude two observations using

these two methods.1. It is very likely that at any point in time only

a few root-cause nodes are troubled or down.

2. Since a root-cause is assigned to be up in most assignment vectors, the evaluation of an assignment vector only requires re-evaluation of states at the descendants of rootcause nodes that are not up.

Page 33: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Evaluation

Inference Graph Established

Page 34: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Accuracy Compared with others

Page 35: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Time to Localize Faults

Page 36: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Impact of Errors in Inference Graph

Page 37: Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao

Open Issues

The Node Model is very simple, which only has three status. Can we have a continuous model of it?

Can we take some stochastic process concept like Markov-Chain into this model?