from w1-s16
DESCRIPTION
From W1-S16. From W2-S9. Node failure. The probability that at least one node failing is: f = 1 – (1-p) n When n =1; then f = p Suppose p =0.0001 but n =10000, then: f = 1 – (1 -0.0001) 10000 = 0.63 [why/how ?] - PowerPoint PPT PresentationTRANSCRIPT
From W1-S16
Node failure• The probability that at least one node failing is: f= 1 – (1-p)n
• When n =1; then f =p
• Suppose p=0.0001 but n=10000, then:
f = 1 – (1 -0.0001)10000 = 0.63 [why/how ?]
• This is one of the most important formulas to know (in general).
From W2-S9
Example• For example suppose the hash functions maps {to, Java, road}
to one node. Then– (to,1) remains (to,1)– (Java,1);(Java,1);(Java,1) (Java, [1,1,1])– (road,1);(road,1)(road,[1,1]);
• Now REDUCE function converts – (Java,[1,1,1]) (Java,3) etc.
• Remember this is a very simple example…the challenge is to take complex tasks and express them as Map and Reduce!
From W2-S15
Similarity Example [2]
Notice, it requires some ingenuity to come up with key-value pairs. This iskey to suing map-reduce effectively
From W2-S19
K-means algorithmLet C = initial k cluster centroids (often selected randomly)Mark C as unstableWhile <C is unstable> Assign all data points to their nearest centroid in C. Compute the centroids of the points assigned to each element of C. Update C as the set of new centroids. Mark C as stable or unstable by comparing with previous set of centroids. End While
Complexity: O(nkdI)n:num of points; k: num of clusters; d: dimension; I: num of iterationsTake away: complexity is linear in n.
From W3-S14
Example: 2 Clusters
c
c
c
c
A(-1,2) B(1,2)
C(-1,-2) D(1,-2)(0,0)
K-means Problem: Solution is (0,2) and (0,-2) and the clusters are {A,B} and{C,D}
K-means Algorithm: Suppose the initial centroids are (-1,0) and (1,0) then{A,C} and {B,D} end up as the two clusters.
4
2
From W3-S16
Bayes Rule
PriorPosterior
From W4-S21
Example: Iris Flower• F=Flower; SL=Sepal Length; SW = Sepal Width; • PL=Petal Length; PW =Petal Width
• Data
Large Small Medium Small ?
choose themaximum
From W4-S25
Confusion Matrix
Actual Label (1) Actual Label (-1)
Predicted Label (1) True Positive (N1) False Positive (N2)
Predicted Label (-1) False Negatives (N3) True Negatives (N4)
Label 1 is called Positive, Label -1 is called Negative
Let the number of test samples be N
N = N1 + N2 + N3 + N4.
True Positive Rate (TPR) = N1/(N1+N3)True Negative Rate (TNR) = N4/(N4+N2)
False Positive Rate (FPR) = N2/(N2+N4)
False Negative Rate (FNR) = N3/(N1+N3)
Accuracy = (N1+N4)/(N1+N2+N3+N4)
Precision = N1/(N1+N2) Recall = N1/(N1+N3)
From W5-S7
ROC (Receiver Operating Characteristic) Curves
• Generally a learning algorithm A will return a real number…but what we want is a label {1 or -1}
• We can apply a threshold..TA 0.7 0.6 0.5 0.2 0.1 0.09 0.08 0.02 0.01
T=0.1 1 1 1 1 1 -1 -1 -1 -1
True Label
1 1 -1 -1 1 1 -1 -1 -1
A 0.7 0.6 0.5 0.2 0.1 0.09 0.08 0.02 0.01
T=0.2 1 1 1 1 -1 -1 -1 -1 -1
True Label
1 1 -1 -1 1 1 -1 -1 -1
TPR = 3/4FPR = 2/5
TPR = 2/4FPR = 2/5
From W5-S9
Random Variable
• A random variable X can take values in a set which is:– discrete and finite.
• Lets toss a coin and X = 1 if it’s a head and X=0 if it’s a tail. X is random variable
– discrete and infinite (countable)• Let X be the number of accidents in Sydney in a day.. Then X
= 0,1,2,…..– Infinite (uncountable)
• Let X be the height of a Sydney-sider.– X = 150, 150.11,150.112,……
From W5-S13
From W7-S2
These slides are from Steinbach, Pang and Kumar
From W7-S7
From W7-S8
From W9-S9
From W9-S12
From W9-S21
From W9-S26
The Key Idea• Decompose the User x Rating matrix into:
• User x Rating = ( User x Genre ) x (Genre x Movies)– Number of Genres is typically small
• Or
• R =~ UV
• Find U and V such that ||R – UV|| is minimized…– Almost like k-means clustering…why ?
From W11-S9
UV Computation….From W11-S15
This example is from Rajaraman, Leskovic and Ullman: See Textbook