extending the multi- instance problem to model instance collaboration anjali koppal advanced machine...
TRANSCRIPT
Extending the Multi-Instance Problem to
Model Instance Collaboration
Anjali KoppalAdvanced Machine Learning
December 11, 2007
Multi-Instance Learning: Set-upGiven: A set of labeled data points (training set) and unlabeled data
points (test set)
Bag: Each data point is called a bag and is associated with a label (‘+’ or ‘-’)
Instance: A Bag is described by a set of Instances. Each instance is described by a vector of features.
Every instance also has a (hidden/unknown) label.
Problem: Predict the class of an unlabelled bag.
+
Multi-Instance Learning: Set-upGiven: A set of labeled data points (training set) and unlabeled data
points (test set)
Bag: Each data point is called a bag and is associated with a label
(‘+’ or ‘-’)
Instance: A Bag is described by a set of Instances. Each instance is described by a vector of features.
Every instance also has a (hidden/unknown) label.
Problem: Predict the class of an unlabelled bag.
+
Multi-Instance Learning: Set-upGiven: A set of labeled data points (training set) and unlabeled data
points (test set)
Bag: Each data point is called a bag and is associated with a label
(‘+’ or ‘-’)
Instance: A Bag is described by a set of Instances. Each instance is described by a vector of features.
Every instance also has a (hidden/unknown) label.
Problem: Predict the class of an unlabelled bag.+
Multi-Instance Learning: Example
Bag -> An image
Instance -> A ‘region’ in the image
Label -> {‘beach’ ‘not beach’}
beach
sand
http://www.adrhi.com/Waimanalo-Beach.jpg
water
sky
Framework for Solving MI Problems
Key Assumption:
A bag is positive if at least one of its instances is predicted to be positive.
A bag is negative if all its instances are predicted to be negative.
+
+ ––
–––
––
–
– ––
–––
––
Framework for Solving MI Problems:
Diversity DensityDiversity Density of an instance: A measure of how close an instance is to instances of different positive bags while also being far away from all instances of negative bags.
The instance with the maximum DD value is the most positive instance.
O. Maron and T. Lozano-Pérez. A framework for multiple-instance learning. In Advances in Neural Information Processing Systems 10, pages 570-576. Cambridge, MA: MIT Press, 1998
Using DD to Solve MIL Problems
Approach #1: Maron et al:
• Find the point in instance space with the maximum DD value. Call this
p.
• Use the max DD value to compute a distance threshold t.
• A test bag b with instances x_i is classified as positive if min_i(distance(x_i, p)) < t
?
? ?
?
p
d1
d2
d3
d4
Using DD to Solve MIL Problems
Approach #1: Maron et al:
• Find the point in instance space with the maximum DD value. Call this
p.
• Use the max DD value to compute a distance threshold t.
• A test bag b with instances x_i is classified as positive if min_i(distance(x_i p)) < t
+
– –
–
p
d1 < td3
> t
d2 > td
4 > t
DD-SVMApproach #2: (Y. Chen and J. Wang, image categorization by learning and reasoning with regions,
The Journal of Machine Learning Research, 5, p.913-939)
The previous approach uses one instance point (the one with the globally maximal DD value) as a prototype for all positive instances.
Idea: Represent the positive instance space by a set of instance prototypes (locally max DD values).
A bag is represented by a vector of minimum distances to each of the prototypes.
d1d2
d3
d4
d5
d1
d2
d3
d4
d5
Now train a SVM in the “bag-feature” space.
Extending DD-SVM : Motivation
Instance Prototypes (Approach #2) are an improvement over single prototypes (Approach #1) because they allow for diversity in positive instances.
e.g: A ‘beach scene’ could contain “water” or “sand” or “sky”.
sand
water
sky
Extending DD-SVM : Motivation
However Approach #2 assumes to some extent that the instances contribute independently to the bag’s class. e.g: The classification rule for ‘beach scene’ might be:
“has water and sand but not only one”.
For what is a beach without sand…
Extending DD-SVM : Proposal
Instead of computing the minimum distance between an instance prototype and any instance in a bag compute the minimum distance between an instance prototype and a linear combination of all instances in the bag.
α1
α2α3
α4
α5
Instance Prototype
For bag B with instances bi find the values of αis such thatdist( Σαibi instancePrototype) is minimized – this is a standard QP Problem.
Extending DD-SVM : Initial Results
Artificial Data Set
Instances come from two normal distributions N1 and N2.
number of bags = 200 with each bag containing 2 instances.Positive bags contain one instance from each of N1 and N2.Negative bags contain two instances from N2 (with probability t)
+ --
Train 60 61
Test 39 40
Total 99 101
Extending DD-SVM: Initial Results
Classifier 10-fold
Cross-validation
# SVs Prediction accuracy
DD-SVM 63.63% 102 62.0253%
EDD-SVM 65.29% 95 70.8861%
SVM used: LIBSVM RBF Kernel
1 -1 0.5483 1 0.50001 -1 0.6220 1 0.58681 -1 0.5593 1 0.55351 -1 0.5329 1 0.50001 1 0.6002 -1 0.53961 -1 0.5205 1 0.68431 -1 0.5379 1 0.51991 -1 0.5152 1 0.5817-1 1 0.5442 -1 0.5075-1 -1 0.5834 1 0.5414-1 1 0.5109 -1 0.5146-1 1 0.5388 -1 0.5071-1 -1 0.5106 1 0.5661
Correct_Label DD_label DD_conf EDD_label EDD_conf
Extending DD-SVM: Initial ResultsMusk 2 Data Set (UCI)
-102 bags (molecules) containing instances represented by 166 dimensions (conformations). The 2 classes are ‘musk’ and ‘non-musk’
- Average number of instances per bag ~ 64 – computationally very slow to calculate instance prototypes. Worked only with bags containing fewer than 20 instances (there were 63 such bags).
Data + --
Train 14 11
Test 28 10
Total 42 21
Classifier 10-fold
Cross-validation
# SVs Prediction accuracy
DD-SVM 71.4286% 40 66.6667% (14/21)
EDD-SVM 76.1905% 40 61.9048% (13/21)
Conclusions & Future Work
1. The Extended DD-SVM is a modification of the DD-SVM algorithm to model the influence of groups of instances instead of single, independent instances in multi instance labeling.
2. Initial results were interesting, but need to run more expansive tests to better understand its performance.
3. Useful to find datasets where a collaborative model is intuitivelya good one (unclear if that is true for the MUSK dataset).
4. Other possible directions:-- Another way of extending DD-SVM: use the average distance of k closest instances to a prototype.-- Another way to model collaborations: convex combinations of nearest neighbours (instead of all instances).
Application
mRNA TS3TS1 TS2
µ-rNA
Bag ~ mRNAInstance ~ target sites (TS)
Positive instance ~ legitimate TSPositive bag ~ protein expression